Filters








74 Hits in 2.2 sec

SubICap: Towards Subword-informed Image Captioning [article]

Naeha Sharif, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah
2020 arXiv   pre-print
Existing Image Captioning (IC) systems model words as atomic units in captions and are unable to exploit the structural information in the words. This makes representation of rare words very difficult and out-of-vocabulary words impossible. Moreover, to avoid computational complexity, existing IC models operate over a modest sized vocabulary of frequent words, such that the identity of rare words is lost. In this work we address this common limitation of IC systems in dealing with rare words in
more » ... the corpora. We decompose words into smaller constituent units 'subwords' and represent captions as a sequence of subwords instead of words. This helps represent all words in the corpora using a significantly lower subword vocabulary, leading to better parameter learning. Using subword language modeling, our captioning system improves various metric scores, with a training vocabulary size approximately 90% less than the baseline and various state-of-the-art word-level models. Our quantitative and qualitative results and analysis signify the efficacy of our proposed approach.
arXiv:2012.13122v1 fatcat:zn4plh6q5bevzbncrunxiiw5ta

Towards Visual Affordance Learning: A Benchmark for Affordance Segmentation and Recognition [article]

Zeyad Osama Khalifa, Syed Afaq Ali Shah
2022 arXiv   pre-print
The physical and textural attributes of objects have been widely studied for recognition, detection and segmentation tasks in computer vision. A number of datasets, such as large scale ImageNet, have been proposed for feature learning using data hungry deep neural networks and for hand-crafted feature extraction. To intelligently interact with objects, robots and intelligent machines need the ability to infer beyond the traditional physical/textural attributes, and understand/learn visual cues,
more » ... called visual affordances, for affordance recognition, detection and segmentation. To date there is no publicly available large dataset for visual affordance understanding and learning. In this paper, we introduce a large scale multi-view RGBD visual affordance learning dataset, a benchmark of 47210 RGBD images from 37 object categories, annotated with 15 visual affordance categories and 35 cluttered/complex scenes with different objects and multiple affordances. To the best of our knowledge, this is the first ever and the largest multi-view RGBD visual affordance learning dataset. We benchmark the proposed dataset for affordance recognition and segmentation. To achieve this we propose an Affordance Recognition Network a.k.a ARNet. In addition, four state-of-the-art deep learning networks are evaluated for affordance segmentation task. Our experimental results showcase the challenging nature of the dataset and present definite prospects for new and robust affordance learning algorithms. The dataset is available at: https://sites.google.com/view/afaqshah/dataset.
arXiv:2203.14092v1 fatcat:jhpsjl6z4nd4nk43m5ivonuqwy

Deep Bayesian Image Set Classification: A Defence Approach against Adversarial Attacks [article]

Nima Mirnateghi, Syed Afaq Ali Shah, Mohammed Bennamoun
2021 arXiv   pre-print
Syed Afaq Ali Shah received the PhD degree in computer vision and machine learning from The University of Western Australia (UWA), Crawley, WA, Australia.  ...  Shah are with the Discipline of Information Technology, Murdoch University, Perth, Australia. • M.  ... 
arXiv:2108.10217v1 fatcat:s6hmgbf2sffj3c3rldm6dbefou

Automatic Number Plate Recognition:A Detailed Survey of Relevant Algorithms

Lubna, Naveed Mufti, Syed Afaq Ali Shah
2021 Sensors  
Technologies and services towards smart-vehicles and Intelligent-Transportation-Systems (ITS), continues to revolutionize many aspects of human life. This paper presents a detailed survey of current techniques and advancements in Automatic-Number-Plate-Recognition (ANPR) systems, with a comprehensive performance comparison of various real-time tested and simulated algorithms, including those involving computer vision (CV). ANPR technology has the ability to detect and recognize vehicles by
more » ... number-plates using recognition techniques. Even with the best algorithms, a successful ANPR system deployment may require additional hardware to maximize its accuracy. The number plate condition, non-standardized formats, complex scenes, camera quality, camera mount position, tolerance to distortion, motion-blur, contrast problems, reflections, processing and memory limitations, environmental conditions, indoor/outdoor or day/night shots, software-tools or other hardware-based constraint may undermine its performance. This inconsistency, challenging environments and other complexities make ANPR an interesting field for researchers. The Internet-of-Things is beginning to shape future of many industries and is paving new ways for ITS. ANPR can be well utilized by integrating with RFID-systems, GPS, Android platforms and other similar technologies. Deep-Learning techniques are widely utilized in CV field for better detection rates. This research aims to advance the state-of-knowledge in ITS (ANPR) built on CV algorithms; by citing relevant prior work, analyzing and presenting a survey of extraction, segmentation and recognition techniques whilst providing guidelines on future trends in this area.
doi:10.3390/s21093028 pmid:33925845 fatcat:rljgab5qlne4vi3njjyylvhxxi

WEmbSim: A Simple yet Effective Metric for Image Captioning [article]

Naeha Sharif, Lyndon White, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah
2020 arXiv   pre-print
The area of automatic image caption evaluation is still undergoing intensive research to address the needs of generating captions which can meet adequacy and fluency requirements. Based on our past attempts at developing highly sophisticated learning-based metrics, we have discovered that a simple cosine similarity measure using the Mean of Word Embeddings(MOWE) of captions can actually achieve a surprisingly high performance on unsupervised caption evaluation. This inspires our proposed work
more » ... an effective metric WEmbSim, which beats complex measures such as SPICE, CIDEr and WMD at system-level correlation with human judgments. Moreover, it also achieves the best accuracy at matching human consensus scores for caption pairs, against commonly used unsupervised methods. Therefore, we believe that WEmbSim sets a new baseline for any complex metric to be justified.
arXiv:2012.13137v1 fatcat:gw25ymwapnhm5p5dhrl5x76roi

Efficient Image Set Classification using Linear Regression based Image Reconstruction [article]

Syed Afaq Ali Shah, Uzair Nadeem, Mohammed Bennamoun, Ferdous Sohel, Roberto Togneri
2017 arXiv   pre-print
We propose a novel image set classification technique using linear regression models. Downsampled gallery image sets are interpreted as subspaces of a high dimensional space to avoid the computationally expensive training step. We estimate regression models for each test image using the class specific gallery subspaces. Images of the test set are then reconstructed using the regression models. Based on the minimum reconstruction error between the reconstructed and the original images, a
more » ... voting strategy is used to classify the test set. We performed extensive evaluation on the benchmark UCSD/Honda, CMU Mobo and YouTube Celebrity datasets for face classification, and ETH-80 dataset for object classification. The results demonstrate that by using only a small amount of training data, our technique achieved competitive classification accuracy and superior computational speed compared with the state-of-the-art methods.
arXiv:1701.02485v1 fatcat:ub2ur5klfbhp7o5chui3drugrq

CommuNety: A Deep Learning System for the Prediction of Cohesive Social Communities [article]

Syed Afaq Ali Shah, Weifeng Deng, Jianxin Li, Muhammad Aamir Cheema, Abdul Bais
2020 arXiv   pre-print
Fig. 8 . 8 Size Distribution of Communities Fig. 9 . 9 Average Network Density Afaq Shah received the Ph.D. degree in computer vision and machine learning from The University of Western Australia (  ...  Shah is with the Department of Information Technology, Mathematics and Statistics, Murdoch University, Australia, e-mail: afaq.shah@murdoch.edu.au. W.  ... 
arXiv:2007.14741v1 fatcat:ifpsao223zcthchbnqj7duplma

NNEval: Neural Network Based Evaluation Metric for Image Captioning [chapter]

Naeha Sharif, Lyndon White, Mohammed Bennamoun, Syed Afaq Ali Shah
2018 Lecture Notes in Computer Science  
The automatic evaluation of image descriptions is an intricate task, and it is highly important in the development and fine-grained analysis of captioning systems. Existing metrics to automatically evaluate image captioning systems fail to achieve a satisfactory level of correlation with human judgements at the sentence level. Moreover, these metrics, unlike humans, tend to focus on specific aspects of quality, such as the n-gram overlap or the semantic meaning. In this paper, we present the
more » ... st learning-based metric to evaluate image captions. Our proposed framework enables us to incorporate both lexical and semantic information into a single learned metric. This results in an evaluator that takes into account various linguistic features to assess the caption quality. The experiments we performed to assess the proposed metric, show improvements upon the state of the art in terms of correlation with human judgements and demonstrate its superior robustness to distractions.
doi:10.1007/978-3-030-01237-3_3 fatcat:kokhsilt3vcqndc5vo2zqugqgu

Application of MXenes in Perovskite Solar Cells: A Short Review

Syed Afaq Ali Shah, Muhammad Hassan Sayyad, Karim Khan, Jinghua Sun, Zhongyi Guo
2021 Nanomaterials  
Application of MXene materials in perovskite solar cells (PSCs) has attracted considerable attention owing to their supreme electrical conductivity, excellent carrier mobility, adjustable surface functional groups, excellent transparency and superior mechanical properties. This article reviews the progress made so far in using Ti3C2TxMXene materials in the building blocks of perovskite solar cells such as electrodes, hole transport layer (HTL), electron transport layer (ETL) and perovskite
more » ... active layer. Moreover, we provide an outlook on the exciting opportunities this recently developed field offers, and the challenges faced in effectively incorporating MXene materials in the building blocks of PSCs for better operational stability and enhanced performance.
doi:10.3390/nano11082151 pmid:34443979 pmcid:PMC8401012 fatcat:mbiqgc6oqnbbnc3khedjka7cfm

Machine Learning Approaches for Prediction of Facial Rejuvenation using Real and Synthetic Data

Syed Afaq Ali Shah, Mohammed Bennamoun, Michael Molton
2019 IEEE Access  
This paper proposes a novel machine learning approaches to predict the outcome of facial rejuvenation prior to a cosmetic procedure. This is achieved by estimating the required amount of dermal filler volume that needs to be applied on the face by learning the underlying structural mapping from the pretreatment and posttreatment 3D face images. We develop and train our proposed deep neural network, called Rejuv3DNet, designed specifically to predict the dermal filler volume. We also propose the
more » ... kernel regression (KR)-based model to validate and improve our volume estimation results using regression. Our other contributions include the development of the first 3D face cosmetic dataset, which consists of realworld pretreatment and posttreatment 3D face images and a novel technique for the generation of synthetic cosmetic treatment 3D face images. Our experimental results show that the proposed Rejuv3DNet and the KR model achieve 62.5% and 66.67%, respectively, on real-world data, while these techniques achieve a prediction accuracy of 75.2% and 89.5%, and 77.2% and 90.1% on our two different synthetic datasets. Our proposed techniques have been found to be computationally efficient, achieving near real-time prediction performance. The reported accuracies are our preliminary results for proof of concept, which can be improved with more data. The proposed approach has the potential for further investigation in the cosmetic surgery domain. INDEX TERMS Deep learning, deep neural network, facial analysis, regression.
doi:10.1109/access.2019.2899379 fatcat:uiszlvtfmrccphmvfrklesplza

Scene Graph Generation: A Comprehensive Survey [article]

Guangming Zhu, Liang Zhang, Youliang Jiang, Yixuan Dang, Haoran Hou, Peiyi Shen, Mingtao Feng, Xia Zhao, Qiguang Miao, Syed Afaq Ali Shah, Mohammed Bennamoun
2022 arXiv   pre-print
Deep learning techniques have led to remarkable breakthroughs in the field of generic object detection and have spawned a lot of scene-understanding tasks in recent years. Scene graph has been the focus of research because of its powerful semantic representation and applications to scene understanding. Scene Graph Generation (SGG) refers to the task of automatically mapping an image into a semantic structural scene graph, which requires the correct labeling of detected objects and their
more » ... ships. Although this is a challenging task, the community has proposed a lot of SGG approaches and achieved good results. In this paper, we provide a comprehensive survey of recent achievements in this field brought about by deep learning techniques. We review 138 representative works that cover different input modalities, and systematically summarize existing methods of image-based SGG from the perspective of feature extraction and fusion. We attempt to connect and systematize the existing visual relationship detection methods, to summarize, and interpret the mechanisms and the strategies of SGG in a comprehensive way. Finally, we finish this survey with deep discussions about current existing problems and future research directions. This survey will help readers to develop a better understanding of the current research status and ideas.
arXiv:2201.00443v2 fatcat:s4w7sdf6dndzneujly54srh5c4

Learning-based Composite Metrics for Improved Caption Evaluation

Naeha Sharif, Lyndon White, Mohammed Bennamoun, Syed Afaq Ali Shah
2018 Proceedings of ACL 2018, Student Research Workshop  
The evaluation of image caption quality is a challenging task, which requires the assessment of two main aspects in a caption: adequacy and fluency. These quality aspects can be judged using a combination of several linguistic features. However, most of the current image captioning metrics focus only on specific linguistic facets, such as the lexical or semantic, and fail to meet a satisfactory level of correlation with human judgements at the sentence-level. We propose a learning-based
more » ... k to incorporate the scores of a set of lexical and semantic metrics as features, to capture the adequacy and fluency of captions at different linguistic levels. Our experimental results demonstrate that composite metrics draw upon the strengths of standalone measures to yield improved correlation and accuracy.
doi:10.18653/v1/p18-3003 dblp:conf/acl/SharifWBS18 fatcat:pn2r4slifzev5ejax66pk5kq5i

A novel 3D vorticity based approach for automatic registration of low resolution range images

Syed Afaq Ali Shah, Mohammed Bennamoun, Farid Boussaid
2015 Pattern Recognition  
This paper tackles the problem of feature matching and range image registration. Our approach is based on a novel set of discriminating three-dimensional (3D) local features, named 3D-Vor (Vorticity). In contrast to conventional local feature representation techniques, which use the vector field (i.e. surface normals) to just construct their local reference frames, the proposed feature representation exploits the vorticity of the vector field computed at each point of the local surface to
more » ... e the distinctive characteristics at each point of the underlying 3D surface. The 3D-Vor descriptors of two range images are then matched using a fully automatic feature matching algorithm which identifies correspondences between the two range images. Correspondences are verified in a local validation step of the proposed algorithm and used for the pairwise registration of the range images. Quantitative results on low resolution Kinect 3D data (Washington RGB-D dataset) show that our proposed automatic registration algorithm is accurate and computationally efficient. The performance evaluation of the proposed descriptor was also carried out on the challenging low resolution Washington RGB-D (Kinect) object dataset, for the tasks of automatic range image registration. Reported experimental results show that the proposed local surface descriptor is robust to resolution, noise and more accurate than state-of-the-art techniques. It achieves 90% registration accuracy compared to 50%, 69.2% and 52% for spin image, 3D SURF and SISI/LD-SIFT descriptors, respectively.
doi:10.1016/j.patcog.2015.03.014 fatcat:m7ndkqzrsrgcdoslq2fj2pessy

WEmbSim: A Simple yet Effective Metric for Image Captioning

Naeha Sharif, Lyndon White, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah
2020 2020 Digital Image Computing: Techniques and Applications (DICTA)  
The area of automatic image caption evaluation is still undergoing intensive research to address the needs of generating captions which can meet adequacy and fluency requirements. Based on our past attempts at developing highly sophisticated learning-based metrics, we have discovered that a simple cosine similarity measure using the Mean of Word Embeddings (MOWE) of captions can actually achieve a surprisingly high performance on unsupervised caption evaluation. This inspires our proposed work
more » ... n an effective metric WEmbSim, which beats complex measures such as SPICE, CIDEr and WMD at system-level correlation with human judgments. Moreover, it also achieves the best accuracy at matching human consensus scores for caption pairs, against commonly used unsupervised methods. Therefore, we believe that WEmbSim sets a new baseline for any complex metric to be justified.
doi:10.1109/dicta51227.2020.9363392 fatcat:bdbpf3xwhbcq3df7yvjdkmfp2e

Effect of Different Levels of Zinc and Compost on Yield and Yield Components of Wheat

Khadim Dawar, Wajid Ali, Hamida Bibi, Ishaq Ahmad Mian, Mian Afaq Ahmad, Muhammad Baqir Hussain, Muqarrab Ali, Shamsher Ali, Shah Fahad, Saeed ur Rehman, Rahul Datta, Asad Syed (+1 others)
2022 Agronomy  
Management of organic matter and micronutrients is very important for the sustainable improvement of soil health. Poor soil organic matter usually results in lower availability of zinc (Zn) micronutrients in plants. Such deficiency in Zn causes a significant decrease in the growth and yield of crops. The need at the current time is to balance the application of organic amendments with Zn micronutrients to achieve optimum crop yields. Thus, the current study was conducted to investigate wheat,
more » ... ing compost as organic matter and Zn as a micronutrient. There were three levels of compost (i.e., control (0C), 5 t/ha (5C) and 10 t/ha (10C)) and four levels of Zn (control (0Zn), 2.5 kg Zn/ha (2.5Zn), 5.0 kg Zn/ha (5.0Zn) and 10.0 kg Zn/ha (10.0Zn)) applied with three replicates. The addition of 10C under 10.0Zn produced significantly better results for the maximum enhancement in plant height (8.08%), tillers/m2 (21.61%), spikes/m2 (22.33%) and spike length (40.50%) compared to 0C. Significant enhancements in 1000-grain weight, biological yield and grain yield also validated the effectiveness of 10C under 10.0Zn compared to 0C. In conclusion, application of 10C with 10.0Zn showed the potential to improve wheat growth and yield attributes. The addition of 10C with 10.0Zn also regulated soil mineral N, total soil N and extractable soil P. Further investigation is recommended with different soil textures to verify 10C with 10.0Zn as the best amendment for the enhancement of wheat yield in poor organic matter and Zn-deficient soils.
doi:10.3390/agronomy12071562 fatcat:grlwutcfkzfgzmpw7ipygtkkcu
« Previous Showing results 1 — 15 out of 74 results