Filters








25 Hits in 0.51 sec

Semantic Autoencoder for Zero-Shot Learning [article]

Elyor Kodirov, Tao Xiang, Shaogang Gong
2017 arXiv   pre-print
Semantic Autoencoder for Zero-Shot Learning Elyor Kodirov  ...  Kodirov, T. Xiang, Z. Fu, and S. Gong. Unsupervised caltech-ucsd birds-200-2011 dataset. In California Institute domain adaptation for zero-shot learning.  ... 
arXiv:1704.08345v1 fatcat:3tlzperd2nc3vmdmmigqavdonm

Instance Cross Entropy for Deep Metric Learning [article]

Xinshao Wang, Elyor Kodirov, Yang Hua, Neil Robertson
2019 arXiv   pre-print
Loss functions play a crucial role in deep metric learning thus a variety of them have been proposed. Some supervise the learning process by pairwise or tripletwise similarity constraints while others take advantage of structured similarity information among multiple data points. In this work, we approach deep metric learning from a novel perspective. We propose instance cross entropy (ICE) which measures the difference between an estimated instance-level matching distribution and its
more » ... th one. ICE has three main appealing properties. Firstly, similar to categorical cross entropy (CCE), ICE has clear probabilistic interpretation and exploits structured semantic similarity information for learning supervision. Secondly, ICE is scalable to infinite training data as it learns on mini-batches iteratively and is independent of the training set size. Thirdly, motivated by our relative weight analysis, seamless sample reweighting is incorporated. It rescales samples' gradients to control the differentiation degree over training examples instead of truncating them by sample mining. In addition to its simplicity and intuitiveness, extensive experiments on three real-world benchmarks demonstrate the superiority of ICE.
arXiv:1911.09976v1 fatcat:ae3enq5rxzfa7dmxustmbmsa5a

ID-aware Quality for Set-based Person Re-identification [article]

Xinshao Wang, Elyor Kodirov, Yang Hua, Neil M. Robertson
2019 arXiv   pre-print
Set-based person re-identification (SReID) is a matching problem that aims to verify whether two sets are of the same identity (ID). Existing SReID models typically generate a feature representation per image and aggregate them to represent the set as a single embedding. However, they can easily be perturbed by noises--perceptually/semantically low quality images--which are inevitable due to imperfect tracking/detection systems, or overfit to trivial images. In this work, we present a novel and
more » ... simple solution to this problem based on ID-aware quality that measures the perceptual and semantic quality of images guided by their ID information. Specifically, we propose an ID-aware Embedding that consists of two key components: (1) Feature learning attention that aims to learn robust image embeddings by focusing on 'medium' hard images. This way it can prevent overfitting to trivial images, and alleviate the influence of outliers. (2) Feature fusion attention is to fuse image embeddings in the set to obtain the set-level embedding. It ignores noisy information and pays more attention to discriminative images to aggregate more discriminative information. Experimental results on four datasets show that our method outperforms state-of-the-art approaches despite the simplicity of our approach.
arXiv:1911.09143v1 fatcat:kvwq7qloq5brhlsa4smi4cinii

GAN-based Pose-aware Regulation for Video-based Person Re-identification [article]

Alessandro Borgia and Yang Hua and Elyor Kodirov and Neil M. Robertson
2019 arXiv   pre-print
Video-based person re-identification deals with the inherent difficulty of matching unregulated sequences with different length and with incomplete target pose/viewpoint structure. Common approaches operate either by reducing the problem to the still images case, facing a significant information loss, or by exploiting inter-sequence temporal dependencies as in Siamese Recurrent Neural Networks or in gait analysis. However, in all cases, the inter-sequences pose/viewpoint misalignment is not
more » ... idered, and the existing spatial approaches are mostly limited to the still images context. To this end, we propose a novel approach that can exploit more effectively the rich video information, by accounting for the role that the changing pose/viewpoint factor plays in the sequences matching process. Specifically, our approach consists of two components. The first one attempts to complement the original pose-incomplete information carried by the sequences with synthetic GAN-generated images, and fuse their feature vectors into a more discriminative viewpoint-insensitive embedding, namely Weighted Fusion (WF). Another one performs an explicit pose-based alignment of sequence pairs to promote coherent feature matching, namely Weighted-Pose Regulation (WPR). Extensive experiments on two large video-based benchmark datasets show that our approach outperforms considerably existing methods.
arXiv:1903.11552v1 fatcat:ypujostuwvaozaotf2t3hzoyze

Semantic Autoencoder for Zero-Shot Learning

Elyor Kodirov, Tao Xiang, Shaogang Gong
2017 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
Existing zero-shot learning (ZSL) models typically learn a projection function from a feature space to a semantic embedding space (e.g. attribute space). However, such a projection function is only concerned with predicting the training seen class semantic representation (e.g. attribute prediction) or classification. When applied to test data, which in the context of ZSL contains different (unseen) classes without training data, a ZSL model typically suffers from the project domain shift
more » ... . In this work, we present a novel solution to ZSL based on learning a Semantic AutoEncoder (SAE). Taking the encoder-decoder paradigm, an encoder aims to project a visual feature vector into the semantic space as in the existing ZSL models. However, the decoder exerts an additional constraint, that is, the projection/code must be able to reconstruct the original visual feature. We show that with this additional reconstruction constraint, the learned projection function from the seen classes is able to generalise better to the new unseen classes. Importantly, the encoder and decoder are linear and symmetric which enable us to develop an extremely efficient learning algorithm. Extensive experiments on six benchmark datasets demonstrate that the proposed SAE outperforms significantly the existing ZSL models with the additional benefit of lower computational cost. Furthermore, when the SAE is applied to supervised clustering problem, it also beats the state-of-the-art.
doi:10.1109/cvpr.2017.473 dblp:conf/cvpr/KodirovXG17 fatcat:sx5n4ldqingdndlejm43teebuu

Unsupervised Domain Adaptation for Zero-Shot Learning

Elyor Kodirov, Tao Xiang, Zhenyong Fu, Shaogang Gong
2015 2015 IEEE International Conference on Computer Vision (ICCV)  
Zero-shot learning (ZSL) can be considered as a special case of transfer learning where the source and target domains have different tasks/label spaces and the target domain is unlabelled, providing little guidance for the knowledge transfer. A ZSL method typically assumes that the two domains share a common semantic representation space, where a visual feature vector extracted from an image/video can be projected/embedded using a projection function. Existing approaches learn the projection
more » ... ction from the source domain and apply it without adaptation to the target domain. They are thus based on naive knowledge transfer and the learned projections are prone to the domain shift problem. In this paper a novel ZSL method is proposed based on unsupervised domain adaptation. Specifically, we formulate a novel regularised sparse coding framework which uses the target domain class labels' projections in the semantic space to regularise the learned target domain projection thus effectively overcoming the projection domain shift problem. Extensive experiments on four object and action recognition benchmark datasets show that the proposed ZSL method significantly outperforms the state-of-the-arts.
doi:10.1109/iccv.2015.282 dblp:conf/iccv/KodirovXFG15 fatcat:r2o6prdiabbileemh4dmtddcoq

Cross-View Discriminative Feature Learning for Person Re-Identification

Alessandro Borgia, Yang Hua, Elyor Kodirov, Neil M. Robertson
2018 IEEE Transactions on Image Processing  
Kodirov is with Anyvision, Belfast, UK. Email: elyor@anyvision.co N. M. Robertson is with EEECS/ECIT, Queen's University Belfast, UK.  ... 
doi:10.1109/tip.2018.2851098 pmid:29994678 fatcat:nxpidryun5erfg3cmglke7ob6a

Dictionary Learning with Iterative Laplacian Regularisation for Unsupervised Person Re-identification

Elyor Kodirov, Tao Xiang, Shaogang Gong
2015 Procedings of the British Machine Vision Conference 2015  
Many existing approaches to person re-identification (Re-ID) are based on supervised learning, which requires hundreds of matching pairs to be labelled for each pair of cameras. This severely limits their scalability for real-world applications. This work aims to overcome this limitation by developing a novel unsupervised Re-ID approach. The approach is based on a new dictionary learning for sparse coding formulation with a graph Laplacian regularisation term whose value is set iteratively. As
more » ... n unsupervised model, the dictionary learning model is well-suited to the unsupervised task, whilst the regularisation term enables the exploitation of cross-view identity-discriminative information ignored by existing unsupervised Re-ID methods. Importantly this model is also flexible in utilising any labelled data if available. Experiments on two benchmark datasets demonstrate that the proposed approach significantly outperforms the state-of-the-arts.
doi:10.5244/c.29.44 dblp:conf/bmvc/KodirovXG15 fatcat:igt3ogwj4rddnkoz5eiwv5gzxy

Ranked List Loss for Deep Metric Learning

Xinshao Wang, Yang Hua, Elyor Kodirov, Neil M Robertson
2021 IEEE Transactions on Pattern Analysis and Machine Intelligence  
Xinshao Yang Elyor  ...  Kodirov received his Ph.D. degree in the School of Electronic Engineering and Computer Science, Queen Mary University of London, 2017 and the Master's degree in computer science from Chonnam National University  ... 
doi:10.1109/tpami.2021.3068449 pmid:33760730 fatcat:onbyudurbfhwjgk3lagd3azhqa

Zero-shot object recognition by semantic manifold distance

Zhenyong Fu, Tao A Xiang, Elyor Kodirov, Shaogang Gong
2015 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
Object recognition by zero-shot learning (ZSL) aims to recognise objects without seeing any visual examples by learning knowledge transfer between seen and unseen object classes. This is typically achieved by exploring a semantic embedding space such as attribute space or semantic word vector space. In such a space, both seen and unseen class labels, as well as image features can be embedded (projected), and the similarity between them can thus be measured directly. Existing works differ in
more » ... embedding space is used and how to project the visual data into the semantic embedding space. Yet, they all measure the similarity in the space using a conventional distance metric (e.g. cosine) that does not consider the rich intrinsic structure, i.e. semantic manifold, of the semantic categories in the embedding space. In this paper we propose to model the semantic manifold in an embedding space using a semantic class label graph. The semantic manifold structure is used to redefine the distance metric in the semantic embedding space for more effective ZSL. The proposed semantic manifold distance is computed using a novel absorbing Markov chain process (AMP), which has a very efficient closedform solution. The proposed new model improves upon and seamlessly unifies various existing ZSL algorithms. Extensive experiments on both the large scale ImageNet dataset and the widely used Animal with Attribute (AwA) dataset show that our model outperforms significantly the state-ofthe-arts.
doi:10.1109/cvpr.2015.7298879 dblp:conf/cvpr/FuXKG15 fatcat:2ophdrj72vfcffimbzrjbighgi

Ranked List Loss for Deep Metric Learning

Xinshao Wang, Yang Hua, Elyor Kodirov, Guosheng Hu, Romain Garnier, Neil M. Robertson
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Yang Elyor  ...  Kodirov is currently a senior researcher at AnyVision, Belfast, UK.  ... 
doi:10.1109/cvpr.2019.00535 dblp:conf/cvpr/WangHKHGR19 fatcat:jrkbck4ljvfixii7dvlt2h3cwm

Zero-Shot Learning on Semantic Class Prototype Graph

Zhenyong Fu, Tao Xiang, Elyor Kodirov, Shaogang Gong
2018 IEEE Transactions on Pattern Analysis and Machine Intelligence  
However, we found that none of them, including SSE-ReLU [65] and Kodirov et al.  ...  Compared to the inductive learningbased methods, our model beats the closest competitor Deep-SCoRe [43] by 8.2%. (2) As expected, the three transductive methods (Kodirov et al.  ... 
doi:10.1109/tpami.2017.2737007 pmid:28796607 fatcat:glg5peo2h5gytp6fyktg6j7bay

GAN-Based Pose-Aware Regulation for Video-Based Person Re-Identification

Alessandro Borgia, Yang Hua, Elyor Kodirov, Neil Robertson
2019 2019 IEEE Winter Conference on Applications of Computer Vision (WACV)  
Video-based person re-identification deals with the inherent difficulty of matching unregulated sequences with different length and with incomplete target pose/viewpoint structure. Common approaches operate either by reducing the problem to the still images case, facing a significant information loss, or by exploiting inter-sequence temporal dependencies as in Siamese Recurrent Neural Networks or in gait analysis. However, in all cases, the intersequences pose/viewpoint misalignment is not
more » ... dered, and the existing spatial approaches are mostly limited to the still images context. To this end, we propose a novel approach that can exploit more effectively the rich video information, by accounting for the role that the changing pose/viewpoint factor plays in the sequences matching process. Specifically, our approach consists of two components. The first one attempts to complement the original pose-incomplete information carried by the sequences with synthetic GAN-generated images, and fuse their feature vectors into a more discriminative viewpointinsensitive embedding, namely Weighted Fusion (WF). Another one performs an explicit pose-based alignment of sequence pairs to promote coherent feature matching, namely Weighted-Pose Regulation (WPR). Extensive experiments on two large video-based benchmark datasets show that our approach outperforms considerably existing methods.
doi:10.1109/wacv.2019.00130 dblp:conf/wacv/BorgiaHKR19 fatcat:s6yzanspdvcojgh7x2cjtusmoq

Deep Metric Learning by Online Soft Mining and Class-Aware Attention [article]

Xinshao Wang, Yang Hua, Elyor Kodirov, Guosheng Hu, Neil M. Robertson
2019 arXiv   pre-print
Deep metric learning aims to learn a deep embedding that can capture the semantic similarity of data points. Given the availability of massive training samples, deep metric learning is known to suffer from slow convergence due to a large fraction of trivial samples. Therefore, most existing methods generally resort to sample mining strategies for selecting nontrivial samples to accelerate convergence and improve performance. In this work, we identify two critical limitations of the sample
more » ... methods, and provide solutions for both of them. First, previous mining methods assign one binary score to each sample, i.e., dropping or keeping it, so they only selects a subset of relevant samples in a mini-batch. Therefore, we propose a novel sample mining method, called Online Soft Mining (OSM), which assigns one continuous score to each sample to make use of all samples in the mini-batch. OSM learns extended manifolds that preserve useful intraclass variances by focusing on more similar positives. Second, the existing methods are easily influenced by outliers as they are generally included in the mined subset. To address this, we introduce Class-Aware Attention (CAA) that assigns little attention to abnormal data samples. Furthermore, by combining OSM and CAA, we propose a novel weighted contrastive loss to learn discriminative embeddings. Extensive experiments on two fine-grained visual categorisation datasets and two video-based person re-identification benchmarks show that our method significantly outperforms the state-of-the-art.
arXiv:1811.01459v3 fatcat:ntm2v4vjcfdfhmlhhjjkle7y7u

Deep Metric Learning by Online Soft Mining and Class-Aware Attention

Xinshao Wang, Yang Hua, Elyor Kodirov, Guosheng Hu, Neil M. Robertson
2019 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Deep metric learning aims to learn a deep embedding that can capture the semantic similarity of data points. Given the availability of massive training samples, deep metric learning is known to suffer from slow convergence due to a large fraction of trivial samples. Therefore, most existing methods generally resort to sample mining strategies for selecting nontrivial samples to accelerate convergence and improve performance. In this work, we identify two critical limitations of the sample
more » ... methods, and provide solutions for both of them. First, previous mining methods assign one binary score to each sample, i.e., dropping or keeping it, so they only selects a subset of relevant samples in a mini-batch. Therefore, we propose a novel sample mining method, called Online Soft Mining (OSM), which assigns one continuous score to each sample to make use of all samples in the mini-batch. OSM learns extended manifolds that preserve useful intraclass variances by focusing on more similar positives. Second, the existing methods are easily influenced by outliers as they are generally included in the mined subset. To address this, we introduce Class-Aware Attention (CAA) that assigns little attention to abnormal data samples. Furthermore, by combining OSM and CAA, we propose a novel weighted contrastive loss to learn discriminative embeddings. Extensive experiments on two fine-grained visual categorisation datasets and two video-based person re-identification benchmarks show that our method significantly outperforms the state-of-the-art.
doi:10.1609/aaai.v33i01.33015361 fatcat:wklvfundbjd25lb2pthegegzam
« Previous Showing results 1 — 15 out of 25 results