Filters








626 Hits in 6.8 sec

Joint Latent Subspace Learning and Regression for Cross-Modal Retrieval

Jianlong Wu, Zhouchen Lin, Hongbin Zha
2017 Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '17  
Cross-modal retrieval has received much attention in recent years. It is a commonly used method to project multi-modality data into a common subspace and then retrieve.  ...  Then we construct a graph model to project the multi-modality data into the latent space. Finally, we combine these two processes together to jointly learn the latent space and regress.  ...  CONCLUSIONS In this paper, we proposed a novel framework to joint learn the latent space and regress for cross-modal retrieval.  ... 
doi:10.1145/3077136.3080678 dblp:conf/sigir/WuLZ17 fatcat:lnqzdgi4wfefpfgt7ybt5226ii

Variational Autoencoder with CCA for Audio-Visual Cross-Modal Retrieval [article]

Jiwei Zhang, Yi Yu, Suhua Tang, Jianming Wu, Wei Li
2021 arXiv   pre-print
Additionally, in this way, the cross-modal discrepancy from intra-modal and inter-modal information are simultaneously eliminated in the joint embedding subspace.  ...  Cross-modal retrieval is to utilize one modality as a query to retrieve data from another modality, which has become a popular topic in information retrieval, machine learning, and database.  ...  Zha, “Joint latent subspace learning and regression for cross-modal retrieval,” in Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval,  ... 
arXiv:2112.02601v1 fatcat:iowujbzu4vfqhei5gwo3hma2d4

Modality-dependent Cross-media Retrieval [article]

Yunchao Wei, Yao Zhao, Zhenfeng Zhu, Shikui Wei, Yanhui Xiao, Jiashi Feng, Shuicheng Yan
2015 arXiv   pre-print
project images and text from their original feature spaces into two common latent subspaces (one for I2T and the other for T2I).  ...  Different from previous works, we propose a modality-dependent cross-media retrieval (MDCR) model, where two couples of projections are learned for different cross-media retrieval tasks instead of one  ...  et al. 2013; Wei et al. 2014; try to learn an optimal common latent subspace for multi-modal data.  ... 
arXiv:1506.06628v2 fatcat:vbnedfrefzdldk67prpojzl2v4

A Comprehensive Survey on Cross-modal Retrieval [article]

Kaiye Wang, Qiyue Yin, Wei Wang, Shu Wu, Liang Wang
2016 arXiv   pre-print
In this paper, we first review a number of representative methods for cross-modal retrieval and classify them into two main groups: 1) real-valued representation learning, and 2) binary representation  ...  To speed up the cross-modal retrieval, a number of binary representation learning methods are proposed to map different modalities of data into a common Hamming space.  ...  [74] propose a probabilistic latent factor model, called multimodal latent binary embedding (MLBE) to learn hash functions for cross-modal retrieval.  ... 
arXiv:1607.06215v1 fatcat:jfbmmlvzrvcmtmzezogzuxvvqu

Semi-Supervised Cross-Modal Retrieval Based on Discriminative Comapping

Li Liu, Xiao Dong, Tianshi Wang
2020 Complexity  
Most cross-modal retrieval methods based on subspace learning just focus on learning the projection matrices that map different modalities to a common subspace and pay less attention to the retrieval task  ...  To address the two limitations and make full use of unlabelled data, we propose a novel semi-supervised method for cross-modal retrieval named modal-related retrieval based on discriminative comapping  ...  To solve these problems mentioned above, this paper proposes a novel semi-supervised joint learning framework for cross-modal retrieval by integrating the common subspace learning, task-related learning  ... 
doi:10.1155/2020/1462429 fatcat:nvdtfct4crdozf5wa6fppdaaaq

Modality-Dependent Cross-Modal Retrieval Based on Graph Regularization

Guanhua Wang, Hua Ji, Dexin Kong, Na Zhang
2020 Mobile Information Systems  
Nowadays, the heterogeneity gap of different modalities is the key problem for cross-modal retrieval.  ...  In order to fully exploit the potential correlation of different modalities, we propose a cross-modal retrieval framework based on graph regularization and modality dependence (GRMD).  ...  Acknowledgments is work was partially supported by the National Natural Science Foundation of China (grant nos. 61772322, 61572298, 61702310, and 61873151).  ... 
doi:10.1155/2020/4164692 fatcat:2ku7xp5x65bkhggvu7x7skspja

Cross-Modal Manifold Learning for Cross-modal Retrieval [article]

Sailesh Conjeti, Anees Kazi, Nassir Navab, Amin Katouzian
2016 arXiv   pre-print
This paper presents a new scalable algorithm for cross-modal similarity preserving retrieval in a learnt manifold space.  ...  The inter-, and intra-modality affinity matrices are then computed to reinforce original data skeleton using perturbed minimum spanning tree (pMST), and maximizing the affinity among similar cross-modal  ...  Regression for glioma assessment through cross-modal retrieval: We pose the retrieval task defined over a publicly available multi-protocol MR dataset for Glioma assessment (BraTS) [10] .  ... 
arXiv:1612.06098v1 fatcat:itleeic7qzenpn6m4diwgxswkq

Self-Paced Cross-Modal Subspace Matching

Jian Liang, Zhihang Li, Dong Cao, Ran He, Jingdong Wang
2016 Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - SIGIR '16  
Then, we formulate the unsupervised cross-modal matching problem as a non-convex joint feature learning and data grouping problem.  ...  This paper proposes a Self-Paced Cross-Modal Subspace Matching (SCSM) method for unsupervised multimodal data.  ...  However, it continues to be the best algorithm for unsupervised cross-modal subspace learning.  ... 
doi:10.1145/2911451.2911527 dblp:conf/sigir/LiangLCHW16 fatcat:ogq5jwnxtvhlbkbbzbm7rzzgoa

A new approach to cross-modal multimedia retrieval

Nikhil Rasiwasia, Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Gert R.G. Lanckriet, Roger Levy, Nuno Vasconcelos
2010 Proceedings of the international conference on Multimedia - MM '10  
The two hypotheses are studied in the context of the task of cross-modal document retrieval.  ...  The cross-modal model is also shown to outperform state-of-the-art image retrieval systems on a unimodal retrieval task.  ...  This leads to the schematic representation of Figure 2 , where CCA defines a common subspace (U) for cross-modal retrieval.  ... 
doi:10.1145/1873951.1873987 dblp:conf/mm/RasiwasiaPCDLLV10 fatcat:2qph2zemvfhz3n3jxpkeod3o5a

Semantic Consistency Cross-modal Retrieval with Semi-supervised Graph Regularization

Gongwen Xu, Xiaomei Li, Zhijun Zhang
2020 IEEE Access  
INDEX TERMS Cross-modal retrieval, semi-supervised, graph regularization, subspace learning.  ...  Most of the existing cross-modal retrieval methods make use of labeled data to learn projection matrices for different modal data.  ...  Joint latent subspace learning and regression (JLSLR) [29] uses spectral regression when learning potential subspaces.  ... 
doi:10.1109/access.2020.2966220 fatcat:pqaj2o6hdzhczgbsr2dkzl7ssy

Multi-modal Mutual Topic Reinforce Modeling for Cross-media Retrieval

Yanfei Wang, Fei Wu, Jun Song, Xi Li, Yueting Zhuang
2014 Proceedings of the ACM International Conference on Multimedia - MM '14  
In principle, M 3 R is capable of simultaneously accomplishing the following two learning tasks: 1) modality-specific (e.g., image-specific or text-specific ) latent topic learning; and 2) cross-modal  ...  Motivated by this task, we propose a supervised multi-modal mutual topic reinforce modeling (M 3 R) approach, which seeks to build a joint cross-modal probabilistic graphical model for discovering the  ...  For examples, after the maximally correlated subspace of text and image features is obtained by CCA, logistic regression is employed to cross-media retrieval in [23] .  ... 
doi:10.1145/2647868.2654901 dblp:conf/mm/WangWSLZ14 fatcat:rb65bjlp4zbjnh2qxc64x5w3da

Cross-modal Retrieval with Label Completion

Xing Xu, Fumin Shen, Yang Yang, Heng Tao Shen, Li He, Jingkuan Song
2016 Proceedings of the 2016 ACM on Multimedia Conference - MM '16  
Extensive experiments on two large-scale multi-modal datasets demonstrate the superiority of our model for both label completion and cross-modal retrieval over the state-of-the-arts.  ...  Most supervised cross-modal retrieval methods learn discriminant common subspaces minimizing the heterogeneity of different modalities by exploiting the label information.  ...  To this end, many approaches have been proposed to learn a common latent subspace for cross-modal retrieval, where the projected features of different modalities are homogeneous and can be directly matched  ... 
doi:10.1145/2964284.2967231 dblp:conf/mm/XuS0SHS16 fatcat:crv6k5sm5vboznwqb7z3ure47a

Semantic convex matrix factorisation for cross-media retrieval

Yixian Fang, Yuwei Ren, Huaxiang Zhang
2019 IET Image Processing  
To address these problems, a semantic convex matrix factorisation subspace learning approach is proposed for cross-media retrieval between image and text.  ...  When utilising matrix factorisation to extract latent features for cross-media retrieval, semantic information may be lost in the process of factorisation.  ...  , 2017CXGC0703) and the Natural Science Foundation of Shandong China (nos.  ... 
doi:10.1049/iet-ipr.2018.5853 fatcat:sdtyhoiowbapvhwdwya6qvipt4

On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval

Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Nikhil Rasiwasia, Gert R. G. Lanckriet, Roger Levy, Nuno Vasconcelos
2014 IEEE Transactions on Pattern Analysis and Machine Intelligence  
All approaches are shown successful for text retrieval in response to image queries and vice versa.  ...  A mathematical formulation is proposed, equating the design of cross-modal retrieval systems to that of isomorphic feature spaces for different content modalities.  ...  This can be done by learning a joint probability distribution for words and visual features, for example, using latent Dirichlet allocation (LDA) models [14] , probabilistic latent semantic analysis (  ... 
doi:10.1109/tpami.2013.142 pmid:24457508 fatcat:nnzkvhf4l5f4rb2kxgqt5banfe

Cross-modal Subspace Learning via Kernel Correlation Maximization and Discriminative Structure Preserving [article]

Jun Yu, Xiao-Jun Wu
2020 arXiv   pre-print
In this paper, we propose a novel framework, termed Cross-modal subspace learning via Kernel correlation maximization and Discriminative structure-preserving (CKD), to solve this problem in two aspects  ...  However, most of existing works focus on learning a latent subspace but the semantically structural information is not well preserved. Thus, these approaches cannot get desired results.  ...  [6] [7] [8] The other is to learn a latent common subspace where the distance between different modalities data can be measured, which is also termed as cross-modal subspace learning.  ... 
arXiv:1904.00776v3 fatcat:5cj2yphukfeqddvilipkwnuphe
« Previous Showing results 1 — 15 out of 626 results