Filters








15,656 Hits in 6.0 sec

Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry and Fusion [article]

Yang Wang
2020 arXiv   pre-print
With the development of web technology, multi-modal or multi-view data has surged as a major stream for big data, where each modal/view encodes individual property of data objects.  ...  Most of the existing state-of-the-art focused on how to fuse the energy or information from multi-modal spaces to deliver a superior performance over their counterparts with single modal.  ...  Among these, cross-modal retrieval is a hot research topic attracting tremendous attention. To deal with the problem of cross-modal retrieval, Wang et al.  ... 
arXiv:2006.08159v1 fatcat:g4467zmutndglmy35n3eyfwxku

Semantic Modeling of Textual Relationships in Cross-Modal Retrieval [article]

Jing Yu, Chenghao Yang, Zengchang Qin, Zhuoqian Yang, Yue Hu and Weifeng Zhang
2019 arXiv   pre-print
Feature modeling of different modalities is a basic problem in current research of cross-modal information retrieval.  ...  A dual-path neural network is adopted to learn multi-modal representations of information and cross-modal similarity measure jointly.  ...  INTRODUCTION Cross-modal information retrieval (CMIR), which enables queries from one modality to retrieve information in another, plays an increasingly important role in intelligent searching and recommendation  ... 
arXiv:1810.13151v3 fatcat:3fc62dkndvde3g5ilp32sbfybi

HERO: HiErarchical spatio-tempoRal reasOning with Contrastive Action Correspondence for End-to-End Video Object Grounding [article]

Mengze Li and Tianbao Wang and Haoyu Zhang and Shengyu Zhang and Zhou Zhao and Wenqiao Zhang and Jiaxu Miao and Shiliang Pu and Fei Wu
2022 arXiv   pre-print
Furthermore, our proposed pyramid and shifted alignment mechanisms are effective to improve the cross-modal information utilization of neighborhood spatial regions and temporal frames.  ...  In this paper, we tackle this task by a novel framework called HiErarchical spatio-tempoRal reasOning (HERO) with contrastive action correspondence.  ...  They fuse multi-modal information step by step from different semantics and spatio-temporal levels.  ... 
arXiv:2208.05818v1 fatcat:cq7mh2dl5bdbfhwsj74buty2k4

Multi-modal Memory Enhancement Attention Network for Image-Text Matching

Zhong Ji, Zhigang Lin, Haoran Wang, Yuqing He
2020 IEEE Access  
by constructing a Multi-Modal Memory Enhancement (M3E) module.  ...  Specifically, it sequentially restores the intra-modal and multimodal information into the memory items, and they conversely persistently memorize cross-modal shared semantics to improve the latent embeddings  ...  Then, the multi-modal memory vectors are built via fusing the formers to capture the multi-modal semantics.  ... 
doi:10.1109/access.2020.2975594 fatcat:ciiubythzzevpkw2ip5csnjwf4

Self-supervised asymmetric deep hashing with margin-scalable constraint [article]

Zhengyang Yu, Song Wu, Zhihao Dou, Erwin M.Bakker
2021 arXiv   pre-print
SADH implements a self-supervised network to sufficiently preserve semantic information in a semantic feature dictionary and a semantic code dictionary for the semantics of the given dataset, which efficiently  ...  and precisely guides a feature learning network to preserve multilabel semantic information using an asymmetric learning strategy.  ...  With a novel asymmetric guidance mechanism, rich semantic information preserved by Semantic-Network can be seamlessly transferred to Image-Network, which can ensure that the global semantic relevance can  ... 
arXiv:2012.03820v3 fatcat:fscm4ggdyrct3o6kso53mmriou

Research on Cross-media Science and Technology Information Data Retrieval [article]

Yang Jiang and Zhe Xue and Ang Li
2022 arXiv   pre-print
Therefore, in view of the above research background, it is of profound practical significance to study the cross-media science and technology information data retrieval system based on deep semantic features  ...  Since the era of big data, the Internet has been flooded with all kinds of information. Browsing information through the Internet has become an integral part of people's daily life.  ...  How ever, due to the characteristics of multi-source and multi -modal information of cross-media technology informati on data, how to design a unified collection, filtering, stor age and processing process  ... 
arXiv:2204.04887v1 fatcat:l3uq7stdjng3xhujllhe5pvdmu

Multi-modal Mutual Topic Reinforce Modeling for Cross-media Retrieval

Yanfei Wang, Fei Wu, Jun Song, Xi Li, Yueting Zhuang
2014 Proceedings of the ACM International Conference on Multimedia - MM '14  
As an important and challenging problem in the multimedia area, multi-modal data understanding aims to explore the intrinsic semantic information across different modalities in a collaborative manner.  ...  Motivated by this task, we propose a supervised multi-modal mutual topic reinforce modeling (M 3 R) approach, which seeks to build a joint cross-modal probabilistic graphical model for discovering the  ...  for Multi-Modal retrieval (SliM 2 ) [31]: SliM 2 is a supervised dictionary learning approach with group structures utilizing the class information to jointly learn discriminative multi-modal dictionaries  ... 
doi:10.1145/2647868.2654901 dblp:conf/mm/WangWSLZ14 fatcat:rb65bjlp4zbjnh2qxc64x5w3da

Special issue on multimedia recommendation and multi-modal data analysis

Xiangnan He, Zhenguang Liu, Hanwang Zhang, Chong-Wah Ngo, Svebor Karaman, Yongfeng Zhang
2019 Multimedia Systems  
multi-modal 3D model recognition.  ...  Global information (i.e., detected attributes) and local information (i.e., appearance features) extracted from the video are added as extra inputs to each cell of LSTM, with the aim of collaboratively  ... 
doi:10.1007/s00530-019-00639-3 fatcat:z2qpdombobf4zonjjpe5pvfqrm

Search for Multi-modality Data in Digital Libraries [chapter]

Jun Yang, Yueting Zhuang, Qing Li
2001 Lecture Notes in Computer Science  
Unlike most previously proposed retrieval approaches that focus on a specific media type, this paper presents 2M2Net as a seamless integration framework for retrieval of multi-modality data in digital  ...  As its specific approaches, a learningfrom-elements strategy is devised for propagation of semantic descriptions, and a cross media search mechanism with relevance feedback is proposed for evaluation and  ...  It features a learningfrom-elements strategy for propagation of the semantic descriptions, as well as a cross media search mechanism that is tailored to multi-modality data.  ... 
doi:10.1007/3-540-45453-5_62 fatcat:4oz3qzdxpzdc5o2uchpsszuda4

Deep Multi-level Semantic Hashing for Cross-modal Retrieval

Zhenyan Ji, Weina Yao, Wei Wei, Houbing Song, Huaiyu Pi
2019 IEEE Access  
And a deep hashing framework is designed for multi-label image-text cross retrieval tasks.  ...  However, few methods have explored the rich semantic information implicit in multi-label data to improve the accuracy of searching results.  ...  CONCLUSION In this paper, we propose a deep multi-level semantic hashing method for cross-modal retrieval.  ... 
doi:10.1109/access.2019.2899536 fatcat:xynopqlgyfhe3ef6su55zqczim

A Comprehensive Survey on Cross-modal Retrieval [article]

Kaiye Wang, Qiyue Yin, Wei Wang, Shu Wu, Liang Wang
2016 arXiv   pre-print
Various methods have been proposed to deal with such a problem.  ...  To speed up the cross-modal retrieval, a number of binary representation learning methods are proposed to map different modalities of data into a common Hamming space.  ...  [70] propose a multi-view hashing algorithm for cross-modal retrieval, called Iterative Multi-View Hashing (IMVH).  ... 
arXiv:1607.06215v1 fatcat:jfbmmlvzrvcmtmzezogzuxvvqu

Correlation-based Feature Analysis and Multi-Modality Fusion framework for multimedia semantic retrieval

Hsin-Yu Ha, Yimin Yang, Fausto C. Fleites, Shu-Ching Chen
2013 2013 IEEE International Conference on Multimedia and Expo (ICME)  
In this paper, we propose a Correlation based Feature Analysis (CFA) and Multi-Modality Fusion (CFA-MMF) framework for multimedia semantic concept retrieval.  ...  A correlation matrix is built upon feature pair correlations, and then a MaxST is constructed based on the correlation matrix.  ...  RELATED WORK The related works in the area of multimedia semantic retrieval can be roughly summarized into (1) uni-modality based approaches and (2) multi-modality based approaches, from an information-fusion  ... 
doi:10.1109/icme.2013.6607639 dblp:conf/icmcs/HaYFC13 fatcat:yivpfv2eineylcddtrxgrc3vde

Modality-dependent Cross-media Retrieval [article]

Yunchao Wei, Yao Zhao, Zhenfeng Zhu, Shikui Wei, Yanhui Xiao, Jiashi Feng, Shuicheng Yan
2015 arXiv   pre-print
Different from previous works, we propose a modality-dependent cross-media retrieval (MDCR) model, where two couples of projections are learned for different cross-media retrieval tasks instead of one  ...  Specifically, by jointly optimizing the correlation between images and text and the linear regression from one modal space (image or text) to the semantic space, two couples of mappings are learned to  ...  Different from [Gong et al. 2013 ] which incorporates the semantic information as a third view, in this paper, semantic information is employed to determine a common latent space with a fixed dimension  ... 
arXiv:1506.06628v2 fatcat:vbnedfrefzdldk67prpojzl2v4

Attention-Aware Deep Adversarial Hashing for Cross-Modal Retrieval [chapter]

Xi Zhang, Hanjiang Lai, Jiashi Feng
2018 Lecture Notes in Computer Science  
Due to the rapid growth of multi-modal data, hashing methods for cross-modal retrieval have received considerable attention.  ...  To further address this problem, we propose an adversarial hashing network with an attention mechanism to enhance the measurement of content similarities by selectively focusing on the informative parts  ...  For instance, cross-view hashing (CVH) [27] extends spectral hashing from uni-modal to multi-modal scenarios.  ... 
doi:10.1007/978-3-030-01267-0_36 fatcat:ztkjxybu6nd2tgcm636lto2swm

A Review of Hashing Methods for Multimodal Retrieval

Wenming Cao, Wenshuo Feng, Qiubin Lin, Guitao Cao, Zhihai He
2020 IEEE Access  
With the advent of the information age, the amount of multimedia data has exploded. That makes fast and efficient retrieval in multimodal data become an urgent requirement.  ...  INDEX TERMS Multimedia, multimodal retrieval, hashing method, deep learning, reviews. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License.  ...  MULTI/CROSS-MODAL METHODS 1) CROSS-VIEW HASHING (CVH) [34] & INTER-MEDIA HASHING (IMH) [35] The Cross-View Hashing is an extension of the spectral hashing.  ... 
doi:10.1109/access.2020.2968154 fatcat:e3vmte5hrnhu3b3lf5ws4gwnhm
« Previous Showing results 1 — 15 out of 15,656 results