Filters








17,244 Hits in 5.4 sec

Multi-Modal Knowledge Representation Learning via Webly-Supervised Relationships Mining

Fudong Nian, Bing-Kun Bao, Teng Li, Changsheng Xu
2017 Proceedings of the 2017 ACM on Multimedia Conference - MM '17  
automatically. (2) It is able to learn a common knowledge space which is independent to both task and modality by the proposed Bi-enhanced Cross-modal Deep Neural Network (BC-DNN). (3) It has the ability  ...  ., webly-supervised multi-modal relationship mining, and bi-enhanced cross-modal knowledge representation learning.  ...  Secondly, MM-KRL is bene cial for many downstream multi-modal tasks, such as cross-modal retrieval, visual relationship recognition and image/video captioning.  ... 
doi:10.1145/3123266.3123443 dblp:conf/mm/NianBLX17 fatcat:e5wyg4iykzgexcb6vohf2okshm

Cross-modal Retrieval with Label Completion

Xing Xu, Fumin Shen, Yang Yang, Heng Tao Shen, Li He, Jingkuan Song
2016 Proceedings of the 2016 ACM on Multimedia Conference - MM '16  
We thus formulate the subspace learning problem as a coregularized learning framework based on multi-modal features and incomplete labels.  ...  Cross-modal retrieval has been attracting increasing attention because of the explosion of multi-modal data, e.g., texts and images.  ...  We incorporate the three criteria simultaneously and integrate the label completion and cross-modal retrieval into a joint learning framework.  ... 
doi:10.1145/2964284.2967231 dblp:conf/mm/XuS0SHS16 fatcat:crv6k5sm5vboznwqb7z3ure47a

Self-supervised Feature Learning by Cross-modality and Cross-view Correspondences [article]

Longlong Jing, Yucheng Chen, Ling Zhang, Mingyi He, Yingli Tian
2020 arXiv   pre-print
both 2D image features and 3D point cloud features by exploiting cross-modality and cross-view correspondences without using any human annotated labels.  ...  The effectiveness of the learned 2D and 3D features is evaluated by transferring them on five different tasks including multi-view 2D shape recognition, 3D shape recognition, multi-view 2D shape retrieval  ...  Hassani et al. proposed a multi-task learning framework to learn features by optimizing three different tasks including clustering, prediction, and reconstruction [14] .  ... 
arXiv:2004.05749v1 fatcat:fbpilwf3hjaxxdpulizzwjmnfa

From Intra-Modal to Inter-Modal Space: Multi-task Learning of Shared Representations for Cross-Modal Retrieval

Jaeyoung Choi, Martha Larson, Gerald Friedland, Alan Hanjalic
2019 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM)  
We propose a two-stage shared representation learning framework with intra-modal optimization and subsequent cross-modal transfer learning of semantic structure that produces a robust shared representation  ...  We integrate multi-task learning into each step, making it possible to leverage multiple datasets, annotated with different concepts, as if they were one large dataset.  ...  We integrate multi-task learning into our proposed framework to leverage multiple data sets as if they were one large data set, in order to learn a robust joint semantic representation for video, image  ... 
doi:10.1109/bigmm.2019.00-48 dblp:conf/bigmm/ChoiLFH19 fatcat:kvpmsqqadfaznab2wp7rqih5n4

Cross-Modality Deep Feature Learning for Brain Tumor Segmentation

Dingwen Zhang, Guohai Huang, Qiang Zhang, Jungong Han, Junwei Han, Yizhou Yu
2020 Pattern Recognition  
To this end, this paper proposes a novel cross-modality deep feature learning framework to segment brain tumors from the multi-modality MRI data.  ...  The proposed cross-modality deep feature learning framework consists of two learning processes: the cross-modality feature transition (CMFT) process and the cross-modality feature fusion (CMFF) process  ...  , and the China Postdoctoral Support Scheme for Innovative Talents under Grant BX20180236.  ... 
doi:10.1016/j.patcog.2020.107562 fatcat:6lb4es3v3ngwdaenjo3x42e3he

Advancing Medical Imaging Informatics by Deep Learning-Based Domain Adaptation

Anirudh Choudhary, Li Tong, Yuanda Zhu, May D. Wang
2020 IMIA Yearbook of Medical Informatics  
, image modality, and learning scenarios.  ...  There has been a rapid development of deep learning (DL) models for medical imaging. However, DL requires a large labeled dataset for training the models.  ...  Also alternative multi-modal frameworks such as MUNIT [30] can be explored.  ... 
doi:10.1055/s-0040-1702009 pmid:32823306 fatcat:gtlhoh6m3fh4hcumfzdlpdohr4

Multi-label Cross-Modal Retrieval

Viresh Ranjan, Nikhil Rasiwasia, C. V. Jawahar
2015 2015 IEEE International Conference on Computer Vision (ICCV)  
This results in a discriminative subspace which is better suited for cross-modal retrieval tasks.  ...  In this work, we address the problem of cross-modal retrieval in presence of multi-label annotations.  ...  In this work, we propose Multi-Label Canonical Correlation Analysis (ml-CCA), a cross-modal retrieval framework for multi-label datasets. ml-CCA utilizes multi-label information while learning a common  ... 
doi:10.1109/iccv.2015.466 dblp:conf/iccv/RanjanRJ15 fatcat:fozcl5x2ojg43ceyaeyoqdfztq

Self-supervised Contrastive Video-Speech Representation Learning for Ultrasound [article]

Jianbo Jiao, Yifan Cai, Mohammad Alsharid, Lior Drukker, Aris T.Papageorghiou, J. Alison Noble
2020 arXiv   pre-print
Within this framework, we introduce cross-modal contrastive learning and an affinity-aware self-paced learning scheme to enhance correlation modelling.  ...  Experimental evaluations on multi-modal fetal ultrasound video and audio show that the proposed approach is able to learn strong representations and transfers well to downstream tasks of standard plane  ...  Conclusion In this paper, we propose a self-supervised representation learning framework for ultrasound video-speech multi-modal data.  ... 
arXiv:2008.06607v1 fatcat:vmss2jhwi5e3pal3le323wifxy

Sentiment and Emotion-Aware Multi-Modal Complaint Identification

APOORVA SINGH, Soumyodeep Dey, Anamitra Singha, Sriparna Saha
2022 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
We present an attention-based multi-modal, adversarial multi-task deep neural network model for complaint detection to demonstrate the utility of the multi-modal dataset.  ...  Experimental results indicate that the multi-modality and multi-tasking complaint identification outperforms uni-modal and single-task variants.  ...  multi-task adversarial learning framework for multi-modal complaint, emotion, and sentiment analysis.  ... 
doi:10.1609/aaai.v36i11.21476 fatcat:t2idhnb6vzbphhijyejxfj3lcu

A Unified Continuous Learning Framework for Multi-modal Knowledge Discovery and Pre-training [article]

Zhihao Fan, Zhongyu Wei, Jingjing Chen, Siyuan Wang, Zejun Li, Jiarong Xu, Xuanjing Huang
2022 arXiv   pre-print
For knowledge discovery, a pre-trained model is used to identify cross-modal links on the graph.  ...  Taking the open-domain uni-modal datasets of images and texts as input, we maintain a knowledge graph as the foundation to support these two tasks.  ...  Model Image-to-Sentence Sentence-to-Image R@1 R@5 R@10 R@1 R@5 R@10 Conclusion In this paper, we propose a unified continuous learning framework for multi-modal knowledge discovery and pre-training based  ... 
arXiv:2206.05555v1 fatcat:wick43nxbrf5vdqb67kthnk7fq

M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining [article]

Xiao Dong, Xunlin Zhan, Yangxin Wu, Yunchao Wei, Michael C. Kampffmeyer, Xiaoyong Wei, Minlong Lu, Yaowei Wang, Xiaodan Liang
2022 arXiv   pre-print
, where the importance of each modality is learned directly from the modality embeddings and impacts the inter-modality contrastive learning and masked tasks within a multi-modal transformer model.  ...  We further propose Self-harmonized ContrAstive LEarning (SCALE), a novel pretraining framework that integrates the different modalities into a unified model through an adaptive feature fusion mechanism  ...  Specifically, we present human annotators with a matching task, where annotators are asked to select the matching image-text pairs for a given query image-text pair.  ... 
arXiv:2109.04275v5 fatcat:h5ngubiktfgr3plnx4l4bjz35i

Multi-modal Misinformation Detection: Approaches, Challenges and Opportunities [article]

Sara Abdali
2022 arXiv   pre-print
Thus, many research efforts have been put into development of automatic techniques for detecting possible cross-modal discordances in web-based media.  ...  have recently targeted contextual correlations between modalities e.g., text and image.  ...  However, in recent years, due to the sheer use of multi-modal platforms, many automated techniques for multi-modal tasks such as Visual Question Answering (VQA) [4, 26, 28, 33, 73] , image captioning  ... 
arXiv:2203.13883v3 fatcat:ari4onbo45ejfnwdnjgdti5daq

A Multi-modal and Multi-task Learning Method for Action Unit and Expression Recognition [article]

Yue Jin, Tianqing Zheng, Chao Gao, Guoqiang Xu
2021 arXiv   pre-print
In this paper, we introduce a multi-modal and multi-task learning method by using both visual and audio information.  ...  We use both AU and expression annotations to train the model and apply a sequence model to further extract associations between video frames.  ...  Figure 2 . 2 A multi-modal framework. Figure 3 . 3 A multi-task visual stream training framework.  ... 
arXiv:2107.04187v2 fatcat:n4zv7nuggvfgdgtuhhfqieqxcu

Towards Cross-Modality Medical Image Segmentation with Online Mutual Knowledge Distillation

Kang Li, Lequan Yu, Shujun Wang, Pheng-Ann Heng
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
other state-of-the-art multi-modality learning methods.  ...  To alleviate the learning difficulties caused by modality-specific appearance discrepancy, we first present an Image Alignment Module (IAM) to narrow the appearance gap between assistant and target modality  ...  Moeskops et al. (2016) investigated how to utilize multi-modality information under multi-task learning frameworks.  ... 
doi:10.1609/aaai.v34i01.5421 fatcat:4w3zi75kbzgznkz6acsuayvlte

Towards Cross-modality Medical Image Segmentation with Online Mutual Knowledge Distillation [article]

Kang Li, Lequan Yu, Shujun Wang, Pheng-Ann Heng
2020 arXiv   pre-print
other state-of-the-art multi-modality learning methods.  ...  To alleviate the learning difficulties caused by modality-specific appearance discrepancy, we first present an Image Alignment Module (IAM) to narrow the appearance gap between assistant and target modality  ...  (2016) investigated how to utilize multi-modality information under multi-task learning frameworks. Recently, Valindria et al.  ... 
arXiv:2010.01532v1 fatcat:ipaxqvcjpne3jf44yyin7ywxxe
« Previous Showing results 1 — 15 out of 17,244 results