A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Filters
Multi-Label Learning With Fused Multimodal Bi-Relational Graph
2014
IEEE transactions on multimedia
Experimental results with our proposed method on two standard multi-label image datasets are very promising. Index Terms-Graph-based semi-supervised learning, multi-label classification, multimodal. ...
Such a representation allows for effective exploitation of both feature complementariness and label correlation. This contrasts with previous work where these two factors are considered in isolation. ...
Ziyu Guan for his helpful discussions. ...
doi:10.1109/tmm.2013.2291218
fatcat:icqd2ejzpnf5rff5kfdkeuvv7u
Learning to name faces
2013
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13
their weak labels for naming the query facial image. ...
major components: (i) we enhance the weak labels of top-ranked similar images by exploiting the "label smoothness" assumption; (ii) we construct the multimodal representations of a facial image by extracting ...
Algorithm for Learning to Name Faces In the above, we separately discuss the three key factors that affect the final annotation result of the proposed SBFA framework, including the label matrix Y , the ...
doi:10.1145/2484028.2484040
dblp:conf/sigir/WangHWZ0M13
fatcat:qiaiak4sivaqfmracgmnejmbza
Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization
2012
Neurocomputing
This paper presents a novel method based on non-negative matrix factorization to generate multimodal image representations that integrate visual features and text information. ...
The proposed approach discovers a set of latent factors that correlate multimodal data in the same representation space. ...
Two main requirements are herein considered to approximate the matrix factorization in Equation 2. ...
doi:10.1016/j.neucom.2011.04.037
fatcat:6joubook3jd5zljqxj34thjbna
Online Matrix Factorization for Space Embedding Multilabel Annotation
[chapter]
2013
Lecture Notes in Computer Science
The paper presents an online matrix factorization algorithm for multilabel learning. ...
This method addresses the multi-label annotation problem finding a joint embedding that represents both instances and labels in a common latent space. ...
Radiológicas Usando Semántica Latente", "Diseño e implementación de un sistema de cómputo sobre recursos heterogéneos para la identificación de estructuras atmosféricas en predicción climatológica" and LACCIR "Multimodal ...
doi:10.1007/978-3-642-41822-8_43
fatcat:mfeovm7pnneetmmit5hj2rxlkq
Detection of Illicit Drug Trafficking Events on Instagram: A Deep Multimodal Multilabel Learning Approach
[article]
2021
arXiv
pre-print
We have constructed a large-scale dataset MM-IDTE with manually annotated multiple drug labels to support fine-grained detection of illicit drugs. ...
Specifically, our model takes text and image data as the input and combines multimodal information to predict multiple labels of illicit drugs. ...
correlations [48] , to exploiting label correlations for multi-label learning. ...
arXiv:2108.08920v1
fatcat:k6adlinv7baapkzbogr3vqzura
Multimodal Metric Learning for Tag-based Music Retrieval
[article]
2020
arXiv
pre-print
Also, metric learning has already proven its suitability for cross-modal retrieval tasks in other domains (e.g., text-to-image) by jointly learning a multimodal embedding space. ...
In this paper, we investigate three ideas to successfully introduce multimodal metric learning for tag-based music retrieval: elaborate triplet sampling, acoustic and cultural music information, and domain-specific ...
This metric learning model with side information demonstrated its versatility in multi-label zero-shot annotation and retrieval tasks. ...
arXiv:2010.16030v1
fatcat:opjfd2xoc5avnhjgkmlf7e3f3u
Video Captioning with Guidance of Multimodal Latent Topics
2017
Proceedings of the 2017 ACM on Multimedia Conference - MM '17
For the topic prediction task, we use the mined topics as the teacher to train a student topic prediction model, which learns to predict the latent topics from multimodal contents of videos. ...
We formulate the topic-aware caption generation as a multi-task learning problem, in which we add a parallel task, topic prediction, in addition to the caption task. ...
So the 3-way factorization method [16, 24] is used to share parameters. ...
doi:10.1145/3123266.3123420
dblp:conf/mm/ChenCJH17
fatcat:st3ogxnthbczhnr7kygbgf7psu
Multimodal Co-learning: Challenges, Applications with Datasets, Recent Advances and Future Directions
[article]
2021
arXiv
pre-print
However, in real-world tasks, typically, it is observed that one or more modalities are missing, noisy, lacking annotated data, have unreliable labels, and are scarce in training or testing and or both ...
Our final goal is to discuss challenges and perspectives along with the important ideas and directions for future work that we hope to be beneficial for the entire research community focusing on this exciting ...
One way to create a multimodal embedding is to have a projection of aligned multiple modalities data into a common sub-space governed by a similarity matrix. ...
arXiv:2107.13782v2
fatcat:s4spofwxjndb7leqbcqnwbifq4
Multi modal semantic indexing for image retrieval
2010
Proceedings of the ACM International Conference on Image and Video Retrieval - CIVR '10
In this paper, we propose two techniques: Multi-modal Latent Semantic Indexing (MMLSI) and Multi-Modal Probabilistic Latent Semantic Analysis (MMpLSA). ...
The experimental results demonstrate an improved accuracy over other single and multi-modal methods. ...
This is a naive way of managing multimode data. The disadvantages include shadowing of one mode by another by factors that include dictionary size, distribution etc. ...
doi:10.1145/1816041.1816091
dblp:conf/civr/PullaJ10
fatcat:zv5chquwufazxcocghtrt5hnpu
Logically at Factify 2022: Multimodal Fact Verification
[article]
2022
arXiv
pre-print
This paper describes our participant system for the multi-modal fact verification (Factify) challenge at AAAI 2022. ...
Finally, we highlight challenges of the task and multimodal dataset for future research. ...
Thus, supported maximum sequence length and optimum document context size are two of key factors to be considered. ...
arXiv:2112.09253v2
fatcat:cn4xj4dcybcgrb3clpufkxepmq
Video Captioning with Guidance of Multimodal Latent Topics
[article]
2017
arXiv
pre-print
For the topic prediction task, we use the mined topics as the teacher to train a student topic prediction model, which learns to predict the latent topics from multimodal contents of videos. ...
We formulate the topic-aware caption generation as a multi-task learning problem, in which we add a parallel task, topic prediction, in addition to the caption task. ...
So the 3-way factorization method [16, 24] is used to share parameters. ...
arXiv:1708.09667v2
fatcat:pf5ybcxzhnfufmasqctsfe3xtu
Affective Computing for Large-scale Heterogeneous Multimedia Data
2019
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
., images, music, videos, and multimodal data, with the focus on both handcrafted features-based methods and deep learning methods. ...
We briefly describe the available datasets for evaluating AC algorithms. ...
Multimodal fusion can be done in model-based and model-agnostic ways. ...
doi:10.1145/3363560
fatcat:m56udtjlxrauvmj6d5z2r2zdeu
Knowledge Extraction And Representation Learning For Music Recommendation And Classification
2017
Zenodo
Next, we focus on learning new data representations from multimodal content using deep learning architectures, addressing the problems of cold-start music recommendation and multi-label music genre classification ...
To this end, we first focus on the problem of linking music-related texts with online knowledge repositories and on the automated construction of music knowledge bases. ...
Labels factorization Let M be the binary matrix of items I and labels L where m ij = 1 if i i is annotated with label l j and m ij = 0 otherwise. ...
doi:10.5281/zenodo.1048497
fatcat:kdh5jhvocbh3riwln6n2f756su
Knowledge Extraction And Representation Learning For Music Recommendation And Classification
2017
Zenodo
Next, we focus on learning new data representations from multimodal content using deep learning architectures, addressing the problems of cold-start music recommendation and multi-label music genre classification ...
To this end, we first focus on the problem of linking music-related texts with online knowledge repositories and on the automated construction of music knowledge bases. ...
Labels factorization Let M be the binary matrix of items I and labels L where m ij = 1 if i i is annotated with label l j and m ij = 0 otherwise. ...
doi:10.5281/zenodo.1100973
fatcat:yfpmc6qxbbakjp6qzvywyoaoci
Large Scale Image Indexing Using Online Non-negative Semantic Embedding
[chapter]
2013
Lecture Notes in Computer Science
This paper presents a novel method to address the problem of indexing a large set of images taking advantage of associated multimodal content such as text or tags. ...
The principal advantage of the proposed method is its formulation as an online learning algorithm, which can scale to deal with large image collections. ...
[6] propose multimodal matrix factorization algorithms based on SGD to decompose a training data set, and find correspondences between visual patterns and text terms in large image collection. ...
doi:10.1007/978-3-642-41822-8_46
fatcat:cnsejphfuzgmtmmpnejkjwo5ya
« Previous
Showing results 1 — 15 out of 2,738 results