A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Filters
Cross-media Event Extraction and Recommendation
2016
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
We have developed a comprehensive system that searches, identifies, organizes and summarizes complex events from multiple data modalities. ...
It also recommends events related to the user's ongoing search based on previously selected attribute values and dimensions of events being viewed. ...
Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on. ...
doi:10.18653/v1/n16-3015
dblp:conf/naacl/LuVTRGKZWLCJCHW16
fatcat:kxehxhclqzacpa6rtxijgqgsqy
On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval
2014
IEEE Transactions on Pattern Analysis and Machine Intelligence
All approaches are shown successful for text retrieval in response to image queries and vice versa. ...
This problem addresses the design of retrieval systems that support queries across content modalities, for example, using an image to search for texts. ...
For example, in [29] a query text, and in [30] a query image is used to retrieve similar text documents and images, based on low-level text (e.g., words) and image (e.g., DCTs) representations, respectively ...
doi:10.1109/tpami.2013.142
pmid:24457508
fatcat:nnzkvhf4l5f4rb2kxgqt5banfe
Cross-Modal Information Retrieval – A Case Study on Chinese Wikipedia
[chapter]
2012
Lecture Notes in Computer Science
Probability models have been used in cross-modal multimedia information retrieval recently by building conjunctive models bridging the text and image components. ...
We investigate the problems of retrieving texts (ranked by semantic closeness) given an image query, and vice versa. ...
This work is partially funded by the NCET Program of MOE, the SRF for ROCS, the Fundamental Research Funds for the Central Universities and Graduate Innovative Practice Fund of BUAA. ...
doi:10.1007/978-3-642-35527-1_2
fatcat:4mjzgdhvfrax7nmb4vawi6ppva
Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval
[article]
2020
arXiv
pre-print
To facilitate video retrieval with complex queries, we propose a Tree-augmented Cross-modal Encoding method by jointly learning the linguistic structure of queries and the temporal representation of videos ...
Traditional methods mainly favor the concept-based paradigm on retrieval with simple queries, which are usually ineffective for complex queries that carry far more complex semantics. ...
Figure 2 : 2 An illustration of our tree-augmented cross-modal encoding method for complex-query video retrieval. ...
arXiv:2007.02503v1
fatcat:eptt6v2lirbgxet6bqm7wjjpzu
Deep Learning Techniques for Future Intelligent Cross-Media Retrieval
[article]
2020
arXiv
pre-print
Then, we present some well-known cross-media datasets used for retrieval, considering the importance of these datasets in the context in of deep learning based cross-media retrieval approaches. ...
In this paper, we provide a novel taxonomy according to the challenges faced by multi-modal deep learning approaches in solving cross-media retrieval, namely: representation, alignment, and translation ...
Multimodal alignment is significant for cross-media retrieval, as it allows us to retrieve the contents of different modality based on input query (e.g., image retrieval in case of the text as a query, ...
arXiv:2008.01191v1
fatcat:t63bg55w2vdqjcprzaaidrmprq
Dual Encoding for Zero-Example Video Retrieval
[article]
2019
arXiv
pre-print
Given videos as sequences of frames and queries as sequences of words, an effective sequence-to-sequence cross-modal matching is required. ...
The majority of existing methods are concept based, extracting relevant concepts from queries and videos and accordingly establishing associations between the two modalities. ...
As for query representation, the authors design relatively complex linguistic rules to extract relevant concepts from a given query. Ueki et al. ...
arXiv:1809.06181v3
fatcat:tkjlbrflojhazdeq2wcihhdony
Deep Multimodal Learning for Affective Analysis and Retrieval
2015
IEEE transactions on multimedia
emotion classification and cross-modal retrieval. ...
More importantly, the joint representation enables emotion-oriented cross-modal retrieval, for example, retrieval of videos using the text query "crazy cat". ...
For the visual modality, different from the results in Table III, SentiBank and E-MDBM-V TABLE IV MEAN AVERAGE PRECISION@20 OF TEXT-BASED, VIDEO-BASED, AND MULTIMODAL QUERY FOR RETRIEVING EMOTIONAL ...
doi:10.1109/tmm.2015.2482228
fatcat:7tozmatnhvbj7hjjohkofngecq
Weakly-Supervised Visual-Retriever-Reader for Knowledge-based Question Answering
[article]
2021
arXiv
pre-print
We introduce various ways to retrieve knowledge using text and images and two reader styles: classification and extraction. Both the retriever and reader are trained with weak supervision. ...
One dataset that is mostly used in evaluating knowledge-based VQA is OK-VQA, but it lacks a gold standard knowledge corpus for retrieval. ...
Acknowledgements The authors acknowledge support from the NSF grant 1816039, DARPA grant W911NF2020006, DARPA grant FA875019C0003, and ONR award N00014-20-1-2332; and thank the reviewers for their feedback ...
arXiv:2109.04014v1
fatcat:rnm2ghrosbd4xkctt4jnozfndu
A Survey on Content-based Image Retrieval
2017
International Journal of Advanced Computer Science and Applications
In this article, a survey on state of the art content based image retrieval including empirical and theoretical work is proposed. ...
These databases can be counter-productive if they are not coupled with efficient Content-Based Image Retrieval (CBIR) tools. ...
ACKNOWLEDGMENT This work was supported by the Research Centre of the College of Computer and Information Sciences, King Saud University. The author is grateful for this support. ...
doi:10.14569/ijacsa.2017.080521
fatcat:kzfskamd25coxcj3537z6z3ty4
A support vector approach for cross-modal search of images and texts
2017
Computer Vision and Image Understanding
In this paper, we study two complementary cross-modal prediction tasks: (i) predicting text(s) given a query image ("Im2Text"), and (ii) predicting image(s) given a piece of text ("Text2Im"). ...
We propose a novel Structural SVM based unified framework for these two tasks, and show how it can be efficiently trained and tested. ...
This implies that normalized correlation based loss func-920 tion models the cross-modal patterns better than the other two loss functions. ...
doi:10.1016/j.cviu.2016.10.001
fatcat:4762cgs7cbflxh72kke72cagyi
Intermediate Annotationless Dynamical Object-Index-Based Query in Large Image Archives with Holographic Representation
1996
Journal of Visual Communication and Image Representation
This paper presents a new parallel and distributed associative network based technique for content-based image retrieval (CBIR) with dynamic indices. ...
The paper presents the mechanism, architecture and performance of an image archival and retrieval system realized with this new network. ...
and content-based retrieval in image archives [6, 10] . ...
doi:10.1006/jvci.1996.0033
fatcat:c6u52bcfcrdtflct43inog2kqq
Visual Goal-Step Inference using wikiHow
[article]
2021
arXiv
pre-print
We propose the Visual Goal-Step Inference (VGSI) task, where a model is given a textual goal and must choose which of four images represents a plausible step towards that goal. ...
With a new dataset harvested from wikiHow consisting of 772,277 images representing human actions, we show that our task is challenging for state-of-the-art multimodal models. ...
We thank Chenyu Liu for annotations. We also thank Simmi Mourya, Keren Fuentes, Carl Vondrick, Zsolt Kira, Mohit Bansal, Lara Martin, and anonymous reviewers for their valuable feedback. ...
arXiv:2104.05845v2
fatcat:nsli5d55zza3hjsyeih2j3aili
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
[article]
2020
arXiv
pre-print
We introduce a new pre-trainable generic representation for visual-linguistic tasks, called Visual-Linguistic BERT (VL-BERT for short). ...
It is designed to fit for most of the visual-linguistic downstream tasks. ...
ACKNOWLEDGMENTS The work is partially supported by the National Natural Science Foundation of China under grand No.U19B2044 and No.61836011. ...
arXiv:1908.08530v4
fatcat:venc4egmz5hhbe4oeyt5f2wgku
Overview of the ImageCLEF 2006 Photographic Retrieval and Object Annotation Tasks
[chapter]
2007
Lecture Notes in Computer Science
Topics have been categorised and analysed with respect to attributes including an estimation of their "visualness" and linguistic complexity. ...
These tasks provide both the resources and the framework necessary to perform comparative laboratorystyle evaluation of visual information systems for image retrieval and automatic image annotation. ...
Special thanks to viventura, the IAPR and LTUtech for providing their image databases for this years' tasks, and to Tobias Weyand for creating the web interface for submissions. ...
doi:10.1007/978-3-540-74999-8_71
fatcat:nwavr7byzbbflp3b7wvxd4lem4
From Visual Attributes to Adjectives through Decompositional Distributional Semantics
2015
Transactions of the Association for Computational Linguistics
We can thus achieve better attribute (and object) label retrieval by treating images as "visual phrases", and decomposing their linguistic representation into an attribute-denoting adjective and an object-denoting ...
By building on the recent "zero-shot learning" approach, and paying attention to the linguistic nature of attributes as noun modifiers, and specifically adjectives, we show that it is possible to tag images ...
Acknowledgments We thank the TACL reviewers for their feedback. We were supported by ERC 2011 Starting Independent Research Grant n. 283554 (COMPOSES). ...
doi:10.1162/tacl_a_00132
fatcat:cig3svaf75f57jjgo2r4bxfeum
« Previous
Showing results 1 — 15 out of 3,492 results