A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Filters
Technology Demonstration
2017
Proceedings of the 2017 ACM SIGCHI Conference on Creativity and Cognition - C&C '17
This data is then used to create ideation metadata that can be used in comparative analysis of ideation 'events'. ...
text relationships fed into the ideation environment. Multimodal data: gathered through the use of gesture mapping (using stereoscopic 3D cameras such as the LEAP Motion [2]) and video analytics. ...
ACKNOWLEDGMENTS We thank London College of Communication, University of the Arts London, whose Programme Director Sabbatical Scheme (disciplinary research) significantly enabled the presentation of the ...
doi:10.1145/3059454.3078711
dblp:conf/candc/StopherS17
fatcat:pnu67i2ck5cnnprytfq543roja
Visual and Textual Analysis of Social Media and Satellite Images for Flood Detection @ Multimedia Satellite Task MediaEval 2017
2017
MediaEval Benchmarking Initiative for Multimedia Evaluation
Satellite images (FDSI). ...
in order to discriminate flooded from non-flooded images. ...
Textual information is also retrieved by leveraging the metadata of the social media images by using DBpedia Spotlight annotation tool [2] . ...
dblp:conf/mediaeval/AvgerinakisMAMG17
fatcat:jj526zempjdydmcz3rmdbgo3tm
Unified hypergraph for image ranking in a multimodal context
2012
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Image ranking has long been studied, yet it remains a very challenging problem. Increasingly, online images come with additional metadata such as user annotations and geographic coordinates. ...
Image ranking is then formulated as a ranking problem on a unified hypergraph. ...
CONCLUSIONS In this paper, we address the image ranking problem for online community photo database, and focus on combining multimodal information such as image visual features, user tags and geo-locations ...
doi:10.1109/icassp.2012.6288382
dblp:conf/icassp/XuSGM12
fatcat:xjjizcto5rdcvd7qod4ynwjtwu
Introduction to the special issue on "Semantic Multimedia"
2008
Journal of Web Semantics
The third paper "Image Retrieval ++ ---Web Image Retrieval with An Enhanced Multi-Modality Ontology" by Huan Wang, Liang-Tien Chia and Song Liu investigates the design and use of an ontology for multimodal ...
similarity matching and ranking for multimedia retrieval on the Web. ...
doi:10.1016/j.websem.2008.02.004
fatcat:sqv6pfnxknfbhd5zmuagz56u5m
Introduction to the special issue on "semantic multimedia"
2008
Multimedia tools and applications
Research in this area is important because the amount of information available as multimedia for the purpose of entertainment, security, teaching or technical documentation is overwhelming but the understanding ...
The third paper "Image Retrieval ++ ---Web Image Retrieval with An Enhanced Multi-Modality Ontology" by Huan Wang, Liang-Tien Chia and Song Liu investigates the design and use of an ontology for multimodal ...
similarity matching and ranking for multimedia retrieval on the Web. ...
doi:10.1007/s11042-008-0215-2
fatcat:vqc7qmyd2naxtkiksbccksk5di
Social interactions over geographic-aware multimedia systems
2013
Proceedings of the 21st ACM international conference on Multimedia - MM '13
., tweets, videos, images) can provide us with socially complementary information to predict users' needs. ...
understanding of the basics of location-aware contextual descriptions and its relations to social multimedia scenes, but may also serve to highlight relationships that can be collaboratively applied to multimodal ...
., leveraging the Foursquare API) as an affordable and effective tool can be used to annotate geographic metadata with meaningful categories [1, 5] to make them more informative for geographic-aware ...
doi:10.1145/2502081.2502236
dblp:conf/mm/ZimmermannY13
fatcat:2df6ljo65rgwnaxbd65b56bgaa
Prototype Demonstration: Video Content Personalization for IPTV Services
2007
2007 4th IEEE Consumer Communications and Networking Conference
Query relevance ranking with temporal and other metadata constraints is used to form timely, focused content sets for users. ...
We propose a solution based on stored user interest profiles and multimodal processing for content segmentation to produce manageable content subsets for users. ...
doi:10.1109/ccnc.2007.250
dblp:conf/ccnc/GibbonLDRBS07
fatcat:rvfzih3xujbtrbtq7b6folavb4
An adaptable search engine for multimodal information retrieval
2009
Journal of the American Society for Information Science and Technology
Multimodal information retrieval is a research problem of great interest in all domains, due to the huge collections of multimedia data available in different contexts like text, image, audio and video ...
This paper is an overview of multimodal information retrieval, challenges in the progress of multimodal information retrieval. General Terms Multi Modal Information Retrieval, Information Retrieval. ...
Recent efforts in the field of multimodal retrieval systems have led to a growing research community and a number of academic and industrial projects. ...
doi:10.1002/asi.21091
fatcat:jgcvyun6zna7vkze5wcliibzxu
Movie genome: alleviating new item cold start in movie recommendation
2019
User modeling and user-adapted interaction
works Deldjoo et alto better exploit complementary information between different modalities; (iii) proposing a two-step hybrid approach which trains a CF model on warm items (items with interactions) and leverages ...
Currently, the most common approach to this problem is to switch to a purely CBF method, usually by exploiting textual metadata. ...
for each image. ...
doi:10.1007/s11257-019-09221-y
fatcat:cnzhzxwjlfbd7g4hhafalhtji4
A CNN-RNN Framework for Image Annotation from Visual Cues and Social Network Metadata
[article]
2020
arXiv
pre-print
Metadata accompanying images on social-media represent an ideal source of additional information for retrieving proper neighborhoods easing image annotation task. ...
Images represent a commonly used form of visual communication among people. ...
We also acknowledge the UNIPD CAPRI Consortium, for its support and access to computing resources. ...
arXiv:1910.05770v2
fatcat:4htuagvvfvaoxdw5ufjgvs3n2u
For recalling instances of non-planar and non-rigid shapes, spatial configurations that emphasize topology consistency while allowing for local variations in matches have been incorporated. ...
In IS, names of the instances are inferred from similar visual examples searched through a million-scale image dataset. ...
., the metadata and the image) 2 is extracted from. We consider two clues to model the noise level of the metadata. ...
doi:10.1145/2393347.2393432
dblp:conf/mm/ZhangPN12
fatcat:ipaqc6w7dbeblezxe5gouyziey
Leveraging Known Data for Missing Label Prediction in Cultural Heritage Context
2018
Applied Sciences
In this paper, we present a novel multimodal classification approach for cultural heritage assets that relies on a multitask neural network where a convolutional neural network (CNN) is designed for visual ...
In this paper, we tackle the challenge of automatically classifying and annotating cultural heritage assets using their visual features as well as the metadata available at hand. ...
It would be interesting to leverage these metadata for the identification and prediction of the missing data. ...
doi:10.3390/app8101768
fatcat:jho5iv2r7razxbi4hdg5plewpi
1 Million Captioned Dutch Newspaper Images
2016
Zenodo
The dataset is suitable for experiments in automatic image captioning, image―article matching, object recognition, and data-to-text generation for weather forecasting. ...
This type of multi-modal data offers an interesting basis for vision and language research but most existing datasets use crowdsourced text, which removes the images from their original context. ...
Furthermore, off-the-shelf object recognition methods are unlikely to work on scanned black-and-white images. An alternative use for this dataset is Multimodal Ranking. ...
doi:10.5281/zenodo.844462
fatcat:ckai7wpqf5fbxatjmhjr3szd2m
Knowledge discovery over community-sharing media: From signal to intelligence
2009
2009 IEEE International Conference on Multimedia and Expo
the design of efficient search, mining, and visualization methods for manipulation. ...
Besides plain visual or audio signals, such large-scale media are augmented with rich context such as user-provided tags, geolocations, time, device metadata, and so on, benefiting a wide variety of potential ...
Specifically, both image content and context information are leveraged in a joint matrix factorization framework for theme discovery and tag prediction. ...
doi:10.1109/icme.2009.5202775
dblp:conf/icmcs/HsuMY09
fatcat:nzblrwkfjfdcvfhlc65pvkotbi
An Interpretable Approach to Hateful Meme Detection
[article]
2021
arXiv
pre-print
Hateful memes are an emerging method of spreading hate on the internet, relying on both images and text to convey a hateful message. ...
Multimodal/meme hate detection State-of-the-art multimodal hate speech detection often includes unimodally pretraining models for each modality, for early and late fusion, as well as multimodal pretraining ...
The first meme, despite nonhateful individual modalities, is flagged by the model by leveraging the image to gain the context needed to understand the text. ...
arXiv:2108.10069v1
fatcat:2fmmd4nhjrbixoj5zpp5l66oii
« Previous
Showing results 1 — 15 out of 1,130 results