1,130 Hits in 3.6 sec

Technology Demonstration

Benjamin C. Stopher, Oliver J. Smith
2017 Proceedings of the 2017 ACM SIGCHI Conference on Creativity and Cognition - C&C '17  
This data is then used to create ideation metadata that can be used in comparative analysis of ideation 'events'.  ...  text relationships fed into the ideation environment.  Multimodal data: gathered through the use of gesture mapping (using stereoscopic 3D cameras such as the LEAP Motion [2]) and video analytics.  ...  ACKNOWLEDGMENTS We thank London College of Communication, University of the Arts London, whose Programme Director Sabbatical Scheme (disciplinary research) significantly enabled the presentation of the  ... 
doi:10.1145/3059454.3078711 dblp:conf/candc/StopherS17 fatcat:pnu67i2ck5cnnprytfq543roja

Visual and Textual Analysis of Social Media and Satellite Images for Flood Detection @ Multimedia Satellite Task MediaEval 2017

Konstantinos Avgerinakis, Anastasia Moumtzidou, Stelios Andreadis, Emmanouil Michail, Ilias Gialampoukidis, Stefanos Vrochidis, Ioannis Kompatsiaris
2017 MediaEval Benchmarking Initiative for Multimedia Evaluation  
Satellite images (FDSI).  ...  in order to discriminate flooded from non-flooded images.  ...  Textual information is also retrieved by leveraging the metadata of the social media images by using DBpedia Spotlight annotation tool [2] .  ... 
dblp:conf/mediaeval/AvgerinakisMAMG17 fatcat:jj526zempjdydmcz3rmdbgo3tm

Unified hypergraph for image ranking in a multimodal context

Jiejun Xu, Vishwakarma Singh, Ziyu Guan, B.S. Manjunath
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Image ranking has long been studied, yet it remains a very challenging problem. Increasingly, online images come with additional metadata such as user annotations and geographic coordinates.  ...  Image ranking is then formulated as a ranking problem on a unified hypergraph.  ...  CONCLUSIONS In this paper, we address the image ranking problem for online community photo database, and focus on combining multimodal information such as image visual features, user tags and geo-locations  ... 
doi:10.1109/icassp.2012.6288382 dblp:conf/icassp/XuSGM12 fatcat:xjjizcto5rdcvd7qod4ynwjtwu

Introduction to the special issue on "Semantic Multimedia"

Yannis Avrithis, Noel E. O'Connor, Steffen Staab, Raphael Troncy
2008 Journal of Web Semantics  
The third paper "Image Retrieval ++ ---Web Image Retrieval with An Enhanced Multi-Modality Ontology" by Huan Wang, Liang-Tien Chia and Song Liu investigates the design and use of an ontology for multimodal  ...  similarity matching and ranking for multimedia retrieval on the Web.  ... 
doi:10.1016/j.websem.2008.02.004 fatcat:sqv6pfnxknfbhd5zmuagz56u5m

Introduction to the special issue on "semantic multimedia"

Yannis Avrithis, Noel E. O'Connor, Steffen Staab, Raphael Troncy
2008 Multimedia tools and applications  
Research in this area is important because the amount of information available as multimedia for the purpose of entertainment, security, teaching or technical documentation is overwhelming but the understanding  ...  The third paper "Image Retrieval ++ ---Web Image Retrieval with An Enhanced Multi-Modality Ontology" by Huan Wang, Liang-Tien Chia and Song Liu investigates the design and use of an ontology for multimodal  ...  similarity matching and ranking for multimedia retrieval on the Web.  ... 
doi:10.1007/s11042-008-0215-2 fatcat:vqc7qmyd2naxtkiksbccksk5di

Social interactions over geographic-aware multimedia systems

Roger Zimmermann, Yi Yu
2013 Proceedings of the 21st ACM international conference on Multimedia - MM '13  
., tweets, videos, images) can provide us with socially complementary information to predict users' needs.  ...  understanding of the basics of location-aware contextual descriptions and its relations to social multimedia scenes, but may also serve to highlight relationships that can be collaboratively applied to multimodal  ...  ., leveraging the Foursquare API) as an affordable and effective tool can be used to annotate geographic metadata with meaningful categories [1, 5] to make them more informative for geographic-aware  ... 
doi:10.1145/2502081.2502236 dblp:conf/mm/ZimmermannY13 fatcat:2df6ljo65rgwnaxbd65b56bgaa

Prototype Demonstration: Video Content Personalization for IPTV Services

David C. Gibbon, Zhu Liu, Harris Drucker, Bernard Renger, Lee Begeja, Behzad Shahraray
2007 2007 4th IEEE Consumer Communications and Networking Conference  
Query relevance ranking with temporal and other metadata constraints is used to form timely, focused content sets for users.  ...  We propose a solution based on stored user interest profiles and multimodal processing for content segmentation to produce manageable content subsets for users.  ... 
doi:10.1109/ccnc.2007.250 dblp:conf/ccnc/GibbonLDRBS07 fatcat:rvfzih3xujbtrbtq7b6folavb4

An adaptable search engine for multimodal information retrieval

Gilles Hubert, Josiane Mothe
2009 Journal of the American Society for Information Science and Technology  
Multimodal information retrieval is a research problem of great interest in all domains, due to the huge collections of multimedia data available in different contexts like text, image, audio and video  ...  This paper is an overview of multimodal information retrieval, challenges in the progress of multimodal information retrieval. General Terms Multi Modal Information Retrieval, Information Retrieval.  ...  Recent efforts in the field of multimodal retrieval systems have led to a growing research community and a number of academic and industrial projects.  ... 
doi:10.1002/asi.21091 fatcat:jgcvyun6zna7vkze5wcliibzxu

Movie genome: alleviating new item cold start in movie recommendation

Yashar Deldjoo, Maurizio Ferrari Dacrema, Mihai Gabriel Constantin, Hamid Eghbal-zadeh, Stefano Cereda, Markus Schedl, Bogdan Ionescu, Paolo Cremonesi
2019 User modeling and user-adapted interaction  
works Deldjoo et alto better exploit complementary information between different modalities; (iii) proposing a two-step hybrid approach which trains a CF model on warm items (items with interactions) and leverages  ...  Currently, the most common approach to this problem is to switch to a purely CBF method, usually by exploiting textual metadata.  ...  for each image.  ... 
doi:10.1007/s11257-019-09221-y fatcat:cnzhzxwjlfbd7g4hhafalhtji4

A CNN-RNN Framework for Image Annotation from Visual Cues and Social Network Metadata [article]

Tobia Tesan, Pasquale Coscia, Lamberto Ballan
2020 arXiv   pre-print
Metadata accompanying images on social-media represent an ideal source of additional information for retrieving proper neighborhoods easing image annotation task.  ...  Images represent a commonly used form of visual communication among people.  ...  We also acknowledge the UNIPD CAPRI Consortium, for its support and access to computing resources.  ... 
arXiv:1910.05770v2 fatcat:4htuagvvfvaoxdw5ufjgvs3n2u


Wei Zhang, Lei Pang, Chong-Wah Ngo
2012 Proceedings of the 20th ACM international conference on Multimedia - MM '12  
For recalling instances of non-planar and non-rigid shapes, spatial configurations that emphasize topology consistency while allowing for local variations in matches have been incorporated.  ...  In IS, names of the instances are inferred from similar visual examples searched through a million-scale image dataset.  ...  ., the metadata and the image) 2 is extracted from. We consider two clues to model the noise level of the metadata.  ... 
doi:10.1145/2393347.2393432 dblp:conf/mm/ZhangPN12 fatcat:ipaqc6w7dbeblezxe5gouyziey

Leveraging Known Data for Missing Label Prediction in Cultural Heritage Context

Abdelhak Belhi, Abdelaziz Bouras, Sebti Foufou
2018 Applied Sciences  
In this paper, we present a novel multimodal classification approach for cultural heritage assets that relies on a multitask neural network where a convolutional neural network (CNN) is designed for visual  ...  In this paper, we tackle the challenge of automatically classifying and annotating cultural heritage assets using their visual features as well as the metadata available at hand.  ...  It would be interesting to leverage these metadata for the identification and prediction of the missing data.  ... 
doi:10.3390/app8101768 fatcat:jho5iv2r7razxbi4hdg5plewpi

1 Million Captioned Dutch Newspaper Images

Desmond Elliott, Martijn Kleppe
2016 Zenodo  
The dataset is suitable for experiments in automatic image captioning, image―article matching, object recognition, and data-to-text generation for weather forecasting.  ...  This type of multi-modal data offers an interesting basis for vision and language research but most existing datasets use crowdsourced text, which removes the images from their original context.  ...  Furthermore, off-the-shelf object recognition methods are unlikely to work on scanned black-and-white images. An alternative use for this dataset is Multimodal Ranking.  ... 
doi:10.5281/zenodo.844462 fatcat:ckai7wpqf5fbxatjmhjr3szd2m

Knowledge discovery over community-sharing media: From signal to intelligence

Winston Hsu, Tao Mei, Rong Yan
2009 2009 IEEE International Conference on Multimedia and Expo  
the design of efficient search, mining, and visualization methods for manipulation.  ...  Besides plain visual or audio signals, such large-scale media are augmented with rich context such as user-provided tags, geolocations, time, device metadata, and so on, benefiting a wide variety of potential  ...  Specifically, both image content and context information are leveraged in a joint matrix factorization framework for theme discovery and tag prediction.  ... 
doi:10.1109/icme.2009.5202775 dblp:conf/icmcs/HsuMY09 fatcat:nzblrwkfjfdcvfhlc65pvkotbi

An Interpretable Approach to Hateful Meme Detection [article]

Tanvi Deshpande, Nitya Mani
2021 arXiv   pre-print
Hateful memes are an emerging method of spreading hate on the internet, relying on both images and text to convey a hateful message.  ...  Multimodal/meme hate detection State-of-the-art multimodal hate speech detection often includes unimodally pretraining models for each modality, for early and late fusion, as well as multimodal pretraining  ...  The first meme, despite nonhateful individual modalities, is flagged by the model by leveraging the image to gain the context needed to understand the text.  ... 
arXiv:2108.10069v1 fatcat:2fmmd4nhjrbixoj5zpp5l66oii
« Previous Showing results 1 — 15 out of 1,130 results