Multimodal Distributional Semantics
release_hwiocijrfbfdbhufc26eyaxgim
by
E. Bruni,
N. K. Tran,
M. Baroni
2014 Volume 49, p1-47
Abstract
Distributional semantic models derive computational representations of word meaning from the patterns of co-occurrence of words in text. Such models have been a success story of computational linguistics, being able to provide reliable estimates of semantic relatedness for the many semantic tasks requiring them. However, distributional models extract meaning information exclusively from text, which is an extremely impoverished basis compared to the rich perceptual sources that ground human semantic knowledge. We address the lack of perceptual grounding of distributional models by exploiting computer vision techniques that automatically identify discrete "visual words" in images, so that the distributional representation of a word can be extended to also encompass its co-occurrence with the visual words of images it is associated with. We propose a flexible architecture to integrate text- and image-based distributional information, and we show in a set of empirical tests that our integrated model is superior to the purely text-based approach, and it provides somewhat complementary semantic information with respect to the latter.
In application/xml+jats
format
Archived Files and Locations
application/pdf
1.7 MB
file_2gul5ffrbbblzl5z57rchixgya
|
web.archive.org (webarchive) www.jair.org (web) |
application/pdf
79.7 kB
file_j7lah7tf4zeixdfaphaqumdpv4
|
www.l3s.de (web) web.archive.org (webarchive) |
article-journal
Stage
published
Date 2014-01-23
Open Access Publication
In DOAJ
In ISSN ROAD
Not in Keepers Registry
ISSN-L:
1076-9757
access all versions, variants, and formats of this works (eg, pre-prints)
Crossref Metadata (via API)
Worldcat
SHERPA/RoMEO (journal policies)
wikidata.org
CORE.ac.uk
Semantic Scholar
Google Scholar