A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
CMU-MOSEAS: A Multimodal Language Dataset for Spanish, Portuguese, German and French
2020
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Modeling multimodal language is a core research area in natural language processing. While languages such as English have relatively large multimodal language resources, other widely spoken languages across the globe have few or no large-scale datasets in this area. This disproportionately affects native speakers of languages other than English. As a step towards building more equitable and inclusive multimodal systems, we introduce the first large-scale multimodal language dataset for Spanish,
doi:10.18653/v1/2020.emnlp-main.141
pmid:33969362
pmcid:PMC8106386
fatcat:rdq566qrk5h5lmweffg6whts2q