A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Getting the subtext without the text: Scalable multimodal sentiment classification from visual and acoustic modalities
[article]
2018
arXiv
pre-print
In the last decade, video blogs (vlogs) have become an extremely popular method through which people express sentiment. The ubiquitousness of these videos has increased the importance of multimodal fusion models, which incorporate video and audio features with traditional text features for automatic sentiment detection. Multimodal fusion offers a unique opportunity to build models that learn from the full depth of expression available to human viewers. In the detection of sentiment in these
arXiv:1807.01122v1
fatcat:6w2ym2lxwzdixmt3bd3o4rbyfe