A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2007; you can also visit the original URL.
The file type is application/pdf
.
A multimodal learning interface for grounding spoken language in sensory perceptions
2003
Proceedings of the 5th international conference on Multimodal interfaces - ICMI '03
We present a multimodal interface that learns words from natural interactions with users. In light of studies of human language development, the learning system is trained in an unsupervised mode in which users perform everyday tasks while providing natural language descriptions of their behaviors. The system collects acoustic signals in concert with user-centric multisensory information from nonspeech modalities, such as user's perspective video, gaze positions, head directions, and hand
doi:10.1145/958432.958465
dblp:conf/icmi/YuB03
fatcat:moi3pp7swjctrpy6amho4zbqv4