A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
MultiQT: Multimodal learning for real-time question tracking in speech
2020
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
unpublished
We address a challenging and practical task of labeling questions in speech in real time during telephone calls to emergency medical services in English, which embeds within a broader decision support system for emergency call-takers. We propose a novel multimodal approach to real-time sequence labeling in speech. Our model treats speech and its own textual representation as two separate modalities or views, as it jointly learns from streamed audio and its noisy transcription into text via
doi:10.18653/v1/2020.acl-main.215
fatcat:nwsmg3k6wfgddmnwfolyjmhwt4