D4.2 Report on Discourse-Aware Machine Translation for Audiovisual Data

Maija Hirvonen, Maarit Koponen, Umut Sulubacak, Jörg Tiedemann
2019 Zenodo  
Machine translation is conventionally based on processing isolated textual sentences, disregarding the broader context around them, as well as any cues that are not explicit in text. This formulation is unable to reliably determine referential discourse phenomena such as anaphora and connectives, which creates a barrier to the production of cohesive and coherent language. While exploiting the audio and visual modalities provide a window into the context, machine translation must also venture
more » ... ond the sentence, and become aware of the discourse in the entire document. There has been a surge of research in discourse-aware machine translation in the last two decades, primarily focusing on the design and evaluation of document-level machine translation systems, and expanding the textual context in machine translation architectures without compromising computational efficiency. Following the recent advances, we have likewise put substantial effort into the development of discourse-aware machine translation systems, albeit to no considerable improvement. In this deliverable, we start by breaking down the most salient phenomena to explain their relevance, and presenting a brief survey of successful approaches to discourse-aware machine translation, followed by descriptions of the models we have developed within the WP4 of the MeMAD project. We dedicate the rest of the report to our analysis of user evaluation data collected in an experiment where professional translators tested post-editing of machine translation for subtitling. Finally, we conclude our report with discussions of future directions in utilising dedicated subtitle translation systems, speaker and dialogue information, and end-to-end speech translation models in the last year of the project.
doi:10.5281/zenodo.3690764 fatcat:yrgoykfj4jbnzjrpdt6iw4qgo4