Discourse Segmentation of German Texts

Uladzimir Sidarenka, Andreas Peldszus, Manfred Stede
2015 Journal for Language Technology and Computational Linguistics  
This paper addresses the problem of segmenting German texts into minimal discourse units, as they are needed, for example, in RST-based discourse parsing. We discuss relevant variants of the problem, introduce the design of our annotation guidelines, and provide the results of an extensive interannotator agreement study of the corpus. Afterwards, we report on our experiments with three automatic classifiers that rely on the output of state-of-the-art parsers and use different amounts and kinds
more » ... f syntactic knowledge: constituent parsing versus dependency parsing; tree-structure classification versus sequence labeling. Finally, we compare our approaches with the recent discourse segmentation methods proposed for English.
dblp:journals/ldvf/SidarenkaPS15 fatcat:555oe4e3zraqljsf2bahb3f6by