Robust dependency parsing of spontaneous Japanese speech and its evaluation

Tomohiro Ohno, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki
2004 Interspeech 2004   unpublished
Spontaneously spoken Japanese includes a lot of grammatically ill-formed linguistic phenomena such as fillers, hesitations, inversions, and so on, which do not appear in written language. This paper proposes a method of robust dependency parsing using a large-scale spoken language corpus, and evaluates the availability and robustness of the method using spontaneously spoken dialogue sentences. By utilizing stochastic information about the appearance of ill-formed phenomena, the method can
more » ... ly parse spoken Japanese including fillers, inversions, or dependencies over utterance units. As a result of an experiment, the parsing accuracy provided 87.0%, and we confirmed that it is effective to utilize the location information of a bunsetsu, and the distance information between bunsetsus as stochastic information.
doi:10.21437/interspeech.2004-237 fatcat:xbkzh6gy7bgc7iylregycycpdu