Extraction of Suppositional Adverb and Clause-Final Modality Form Distant Collocations Using a Web Corpus and Corpus Query System and its Application to Japanese Language Learning

Irena Srdanović, Bor Hodošček, Andrej Bekeš, Kikuko Nishina
2009 Journal of Natural Language Processing  
A systematic account of Japanese language modality forms as well as distant collocations between modal adverbs and clause-final modality forms is lacking in the field of natural language processing. The same stands for coverage of this kind of linguistic information in Japanese language education. In order to remedy this deficiency, in this † , Tokyo Institute of Technology † † , University of Ljubljana, Slovenia paper we make extraction of Japanese adverbs and clause-final modality forms
more » ... ations possible using the corpus query system Sketch Engine and examine possibilities for its application in Japanese language learning, focusing on learner's dictionaries. First, as a result of analyzing various Japanese language corpora, we create a long list of modality forms and their variations. Then, we examine how ChaSen morphologically analyzes the forms and retag a sample of the large-scale Japanese language web corpus, JpWaC, by grouping all morphemes that correspond to individual modality forms together under a new modality tag. Finally, we load the newly tagged corpus into the Sketch Engine (SkE), modify the gramrel file and as a result obtain Word Sketch results for collocations between suppositional adverbs and modality forms. The evaluation of the collocation results shows that the proposed method reaches accuracy of above 93%. The results can be utilized in the creation of Japanese learners' dictionaries or other language material or directly in language teaching or learning.
doi:10.5715/jnlp.16.4_29 fatcat:c5levechlvesfeqi6ruswijmui