A Three-Layered Collocation Extraction Tool and Its Application in China English Studies [chapter]

Jingxiang Cao, Dan Li, Degen Huang
2015 Lecture Notes in Computer Science  
We design a three-layered collocation extraction tool by integrating syntactic and semantic knowledge and apply it in China English studies. The tool first extracts peripheral collocations in the frequency layer from dependency triples, then extracts semi-peripheral collocations in the syntactic layer by association measures, and last extracts core collocations in the semantic layer with a similar word thesaurus. The syntactic constraints filter out much noise from surface co-occurrences, and
more » ... e semantic constraints are effective in identifying the very "core" collocations. The tool is applied to automatically extract collocations from a large corpus of China English we compile to explore how China English as a variety of English is nativilized. Then we analyze similarity and difference of the typical China English collocations of a group of verbs. The tool and results can be applied in the compilation of language resources for Chinese-English translation and corpus-based China studies.
doi:10.1007/978-3-319-25816-4_4 fatcat:hue7vvd7cbcnhed2prdoen4e4y