Slavic Corpus and Computational Linguistics

Dagmar Divjak, Serge Sharoff, Tomaž Erjavec
2017 Journal of Slavic Linguistics  
In this paper, we focus on corpus-linguistic studies that address theoretical questions and on computational linguistic work on corpus annotation, that makes corpora useful for linguistic work. First, we discuss why the corpus linguistic approach was discredited by generative linguists in the second half of the 20th century, how it made a comeback through advances in computing and was adopted by usage-based linguistics at the beginning of the 21st century. Then, we move on to an overview of
more » ... ssary and common annotation layers and the issues that are encountered when performing automatic annotation, with special emphasis on Slavic languages. Finally, we survey the types of research requiring corpora that Slavic linguists are involved in world-wide, and the resources they have at their disposal.
doi:10.1353/jsl.2017.0008 fatcat:i7iqxtcd5vejtlssz3cezrczgy