A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Token Level Code-Switching Detection Using Wikipedia as a Lexical Resource
[chapter]
2018
Lecture Notes in Computer Science
We present a novel lexicon-based classification approach for code-switching detection on Twitter. The main aim is to develop a simple lexical look-up classifier based on frequency information retrieved from Wikipedia. We evaluate the classifier using three different language pairs: Spanish-English, Dutch-English, and German-Turkish. The results indicate that our figures for Spanish-English are competitive with current state of the art classifiers, even though the approach is simplistic and based solely on word frequency information.
doi:10.1007/978-3-319-73706-5_16
fatcat:toom7gz435f2bhkynu4v6plyge