Predicting Adolescents' Educational Track from Chat Messages on Dutch Social Media

Lisa Hilte, Walter Daelemans, Reinhild Vandekerckhove
2018 Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis  
We aim to predict Flemish adolescents' educational track based on their Dutch social media writing. We distinguish between the three main types of Belgian secondary education: General (theory-oriented), Vocational (practice-oriented), and Technical Secondary Education (hybrid). The best results are obtained with a Naive Bayes model, i.e. an Fscore of 0.68 (std. dev. 0.05) in 10-fold crossvalidation experiments on the training data and an F-score of 0.60 on unseen data. Many of the most
more » ... f the most informative features are character ngrams containing specific occurrences of chatspeak phenomena such as emoticons. While the detection of the most theory-and practiceoriented educational tracks seems to be a relatively easy task, the hybrid Technical level appears to be much harder to capture based on online writing style, as expected.
doi:10.18653/v1/w18-6248 dblp:conf/wassa/HilteDV18 fatcat:blispywwmfhwbmoq4wasmp4w3u