CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects

Simon Clematide, Peter Makarov
2017 Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)  
Our submissions for the GDI 2017 Shared Task are the results from three different types of classifiers: Naïve Bayes, Conditional Random Fields (CRF), and Support Vector Machine (SVM). Our CRF-based run achieves a weighted F1 score of 65% (third rank) being beaten by the best system by 0.9%. Measured by classification accuracy, our ensemble run (Naïve Bayes, CRF, SVM) reaches 67% (second rank) being 1% lower than the best system. We also describe our experiments with Recurrent Neural Network
more » ... Neural Network (RNN) architectures. Since they performed worse than our nonneural approaches we did not include them in the submission.
doi:10.18653/v1/w17-1221 dblp:conf/vardial/ClematideM17 fatcat:nomlyj55nveylkw6q2h2sahieu