Gender prediction using lexical, morphological, syntactic and character-based features in Dutch

Evgenii Glazunov
2019 Computational Linguistics in the Netherlands  
This work is a result of participation in shared task on gender detection in Dutch. The task was to predict gender within and across different genres. This work applies some existing ideas about using lexical and more abstract text representations (morphological, syntactical labels, text bleaching). It provides a comparison of different features across genres in two types of tasks and presents two pipelines. Using three types of features, we found that lexical features are more significant,
more » ... ough other features also show good results making the model more robust. Final scores where in range 0.61-0.64 for in-genre and 0.53-0.56 for cross-genre prediction.
dblp:conf/clin/Glazunov19 fatcat:nuqa5oq35ne7rhhjnff6aewbnm