Cross-Genre Age and Gender Identification in Social Media

Anam Zahid, Aadarsh Sampath, Anindya Dey, Golnoosh Farnadi
2016 Conference and Labs of the Evaluation Forum  
This paper 1 gives a brief description on the methods adopted for the task of author-profiling as part of the competition PAN 2016 [1] . Author profiling is the task of predicting the author's age and gender from his/her writing. In this paper, we follow a two-level ensemble approach to tackle the cross-genre author profiling task where training documents and testing documents are from different genres. We use the softvoting approach to build the classification ensemble. To include various
more » ... re sets, we first train logistic regression models using the extracted word n-gram, character n-gram, and part-of-speech n-gram features for each genre. We then ensemble single-genre predictive models trained on the blog, social media and Twitter data sources, to build our multi-genre ensemble approach. The experimental results indicate that our approach performs well in both single-genre and cross-genre author profiling tasks.
dblp:conf/clef/ZahidSDF16 fatcat:npeywd3ew5dddjb5fhfph3shuy