CapetownMilanoTirana for GxG at Evalita2018. Simple n-gram based models perform well for gender prediction. Sometimes.(Short Paper)

By: Angelo Basile, Gareth Dwyer, Chiara Rubagotti

Year: 2018

Abstract 

In this paper we describe our participation in the Evalita 2018 GxG crossgenre/domain gender prediction shared task for Italian. Building on previous results obtained on in-genre gender prediction, we try to assess the robustness of a linear model using n-grams in a crossgenre setting. We show that performance drops significantly when the training and testing genres differ. Furthermore, we experiment with abstract features in trying to capture genre-independent features. We achieve an average F1-score of 0.55 on the official in-genre test set—being thus ranked first out of five submissions—and 0.51 on the cross-genre test set.
 

Other Research Papers by Angelo Basile

GET IN TOUCH
SYMANTO OFFICE

Pretzfelder Strasse 15

Nuremberg 90425

Germany