CapetownMilanoTirana for GxG at Evalita2018. Simple n-gram based models perform well for gender prediction. Sometimes.(Short Paper)
By: Angelo Basile, Gareth Dwyer, Chiara Rubagotti
In this paper we describe our participation in the Evalita 2018 GxG crossgenre/domain gender prediction shared task for Italian. Building on previous results obtained on in-genre gender prediction, we try to assess the robustness of a linear model using n-grams in a crossgenre setting. We show that performance drops significantly when the training and testing genres differ. Furthermore, we experiment with abstract features in trying to capture genre-independent features. We achieve an average F1-score of 0.55 on the official in-genre test set—being thus ranked first out of five submissions—and 0.51 on the cross-genre test set.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Advertisement cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.