Shared Task on Cross-Genre Gender Detection in DutchThe shared task organised in the context of CLIN29 in Groningen is concerned with binary gender prediction within and across different genres in Dutch. It is modelled along an existing shared task for Italian, GxG. Please note that we adopt a broad and non-technical notion of “genre”, as to mainly indicate different sources we obtain data from.
Important datesOctober 5: train and dev data available
December 7: test data available
December 14: deadline for sending in predictions
December 21: results known for participants
January 18: short paper deadline
January 31: CLIN!
|Timeslot||Program (Benedenzaal 1)|
|16:15 - 16:30||Introduction by GxG organisers|
|16:30 - 17:00||Participants' presentations|
|17:00 - 17:15||Discussion (everyone!)|
Papers and Presentations
|Matej Martinc and Senja Pollak||Pooled LSTM for Dutch cross-genre gender classification [PDF]||Yes|
|Lennart Faber, Ian Matroos, Leon Melein and Wessel Reijngoud||Co-Training vs. Simple SVM Comparing Two Approaches for Cross-Genre Gender Prediction [PDF]||Yes|
|Rianne Bos, Kelly Dekker and Harmjan Setz||Embedding and Clustering for Cross-Genre Gender Prediction [PDF]||Yes|
|Eva Vanmassenhove, Amit Moryossef, Alberto Poncelas, Andy Way and Dimitar Shterionov||ABI Neural Ensemble Model for Gender Prediction <[PDF]||Yes|
|Eduardo Brito, Rafet Sifa and Christian Bauckhage||Two Attempts to Predict Author Gender in Cross-Genre Settings in Dutch [PDF]||Yes|
|Gerlof Bouma||Exploring Combining Training Datasets for the CLIN 2019 Shared Task on Cross-genre Gender Detection in Dutch [PDF]||No|
|Evgenii Glazunov||Gender prediction using lexical, morphological, syntactic and character-based features in Dutch [PDF]||No|
Shared task participants are invited to submit papers describing their systems and results. The papers should be a maximum of 5 pages excluding references, and should use the ACL 2018 2-column style file. All papers will be peer-reviewed by at least 1 member of the program committee. Good papers should describe the systems in sufficient detail and provide insight into what was effective for performance on the shared task and what was not.
The program committee consists of:
- Malvina Nissim (University of Groningen)
- Walter Daelemans (University of Antwerp)
- Masha Medvedeva (University of Groningen)
- Tim Kreutz (University of Antwerp)
- Hessel Haagsma (University of Groningen)
Given a (collection of) text(s) from a specific genre, the gender of the author has to be predicted. The task is cast as a binary classification task, with gender represented as F (female) or M (male). Gender prediction will be done in two ways:
- using a model which has been trained on the same genre
- using a model which has been trained on anything but that genre
A crucial aspect of this task is designing settings, as they are key to shed light on the core question: are there indicative traits across genres that can be leveraged to model gender in a rather genre-independent way?
This question will be answered by making participants train and test their models on datasets from different genres. For comparison, participants will also submit genre-specific models that will be tested on the very same genre they have been trained on. In-genre modelling will (i) shed light on which genres might be easier to model, i.e. where gender traits are more prominent; and (ii) make it easier to quantify the loss when modelling gender across genres.
More specifically, participants will be asked to submit up to six different models:
|Twitter in-genre model||non-Twitter model for Twitter
|YouTube in-genre model||non-YouTube model for YouTube|
|News in-genre model||non-News model for News|
In the cross-genre setting, the only constraint is not using in training any single instance from the genre they are testing on. Other than that, participants are free to combine the other datasets as they wish.
Participants are also free to use external resources as they wish, provided the cross-genre settings are carefully preserved, and everything used is described in detail.
This is a binary classification tasks with balanced classes. As standardly done in author profiling, we will evaluate performance using accuracy.
In order to derive two final scores, one for the in-genre and of for the cross-genre settings, we will simply average over the three accuracies obtained per genre. We will keep the two rankings separate. For determining the official “winner”, we will use the cross-genre ranking.
For all settings, given that the datasets are balanced for gender distribution, through random assignment we will have a 50% baseline.
We use Dutch data from the following three genres:
We might also introduce a new ‘secret’ genre only at test time, to further test the portability of models without any specific tuning, but this will be confirmed later on.
Gender distribution is balanced in all datasets (50/50), and datasets in all genres are of comparable sizes in terms of tokens.
Data is made available to participants providing a link upon request. Requests must be done via email (see Contact below), and access to data will be granted.
- CLCG Groningen (Malvina Nissim, Hessel Haagsma, Masha Medvedeva)
- CLiPS Antwerp (Walter Daelemans, Tim Kreutz)
- Koninklijke Bibliotheek (Martijn Kleppe, Steven Claeyssens)