User Tools

Site Tools


nikola

Processing and enriching social media data: The South Slavic perspective, and beyond

Social media are known to be a diverse and rich source of information for various areas of research. Processing social media texts with standard language technologies, however, has an error rate which is multiple times higher than that on standard texts. Furthermore, regardless of the richness of the social media sources, researchers are regularly in need of additional data enrichment like the one that can be provided through author profiling. In the first part of my talk I will present a series of technologies for South Slavic languages aimed at non-standard data, like standardness predictors, normalisers and taggers. In the second part of my talk I will present experiments on tackling language-independent author profiling such as user type identification and gender prediction.

Nikola Ljubešić is assistant professor at the University of Zagreb and postdoctoral researcher at the Jožef Stefan Institute in Ljubljana. His main research interests are linguistic processing of South Slavic languages, processing of non-standard language, social media analytics and computational social science.

nikola.txt · Last modified: 2019/02/06 16:03 (external edit)