Loanword-o-meter: Studying Dutch Loanwords across Genre and over Time
Kaspar Beelen, Nicoline van der Sijs, Joey Stofberg and Martin Reynaert
It is a well-known fact that language contact results in language change (Van Coetsem 1988). This is most clearly visible in the transfer of loanwords. For Dutch, these loanwords have been extensively described and analyzed (Van der Sijs 2001, 2005, Van Veen & Van der Sijs 1997). These descriptions were by necessity based on lexicographical reference works, not on corpora.
Currently there are vivid discussions in the press on the question whether the influence of loanwords, especially English loanwords, in Dutch is increasing, and whether this is the case in all genres and sociolects, or only in some. In order to substantiate these discussions with empirical evidence we have built a loanword-o-meter: a simple computational tool that measures the number and frequency presence of loanwords in Dutch texts and categorizes these words by their language of origin.
In this way we can answer research questions such as: has the number and frequency of loanwords changed over the last decades? Is a preference for loanwords associated with genre, with sociological data such as age or gender, or with political inclination?
In our presentation we will discuss the workings of the tool: the application – a Python script - was developed by intern Stofberg under the supervision of Beelen, using the data from Van der Sijs. The INT (Instituut voor de Nederlandse Taal) has built the application into the PICCL pipeline created by Reynaert (Reynaert a.o. 2015). We will end our presentation with a few case studies.