Description


Introduction

Listening to radio or television one notices the tendency for standard Dutch (ABN) to become more and more differentiated, i.e. regionally colored. Hoppenbrouwers (1990) showed the opposite tendency for dialects. Being influenced by standard Dutch and by each other they have become less differentiated and fused to larger wholes: regiolects (see also Hinskens (1993), Auer & Hinskens (1996), and Hinskens, Auer & Kerswill (2005)). While earlier scholars usually describe this change in terms of single linguistic phenomena, we plan to investigate this change using modern web-based and computational techniques to obtain an overall picture of this change. Our goal is to examine how the change from dialects to regiolects is reflected in the production and perception of the dialect speakers. The results of our research will give insight into the nature of language change and dialect levelling. The research is important for historical linguists since it gives information about the direction and rates of sound change.

The research will be based on representative Dutch dialects of approximately 80 locations in the Netherlands and North Belgium. Perceptive distances are obtained on the basis of a web survey in which speakers listen to recordings. Computational distances are found on the basis of the transcriptions of the recordings. In the experiments two groups are distinguished: conservative dialect speakers (old males) and innovative dialect speakers (young females).

We will test three hypotheses. First, perceptive distance measurements which are based on the recordings of innovative speakers will suggest larger areas than those which are based on the recordings of conservative speakers. Second, the change from dialect to regiolect affects the lexical level ('kopstubber' becomes 'roagebol') more strongly than the phonological ('hoes' becomes 'huus') and phonetic levels. Third, this change also affects the perception of the speakers, but perception lags behind production.

The project started at November 1, 2007 and ends at October 31, 2011. Host institution is the Meertens Institute in Amsterdam. Supervisor is Prof. Dr. F.L.M.P. Hinskens. The the international supervising group consists of the following members:

    Prof. Dr. April McMahon, The University of Edinburgh / School of Philosophy, Psychology and Language Sciences / Linguistics and English Language, Edinburgh, United Kingdom
    Prof. Dr. Brian Joseph, The Ohio State University / Department of Linguistics, Columbus, Ohio, United States of America
    Prof. Dr. Hans Goebl, Universität Salzburg / Fachbereich Romanistik, Salzburg, Austria
    Prof. Dr. Ir. John Nerbonne, University of Groningen / Faculty of Arts / Department of Humanities Computing, Groningen, The Netherlands



Research topic

Listening to radio or television one notices the tendency for standard Dutch (ABN) to become more and more differentiated, i.e. regionally colored. The standard speech of many speakers betrays the areas they come from. While the standard language becomes more differentiated, the opposite tendency is going on for dialects. Being influenced by standard Dutch and by each other they become less differentiated and fuse to larger wholes: regiolects. This change is extensively described by Hoppenbrouwers (1990).

A regiolect is a continuum of intermediate language forms which includes the whole structural space between dialect and standard language (Hoppenbrouwers (1990), p. 84, see also Hinskens (1993), Auer & Hinskens (1996) and Hinskens, Auer & Kerswill (2005)). Regiolects are the result of increased mobility and migration on the one hand, and the influence of the standard language in education and communication on the other hand. Important sociolinguistic factors are the speakers' age, sex, education and degree of urbanization (pp. 86 and 172), where old rural poorly educated males and young urban high educated females are the extremes [conservative, traditional dialect] and [innovative, regiolect], respectively.

The goal of this reseach is to examine how the change from Dutch dialects to regiolects is reflected in the production and perception of the dialect speakers. We would like to test three hypotheses:

    (1) Perceptive distance measurements which are based on the recordings of innovative speakers will suggest larger and less sharply distinguished areas than those which are based on the recordings of conservative speakers. We expect that especially relatively small dialect areas which comprise only a few places, will be fused with larger areas.
    (2) This change affects variation at the lexical level ('kopstubber' becomes 'roagebol') more strongly than variation at the lexical phonological level ('hoes' becomes 'huus'). The lexical phonogical level is affected more strongly than the postlexical level (e.g. sandhi phenomena), which in turn is affected more strongly than the purely phonetic level (for example dialect-specific pronunciations of speech segments). For all levels we expect that the most significant changes will be found in areas where dialects are relatively distant from the standard language.
    (3) The change from dialect to regiolect, found in speech production, influences the speaker’s perception. Conservative speakers perceive dialect groups (small and many groups), innovative speakers perceive regiolect groups (larger and fewer groups). Since perception mainly follows production, speaker’s production will change more than their perception.
In the Motie van tolerantie en attentie, edited by Siemon Reker in 2001, we read that saving dialect material in archives and the study of dialect change and dialect loss should contribute to a accurate evaluation of the embedding, relevance and emotional value of dialects in our society. Our research is intended to be such a contribution.


Web survey

In the Atlas of the Netherlands (Smidt, Schuurmans & Ploeger, 1963-1978) a map is included showing the dialect landscape divided in 28 groups. The map was compiled by Jo Daan in 1969. The Dutch part is based on intuitions on dialect similarity of 1500 respondents collected in 1939. These intuitions may be linguistically based for the greater part, but they are probably also influenced by social and/or economic factors. An experiment, probably less influenced by non-linguistic factors was carried out by Charlotte Gooskens (Gooskens (2005), see also Gooskens & Heeringa (2004)). In each of 15 locations in Norway a school class listened to fragments of each the 15 varieties. For each dialect the pupils judged the distance to their own dialect in a scale from 1 (similar to native dialect) to 10 (not similar to native dialect). This works especially well for the Norwegian situation where everyone speaks dialect. In the present-day Dutch situation at most only a few pupils speak dialect. A more practical approach is the use of a web survey, which was carried out by a small group of students at the University of Groningen under our supervision. Fragments of 11 Germanic varieties were put online on the web, and native speakers of these varieties were asked to listen to the fragments and to rate the distance to their mother tongue in a scale from 1 (no distance) to 10 (maximum distance). We adapt the latter approach, but with three modifications.

    (1) To obtain results which are detailed enough, our research will be based on approximately 80 dialects. Most of the 28 groups on Daan’s map will be represented by at least one, and usually two or three dialects. Since doing the experiment should not take too long for a subject, subjects get presented recordings of their own dialect (or the geographically closest one), the five geographically closest dialects and nine randomly chosen dialects in the rest of the Dutch area, so fifteen dialects in total.
    (2) Our aim is to study dialect change. We will select a contemporary text and collect two recordings of this text for each dialect, one pronounced by an old male, and one by a young female. In the web survey we distinguish: a. old males listing to old males, b. old males listening to young females, c. young females listening to old males, and d. young females listening to young females.
    (3) Since subjects are native speakers of different dialects, we have to assure that their judgments are given on the same scale. In the web survey the meaning of the rates will be explained as follows: 1=variety is equal, 4=variety is different, but I would not switch to standard Dutch in conversation with the recorded speaker, 7=variety is different, I would switch to standard Dutch, 10=variety is maximally different. The subject will be told that not necessarily the whole range of the scale need to be used.
The perceptive measurements of the web survey together with transcriptions of the recordings using in the web survey will enable us to answer the following questions:


Questions to be answered

The perceptive measurements of the web survey together with transcriptions of the recordings using in the web survey will enable us to answer the following questions:

1. Are dialects changing into regiolects?

The question will be answered on the basis of the judgments we obtain in the web survey. On the basis of these judgments, the dialects will be clustered. Dialects are classified into different groups so that similar dialects are in the same group. For both judgments we will determine the natural number of groups (clusters). We will use the elbow criterion which says that the number of clusters should be chosen so that adding another cluster does not add significant information. If the percentage of variance explained by the clusters is plotted against the number of clusters, the first clusters will add much information (explain a lot of variance), but at some point the marginal gain will drop, giving an angle in the graph (the elbow) (Aldenderfer & Blashfield, 1984). We will also consider the L method, an efficient algorithm that finds the "knee" in a 'number of clusters vs. clustering evaluation metric' graph. The method was introduced by Salvador & Chan (2004). In this way we test our first hypothesis that dialect areas have been fused to larger and less sharply distinguished areas, namely regiolects. We may also test the hypothesis whether especially small dialect areas will fuse with larger ones.

2. Is the lexical level affected more strongly than the phonological and phonetic levels?

All of the recordings will be transcribed and digitized. The digitized transcriptions are the input for the computational procedures. The transcriptions will be used to calculate distances computationally. For the lexical level, we will use a simple binary measure – two forms are equal (0) or different (1) – or Goebl’s weighted similarity measure, a method in which the coincidence of rarely used forms counts more heavily than those of more frequent ones (Goebl (1984), p. 85; for application to Dutch lexical distances see Heeringa & Nerbonne (2006)). Lexical phonological, postlexical and purely phonetic differences are measured using Levenshtein distance, a string edit distance measure (for application to Dutch see Heeringa (2004) and Heeringa & Nerbonne (2006)).

For each linguistic level the measurements will be performed on the basis of the old male speakers and on the basis of the young female speakers separately. We will determine first whether the number of natural groups found on the basis of the latter measurements will be larger than the number of natural groups based on the first measurements. Second we test on which linguistic level the difference between the two classes is largest. In this way we test the second hypothesis that the lexical level will be affected more strongly than the phonological and phonetic levels.

Additionally we will compare the dialects to standard Dutch for each linguistic level. We expect that the recordings of the young female speakers will be closer to standard Dutch than those of the male speakers. Similar research was carried out by Heeringa & Nerbonne (2000) and Heeringa et al. (2000), but since we measure the degree of convergence per linguistic level, we are able to answer the question which linguistic level shows convergence most clearly. Per level we may test the hypothesis that the change from dialect to regiolect especially affects areas where dialects are relatively distant from the standard language.

3. Is the perception of the speakers affected? Has the speech production of the speakers changed more than the speaker’s perception?

These questions will be answered on the basis of the judgments which are obtained with the web survey. When both old males and young females listen to recordings of the same class (either old males or young females), we expect that the young female judgments will suggest fewer groups. This confirms our third hypothesis that the perception of the speakers has been changed from distinguishing dialects to distinguishing regiolects. Janson (1983) writes that 'for an individual in a situation of change, perception seems to lag behind production'. We will compare the contrasts in number and size of groups between the old males and young females at the perceptive level with the contrast we found at the production level, thus testing the hypothesis that perception lags behind production in the change from dialects to regiolects.


Innovation

There exist numerous studies about language change and dialect convergence. In contrast to many of those studies, we do not focus on particular eye catching phenomena, but want to study these phenomena systemetically on the basis of large amounts of data, using modern web-based and computational techniques to obtain an overall picture of this change, affecting both the production and perception of dialect speakers.

The results of the web survey on the one hand, and the computational phonetic transcription-based measurements on the other hand enable us to investigate if there is a change in the dialects itself, i.e. in the production of the speakers, and whether dialect speakers are becoming less sensitive to (minor) dialect differences, i.e. whether there is a change in the perception of the speakers. The results of our research will give insight in the nature of language change and dialect levelling. The research may also be important for historical linguists since it gives us quantitative information about the direction and rates of sound change.

Furthermore, this project will result in a large set of dialect recordings, accessible for both scientists and non-scientists, together with consistent phonetic transcriptions made by one transcriber. We will also obtain a large database of dialect perceptions, i.e. the judgments of dialectal speakers concerning their own and other varieties. The database will serve to test our hypotheses about language change, but should also prove to be a valuable resource for future work by other researchers.


References

Aldenderfer, M.S. & R.K. Blashfield (1984). Cluster Analysis. Newbury Park (CA): Sage.

Auer, P. & F. Hinskens (1996). 'The convergence and divergence of dialects in Europe. New and not so new developments in an old area'. In: U. Ammon, K. J. Mattheier & P. H. Nelde (eds.), Sociolinguistica, International Yearbook of European Sociolinguistics, volume 10, Convergence and divergence of dialects in Europe. Tübingen: Max Niemeyer Verlag, pp. 1-30.

Daan, J. & D. P. Blok (1969). Van randstad tot landrand. Toelichting bij de kaart: dialecten en naamkunde. Bijdragen en mededelingen der Dialectencommissie van de Koninklijke Nederlandse Akademie van Wetenschappen te Amsterdam 37, Amsterdam: N.V. Noord-Hollandsche uitgevers maatschappij.

Goebl, H. (1984). Dialektometrische Studien. Anhand italoromanischer, rätoromanischer und galloromanischer Sprachmaterialien aus AIS und ALF, volume 191, 192 and 193 of Beihefte zur Zeitschrift für romanische Philologie. Tübingen: Max Niemeyer Verlag. With assistence of S. Selberherr, W.-D. Rase and H. Pudlatz.

Goeman, A. C. M. (1999). 'Dialects and the Subjective Judgments of Speakers'. In: D. Preston (ed.), Handbook of Perceptual Dialectology, volume I. Amsterdam and Philadelphia: John Benjamins Publishing Company, pp. 135-144. Translated by B. E. Evans.

Gooskens, Ch. (2005). 'How well can Norwegians identify their dialects?' In: Nordic Journal of Linguistics, 28 (1), pp. 37-60.

Charlotte Ch. & W. Heeringa (2004). 'Perceptive Evaluation of Levenshtein Dialect Distance Measurements using Norwegian Dialect Data'. In: Language Variation and Change, volume 16, number 3, pp. 189-207.

Gooskens, Ch. & W. Heeringa (2006). 'The relative contribution of pronunciation, lexical and prosodic differences to the perceived distances between Norwegian dialects'. In: J. Nerbonne & W. Kretzschmar, Jr. (eds.), Literary and Linguistic Computing, special issue on Progress in Dialectometry: Toward Explanation, volume 21, number 4, pp. 477-492.

Heeringa, W., J. Nerbonne, H. Niebaum, R. Nieuweboer & P. Kleiweg (2000). 'Dutch-German Contact in and around Bentheim'. In: D. Gilbers, J. Nerbonne & J. Schaeken (eds.), Languages in Contact. Studies in Slavic and General Linguistics, Volume 28, Amsterdam and Atlanta GA: Rodopi, pp. 145-156.

Heeringa. W. & J. Nerbonne (2000). 'Change, Convergence and Divergence among Dutch and Frisian'. In: P. Boersma, Ph. H. Breuker, L. G. Jansma & J. van der Vaart (eds.), Philologia Frisica Anno 1999. Lêzingen fan it fyftjinde Frysk filologekongres, Ljouwert: Fryske Akademy, pp. 88-109.

Heeringa, W. (2004). Measuring dialect pronunciation differences using Levenshtein distance. PhD thesis university of Groningen, Groningen.

Heeringa, W. & J. Nerbonne (2006). 'De analyse van taalvariatie in het Nederlandse dialectgebied: methoden en resultaten op basis van lexicon en uitspraak'. In: Nederlandse Taalkunde, volume 11, number 3, pp. 218-257.

Hinskens, F. (1993). 'Dialectnivellering en regiolectvorming. Bevindingen en beschouwingen'. In: F. Hinskens, C. Hoppenbrouwers & J. Taeldeman (eds.), Dialectverlies en Regiolectvorming, special issue of Taal en Tongval, number 6, pp. 40-61.

Hinskens, F., P. Auer & P. Kerswill (2005). 'The study of dialect convergence and divergence: conceptual and methodological considerations'. In: P. Auer, F. Hinskens & P. Kerswill (eds.), Dialect change. The convergence and divergence of dialects in contemporary societies. Cambridge: Cambridge University Press, pp. 1-48.

Hoppenbrouwers, C. (1990). Het regiolect; van dialect tot Algemeen Nederlands. Muiderberg: Coutinho.

Janson, T. (1983). 'Sound change in perception and production'. In: Language, volume 59, number 1, pp. 18-34.

Salvador, S. & P. Chan (2004). 'Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms'. In: Proceedings of the 16th IEEE International Conference on Tools with AI, pp. 576-584. th AI, pp. 576-584.

Smidt, M. de, F. Schuurmans & H. A. Ploeger (1963-1978). Atlas of the Netherlands. Compiled by the Foundation for the Scientific Atlas of the Netherlands. The Hague: Government Printing and Publishing Office.