Dialectology: Aggregate Dialectal Variation

Instructor: John Nerbonne (course under development)
Course Number: LSA.107
Mon. & Wed. 10:10-11:50, June 27-July 13
2005 Linguistics Institute (Harvard/MIT)

2005 Linguistics Institute Course, from left, John Nerbonne, Jonathan Gajdos, Kari Hiltula, Claire Insel, Thea Park, Nynke de Haas, Anne Ribbert, Michiel Verhagen, Rachel Utain-Evans, Holman Tse, Griet Coupe, Pamela Rutecki, Eric Mayfield, Joseph ?, Kotoe Tashiro, ? (click on photo to enlarge).

Announcements 2005

Sat. 9/7 I've added three exercises and also specified that students who wish credit for the course should turn in at least four pages of exercises. Count the exercise finding isoglosses as one page.
Wed. 29/6 Thanks to Jonathan Gajdos, the pointer to the LAMSAS site has been updated.
Wed. 29/6 I've added some exercises.

Description

This course will focus on what Hans Goebl has called the "linguistic management of space," how language variation is structured geographically. We shall begin with basic ideas from categorical data analysis which we'll apply to lexical and syntactic data, then examine a technique from computational linguistics (edit distance or Levenshtein distance) for the analysis of sequences of segments in pronunciation. We shall examine questions of validity and consistency, and we show how to visualize analyses using the L04 package, where the emphasis is on visualization for the purpose of exploration and understanding. Time permitting we shall turn to one or two advanced topics, e.g., explanatory models of the geographic conditioning of language variation, and/or attention to the role of linguistic structure in the geographic distribution.

This three-week course precedes Bill Kretzschmar's three-week course on the feature-based analysis of language variation. We have coordinated with Kretzschmar on the focus, deliberately focusing here on the analysis of large aggregates, while he intends to focus on analyses based on single features.

The course assumes no familiarity with dialectology or computational techniques. Some basic linguistics is helpful, as is an unintimidated attitude toward software. We have three goals:

to show how aggregate analysis works, and give participants a chance to learn it
to provide tools for exploring and evaluating analyses
to compare this to other sorts of analyses

Students will have the opportunity to practice aggregate analysis of language variation on data from American dialects.

Schedule

Introduction. Concepts & History.
Measuring Lexical Differences
Measuring Pronunciation Differences
Validating and Calibrating.
Location, location, location. Why are there dialect differences?
Linguistic Structure in Dialect Differences

Readings.

Introduction. Chambers & Trudgill (1998) Chap. 9; Nerbonne & Kretzschmar (2003).
Measuring Lexical Differences. Nerbonne & Kleiweg (2003).
Measuring Pronunciation Differences. Nerbonne, Heeringa & Kleiweg (1999); Heeringa (2004), Ch.5
Validating and Calibrating. Heeringa, Nerbonne \& Kleiweg (2002); Heeringa (2004) Ch.7
Location, location, location. Why are there dialect differences? Heeringa & Nerbonne (2002); Chambers & Trudgill (1998) Chap. 11; Nerbonne, van Gemert & Heeringa, (submitted).
Linguistic Structure in Dialect Differences. Nerbonne (submitted).

Literature

See literature list.

Some Useful Links

LAMSAS home page
Dialectometric work on LAMSAS at the University of Groningen
Peter Kleiweg's L04, a dialectometric package focusing on the use of Levenshtein distance to assay pronunciation difference. Note especially that there is a tutorial on its use.

An online demo of how Levenshtein distance works.
Hans Goebl's Dialectometry Project.

Exercises

Are available here.

Further work?

There are two graduate student stipends available at the University of Groningen to work on dialectometry. More information here.

John Nerbonne

Last modified: Fri Oct 7 17:22:37 CEST 2005