RuG/L04

Tutorial

1. Introduction

RuG/L04 is software for dialectometrics and cartography. And more.

In plain text: in several locations in some area, you collected the local variants of the pronunciation of a large number of words. Using RuG/L04, you can compare these words, and make dialect maps of the results.

You start with making a distance table. You construct a table that has for each pair of locations a number that expresses how much words differ between those two locations. For each word, the variant forms of the two locations are compared, and the final difference between the two locations is the average of all variant differences.

There are several measurements to express numerically the difference between two words. One measurement is the Levenshtein distance (or string edit distance or edit distance for short). Another measurement is the Gewichteter Identitätswert or G.I.W.

The Levenshtein algorithm is demonstrated and explained elsewhere.

You do some processing on the difference table. You make a group partitioning using clustering, or a spatial distribution by means of multidimensional scaling. With the result you draw a dialect map.

The basic ideas, the how and why of maps like you see below is discussed elsewhere.

The diagram below gives a simplified view of some of the major steps:

The next diagram shows a few more possibilities:

RuG/L04 is a collection of stand-alone programs. Each program performs a single step in the processing sequence. This means that you can use the programs independently, and apply them to other tasks. For example, you could make maps of genetic differences, if you have obtained a table of genetic differences between locations, using software not included in the current package.

The rubrical index of the manual gives an overview of all available programs.

Apart from the stand-alone programs, there is an R library. R is a powerful statistics system, freely available for each computer platform. Using the library, you can import and export data in a format that is used by the other software, and use R for analysis and further processing of the data.