Fieldworker effect
Our methods are very sensitive to variations in the way the data was recorded.
A clustering below (map left) shows an almost exact division according to
fieldworkers (map right):
two clusters for Guy Lowman, two clusters for Raven McDavid, one for the remaining
fieldworkers. (This last single cluster does not indicate agreement between
data from the remaining fieldworkers.)
grouping: fieldworker + worksheet + community: 570 locations
(two locations merged because of identical coordinates)
map (left): clustering into five groups, using Ward's Method
data: raw phonetic strings, no lexical variants
method: Levenshtein
The plot below shows some of the differences between how Lowman and McDavid did
their work. Lowman seems to have coached the informants more to get consistent
results. McDavid may have allowed the informants to respond more freely,
which could explain why the number of responses per informant varies
much stronger with McDavid than with Lowman.
We cannot tell if the data collected by Lowman is more valuable than that
collected by McDavid, or vice versa. All we can say is that the Lowman data seems
more suitable for computational analysis, both by its uniformity, and by the
fact that it covers by far the largest area of LAMSAS.
There are probably also differences in transcriptions of sounds between
fieldworkers. Those differences may be minor, but they frustrate our
computational analysis.
So, unfortunately, we are not able to analyse LAMSAS as a whole.