1 Abstract

In this study, we investigate differences between native English speakers and the English pronunciation of Dutch and German speakers. We focus on the articulatory trajectories obtained using electromagnetic articulography and particularly investigate two sound contrasts: /t/-/θ/ and /s/-/ʃ/. Our results show that while German speakers make both sound contrasts adequately, the Dutch speakers do not distinguish them clearly. To further evaluate these results, both a human Dutch listener as well as an automatic speech recognition (ASR) system classified the pronounced words on the basis of the acoustic recording. Both classifications lined up with the articulatory results. For Dutch speakers, /θ/-words (and /s/-words) were more frequently recognized as /t/-words (and /ʃ/-words). However, the intended utterance was still recognized in the majority of cases for the Dutch speakers. The perceptual results therefore do not support a complete merger of the sounds in Dutch.

Journal: Submitted to Proceedings of ISSP 2017Submitted

Preprint: http://www.martijnwieling.nl/files/ISSP-Wieling.pdf

Keywords: Generalized additive modeling; Tutorial; Articulography; Second language acquisition

## Generated on: August 30, 2017 - 23:39:25

2 Libraries and functions

The following commands load the necessary functions and libraries and show the version information.

# install packages if not yet installed
packages <- c("mgcv","itsadug","lme4")
if (length(setdiff(packages, rownames(installed.packages()))) > 0) {
  install.packages(setdiff(packages, rownames(installed.packages())))  

# load required packages

# version information
## [1] "R version 3.4.1 (2017-06-30)"
cat(paste('mgcv version:',packageVersion('mgcv')))
## mgcv version: 1.8.18
cat(paste('itsadug version:',packageVersion('itsadug')))
## itsadug version: 2.2.4
cat(paste('lme4 version:',packageVersion('lme4')))
## lme4 version: 1.1.12

3 Production datasets

The following shows the columns of the full dataset and their explanation.

if (!file.exists('datth.rda')) { 
    download.file('http://www.let.rug.nl/wieling/ISSP2017/datth.rda', 'datth.rda')
if (!file.exists('datsh.rda')) { 
    download.file('http://www.let.rug.nl/wieling/ISSP2017/datsh.rda', 'datsh.rda')

3.1 Column names

The dataset datsh consists of 265599 rows and 10 columns, whereas the dataset datth consists of 223954 rows and 10 columns. Both datasets have the following column names:

##  [1] "Speaker" "Lang"    "Sensor"  "Axis"    "Trial"   "Word"    "Sound"  
##  [8] "Loc"     "Time"    "Pos"

3.2 Data description

  • Speaker – ID of the speaker
  • Lang – Native language of the speaker ("NL" for Dutch, "DE" for German, or "EN" for English)
  • Sensor – The sensor (in this case only `“TT”, the tongue tip sensor)
  • Axis – The Axis (in this case only "X", the anterior-posterior position)
  • Trial – The trial number of the word
  • Word – The label of the word
  • Sound – The sound contrast ("TH" for words with the dental fricative, "T" for words with the stop ; or "SH" for words with the post-alveolar fricative, and "S" for words with the alveolar fricative in dataset datsh)
  • Loc – The location where in the word the sound contrasts occurs ("START" when it occurs at the beginning of the word or "END" when it occurs at the back of the word
  • Time – The normalized (between 0: beginning of the word, to 1: end of the word)
  • Pos – The standardized (mean 0, standard deviation 1) position for each speaker of the T1 sensor in the anterior-posterior direction (higher values, more anterior)

4 Contrasting /θ/ and /t/ in production

datth <- start_event(datth,event=c("Speaker","Trial"))

datth$LangLoc <- interaction(datth$Lang, datth$Loc)

datth$IsENTHStart <- (datth$Lang == "EN" & datth$Sound == "TH" & datth$Loc == "Start")*1

datth$IsNLTHStart <- (datth$Lang == "NL" & datth$Sound == "TH" & datth$Loc == "Start")*1

datth$IsDETHStart <- (datth$Lang == "DE" & datth$Sound == "TH" & datth$Loc == "Start")*1

datth$IsENTHEnd <- (datth$Lang == "EN" & datth$Sound == "TH" & datth$Loc == "End")*1

datth$IsNLTHEnd <- (datth$Lang == "NL" & datth$Sound == "TH" & datth$Loc == "End")*1

datth$IsDETHEnd <- (datth$Lang == "DE" & datth$Sound == "TH" & datth$Loc == "End")*1

datth$SpeakerSoundLoc <- interaction(datth$Speaker, datth$Sound, datth$Loc)

system.time(th1 <- bam(Pos ~ LangLoc + s(Time,by=LangLoc) + s(Time,by=IsENTHStart) + s(Time,by=IsENTHEnd) + s(Time,by=IsNLTHStart) + s(Time,by=IsNLTHEnd) + s(Time,by=IsDETHStart) + s(Time,by=IsDETHEnd) + s(Time,SpeakerSoundLoc,bs="fs",m=1) + s(Time,Word,bs="fs",m=1), data=datth, discrete=TRUE, rho=0.999, nthreads=8, AR.start=datth$start.event))
## Warning in gam.side(sm, X, tol = .Machine$double.eps^0.5): model has
## repeated 1-d smooths of same variable.
##    user  system elapsed 
## 872.532   6.548 150.889