1 Abstract

In phonetics, many datasets are encountered which deal with dynamic data collected over time. Examples include diphthongal formant trajectories and articulator trajectories observed using electromagnetic articulography. Traditional approaches for analyzing this type of data generally aggregate data over a certain timespan, or only include measurements at a fixed time point (e.g., formant measurements at the midpoint of a vowel). In this paper, I discuss generalized additive modeling, a non-linear regression method which does not require aggregation or the pre-selection of a fixed time point. Instead, the method is able to identify general patterns over dynamically varying data, while simultaneously accounting for subject and item-related variability. An advantage of this approach is that patterns may be discovered which are hidden when data is aggregated or when a single time point is selected. A corresponding disadvantage is that these analyses are generally more time consuming and complex. This tutorial aims to overcome this disadvantage by providing a hands-on introduction to generalized additive modeling using articulatory trajectories from L1 and L2 speakers of English within the freely available R environment. All data and R code is made available to reproduce the analysis presented in this paper.

Journal: Journal of Phonetics (https://www.sciencedirect.com/science/article/abs/pii/S0095447017301377)

Preprint: http://www.martijnwieling.nl/files/GAM-tutorial-Wieling.pdf

Keywords: Generalized additive modeling, Tutorial, Articulography, Second language acquisition

2 Important note

A recent publication (Sóskuthy, 2021) has shown that the random-effects specification in the models below (from model m6 onwards) results in overly conservative \(p\)-values (i.e. \(p\)-values which are higher than they should be) when determining the fixed-effect differences between two levels (such as Word in model m6, or Sound in model ffmc0s). The reason for this is that the factor smooth does not take the dependency between the subjects with respect to the two levels of the fixed-effect factor into account, thereby moving some of the differences between the two levels to the random effects (and hence an estimated uncertainty about the difference in the fixed-effects which is too high, i.e. confidence bands which are too high). To obtain a more appropriate estimated significance of the difference curve, the random-effects specification (for m6) s(Time, Subject, by=Word, bs="fs", m=1) should be replaced by (with WordO as an orderdered factor, see Section 6.4.3) s(Time, Subject, bs="fs", m=1) + s(Time, Subject, by=WordO, bs="fs", m=1). Note, however, that this model is suitable to estimate the difference between the two words (Tenth and Tent), but should not be used to make inferences about the word-specific non-linear trajectories, as the confidence bands for the non-reference level will be overly large (overly conservative). Instead, model m6 should be used for that purpose. An example of this approach (for all words ending in "th") is shown in Section 7.4. Note that the other models in this tutorial which focus on assessing whether differences are significant are (still) overly conservative, and would need to be refit using the adjusted random-effects specification. This was not done here (except for the additional model shown in Section 7.4) to not deviate from the results discussed in the paper (which this tutorial accompanies).

3 Libraries and functions

The following commands load the necessary functions and libraries and show the version information.

# install packages if not yet installed
packages <- c("mgcv","itsadug","lme4","sp")
if (length(setdiff(packages, rownames(installed.packages()))) > 0) {
  install.packages(setdiff(packages, rownames(installed.packages())))  
}

# load required packages
library(mgcv)
library(itsadug)
library(sp) # for colors which also print well in grayscale
library(lme4)

# version information
R.version.string
# [1] "R version 4.1.0 (2021-05-18)"
cat(paste("mgcv version:",packageVersion("mgcv")))
# mgcv version: 1.8.36
cat(paste("itsadug version:",packageVersion("itsadug")))
# itsadug version: 2.4

4 Dataset

The following shows the columns of the full dataset and their explanation.

if (!file.exists("full.rda")) { 
    download.file("http://www.let.rug.nl/wieling/Tutorial/full.rda","full.rda") # 2 MB
}

load("full.rda")

4.1 Column names

The dataset consists of 126177 rows and 9 columns with the following column names:

str(full)
# 'data.frame': 126177 obs. of  9 variables:
#  $ Speaker: Factor w/ 42 levels "VENI_EN_1","VENI_EN_10",..: 1 1 1 1 1 1 1 1 1 1 ...
#  $ Lang   : Factor w/ 2 levels "EN","NL": 1 1 1 1 1 1 1 1 1 1 ...
#  $ Word   : Factor w/ 20 levels "faith","fate",..: 18 18 18 18 18 18 18 18 18 18 ...
#  $ Sound  : Factor w/ 2 levels "T","TH": 1 1 1 1 1 1 1 1 1 1 ...
#  $ Loc    : Factor w/ 2 levels "Final","Init": 2 2 2 2 2 2 2 2 2 2 ...
#  $ Trial  : int  1 1 1 1 1 1 1 1 1 1 ...
#  $ Time   : num  0 0.0161 0.0323 0.0484 0.0645 ...
#  $ Pos    : num  -0.392 -0.44 -0.44 -0.503 -0.513 ...
#  $ Pos01  : num  0.434 0.425 0.425 0.412 0.41 ...

4.2 Data description

  • Speaker - ID of the speaker
  • Lang - Native language of the speaker ("NL" for Dutch, or "EN" for * English)
  • Word - The label of the word
  • Sound - The sound contrast ("TH" for words with the dental fricative, "T" for words with the stop)
  • Loc - The location where in the word the sound contrasts occurs ("Init" when it occurs at the beginning of the word or "Final" when it occurs at the back of the word)
  • Trial - The trial number of the word for each speaker
  • Time - The normalized (between 0: beginning of the word, to 1: end of the word)
  • Pos - The standardized (mean 0, standard deviation 1) position for each speaker of the T1 sensor in the anterior-posterior direction (higher values, more anterior)
  • Pos01 - The normalized (1: most anterior, 0: most posterior) position for each speaker of the T1 sensor

4.3 Example data

head(full)
#     Speaker Lang Word Sound  Loc Trial   Time    Pos Pos01
# 1 VENI_EN_1   EN tick     T Init     1 0.0000 -0.392 0.434
# 2 VENI_EN_1   EN tick     T Init     1 0.0161 -0.440 0.425
# 3 VENI_EN_1   EN tick     T Init     1 0.0323 -0.440 0.425
# 4 VENI_EN_1   EN tick     T Init     1 0.0484 -0.503 0.412
# 5 VENI_EN_1   EN tick     T Init     1 0.0645 -0.513 0.410
# 6 VENI_EN_1   EN tick     T Init     1 0.0806 -0.677 0.378

5 Visualizing basis functions

# Summary:
#   * Word : factor; set to the value(s): tenth. 
#   * Time : numeric predictor; with 100 values ranging from 0.000000 to 1.000000.

6 Comparing "tenth" and "tent"

In the following, several models are fitted for the two words "tenth" and "tent".

words = c("tent","tenth")
dat = droplevels(full[full$Word %in% words,])
dat$Word = relevel(dat$Word,"tent") # tent is set as the reference level

# track first observation per trial (to correct for autocorrelation later on)
# and sort per trajectory
dat$start.event = (dat$Time == 0) 
dat = dat[order(dat$Speaker,dat$Trial,dat$Time),]
# dat <- start_event(dat,event=c("Speaker","Trial")) # bug in itsadug 2.4

6.1 Fitting a linear model

m1 <- bam(Pos ~ Word, data=dat, method="fREML")
(smry1 <- summary(m1))
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ Word
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)   0.0654     0.0117    5.57  2.5e-08 ***
# Wordtenth     0.6642     0.0164   40.41  < 2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# 
# R-sq.(adj) =  0.113   Deviance explained = 11.3%
# -REML =  17307  Scale est. = 0.86694   n = 12839

6.2 Fitting a non-linear model

m2 <- bam(Pos ~ Word + s(Time, by=Word, bs="tp", k=10), data=dat)
(smry2 <- summary(m2))
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ Word + s(Time, by = Word, bs = "tp", k = 10)
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)   0.0651     0.0107    6.11    1e-09 ***
# Wordtenth     0.6600     0.0149   44.18   <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Approximate significance of smooth terms:
#                    edf Ref.df     F p-value    
# s(Time):Wordtent  7.52   8.46  28.4  <2e-16 ***
# s(Time):Wordtenth 8.55   8.94 276.2  <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# R-sq.(adj) =  0.267   Deviance explained = 26.8%
# fREML =  16112  Scale est. = 0.71584   n = 12839
gam.check(m2)
# 
# Method: fREML   Optimizer: perf newton
# full convergence after 9 iterations.
# Gradient range [-4.63e-07,3.87e-07]
# (score 16112 & scale 0.716).
# Hessian positive definite, eigenvalue range [2.95,6418].
# Model rank =  20 / 20 
# 
# Basis dimension (k) checking results. Low p-value (k-index<1) may
# indicate that k is too low, especially if edf is close to k'.
# 
#                     k'  edf k-index p-value
# s(Time):Wordtent  9.00 7.52       1    0.58
# s(Time):Wordtenth 9.00 8.55       1    0.60

6.3 Visualizing the results

par(mfrow=c(1,2))
plot(m2, ylim=c(-1,2), select=1)
abline(h=0)
plot(m2, ylim=c(-1,2), select=2)
abline(h=0)

two.colors = bpy.colors(n = 2, cutoff.tails = 0.7, alpha = 1.0)
plot_smooth(m2, view="Time", plot_all="Word", main="", rug=FALSE, ylim=c(-1,2), col=two.colors) 
# Summary:
#   * Word : factor; set to the value(s): tent, tenth. 
#   * Time : numeric predictor; with 30 values ranging from 0.000000 to 1.000000. 
#   * NOTE : No random effects in the model to cancel.
# 

plot_diff(m2, view="Time", comp=list(Word=c("tenth","tent")), ylim=c(-1,2))
# Summary:
#   * Time : numeric predictor; with 100 values ranging from 0.000000 to 1.000000. 
#   * NOTE : No random effects in the model to cancel.
# 

# 
# Time window(s) of significant difference(s):
#   0.070707 - 1.000000

6.4 Complexity necessary?

6.4.1 Model comparison

m2a.ml <- bam(Pos ~ Word + s(Time), data=dat, method="ML")
m2b.ml <- bam(Pos ~ Word + s(Time, by=Word), data=dat, method="ML")
compareML(m2a.ml, m2b.ml)
# m2a.ml: Pos ~ Word + s(Time)
# 
# m2b.ml: Pos ~ Word + s(Time, by = Word)
# Warning in sprintf("***", h1): one argument not used by format '***'
# 
# Chi-square test of ML scores
# -----
#    Model Score Edf Difference    Df  p.value Sig.
# 1 m2a.ml 16505   4                               
# 2 m2b.ml 16103   6    401.810 2.000  < 2e-16  ***
# 
# AIC difference: 823.83, model m2b.ml has lower AIC.

6.4.2 Binary difference smooth

dat$IsTenth <- (dat$Word == "tenth")*1

m2.bin <- bam(Pos ~ s(Time) + s(Time,by=IsTenth), data=dat)
(smry2bin <- summary(m2.bin))
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ s(Time) + s(Time, by = IsTenth)
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)   0.0650     0.0107    6.09  1.1e-09 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Approximate significance of smooth terms:
#                  edf Ref.df     F p-value    
# s(Time)         7.69   8.49  28.8  <2e-16 ***
# s(Time):IsTenth 9.02   9.66 293.9  <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# R-sq.(adj) =  0.267   Deviance explained = 26.8%
# fREML =  16111  Scale est. = 0.71583   n = 12839
plot(m2.bin, select=2, shade=TRUE, ylim=c(-1,2))
abline(h=0)

6.4.2.1 Binary predictors: use only once!

summary(bam(Pos ~ s(Time) + s(Time,by=IsTenth) + s(Trial) + s(Trial,by=IsTenth), data=dat))
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ s(Time) + s(Time, by = IsTenth) + s(Trial) + s(Trial, by = IsTenth)
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)  -0.0253     0.0261   -0.97     0.33
# 
# Approximate significance of smooth terms:
#                   edf Ref.df     F p-value    
# s(Time)          7.76   8.53  31.4  <2e-16 ***
# s(Time):IsTenth  8.08   8.70 108.3  <2e-16 ***
# s(Trial)         8.97   9.00  75.7  <2e-16 ***
# s(Trial):IsTenth 9.97  10.00 101.4  <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Rank: 38/39
# R-sq.(adj) =  0.329   Deviance explained = 33.1%
# fREML =  15622  Scale est. = 0.65564   n = 12839
summary(bam(Pos ~ s(Trial) + s(Trial,by=IsTenth) + s(Time) + s(Time,by=IsTenth), data=dat))
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ s(Trial) + s(Trial, by = IsTenth) + s(Time) + s(Time, by = IsTenth)
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)  -0.0253     0.0261   -0.97     0.33
# 
# Approximate significance of smooth terms:
#                   edf Ref.df     F p-value    
# s(Trial)         8.97   9.00  75.7  <2e-16 ***
# s(Trial):IsTenth 8.97   9.00 109.7  <2e-16 ***
# s(Time)          7.76   8.53  31.4  <2e-16 ***
# s(Time):IsTenth  9.08   9.70 156.8  <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Rank: 38/39
# R-sq.(adj) =  0.329   Deviance explained = 33.1%
# fREML =  15622  Scale est. = 0.65564   n = 12839

Note that the edf value for the binary smooths will differ by exactly 1 comparing these two models. The reason for this problematic order effect is that the model is not able to determine which of the binary difference smooths will include the constant difference between the two words.

6.4.3 Ordered factor difference

dat$WordO <- as.ordered(dat$Word)
contrasts(dat$WordO) <- "contr.treatment"

m2.ord <- bam(Pos ~ s(Time) + s(Time,by=WordO) + WordO, data=dat)
(smrym2.ord <- summary(m2.ord))
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ s(Time) + s(Time, by = WordO) + WordO
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)   0.0651     0.0107     6.1  1.1e-09 ***
# WordOtenth    0.6601     0.0149    44.2  < 2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Approximate significance of smooth terms:
#                     edf Ref.df    F p-value    
# s(Time)            7.69   8.48 28.8  <2e-16 ***
# s(Time):WordOtenth 8.02   8.66 99.8  <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# R-sq.(adj) =  0.267   Deviance explained = 26.8%
# fREML =  16111  Scale est. = 0.71583   n = 12839
plot(m2.ord, select=2, shade=TRUE, ylim=c(-1,2))
abline(h=0)

6.5 Model criticism

Model criticism shows the residuals are fine.

gam.check(m2)

# 
# Method: fREML   Optimizer: perf newton
# full convergence after 9 iterations.
# Gradient range [-4.63e-07,3.87e-07]
# (score 16112 & scale 0.716).
# Hessian positive definite, eigenvalue range [2.95,6418].
# Model rank =  20 / 20 
# 
# Basis dimension (k) checking results. Low p-value (k-index<1) may
# indicate that k is too low, especially if edf is close to k'.
# 
#                     k'  edf k-index p-value
# s(Time):Wordtent  9.00 7.52    1.02    0.91
# s(Time):Wordtenth 9.00 8.55    1.02    0.92

6.6 Mixed-effects

6.6.1 Random intercepts

m3 <- bam(Pos ~ Word + s(Time, by=Word) + s(Speaker,bs="re"), data=dat)
(smrym3 <- summary(m3))
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ Word + s(Time, by = Word) + s(Speaker, bs = "re")
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)   0.0914     0.0680    1.35     0.18    
# Wordtenth     0.6782     0.0134   50.79   <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Approximate significance of smooth terms:
#                     edf Ref.df     F p-value    
# s(Time):Wordtent   7.76   8.61  36.3  <2e-16 ***
# s(Time):Wordtenth  8.64   8.96 352.8  <2e-16 ***
# s(Speaker)        40.58  41.00  86.9  <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# R-sq.(adj) =  0.427   Deviance explained = 42.9%
# fREML =  14634  Scale est. = 0.56012   n = 12839
par(mfrow=c(1,2))
plot_smooth(m3, view="Time", plot_all="Word", main="m3", rug=FALSE, rm.ranef=T, ylim=c(-1,2), 
            col=two.colors)
# Summary:
#   * Word : factor; set to the value(s): tent, tenth. 
#   * Time : numeric predictor; with 30 values ranging from 0.000000 to 1.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Speaker)
# 
plot_diff(m3, view="Time", comp=list(Word=c("tenth","tent")), rm.ranef=T, ylim=c(-1,2))
# Summary:
#   * Time : numeric predictor; with 100 values ranging from 0.000000 to 1.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Speaker)
# 

# 
# Time window(s) of significant difference(s):
#   0.050505 - 1.000000

6.6.2 Random slopes

m4 <- bam(Pos ~ Word + s(Time, by=Word) + s(Speaker,bs="re") + s(Speaker,Word,bs="re"), data=dat)
(smrym4 <- summary(m4))
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ Word + s(Time, by = Word) + s(Speaker, bs = "re") + s(Speaker, 
#     Word, bs = "re")
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)   0.1087     0.0828    1.31     0.19    
# Wordtenth     0.6179     0.1032    5.99  2.2e-09 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Approximate significance of smooth terms:
#                     edf Ref.df      F p-value    
# s(Time):Wordtent   7.95   8.71   44.6 < 2e-16 ***
# s(Time):Wordtenth  8.70   8.97  433.0 < 2e-16 ***
# s(Speaker)        15.48  41.00 1080.0    0.11    
# s(Speaker,Word)   64.59  81.00  960.4 3.3e-05 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# R-sq.(adj) =  0.534   Deviance explained = 53.7%
# fREML =  13397  Scale est. = 0.45546   n = 12839
compareML(m3,m4)
# m3: Pos ~ Word + s(Time, by = Word) + s(Speaker, bs = "re")
# 
# m4: Pos ~ Word + s(Time, by = Word) + s(Speaker, bs = "re") + s(Speaker, 
#     Word, bs = "re")
# Warning in sprintf("***", h1): one argument not used by format '***'
# 
# Chi-square test of fREML scores
# -----
#   Model Score Edf Difference    Df  p.value Sig.
# 1    m3 14634   7                               
# 2    m4 13397   8   1237.194 1.000  < 2e-16  ***
# 
# AIC difference: 2616.08, model m4 has lower AIC.
par(mfrow=c(1,2))
plot_smooth(m4, view="Time", plot_all="Word", main="m4", rug=FALSE, rm.ranef=T, ylim=c(-1,2), 
            col=two.colors) 
# Summary:
#   * Word : factor; set to the value(s): tent, tenth. (Might be canceled as random effect, check below.) 
#   * Time : numeric predictor; with 30 values ranging from 0.000000 to 1.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Speaker),s(Speaker,Word)
# 
plot_diff(m4, view="Time", comp=list(Word=c("tenth","tent")), rm.ranef=T, ylim=c(-1,2))
# Summary:
#   * Time : numeric predictor; with 100 values ranging from 0.000000 to 1.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Speaker),s(Speaker,Word)
# 

# 
# Time window(s) of significant difference(s):
#   0.151515 - 0.282828
#   0.393939 - 1.000000

6.6.3 Factor smooths

m5 <- bam(Pos ~ Word + s(Time, by=Word) + s(Speaker,Word,bs="re") + s(Time,Speaker,bs="fs",m=1), 
          data=dat)
(smrym5 <- summary(m5))
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ Word + s(Time, by = Word) + s(Speaker, Word, bs = "re") + 
#     s(Time, Speaker, bs = "fs", m = 1)
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)   0.1090     0.0829    1.31     0.19    
# Wordtenth     0.6176     0.1032    5.99  2.2e-09 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Approximate significance of smooth terms:
#                      edf Ref.df      F p-value    
# s(Time):Wordtent    7.41   7.97   9.47  <2e-16 ***
# s(Time):Wordtenth   8.57   8.80  43.72  <2e-16 ***
# s(Speaker,Word)    64.79  81.00  52.32  <2e-16 ***
# s(Time,Speaker)   296.36 377.00 249.79   0.075 .  
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# R-sq.(adj) =   0.67   Deviance explained =   68%
# fREML =  11596  Scale est. = 0.32251   n = 12839
plot(m5, select=4)

compareML(m4,m5)
# m4: Pos ~ Word + s(Time, by = Word) + s(Speaker, bs = "re") + s(Speaker, 
#     Word, bs = "re")
# 
# m5: Pos ~ Word + s(Time, by = Word) + s(Speaker, Word, bs = "re") + 
#     s(Time, Speaker, bs = "fs", m = 1)
# Warning in sprintf("***", h1): one argument not used by format '***'
# 
# Chi-square test of fREML scores
# -----
#   Model Score Edf Difference    Df  p.value Sig.
# 1    m4 13397   8                               
# 2    m5 11596   9   1801.628 1.000  < 2e-16  ***
# 
# AIC difference: 4156.35, model m5 has lower AIC.
par(mfrow=c(1,2))
plot_smooth(m5, view="Time", plot_all="Word", main="m5", rug=FALSE, rm.ranef=T, ylim=c(-1,2), 
            col=two.colors) 
# Summary:
#   * Word : factor; set to the value(s): tent, tenth. (Might be canceled as random effect, check below.) 
#   * Time : numeric predictor; with 30 values ranging from 0.000000 to 1.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Speaker,Word),s(Time,Speaker)
# 
plot_diff(m5, view="Time", comp=list(Word=c("tenth","tent")), rm.ranef=T, ylim=c(-1,2))
# Summary:
#   * Time : numeric predictor; with 100 values ranging from 0.000000 to 1.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Speaker,Word),s(Time,Speaker)
# 

# 
# Time window(s) of significant difference(s):
#   0.141414 - 0.292929
#   0.393939 - 1.000000
m6 <- bam(Pos ~ Word + s(Time, by=Word) + s(Time,Speaker,by=Word,bs="fs",m=1), data=dat)
# Warning in gam.side(sm, X, tol = .Machine$double.eps^0.5): model has repeated 1-d smooths of same
# variable.
(smrym6 <- summary(m6))
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ Word + s(Time, by = Word) + s(Time, Speaker, by = Word, 
#     bs = "fs", m = 1)
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)    0.109      0.094    1.16     0.25    
# Wordtenth      0.612      0.117    5.22  1.9e-07 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Approximate significance of smooth terms:
#                              edf Ref.df     F p-value    
# s(Time):Wordtent            7.54   7.95  9.31  <2e-16 ***
# s(Time):Wordtenth           8.38   8.54 22.13  <2e-16 ***
# s(Time,Speaker):Wordtent  316.83 377.00 38.13  <2e-16 ***
# s(Time,Speaker):Wordtenth 328.72 368.00 43.27  <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# R-sq.(adj) =  0.782   Deviance explained = 79.3%
# fREML = 9386.6  Scale est. = 0.21284   n = 12839
compareML(m5,m6)
# m5: Pos ~ Word + s(Time, by = Word) + s(Speaker, Word, bs = "re") + 
#     s(Time, Speaker, bs = "fs", m = 1)
# 
# m6: Pos ~ Word + s(Time, by = Word) + s(Time, Speaker, by = Word, 
#     bs = "fs", m = 1)
# Warning in sprintf("***", h1): one argument not used by format '***'
# 
# Chi-square test of fREML scores
# -----
#   Model Score Edf Difference    Df  p.value Sig.
# 1    m5 11596   9                               
# 2    m6  9387  10   2209.002 1.000  < 2e-16  ***
# 
# AIC difference: 5063.34, model m6 has lower AIC.
par(mfrow=c(1,2))
plot_smooth(m6, view="Time", plot_all="Word", main="m6", rug=FALSE, rm.ranef=T, ylim=c(-1,2), 
            col=two.colors) 
# Summary:
#   * Word : factor; set to the value(s): tent, tenth. 
#   * Time : numeric predictor; with 30 values ranging from 0.000000 to 1.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Time,Speaker):Wordtent,s(Time,Speaker):Wordtenth
# 
plot_diff(m6, view="Time", comp=list(Word=c("tenth","tent")), rm.ranef=T, ylim=c(-1,2))
# Summary:
#   * Time : numeric predictor; with 100 values ranging from 0.000000 to 1.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Time,Speaker):Wordtent,s(Time,Speaker):Wordtenth
# 

# 
# Time window(s) of significant difference(s):
#   0.404040 - 1.000000

6.7 Autocorrelation

m6acf <- acf_resid(m6)

(rhoval <- m6acf[2]) # autocorrelation at lag 1
#     1 
# 0.912
m7 <- bam(Pos ~ Word + s(Time, by=Word) + s(Time,Speaker,by=Word,bs="fs",m=1), data=dat, 
          rho=rhoval, AR.start=dat$start.event) 
# Warning in gam.side(sm, X, tol = .Machine$double.eps^0.5): model has repeated 1-d smooths of same
# variable.
acf_resid(m7)

summary(m7)
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ Word + s(Time, by = Word) + s(Time, Speaker, by = Word, 
#     bs = "fs", m = 1)
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)   0.1093     0.0896    1.22     0.22    
# Wordtenth     0.6137     0.1126    5.45  5.1e-08 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Approximate significance of smooth terms:
#                             edf Ref.df     F p-value    
# s(Time):Wordtent            7.4   8.07  8.88  <2e-16 ***
# s(Time):Wordtenth           8.3   8.58 21.29  <2e-16 ***
# s(Time,Speaker):Wordtent  231.2 377.00  2.86  <2e-16 ***
# s(Time,Speaker):Wordtenth 271.1 368.00  3.86  <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# R-sq.(adj) =  0.762   Deviance explained = 77.2%
# fREML = -3293.3  Scale est. = 0.18777   n = 12839
par(mfrow=c(1,2))
plot_smooth(m7, view="Time", plot_all="Word", main="m7", rug=FALSE, rm.ranef=T, ylim=c(-1,2), 
            col=two.colors) 
# Summary:
#   * Word : factor; set to the value(s): tent, tenth. 
#   * Time : numeric predictor; with 30 values ranging from 0.000000 to 1.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Time,Speaker):Wordtent,s(Time,Speaker):Wordtenth
# 
plot_diff(m7, view="Time", comp=list(Word=c("tenth","tent")), rm.ranef=T, ylim=c(-1,2))
# Summary:
#   * Time : numeric predictor; with 100 values ranging from 0.000000 to 1.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Time,Speaker):Wordtent,s(Time,Speaker):Wordtenth
# 

# 
# Time window(s) of significant difference(s):
#   0.404040 - 1.000000
m7.alt <- bam(Pos ~ Word + s(Time, by=Word) + s(Time,Speaker,by=Word,bs="fs",m=1), data=dat, 
          rho=rhoval - 0.1, AR.start=dat$start.event) 
# Warning in gam.side(sm, X, tol = .Machine$double.eps^0.5): model has repeated 1-d smooths of same
# variable.
acf_resid(m7.alt)

m7.alt2 <- bam(Pos ~ Word + s(Time, by=Word) + s(Time,Speaker,by=Word,bs="fs",m=1), data=dat, 
          rho=0.99, AR.start=dat$start.event) 

acf_resid(m7.alt2)
par(mfrow=c(1,2))
plot_smooth(m7.alt2, view="Time", plot_all="Word", main="m7.alt2", rug=FALSE, rm.ranef=T, ylim=c(-1,2), 
            col=two.colors) 
plot_diff(m7.alt2, view="Time", comp=list(Word=c("tenth","tent")), rm.ranef=T, ylim=c(-1,2))

6.8 Tensor product interaction

m8 <- bam(Pos ~ Word + te(Time, Trial, by=Word) + s(Time,Speaker,by=Word,bs="fs",m=1), data=dat, 
          rho=rhoval, AR.start=dat$start.event)
(smrym8 <- summary(m8))
# 
# Family: gaussian 
# Link function: identity 
# 
# Formula:
# Pos ~ Word + te(Time, Trial, by = Word) + s(Time, Speaker, by = Word, 
#     bs = "fs", m = 1)
# 
# Parametric coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)   0.0895     0.0937    0.96     0.34    
# Wordtenth     0.6636     0.1184    5.60  2.2e-08 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Approximate significance of smooth terms:
#                              edf Ref.df     F p-value    
# te(Time,Trial):Wordtent     9.09   9.92  8.07  <2e-16 ***
# te(Time,Trial):Wordtenth    8.55   8.77 15.61  <2e-16 ***
# s(Time,Speaker):Wordtent  232.93 377.00  2.88  <2e-16 ***
# s(Time,Speaker):Wordtenth 282.68 368.00  4.15  <2e-16 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# R-sq.(adj) =  0.776   Deviance explained = 78.5%
# fREML =  -3290  Scale est. = 0.18737   n = 12839
par(mfrow=c(2,2))
fvisgam(m8, view=c("Time","Trial"), cond=list(Word=c("tent")), main="m8: tent", rm.ranef=T,
        zlim=c(-0.9,1.6), ylim=c(0,600), color="gray") 
# Summary:
#   * Word : factor; set to the value(s): tent. 
#   * Time : numeric predictor; with 30 values ranging from 0.000000 to 1.000000. 
#   * Trial : numeric predictor; with 30 values ranging from 0.000000 to 600.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Time,Speaker):Wordtent,s(Time,Speaker):Wordtenth
# 
# Warning in gradientLegend(c(min.z, max.z), n.seg = 3, pos = 0.875, color = pal, : Increase right
# margin to fit labels or decrease the number of decimals, see help(gradientLegend).
fvisgam(m8, view=c("Time","Trial"), cond=list(Word=c("tenth")), main="m8: tenth", rm.ranef=T,
        zlim=c(-0.9,1.6), ylim=c(0,600), color="gray") 
# Summary:
#   * Word : factor; set to the value(s): tenth. 
#   * Time : numeric predictor; with 30 values ranging from 0.000000 to 1.000000. 
#   * Trial : numeric predictor; with 30 values ranging from 0.000000 to 600.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Time,Speaker):Wordtent,s(Time,Speaker):Wordtenth
# 
# Warning in gradientLegend(c(min.z, max.z), n.seg = 3, pos = 0.875, color = pal, : Increase right
# margin to fit labels or decrease the number of decimals, see help(gradientLegend).
plot_diff2(m8, view=c("Time","Trial"), comp=list(Word=c("tenth","tent")), rm.ranef=T,  se=0,
           main="Difference tenth - tent", zlim=c(-0.1,1.2), ylim=c(0,600), color="gray")
# Summary:
#   * Time : numeric predictor; with 30 values ranging from 0.000000 to 1.000000. 
#   * Trial : numeric predictor; with 30 values ranging from 0.000000 to 600.000000. 
#   * Speaker : factor; set to the value(s): VENI_NL_5. (Might be canceled as random effect, check below.) 
#   * NOTE : The following random effects columns are canceled: s(Time,Speaker):Wordtent,s(Time,Speaker):Wordtenth
# 
# Warning in gradientLegend(zlim, n.seg = 3, pos = 0.875, dec = dec, color = color, : Increase right
# margin to fit labels or decrease the number of decimals, see help(gradientLegend).