1 Introduction

In this lab session, we will look at lexical decision data discussed by Baayen in his book (Baayen, 2008, Ch. 7.1). Make sure you have the most recent version of R 64 bits installed.

You can download the R markdown file here: lab.Rmd (make sure to not download it as a .txt file, but rather as a .Rmd file), add your answers to this file, and generate the HTML report (see for details the final section).

2 Initialization

Load the required packages and download the required files

# test if the necessary packages are installed; if not, install them
packages <- c("lme4", "car", "boot", "multcomp", "emmeans", "visreg")
if (length(setdiff(packages, rownames(installed.packages()))) > 0) {
    install.packages(setdiff(packages, rownames(installed.packages())),
        repos = "http://cran.us.r-project.org")
}

# load the packages
library(lme4)
library(car)
library(visreg)
library(boot)
library(multcomp)
library(emmeans)

# download and load the following files containing the dataset and
# some useful plotting functions:
if (!file.exists("myFunctions.R")) {
    download.file("http://www.let.rug.nl/wieling/Statistics/Mixed-Effects/lab/myFunctions.R",
        "myFunctions.R")
}
if (!file.exists("lexdec.rda")) {
    download.file("http://www.let.rug.nl/wieling/Statistics/Mixed-Effects/lab/lexdec.rda",
        "lexdec.rda")
}
source("myFunctions.R")
load("lexdec.rda")
# display version information
R.version.string
packageVersion("lme4")
packageVersion("car")
packageVersion("visreg")
packageVersion("boot")
packageVersion("multcomp")
packageVersion("emmeans")

3 Data investigation

Look at the data to discover its structure.

head(lexdec)
str(lexdec)
summary(lexdec)
nrow(lexdec)
plot(density(lexdec$Frequency))

# Frequency normally shows a skewed distribution, and was therefore
# log-transformed

# This shows the original distribution:
plot(density(exp(lexdec$Frequency)))

3.1 Data cleaning

Here we remove impossible RTs (< 200ms. or > 1100ms.) and we only focus on correct responses.

lexdec2 <- lexdec[exp(lexdec$RT) > 200 & exp(lexdec$RT) < 1100, ]
nrow(lexdec2)/nrow(lexdec)  # proportion of data retained
dat <- lexdec2[lexdec2$Correct == "correct", ]

4 Linear regression

We create our first linear regression model (no mixed-effects regression yet) and we inspect the summary. As always, we first center the numerical predictors.

dat$Trial.c = dat$Trial - mean(dat$Trial)
linearModel <- lm(RT ~ Trial.c, data = dat)
summary(linearModel)

5 Random intercepts

Since we would like our results to be generalizable across words and subjects, we add random intercepts for these factors.

5.1 Random intercepts for subject

We start with a random intercept for Subject: (1|Subject).

mixedModel1 <- lmer(RT ~ Trial.c + (1 | Subject), data = dat)
summary(mixedModel1, cor = F)

Looking at the results, we see that the p-values are not present anymore (compared to the linear regression model), but we can use \(|t| \geq 1.65 \implies p < 0.05\) (one-tailed) and \(|t| \geq 1.96 \implies p < 0.05\) (two-tailed) for a large data set.

The explained variance of the model can be found as follows:

cor(fitted(mixedModel1), dat$RT)^2

# explained variance of fixed effects only:
cor(fitted(linearModel), dat$RT)^2

Clearly most of the variance is due to the random intercepts. For this reason the explained variance of a mixed-effects regression model is not a useful measure of how well the model performs.

To see how the intercepts for subjects vary, we can plot them. The custom function myInterceptPlot extracts the random effects via coef.

myInterceptPlot(dat, "RT", "Trial.c", "Subject", mixedModel1)

5.2 Exercise 1

In this exercise, we’ll include random intercepts for word. Create a new model mixedModel2 which additionally includes random intercepts per word. Show the summary of the model, plot the random intercepts and evaluate if the by-word random intercepts are necessary by comparing the two models using anova. Does the anova parameter refit need to be set to T or F?

# your code goes here

5.3 Adding a fixed effect

Next, we’ll add another fixed effect to our model.

mixedModel3 <- lmer(RT ~ Trial.c + NativeLanguage + (1 | Subject) + (1 |
    Word), data = dat)
summary(mixedModel3, cor = F)
# NativeLanguage is significant, people with the Other native
# language are slower

To assess if the fixed-effect factor improves the model, we can again use anova which automatically refits the models using ML (which is appropriate when comparing fixed effects). While significant parameters are generally indicative of a significant improvement, this needs to be verified using model comparison (especially for interactions or multilevel predictors).

anova(mixedModel2, mixedModel3)

5.4 Exercise 2

In this exercise, we’ll add another fixed-effect predictor. Add the centered frequency as a fixed-effect predictor to the model and assess if the model improves using model comparison. Store the model as mixedModel4.

# your code goes here

6 Random slopes

6.1 Take note of warning messages!

We now want to see if subjects become slower or faster during the course of the experiment: include a random slope for Trial.c. If you see warning messages, you should read these carefully!

dat$Frequency.c <- dat$Frequency - mean(dat$Frequency)
mixedModel5 <- lmer(RT ~ Trial.c + NativeLanguage + Frequency.c + (1 |
    Subject) + (0 + Trial.c | Subject) + (1 | Word), data = dat)

The model gives a warning that the model is nearly unidentifiable and recommends rescaling the predictors. We follow its recommendation for trial, as this variable caused the problem:

# z-transforming (centering, then dividing by standard deviation)
dat$Trial.z <- dat$Trial.c/sd(dat$Trial.c)
mixedModel5 <- lmer(RT ~ Trial.z + NativeLanguage + Frequency.c + (1 |
    Subject) + (0 + Trial.z | Subject) + (1 | Word), data = dat)

6.2 Exercise 3

Please assess if mixedModel5 is an improvement over mixedModel4 and additionally assess if a correlation parameter between the random intercept and the by-subject random slope for trial is necessary.

# your code goes here

6.3 Exercise 4

In this exercise, we will test if subjects show a different response depending on word frequency: include a random slope for frequency per subject in mixedModel6 and assess which model is best. Continue from the best model from Exercise 3.

# your code goes here

6.4 Plotting random slopes

We can plot the random slopes (based on Chapter 7 of Baayen, 2008) as follows:

myXYPlot(dat, "RT", "Frequency.c", "Subject", mixedModel6)

Similarly, we can plot the sorted coefficients - from high negative (subjects are very fast for more frequent words) to less negative (subjects are less fast for more frequent words). Compare both plots. The first shows the lines, the second only the slope of the line.

myCoefPlot(mixedModel6, "Subject", "Frequency.c")

6.5 Solving convergence issues

Now we test if the effect of native language varies per word. Since this is a factorial predictor, it always needs to be put in the model together with the random intercept in one block.

mixedModel7 <- lmer(RT ~ Trial.z + NativeLanguage + Frequency.c + (1 |
    Subject) + (0 + Trial.z | Subject) + (0 + Frequency.c | Subject) +
    (1 + NativeLanguage | Word), data = dat)
summary(mixedModel7, cor = F)

The above model converges, but in an earlier lme4 version this was not the case. If your model does not converge, you can try different optimizers to see if one of those finds a solution. The default optimizer is bobyqa. Another is Nelder_Mead which we try now and also does converge.

mixedModel7a1 <- lmer(RT ~ Trial.z + NativeLanguage + Frequency.c + (1 |
    Subject) + (0 + Trial.z | Subject) + (0 + Frequency.c | Subject) +
    (1 + NativeLanguage | Word), data = dat, control = lmerControl(optimizer = "Nelder_Mead"))

A further optimizer is the one used in an earlier version (0.9) of the lme4 package, which also converges.

mixedModel7a2 <- lmer(RT ~ Trial.z + NativeLanguage + Frequency.c + (1 |
    Subject) + (0 + Trial.z | Subject) + (0 + Frequency.c | Subject) +
    (1 + NativeLanguage | Word), data = dat, control = lmerControl(optimizer = "nloptwrap"))

As all models converged, their summary and AIC is effectively identical

summary(mixedModel7a1, cor = F)
summary(mixedModel7a2, cor = F)
AIC(mixedModel7a1) - AIC(mixedModel7a2)
AIC(mixedModel7) - AIC(mixedModel7a1)

If the model would not have converged with any of the optimizers, the next step would be to drop NativeLanguage as a random slope (i.e. stick with mixedModel6).

Now we can assess if the random slope is necessary.

anova(mixedModel6, mixedModel7a1, refit = F)

And we can visualize the random slopes. Do you also see an interesting pattern?

myCoefPlot(mixedModel7a1, "Word", "NativeLanguageOther")

7 Testing interactions

We saw that non-native speakers showed a specific pattern in their responses to a word. Perhaps this is related to word class, so we add this interaction and see if this reaches significance and also influences the random slopes.

mixedModel8 <- lmer(RT ~ Trial.z + NativeLanguage * Class + Frequency.c +
    (1 | Subject) + (0 + Trial.z | Subject) + (0 + Frequency.c | Subject) +
    (1 + NativeLanguage | Word), data = dat)
anova(mixedModel7a1, mixedModel8)
summary(mixedModel8, cor = F)  # interaction clearly significant

7.1 Exercise 5

In this exercise, please visualize the interaction.

# your code goes here

7.2 Random slope necessary?

Perhaps the random slope of NativeLanguage per word is not necessary anymore?

mixedModel8simpler <- lmer(RT ~ Trial.z + NativeLanguage * Class + Frequency.c +
    (1 | Subject) + (0 + Trial.z | Subject) + (0 + Frequency.c | Subject) +
    (1 | Word), data = dat)
anova(mixedModel8simpler, mixedModel8, refit = F)
# the more complex model is not better (i.e. the random slope is not
# necessary anymore)

8 Assumptions met?

The following commands assess if the assumptions are met.

library(car)
acf(resid(mixedModel8simpler))  # a bit of autocorrelation, but not too serious
vif(mixedModel8simpler)  # no multicollinearity (use car::vif if this command gives an error)
qqnorm(resid(mixedModel8simpler))
qqline(resid(mixedModel8simpler))
plot(resid(mixedModel8simpler), fitted(mixedModel8simpler))

9 Model criticism

The quantile-quantile-plot shows some problems at the tails, so model criticism is needed. We employ model criticism by removing the data points which are problematic for the model.

dat2 <- dat[abs(scale(resid(mixedModel8simpler))) < 2.5, ]
(1 - (nrow(dat2))/nrow(dat)) * 100  # 2.37% removed
mixedModel8simpler.v2 <- lmer(RT ~ Trial.z + NativeLanguage * Class + Frequency.c +
    (1 | Subject) + (0 + Trial.z | Subject) + (0 + Frequency.c | Subject) +
    (1 | Word), data = dat2)

The quantile-quantile plot and the heteroscedasticity plot both look much better.

qqnorm(resid(mixedModel8simpler.v2))
qqline(resid(mixedModel8simpler.v2))
plot(resid(mixedModel8simpler.v2), fitted(mixedModel8simpler.v2))

10 Bootstrapping

To get confidence bands for the estimates, we can conduct a bootstrapping analysis as follows.

set.seed(99)  # to ensure replicable results
library(boot)
(bs <- confint(mixedModel8simpler.v2, method = "boot", nsim = 100, level = 0.95,
    .progress = "none", PBargs = list(style = 3)))

11 Multiple comparison

To conduct multiple comparisons (i.e. comparing multiple factor levels) the library multcomp can be used as follows:

library(multcomp)
dat2$Factor <- interaction(dat2$NativeLanguage, dat2$Class)
model <- lmer(RT ~ Trial.z + Factor + Frequency.c + (1 | Subject) + (0 +
    Trial.z | Subject) + (0 + Frequency.c | Subject) + (1 | Word), data = dat2)
summary(model, cor = F)
summary(glht(model, linfct = mcp(Factor = "Tukey")))

Alternatively, we could ask if complex differs from simplex for the two different classes (animals vs. plants). We can use the function emmeans for this

library(emmeans)
model <- lmer(RT ~ Trial.z + NativeLanguage * Class + Frequency.c + (1 |
    Subject) + (0 + Trial.z | Subject) + (0 + Frequency.c | Subject) +
    (1 | Word), data = dat2)
emmeans(model, list(pairwise ~ NativeLanguage | Class))

12 Concluding remarks

  • Model comparison is only valid when the models have been fitted on the same dataset. So if you add a variable with missing values, this is not the case anymore. The function anova will indicate this. When you want to see if the addition of this variable is necessary, you need to fit the simpler model also on the dataset where the observations with missing values for the new variable have been excluded.
  • The explained variance of your model will be very high due to the random-effects structure, so optimizing your explained variance is not useful. Instead you should use model comparison to determine which model to use. If you compare AIC values instead of using anova, make sure you fit your models with REML=T when comparing random effects, and REML=F when comparing fixed effects.
  • Visualize your results, including the random effects. They may reveal patterns suggesting the inclusion of several control variables
  • Use visualization to interpret interactions
  • If you are interested, take a look at https://www.let.rug.nl/wieling/Statistics/Mixed-Effects/lab/myFunctions.R
  • If you have a dependent variable Y which is binary (1: success, 0: failure), you can follow the same approach as before, but now using logistic regression. The commands are very similar: you use glmer instead of lmer and specify family='binomial': glmer(Y ~ ..., family='binomial')
    • Note that if you need to change the optimizer, use control = glmerControl(...)
    • There is no REML option for binomial models, so you always fit the same type of model
    • Residuals do not need to be normally distributed anymore
    • Alternatively, for logistic regression, you may also specify your dependent variable as the number of successes and the number of failures (e.g., over a certain time span and stored in columns NrSuccesses and NrFailures) in the following way: glmer(cbind(NrSuccesses,NrFailures) ~ ..., family='binomial')

13 Replication

To generate the output of the analysis presented above, first install Pandoc. Then copy the following lines to the most recent version of R. (Or use the Knit HTML button when viewing the file lab.Rmd in RStudio.) Note that if you get a Pandoc conversion error in Mac OSX, you can try adding self_contained: no just below the line html_document: at the top of this file.

# install rmarkdown package if not installed
if (!"rmarkdown" %in% rownames(installed.packages())) {
    install.packages("rmarkdown", repos = "http://cran.us.r-project.org")
}
library(rmarkdown)  # load rmarkdown package

# download original file if not already exists (to prevent
# overwriting)
if (!file.exists("lab.Rmd")) {
    download.file("http://www.let.rug.nl/wieling/Statistics/Mixed-Effects/lab/lab.Rmd",
        "lab.Rmd")
}

# generate output
render("lab.Rmd")  # generates html file with results

# view output in browser
browseURL(paste("file://", file.path(getwd(), "lab.html"), sep = ""))  # shows result