1 Introduction

In this lab session, we will look at eye tracking data (Loerts et al., 2013). Make sure you have the most recent version of R 64 bits installed.

You can download the R markdown file here: lab.Rmd, add your answers to this file, and generate the HTML report (see also the final section).

2 Initialization

Load the required packages and download the required files.

# test if the necessary packages are installed if not, install them
packages <- c("lme4","car","gplots","visreg")
if (length(setdiff(packages, rownames(installed.packages()))) > 0) {
  install.packages(setdiff(packages, rownames(installed.packages())), repos='http://cran.us.r-project.org')  
}

# load the packages
library(lme4)
library(car)
# Loading required package: carData
library(visreg)
library(gplots) # used for the generation of focus plots
# 
# Attaching package: 'gplots'
# The following object is masked from 'package:stats':
# 
#     lowess
# download and load the following files containing the dataset and some functions:
if (!file.exists('myFunctions2.R')) { 
download.file('http://www.let.rug.nl/wieling/Statistics/Logistic-Mixed-Effects/lab/myFunctions2.R', 
              'myFunctions2.R')
}

if (!file.exists('eye.rda')) { 
download.file('http://www.let.rug.nl/wieling/Statistics/Logistic-Mixed-Effects/lab/eye.rda', 
              'eye.rda')
}

if (!file.exists('eyeAll.rda')) { 
download.file('http://www.let.rug.nl/wieling/Statistics/Logistic-Mixed-Effects/lab/eyeAll.rda', 
              'eyeAll.rda')
}

source('myFunctions2.R') # custom plotting functions
load('eye.rda') # data for mixed-effects regression, only SameColor == TRUE
load('eyeAll.rda') # data for making focus plots
# display version information
R.version.string
packageVersion("lme4")
packageVersion("car")
packageVersion("gplots")
packageVersion("visreg")

3 Data investigation

Look at the data to discover its structure.

head(eye)
str(eye)
summary(eye)
dim(eye)  # number of rows, columns
plot(density(eye$Age))  # e.g., to investigate distributions

3.1 Exercise 1

In this exercise, investigate eyeAll yourself. This dataset is used to generate the focus plots.

# your code goes here

3.2 Investigating the focus plots

The following plots show the various focus plots split by gender competitor and target type (common vs. neuter).

myFocusPlot(eyeAll)

The following plots show the various focus plots split by definiteness as well.

myFocusPlot(eyeAll, splitDefiniteness = T)

With the following code we can focus on a certain time window.

myFocusPlot(eyeAll, startTime = 500, endTime = 2000)

Finally, we can also focus on the competitor vs. distractors.

myFocusPlot(eyeAll, startTime = 500, endTime = 2000, plotTarget = F)

4 Mixed-effects regression

We create our first random effects model (no fixed-effects yet).

model <- glmer(cbind(TargetFocus, CompFocus) ~ (1 | Subject) + (1 | Item),
    data = eye, family = "binomial")
summary(model)

We visualize the random intercepts. In order to visualize the correct y-axis, we visualize the empirical logit (which is the same as the logit, but then including a small adaptation such that the divisor is never equal to 0).

eye$elog <- log((eye$TargetFocus + 0.5)/(eye$CompFocus + 0.5))
myInterceptPlot(eye, "elog", "TrialID", "Subject", model)  # for subjects
myInterceptPlot(eye, "elog", "TrialID", "Item", model)  # for items

Now we test if a by-item analysis is necessary.

model1 <- glmer(cbind(TargetFocus, CompFocus) ~ (1 | Subject), data = eye,
    family = "binomial")
model2 <- glmer(cbind(TargetFocus, CompFocus) ~ (1 | Subject) + (1 | Item),
    data = eye, family = "binomial")
anova(model1, model2)  # yes

4.1 Adding a fixed-effect factor

We try to add a fixed-effect factor for color, but we get a warning.

model3 <- glmer(cbind(TargetFocus, CompFocus) ~ SameColor + (1 | Subject) +
    (1 | Item), data = eye, family = "binomial")

This is due to the column being constant in this subset.

table(eye$SameColor)

We can, of course, add other predictors, such as the contrast between male and female IsMale (which is 1 for a male and 0 for a female).

model4 <- glmer(cbind(TargetFocus, CompFocus) ~ IsMale + (1 | Subject) +
    (1 | Item), data = eye, family = "binomial")
summary(model4)

As IsMale is significant, we retain it as a predictor.

4.2 Exercise 2

Why does it not make sense to vary the effect of IsMale per Subject as a random slope? Would it make sense to vary it per Item instead?

Answer: ….

If it does, assess if this random slope is necessary and store it as model4b. In the following we will continue without the random slope for simplicity.

# your code goes here

4.3 Adding the predictors of interest

We now investigate the effect of gender competitor and target gender

model5 <- glmer(cbind(TargetFocus, CompFocus) ~ IsMale + TargetNeuter +
    SameGender + (1 | Subject) + (1 | Item), data = eye, family = "binomial")
summary(model5)

Both predictors are not significant (as in the lecture), but there may be an interaction…

model6 <- glmer(cbind(TargetFocus, CompFocus) ~ IsMale + TargetNeuter *
    SameGender + (1 | Subject) + (1 | Item), data = eye, family = "binomial")
# another way to write the formula would be:
# cbind(TargetFocus,CompFocus) ~ IsMale + TargetNeuter + SameGender +
# TargetNeuter:SameGender
summary(model6)

Now both predictors are significant as well as their interaction. Of course, we still need to test if the interaction is really necessary and interpret the interaction by visualizing it.

4.4 Exercise 3

In this exercise, please assess if the interaction is necessary by comparing model6 to model4. Also visualize the interaction if it is necessary in the optimal model.

# your code goes here

Why are we not comparing model6 to model5?

Answer: ….

4.5 Including random slopes

We need to test if the inclusion of the interacting predictors as random slopes is necessary, as it may influence the significance of the fixed effects (in the interaction or separately).

model7 <- glmer(cbind(TargetFocus, CompFocus) ~ IsMale + TargetNeuter *
    SameGender + (1 + TargetNeuter * SameGender | Subject) + (1 | Item),
    data = eye, family = "binomial")
summary(model7)
anova(model6, model7)

The latter model (model7) is much better. It also illustrates that the pattern in your data may become non-significant when taking into account the adequate random-effects structure in your data. (Note that you’d normally add the random slopes one-by-one, but here we added all three at the same time as a shortcut - in principle you still need to assess if this random-effects structure is indeed best, and no simpler structure is better.)

4.6 Adding the color of the target

Now add the factor TargetColor to model6 (we ignore the random slopes in the rest of this lab session, to make computation quicker).

model8 <- glmer(cbind(TargetFocus, CompFocus) ~ IsMale + TargetNeuter *
    SameGender + TargetColor + (1 | Subject) + (1 | Item), data = eye,
    family = "binomial")
summary(model8)

The summary suggests that we need a contrast for brown:

eye$TargetBrown = (eye$TargetColor == "brown") * 1  # 1 if brown, 0 otherwise

4.7 Exercise 4

In this exercise, we’ll add a contrast to the model. Add the contrast TargetBrown to model (model6) and test if this model should be preferred over the more complex model with all color contrasts (model8). Store the model as model9.

# your code goes here

4.8 Exercise 5

In this exercise, we’ll assess if a random slope for TargetBrown is necessary. Assess if TargetBrown needs to be included as a by-subject and/or by-item random slope in model9. Store the model as model10.

# your code goes here

4.9 Exercise 6

Conduct model criticism on the basis of model model10.

5 Final remarks

While the present analysis (i.e. logistic mixed-effects regression) is better than using linear mixed-effects regression modeling, the analysis could be improved by taking into account the time course via generalized additive modeling. In that case our dependent variable would be binary, indicating if the subject focused on the target or not at each specific time point.

6 Replication

To generate the output of the analysis presented above, first install Pandoc. Then copy the following lines to the most recent version of R. Alternatively, you can download the file, open it in R studio, and then press the button Knit HTML. (Or use the Knit HTML button when viewing the file lab.Rmd in RStudio.)

# install rmarkdown package if not installed
if (!"rmarkdown" %in% rownames(installed.packages())) {
    install.packages("rmarkdown", repos = "http://cran.us.r-project.org")
}
library(rmarkdown)  # load rmarkdown package

# download original file if not already exists (to prevent
# overwriting)
if (!file.exists("lab.Rmd")) {
    download.file("http://www.let.rug.nl/wieling/Statistics/Logistic-Mixed-Effects/lab/lab.Rmd",
        "lab.Rmd")
}

# generate output
render("lab.Rmd")  # generates html file with results

# view output in browser
browseURL(paste("file://", file.path(getwd(), "lab.html"), sep = ""))  # shows result