Install or update the latest version of the packages mgcv
and itsadug
:
install.packages(c("mgcv", "itsadug"))
Load libraries:
library(mgcv)
library(itsadug)
# indicate that messages of itsadug must be printed to the command line
infoMessages("off")
Check R version and package versions:
R.version.string
## [1] "R version 3.2.1 (2015-06-18)"
packageVersion("mgcv")
## [1] '1.8.6'
packageVersion("itsadug")
## [1] '1.0.2'
If these versions do not agree with the ones included in the Lab Assignment file, please let us know.
Download the source file nonlinear_0.1.tar.gz
and install the packages by the following commands in R:
download.file("http://www.sfs.uni-tuebingen.de/~jvanrij/packages/nonlinear_0.1.tar.gz", "nonlinear_0.1.tar.gz")
install.packages("nonlinear_0.1.tar.gz", type="source", repos=NULL)
If the package is installed succesfully, you could remove the source file (delete it).
Run the demo about random effects as follows:
library(nonlinear)
demoRandom()
Download the data pupilsize_S1.rds
and load the data in R:
download.file("http://www.sfs.uni-tuebingen.de/~jvanrij/LSA2015/Data/pupilsize_S1.rds", "pupilsize_S1.rds", mode="wb")
dat <- readRDS('pupilsize_S1.rds')
Inspect the data frame:
head(dat)
str(dat)
The data set is a subset of the data, with only 10 participants included in order to run some more complex models.
The column Pupil
contains the raw pupil dilation measure (blinks are removed). The data is sampled with 500 Hz, but downsampled to 50 Hz (i.e., a measurement each 20 ms).
Plot the grandaverage of the data. You can use your own code, or complete the code in the R block below.
# First we group the timestamps in bins of 100 ms:
dat$Timebin <- timeBins(dat$Time, 100)
# Calculate the grand averages with tapply or aggregate for the different timebins.
# Complete the code:
# - replace the dots in line A with the column of dat
# that contains the data to be averaged.
# - replace the dots in line B with the column(s) of dat
# thatgroup the data.
avg <- aggregate( dat$Pupil, # line A
list( Timebin=dat$Timebin ), # line B
mean, na.rm=TRUE)
# Plot the averages:
# Complete the code:
# - replace the dots in line A with the values to plot on the x-axis
# - replace the dots in line B with the values to plot on the y-axis
plot( avg$Timebin , # line A
avg$x , # line B
type='l', main="Grand average",
xlab='Time (ms)', ylab='Pupil size',
ylim=c(1000,1200),
lwd=2,
bty='n')
# Note: see help(par) for info about the different parameters
Alternative ways to use aggregate
to plot the data:
avg <- aggregate( Pupil ~ Timebin, data=dat, mean, na.rm=TRUE)
# or:
avg <- aggregate( Pupil ~ Timebin + Context, data=dat, mean, na.rm=TRUE)
Example of plotting two lines, one for each condition:
# First we group the timestamps in bins of 100 ms:
dat$Timebin <- timeBins(dat$Time, 100)
# Calculate the grand averages with tapply or aggregate for the different timebins.
# Complete the code:
# - replace the dots in line A with the column of dat
# that contains the data to be averaged.
# - replace the dots in line B with the column(s) of dat
# thatgroup the data.
avg <- aggregate( dat$Pupil, # line A
list( Timebin=dat$Timebin, Context=dat$Context), # line B
mean, na.rm=TRUE)
# Plot the averages:
# Complete the code:
# - replace the dots in line A with the values to plot on the x-axis
# - replace the dots in line B with the values to plot on the y-axis
plot( avg[avg$Context=="AP", ]$Timebin , # line A
avg[avg$Context=="AP", ]$x , # line B
type='l', main="Grand averages",
xlab='Time (ms)', ylab='Pupil size',
ylim=c(1000,1200),
lwd=2,
bty='n')
lines(avg[avg$Context=="PA", ]$Timebin , # line A
avg[avg$Context=="PA", ]$x, lwd=2, lty=2 )
legend("bottomright",
legend=c("AP", "PA"),
lwd=2, lty=c(1,2))
# Note: see help(par) for info about the different parameters
Timebin
column represent? In case you want to see the code for calculating the timebins type timeBins
in the command line. Why would it sometimes be useful to plot Timebin
rather than raw timestamps?The timebin column groups the timestamps in larger time bins of 100 ms. This makes the line smoother and allows for inspection of the major effects.
tapply(dat$Time, list(dat$Event), max)
in the commandline.Most time series run until 3000 ms, so the averages after 3000 ms are based on limited data points. That’s why these might go more extreme.
Check the numer of rows in the data. With more than 10.000 data points you should use the function bam()
rather than gam()
. Also when you’re planning to include many random smooths and other complex model terms, bam()
is more efficient.
nrow(dat)
## [1] 44093
Run the model with a nonlinear effect for Time
. Replace the dots ...
with a nonlinear smooth for Time
.
m1 <- bam( Pupil ~ s(Time), data=dat )
summary(m1)
).
summary(m1)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Pupil ~ s(Time)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1090.137 1.963 555.4 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(Time) 5.013 6.114 58.81 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.00808 Deviance explained = 0.819%
## fREML = 3.2806e+05 Scale est. = 1.6979e+05 n = 44093
s(Time)
). Because we don’t have any categorical predictors included, it is roughly the mean value of pupil dilation (1089.92).
edf
)
The edf
is 5, which means that the smooth is not a straight line (in that case, the edf would be 1).
The smooth is at some point significantly different from a flat horizontal zero line. The confidence intervals of the smooth do not contain the zero line.
Use the following code to visualize the smooth.
par(mfrow=c(1,2), cex=1.1)
# Partial effect plot:
plot(m1, select=1, shade=TRUE)
abline(h=0)
# Fitted values:
plot_smooth(m1, view="Time")
The first plot visualizes the partial effect of the smooth of Time
. So it selectively visualizes the first line of the smooth term summary, but no other terms are included. The second plot visualizes the summed or fitted effects, which are the model’s predictions. This plot sets every value in the model to a certain value and sums all these effects.
Participants differ considerably in pupil dilation size. Visualize the individual patterns. You could use the code below, or use another plotting method.
boxplot(Pupil ~ Subject, data=dat)
# or alternatively:
library(lattice)
xyplot(Pupil ~ Time | Subject, pch=16, cex=.5, data = dat)
To take into account differences between participants we could add a random intercept for each subject to the model. Replace the dots ...
with the random intercepts for Subject
and random intercepts for Item
.
m2 <- bam( Pupil ~ s(Time)
+ s(Subject, bs='re')
+ s(Item, bs='re'), data=dat )
Inspect the summary (summary(m2)
).
What does the intercept represent? The intercept is a contant value to be added to the other modelterms (here only s(Time)
).
Use the following code to visualize the smooth.
# make sure to have messages on:
infoMessages("on")
# split plo window in two:
par(mfrow=c(1,2), cex=1.1)
# Partial effect plot:
plot(m2, select=1, shade=TRUE)
abline(h=0)
# Fitted values:
plot_smooth(m2, view="Time")
## Summary:
## * Time : numeric predictor; with 30 values ranging from 0.000000 to 3243.000000.
## * Subject : factor; set to the value(s): s08.
## * Item : factor; set to the value(s): 30.
What do the plots represent? (Also pay attention to the summary in the command window for the plot_smooth
function.) The first plot visualizes the partial effect of Time
, i.e. the first line of the smooth term summary. The second plot visualizes the summed effects over time, i.e., the intercept and values for Subject
and Item
are included. The message in the commandline suggests that the plot is the fitted effect of Time for subject “s08” and item “30”.
Run the plot_smooth
function again, but now add rm.ranef=TRUE
as argument: plot_smooth(m2, view="Time", rm.ranef=TRUE)
. Why did the plot change so much?
plot_smooth(m2, view="Time", rm.ranef=TRUE)
## Summary:
## * Time : numeric predictor; with 30 values ranging from 0.000000 to 3243.000000.
## * Subject : factor; set to the value(s): s08. (Might be canceled as random effect, check below.)
## * Item : factor; set to the value(s): 30. (Might be canceled as random effect, check below.)
## * NOTE : The following random effects columns are canceled: s(Subject),s(Item)
##
Now the effects of subject and item (intercept adjustments) have been canceled out. Now the confidence intervals have increased, because this plot now shows the uncertainty around the interval. In the previous plot the confidence bands were smaller because the model was more certain about the intercept of the specific subject and item combination.
We now inspect the random effects. Run the following code.
par(mfrow=c(1,3), cex=1.1)
# Partial effect plot:
plot(m2, select=2)
# get the values for individual subjects:
rand <- get_random(m2)
# note that get_random returns a list with
# the random adjustments for each random
# intercept or slope:
rand
## $`s(Subject)`
## s01 s02 s03 s04 s05 s07
## -101.35582 -236.97476 39.30879 69.15500 -702.93196 -203.94801
## s08 s09 s10 s06
## 386.88710 -55.72687 849.04654 -43.46002
##
## $`s(Item)`
## 1 2 3 4 5 6
## -22.382871 -45.003660 26.324149 17.031028 14.617727 -39.481939
## 7 8 9 10 11 12
## -30.403224 -23.352711 27.480479 26.427325 24.066908 82.179005
## 13 14 15 16 17 18
## -79.583049 7.162514 53.285312 110.270764 -43.767303 -51.976279
## 19 20 21 22 23 24
## -62.005568 -48.999765 13.031326 18.448216 76.034841 -17.816977
## 25 26 27 28 29 30
## -37.125977 16.777733 -6.531431 -28.830664 23.932075 34.731534
## 31 32
## -42.589029 8.049511
# similar plot with subj names added:
xy <- qqnorm(rand[[1]], main='s(Subject)',
col='steelblue', pch=16, bty='n') # <- some additional arguments that could be left out
qqline(rand[[1]])
text( xy$x, rand[[1]], labels=names( rand[[1]] ),
pos=3, col="steelblue", xpd=TRUE) # <- some additional arguments that could be left out
# same for items:
xy <- qqnorm(rand[[2]], main='s(Item)',
col='steelblue', pch=16, bty='n') # <- some additional arguments that could be left out
qqline(rand[[2]])
text( xy$x, rand[[2]], labels=names( rand[[2]] ),
pos=3, col="steelblue", xpd=TRUE) # <- some additional arguments that could be left out
The plots show random adjustments in intercepts for Subject
and Item
. With negative values for a random intercept, the whole fitted effects become lower with this constant value.
The random intercept component s(Subject, bs="re")
equals (1|Subject)
in lmer
.
s05
, s10
and s06
. 1. s05
has a very negative value as random intercept, which means a negative adjustment of the intercept. So this participants has generally lower pupil dilation size.
2. s10
has a very positive value as random intercept, which means a positive adjustment of the intercept. So this participants has generally much higher pupil dilation size.
3. The value of s06
is very close to 0, which means only a small adjustment of the intercept. So this participants has a pupil dilation size close to the average.
The following plotting function produces plots that compare the fitted effects with the data. Run the code. The most important line of the function is the first command dat$fit <- fitted(model)
: The function stores the fitted values (model predictions) as a separate column in the data.
# This is a simple function for comparing the
# fitted effect with the data:
plot.fitted <- function(model, subject, n=3, column="Pupil"){
dat$fit <- fitted(model)
# extract the items for this subject:
items <- unique( as.character(dat[dat$Subject==subject, "Item"] ))
# if n > length of items, then take length of items:
n <- min(n, length(items))
# select n random items
items <- sample(items,n)
# select these items from the data:
tmp <- droplevels( dat[dat$Subject==subject & dat$Item %in% items,
c("Time", "Item", column, "fit")])
# set up empty plot window
emptyPlot(range(tmp$Time), range(tmp[,column]), main=subject)
# plot the items and their fitted effects:
for(i in items){
lines(tmp[tmp$Item==i,]$Time, tmp[tmp$Item==i,column], col='darkgray', xpd=TRUE)
lines(tmp[tmp$Item==i,]$Time, tmp[tmp$Item==i,]$fit, col='red', xpd=TRUE)
}
}
## end of function
You can run the function with a subject as input. The optional argument n
tells the function how many items to plot of that particular subject (defaults to 3):
par(mfrow=c(1,2))
plot.fitted(m2, 's01')
plot.fitted(m2, 's10', n=15)
The y-limits are different for each plot. With a smaller y-axis range, the pattern in the curves is more pronounced. However, the trend over time is the same for all participants.
Same: Intercept
, s(Time)
, adjustments of intercepts for Items s(Item)
is also the same for all participants Different: adjustments of intercepts for Subjects s(Subject)
- each subject has a different adjustment
Run the following model (ignore the warning) with random smooths over Time
for each participant. Include similar random smooths for Items.
m3 <- bam( Pupil ~ s(Time)
+ s(Time, Subject, bs='fs', m=1)
+ s(Time, Item, bs='fs', m=1), data=dat )
## Warning in gam.side(sm, X, tol = .Machine$double.eps^0.5): model has
## repeated 1-d smooths of same variable.
Inspect the summary (summary(m3)
).
summary(m3)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Pupil ~ s(Time) + s(Time, Subject, bs = "fs", m = 1) + s(Time,
## Item, bs = "fs", m = 1)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1086 128 8.483 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(Time) 4.701 5.099 4.998 0.000128 ***
## s(Time,Subject) 74.230 89.000 6244.853 < 2e-16 ***
## s(Time,Item) 201.938 287.000 30.866 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.928 Deviance explained = 92.9%
## fREML = 2.7052e+05 Scale est. = 12244 n = 44093
Timebin
? Is it still significant?The smooth has become less wiggly (around 4 edf), but is still nonlinear. The smooth is significantly different from 0.
edf
value?These terms fit 10 smooths, one for each participant, and 32 smooths, one for each item.
Use the following code to visualize the smooth.
par(mfrow=c(1,3), cex=1.1)
# Model term 1
plot(m3, select=1, shade=TRUE, scale=0)
abline(h=0)
# Model term 2
plot(m3, select=2)
abline(h=0)
# Summed effect of time:
plot_smooth(m3, view="Time", plot_all="Subject", rug=FALSE)
## Summary:
## * Time : numeric predictor; with 30 values ranging from 0.000000 to 3243.000000.
## * Subject : factor; set to the value(s): s01, s02, s03, s04, s05, s06, s07, s08, s09, s10.
## * Item : factor; set to the value(s): 30.
The first plot visualizes the partial effect of the main effect of Timebin
(without intercept or other modelterms included), the second plot visualizes the partial effect of the random smooths for Subject over Time. This are the random adjustments of the general time smooth for each subject. The last plot shows the summed effects for each subject. That includes for every line: the intercept, the main trend of Timebin
, and a random adjustment for that participant over Timebin
, and a random adjustment for an arbitrary item (30 according to the command line summary).
Side step: flexible visualization functions (advanced)
The following three plots show the separate components that build up the third plot. Note that with the function inspect_random
the individual item plots could be visualized.
par(mfrow=c(1,3), cex=1.1)
# s(Time) + intercept
# no random effects
plot_smooth(m3, view="Time",
main="intercept + s(Time)",
ylim=c(500,2000),
rm.ranef=TRUE, rug=FALSE)
# random effect for item 30:
inspect_random(m3, select=3, cond=list(Item=30), lwd=2,
main="+ Item 30", ylim=c(-750,750))
# random effects for subjects:
inspect_random(m3, select=2, main="+ s(Timebin, Subject)", ylim=c(-750,750), xpd=TRUE)
How could we plot the effects for each subject without the random effects for items included? With the argument rm.ranef=TRUE
, both Subject and Item effects are removed.
The function get_predictions
from the package itsadug
(which in turn is a wrapper around mgcv
’s predict.gam
) allows for retrieving model predictions with specific random effects being canceled. The rm.ranef
argument accepts a logical value or a regular expression when specific effects need to be removed. Make sure that this expression only holds for the random effect to remove, and not by accident applies to a fixed effect.
par(mfrow=c(1,2), cex=1.1)
# same time values as previous plots:
times <- seq(0,3243, length=30)
emptyPlot(range(times), c(400,2000),
main="No item effects", xlab='Time (ms)',
ylab="Summed effect, excl. item effect")
subj <- unique(dat$Subject)
for(su in 1:length(subj)){
p <- get_predictions(m3,
cond=list(Time=times, Subject=subj[su]),
rm.ranef="Item")
plot_error(p$Time, p$fit, p$CI, shade=TRUE,
col=rainbow(length(subj))[su], lwd=2, xpd=TRUE)
}
# For comparison:
# plot with item effects included:
plot_smooth(m3, view="Time", plot_all="Subject", rug=FALSE,
main="Incl. Item 30", xlab='Time (ms)',
ylab="Summed effect, incl. item effect")
If you use predictions like this, make sure that your predictions make sense.
The random smooths capture differences in the pupil dilation average (intercept differences) and differences in trend over time.
Use the same function as before to inspect the fitted effects:
par(mfrow=c(1,1))
plot.fitted(m3, 's01', n=10)
Although the random smooths capture some of the variation between participants, they do not capture all the variation within subjects.
In addition to the random smooths over Time
, we could include random intercept for Event
(unique Trial and Subject combination).
Run the following model (may take some time):
m4 <- bam( Pupil ~ s(Time)
+ s(Time, Subject, bs='fs', m=1)
+ s(Time, Item, bs='fs', m=1)
+ s(Event, bs='re') , data=dat )
## Warning in gam.side(sm, X, tol = .Machine$double.eps^0.5): model has
## repeated 1-d smooths of same variable.
Use the following code to visualize the smooth terms
par(mfrow=c(1,3), cex=1.1)
# Model term 1
plot(m4, select=1, shade=TRUE, scale=0)
abline(h=0)
# Model term 2
plot(m4, select=2)
abline(h=0)
# Model term 3
plot(m4, select=3)
abline(h=0)
Inspection of the fitted effects:
par(mfrow=c(1,1))
plot.fitted(m4, 's01', n=15)
Event
rather than for each Item
? Why would the random effects of Subject
and Item
in many time series data not be sufficient? Random smooths for Subject
capture the average pattern for each participant, random smooths for Items
capture the average pattern for each item. However, a time series consist of multiple measurements which are not only determined by the averages of participants and items, but are also influenced by events happening in that particular time series (e.g., associations with stimuli, or still thinking about the previous trial). If these individual time series show a lot of variation, as in pupil dilation data, the individual time series are not fit very well by the model. This will result in autocorrelation, because the model fit deviates from the data in a systematic way. Random intercepts for individual time series reduce the distance between the data and the model fit, thereby reducing the size of the residuals.
Different answers are possible. Our opinion: the model fit has improved with the event intercept adjustments, but a random smooth for each Event
instead of the Subject
and Item
random smooths would maybe capture additional variation within each time series. However, this takes a long time to run and is for many data sets simply not possible.
Assumptions of regression:
1. Residuals are normally distributed
2. There is no structure in the residuals (independent observations)
To test whether this model meets these assumptions, we inspect the residuals of the model.
Plot the distribution of the residuals in a QQ-norm plot:
qqnorm(resid(m4))
qqline(resid(m4))
Plot and ACF plot of the residuals:
par(mfrow=c(1,1))
acf(resid(m4))
On x-axis: quantiles from normal (Gaussian) distribution. On y-axis: quantiles from the residuals of the model.
The straight line is where the residual quantiles would be if they would follow a normal distribution.
The residuals clearly do not follow a normal (Gaussian) distribution: the most negative residuals are more negative than expected from a normal distribution, and the extreme positives are more positive than expected. This looks more like a t-distribution, which is steeper in shape than the normal distribution, as illustrated in the two QQ-plots (and associated density plots) below.
An ACF plot visualizes the autocorrelation in the residuals. On the x-axis are the Lags, which is the distance between the residuals. The y-axis shows the average correlation between data points. Lag of 0 represents the correlation between the data points (i.e., the residuals of the model) and themselves, which is always equal to 1. For Lag 1 the average correlation between all the residuals and all the residuals’ next data points. For Lag 2 the the average correlation between all the residuals and all the residuals of two measurements further. So basically Lag n equals \(corr(residuals_{t}, residuals_{t+n})\). Here are some examples:
# lags 0, 1, 2, and 5:
tmp <- data.frame(t0 = resid(m4),
t1 = move_n_point(resid(m4), n=1),
t2 = move_n_point(resid(m4), n=2),
t5 = move_n_point(resid(m4), n=5))
# move_n_points shifts the values n points forward
# Print ACF values:
print(acf(resid(m4), plot=FALSE))
##
## Autocorrelations of series 'resid(m4)', by lag
##
## 0 1 2 3 4 5 6 7 8 9
## 1.000 0.977 0.953 0.925 0.893 0.859 0.822 0.784 0.744 0.704
## 10 11 12 13 14 15 16 17 18 19
## 0.663 0.623 0.583 0.543 0.505 0.468 0.432 0.398 0.365 0.333
## 20 21 22 23 24 25 26 27 28 29
## 0.303 0.275 0.248 0.222 0.198 0.175 0.152 0.132 0.112 0.092
## 30 31 32 33 34 35 36 37 38 39
## 0.073 0.056 0.038 0.022 0.006 -0.010 -0.025 -0.039 -0.053 -0.066
## 40 41 42 43 44 45 46
## -0.079 -0.090 -0.102 -0.113 -0.123 -0.133 -0.142
# Compare correlations
# (if necessary, remove NA's in first row(s))
# Lag 0:
cor(tmp$t0, tmp$t0)
## [1] 1
# Lag 1:
tmp1 <- tmp[2:nrow(tmp),]
cor(tmp1$t0, tmp1$t1)
## [1] 0.9774553
# Lag 2:
tmp2 <- tmp[3:nrow(tmp),]
cor(tmp2$t0, tmp2$t2)
## [1] 0.9530716
# Lag 5:
tmp5 <- tmp[6:nrow(tmp),]
cor(tmp5$t0, tmp5$t5)
## [1] 0.8588949
The residuals are clearly not independent, because there is a strong correlation between the residuals and the residuals of following data points (Lag 1 correlation of .977). So there is some structure in the residuals that is not captured by the regression model.
We can clearly see this structure when plotting the residals over Time: we can clearly recognize individual timeseries in the residuals (which should not be the case if the residuals are randomly distributed).
plot(dat$Time, residuals(m4),
pch=16, col=alpha(1), cex=.75, # layout points
ylab = "Residuals", xlab="Time", # labels
bty='n')
abline(h=0)
To account for auctorrelation in residuals, we could include an AR1 model in the GAMM.
First, we need to generate a ‘start event’ column:
dat <- start_event(dat, event="Event")
head(dat[, c("Subject", "Item", "Time", "start.event")])
## Subject Item Time start.event
## 1 s01 1 2 TRUE
## 2 s01 1 22 FALSE
## 3 s01 1 42 FALSE
## 4 s01 1 62 FALSE
## 5 s01 1 82 FALSE
## 6 s01 1 102 FALSE
# one true and rest false for each event:
head( table(dat$Event, dat$start.event) )
##
## FALSE TRUE
## s01.1 147 1
## s02.1 142 1
## s03.1 142 1
## s04.1 147 1
## s05.1 142 1
## s07.1 142 1
The first time step of each trial and subject combination (i.e., of each individual time series) is set to TRUE
and all other data points to FALSE
to mark the start of each individual time series.
The AR1 model needs this information, because without this information it will assume that all datapoints are connected and correlated. However, between each time series there might have been some time without recording data, so the data points of different time series are much less correlated. Including an AR.start column which values are correlated and where the breaks between timeseries are.
Second, we need to have a start value for the parameter rho
, which indicates the amount of autocorrelation in the residuals. A good starting point is the Lag 1 value:
rho <- acf(resid(m4), plot=FALSE)$acf[2]
rho
## [1] 0.9774546
Run the model again with AR1 model included:
m4b <- bam( Pupil ~ s(Time)
+ s(Time, Subject, bs='fs', m=1)
+ s(Time, Item, bs='fs', m=1)
+ s(Event, bs='re')
, data=dat, AR.start=dat$start.event, rho=rho )
## Warning in gam.side(sm, X, tol = .Machine$double.eps^0.5): model has
## repeated 1-d smooths of same variable.
Inspect the summary and the plots to see what has changed in the model.
summary(m4)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Pupil ~ s(Time) + s(Time, Subject, bs = "fs", m = 1) + s(Time,
## Item, bs = "fs", m = 1) + s(Event, bs = "re")
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1070.3 127.3 8.407 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(Time) 1.00 1 1.102e+01 0.000901 ***
## s(Time,Subject) 83.04 89 4.944e+06 < 2e-16 ***
## s(Time,Item) 223.47 287 8.064e+03 2.89e-15 ***
## s(Event) 286.14 308 2.873e+02 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.976 Deviance explained = 97.7%
## fREML = 2.471e+05 Scale est. = 4053.1 n = 44093
summary(m4b)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Pupil ~ s(Time) + s(Time, Subject, bs = "fs", m = 1) + s(Time,
## Item, bs = "fs", m = 1) + s(Event, bs = "re")
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1078.5 127.7 8.444 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(Time) 8.229 8.335 13.22 < 2e-16 ***
## s(Time,Subject) 80.082 89.000 18748.90 < 2e-16 ***
## s(Time,Item) 240.571 287.000 32.96 2.91e-11 ***
## s(Event) 268.872 308.000 15.40 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.972 Deviance explained = 97.2%
## fREML = 1.5454e+05 Scale est. = 1362.2 n = 44093
The most striking differences in the summaries are that including the AR1 model resulted in a more wiggly smooth for Time (edf 1 (linear) changed to edf 8.3), and a higher intercept (from 1070.3 to 1078.5).
par(mfrow=c(1,2))
plot(m4, select=1, scale=0, rug=FALSE,
main="Model m4", shade=TRUE)
abline(h=0)
plot(m4b, select=1, scale=0, rug=FALSE,
main="Model m4b", shade=TRUE)
abline(h=0)
The plots confirm the more wiggly pattern in the model including the AR1 model.
In addition some of the random effects have changed. Below the comparison for the Item smooths are plotted:
par(mfrow=c(1,2))
plot(m4, select=3, rug=FALSE,
main="Model m4", shade=TRUE, ylim=c(-100,100))
abline(h=0)
plot(m4b, select=3, rug=FALSE,
main="Model m4b", shade=TRUE)
abline(h=0)
Two plots of the new residuals:
par(mfrow=c(1,2))
acf(resid(m4b), main='Residuals')
acf_resid(m4b, main='Corrected residuals')
- Left plot: residuals of the model. - Right plot: residuals of the model after correction for autcorrelation by the AR1 model. This plot shows that the AR1 model reduced the autocorrelation, but did not eliminate it completely.
compareML
to compare the two models, m4
and m4b
. Which one is preferred? compareML(m4, m4b)
## m4: Pupil ~ s(Time) + s(Time, Subject, bs = "fs", m = 1) + s(Time,
##
## m4b: Pupil ~ s(Time) + s(Time, Subject, bs = "fs", m = 1) + s(Time,
## m4: Item, bs = "fs", m = 1) + s(Event, bs = "re")
##
## m4b: Item, bs = "fs", m = 1) + s(Event, bs = "re")
##
## Model m4b preferred: lower fREML score (92565.223), and equal df (0.000).
## -----
## Model Score Edf Difference Df
## 1 m4 247102.6 8
## 2 m4b 154537.4 8 -92565.223 0.000
## Warning in compareML(m4, m4b): AIC is not reliable, because an AR1 model
## is included (rho1 = 0.000000, rho2 = 0.977455).
Model m4b is clearly preferred, because it has a considerably lower fREML score, although the degrees of freedom are the same.
With the argument n
, the function acf_resid
compares the ACF Lag 1 value for different events.
acf_resid(m4b, split_pred=list(Event=dat$Event), n=9)
## Quantiles to be plotted:
## 0% 12.5% 25% 37.5% 50% 62.5%
## -0.31079713 -0.02105642 0.02197881 0.05849034 0.09767531 0.15852502
## 75% 87.5% 100%
## 0.23152098 0.32193121 0.79064821
Each of the panels is representive for a quantile of the data (1/9th part). These plots indicate that there are some extreme events with high autocorrelation, for which including the AR1 model did not work.
A different value of \(\rho\) for each subject, or maybe even for each time series event, would be better than just one value for all trials and subjects together.
Hand in the R code, R output, the plots, and the answers on the questions in the section ‘d. Model criticism’.
Deadline: before next class.
Format: html document (R Markdown)
An easy way to include everything in a single document is by using R markdown.
Quick instructions for R Markdown in R Studio: www.sfs.uni-tuebingen.de/~jvanrij/Rinfo.html)
When using NOT using R Studio, try the following steps:
download.file('http://www.let.rug.nl/~wieling/statscourse/lecture4/lab/answers/lab-including-answers.Rmd', 'analysis.Rmd')
library(rmarkdown)
render('analysis.Rmd') # generates html file with results
browseURL(paste('file://', file.path(getwd(),'analysis.html'),
sep='')) # shows result
Alternatively you could hand in a pdf that contains all the information.
The assignments can be send by mail to vanrij.jacolien ‘AT’ gmail.com, with “LSA lab 1” in the subject line.
Note: nonlinear interactions and the use of the ti()
smooth will be explained in the next class.
Often we are interested in the change in pupil size after a stimulus. In these case it might be useful to work with the baselined pupil size. Instead, we could also include baseline as a fixed-effect predictor to account for pupil size at the start of the analysis window:
m5 <- bam( Pupil ~ s(Time)
+ s(baseline) + ti(baseline, Time)
+ s(Time, Subject, bs='fs', m=1)
+ s(Time, Item, bs='fs', m=1)
+ s(Event, bs='re') , data=dat )
Note that the effects are highly similar to a model of baselined pupil dilation:
m5b <- bam( Pupil_base ~ s(Time)
+ s(baseline) + ti(baseline, Time)
+ s(Time, Subject, bs='fs', m=1)
+ s(Time, Item, bs='fs', m=1)
+ s(Event, bs='re') , data=dat )
The predictor baseline
and the interaction between baseline
and Time
are included to account for differences in the trend of pupil dilation depending on the size of the pupil at the start of the trial.
To account for the effect of gaze position on pupil dilation, we include an interaction between X position and Y position of the fixation. As both predictors are on the same scale (pixels on screen), we could use the function s()
for modeling the interaction.
m6 <- bam( Pupil ~ s(Time)
+ s(baseline) + ti(baseline, Time)
+ s(Xgaze, Ygaze)
+ s(Time, Subject, bs='fs', m=1)
+ s(Time, Item, bs='fs', m=1)
+ s(Event, bs='re') , data=dat )
Note that this is not included as a random effect by subject, but rather as a fixed effect to reduce the processing time for the model.
Inspect the interaction:
par(mfrow=c(1,2), cex=1.1)
plot(m6, select=2)
# same plot but now with colors:
pvisgam(m6, select=4, view=c("Xgaze", "Ygaze"))
# make white the areas where no data was observed:
fadeRug(dat$Xgaze, dat$Ygaze)
# indicate the center of the screen:
abline(v=1024/2, lty=3)
abline(h=768/2, lty=3)