plot(lls$BAC, lls$ratingNL, xlab='BAC', ylab='rating (NL)')
Introduction to linear regression
Call:
lm(formula = ratingNL ~ 1 + BAC, data = lls)
Residuals:
Min 1Q Median 3Q Max
-1.0653 -0.3496 0.0867 0.2953 1.0466
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.3497 0.0995 3.51 0.0011 **
BAC -0.3505 0.1624 -2.16 0.0369 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.456 on 41 degrees of freedom
Multiple R-squared: 0.102, Adjusted R-squared: 0.0801
F-statistic: 4.66 on 1 and 41 DF, p-value: 0.0369
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.34970 0.099513 3.5141 0.0010909
BAC -0.35049 0.162448 -2.1576 0.0368728
ratingNL
(dependent variable) decreases by 0.35BAC
of 0 is expected to have a ratingNL
of 0.35ratingNL = 0.35 + -0.35 * BAC
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.34970 0.099513 3.5141 0.0010909
BAC -0.35049 0.162448 -2.1576 0.0368728
BAC
of \(0\) fitted ratingNL
of: \(0.35 +\,\) \(-0.35\, *\) \(0\) \(=\) \(0.35\) (= Intercept)BAC
of \(0.5\) fitted ratingNL
of: \(0.35 +\,\) \(-0.35\, *\) \(0.5\) \(=\) \(0.175\)BAC
of \(1.5\) fitted ratingNL
of: \(0.35 +\,\) \(-0.35\, *\) \(1.5\) \(=\) \(-0.175\) Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.35 0.0995 3.51 0.00109
BAC -0.35 0.1624 -2.16 0.03687
lls$BAC.c = lls$BAC - mean(lls$BAC) # center BAC
summary(mc <- lm(ratingNL ~ BAC.c, data=lls))$coef # results with centered BAC
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.196 0.0695 2.82 0.00736
BAC.c -0.350 0.1624 -2.16 0.03687
BAC = 0.44 => BAC.c = 0
) have an expected L1 rating of 0.196
ratingNL
decreases by 0.35lls$BAC.z = (lls$BAC - mean(lls$BAC)) / sd(lls$BAC) # z-transform BAC values
summary(mz <- lm(ratingNL ~ BAC.z, data=lls))$coef # results with z-transformed BAC
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.196 0.0695 2.82 0.00736
BAC.z -0.152 0.0704 -2.16 0.03687
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.196 0.0695 2.82 0.00736
BAC.z -0.152 0.0704 -2.16 0.03687
E.g., \(z \geq 2 \implies p \approx 0.025\)
But \(z\)-values can only be determined when \(\sigma\) is known
\[z = \frac{m - \mu}{\sigma / \sqrt{n}}\]
\[SE = s / \sqrt{n}\]
\[t = \frac{\beta}{SE}\]
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.35 0.0995 3.51 0.00109
BAC -0.35 0.1624 -2.16 0.03687
[1] 0.00109
[1] 0.0368693
R
does it for us
Call:
lm(formula = ratingNL ~ 1 + BAC, data = lls)
Residuals:
Min 1Q Median 3Q Max
-1.0653 -0.3496 0.0867 0.2953 1.0466
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.3497 0.0995 3.51 0.0011 **
BAC -0.3505 0.1624 -2.16 0.0369 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.456 on 41 degrees of freedom
Multiple R-squared: 0.102, Adjusted R-squared: 0.0801
F-statistic: 4.66 on 1 and 41 DF, p-value: 0.0369
[1] -0.31931
[1] 0.10196
Our RQ was: Does alcohol have a negative influence on L1 language proficiency? Our hypotheses were: \(H_0\): \(\beta_{bac} = 0\), \(H_a\): \(\beta_{bac} < 0\). We obtained alcohol percentages and L1 proficiency ratings for 43 individuals. We fitted a regression model with the L1 ratings as the dependent variable (DV) and blood alcohol concentration (BAC) as the independent variable (IV). We verified that the required assumptions were satisfied (i.e. normally distributed and homoscedastic residuals). The regression coefficient for BAC was -0.35 with a one-tailed associated \(p\)-value of 0.018. The effect size, measured in \(R^2\) was 0.102 (medium). This indicates that the predictor BAC accounts for about 10% of the variance in L1 proficiency ratings. See Table X and Figure Y for the exact pattern. As the \(p\)-value was lower than our significance threshold \(\alpha\) of 0.05, we reject the null hypothesis and accept the alternative hypothesis that there is a significant negative influence of alcohol on L1 language proficiency.
Practice this in laboratory exercises!
Thank you for your attention!
https://www.martijnwieling.nl
m.b.wieling@rug.nl