Statistiek I

$t$-tests

Martijn Wieling
University of Groningen

Question 1: last lecture

Last lecture

  • How to reason about the population using a sample (CLT)
  • Calculating the standard error (\(SE\))
    • Standard error: used when reasoning about the population using a sample
    • Standard deviation: used when comparing an individual to the population
  • Calculating a confidence interval
  • Specifying a concrete testable hypothesis based on a research question
  • Specifying the null (\(H_0\)) and alternative hypothesis (\(H_a\))
  • Conducting a \(z\)-test and using the results to evaluate a hypothesis
  • Definition of a \(p\)-value: probability of data given that \(H_0\) is true
  • Evaluating the statistical significance given \(p\)-value and \(\alpha\)-level
  • Difference between a one-tailed and a two-tailed test
  • Type I and II errors

This lecture

  • Introduction to \(t\)-test
  • Three types of \(t\)-tests:
    • Single sample \(t\)-test
    • Independent samples \(t\)-test
    • Paired samples \(t\)-test
  • Effect size
  • How to report?

Introduction: \(t\)-test similar to \(z\)-test

  • Last lecture: \(z\)-test is used for comparing averages when \(\sigma\) is known
    • \(\sigma\) is only known for standardized tests, such as IQ tests
  • When \(\sigma\) is not known (in most cases), we can use the \(t\)-test
    • This test includes an estimation of \(\sigma\) based on sample standard deviation \(s\)

Calculating \(t\)-value

  • Very similar to calculating \(z\)-value for a sample (using standard error):

\[t = \frac{m - \mu}{s / \sqrt{n}} \hspace{70pt} z = \frac{m - \mu}{\sigma / \sqrt{n}}\]

  • Only difference: sample standard deviation \(s\) is used instead of \(\sigma\)
  • The precise formula depends on the type of \(t\)-test (independent samples, etc.)
    • (But for the exam, you only have to know the basic formulas shown above)

Obtaining \(p\)-values on the basis of \(t\)-values

  • \(z\)-values are compared to the standard normal distribution
  • But \(t\)-values are compared to the \(t\)-distribution
  • \(t\)-distributions look similar to the standard normal distribution
    • but dependent on the number of degrees of freedom (dF)

What are degrees of freedom?

  • There are five balloons each having a different color
  • There are five students (\(n = 5\)) who need to select a balloon
    • If 4 students have selected a balloon (dF = 4), student nr. 5 gets the last balloon
  • Similarly: if we have a fixed mean value calculated from 10 values
    • 9 values may vary in their value, but the 10th is fixed: dF = 10 - 1 = 9

Question 2

\(t\)-distribution vs. normal distribution

  • Difference between normal distribution and \(t\)-distribution is large for small dFs
  • When dF \(\geq\) 100, the difference is negligible
  • As the shape differs, the \(p\)-value associated with a certain \(t\)-value also changes
  • That is why it is essential to specify dF when describing the results of a \(t\)-test:
    \(t\)(dF)

Visualizing \(t\)-distributions

plot of chunk unnamed-chunk-1

  • For significance (given \(\alpha\)), higher (abs.) \(t\)-values are needed than \(z\)-values (but only when dF < 100, otherwise \(z\) and \(t\) are equal)
qt(0.025, df = 10, lower.tail = F)  # crit. t-value (alpha = 0.025) for dF = 10
# [1] 2.2281

Question 3

Answer to question 3

pt(2, 10, lower.tail = F) * 2  # two-sided p-value = 2 * one-sided p-value
# [1] 0.073388

plot of chunk unnamed-chunk-4

  • Dark gray area: \(p\) < 0.05 (2-tailed)

Three types of \(t\)-tests

  • Single sample \(t\)-test: compare mean with fixed value
  • Independent sample \(t\)-test: compare the means of two independent groups
  • Paired \(t\)-test: compare pairs of (dependent) values (e.g., repeated measurements of same subjects)
  • Requirement for all \(t\)-tests: Data should be approximately normally distributed
    • Otherwise: use non-parametric tests (discussed in next lecture)

Single sample \(t\)-test

\[t = \frac{m - \mu}{s / \sqrt{n}}\]

  • Used to compare mean to fixed value
  • \(H_0\): \(\mu = \mu_0\) and \(H_a\): \(\mu \neq \mu_0\)
  • Larger \(t\)-values give reason to reject to \(H_0\)
  • Automatic calculation in R using function t.test()
  • Standardized effect size is measured as difference in standard deviations
    • Cohen's \(d\): \(d = (m - \mu) / s\)

Assumptions for the single sample \(t\)-test

  • Data randomly selected from population
  • Data measured at interval or ratio scale
  • Observations are independent
  • Observations are approximately normally distributed
    • But \(t\)-test is robust to non-normality for larger samples (\(n > 30\))

Single sample \(t\)-test: example

  • Given our English proficiency data, we'd like to assess if the average English score is different from 7.5
  • \(H_0\): \(\mu = 7.5\) and \(H_a\): \(\mu \neq 7.5\)
  • We use \(\alpha\) = 0.05
  • Sample mean \(m\) = 7.62
  • Sample standard deviaton \(s\) = 0.92
  • Sample size \(n\) = 500
  • Degrees of freedom of \(t\)-test equals 500 - 1 = 499

Step 1: \(t\)-test assumptions met?

  • Data randomly selected from population ?
  • Data measured at interval scale ✓
  • Independent observations ✓
  • Data roughly normally distributed (or > 30 observations) ✓