1 Introduction

In this lab session, we will use the same data as in the lab session associated with the lecture “Introduction to R and data exploration”. You will have to fill in most commands yourself, but this should be feasible given the slides of the lecture, which can be viewed here: https://www.let.rug.nl/wieling/Statistics/Basic-Tests. While you can just enter the commands in RStudio, it is also possible to modify the source of this so-called R-markdown file directly in RStudio and press the “Knit HTML” button to generate an html file which contains both the commands you’ve used and their output. You can download the file to your current working directory in R by pasting the following command: download.file('http://www.let.rug.nl/wieling/Statistics/Basic-Tests/lab/lab.Rmd', 'lab.Rmd'). You can then open this file in RStudio. In this file, all R commands which are located within chunks (beginning and ending with three backticks) will be evaluated. Creating an R markdown file is very useful as your analysis becomes reproducible and easy to check for others. Note that chunks have options, with which you can customize the output. See for more information: https://raw.githubusercontent.com/rstudio/cheatsheets/master/rmarkdown-2.0.pdf

2 Importing the data

We will first download a csv file generated in Excel and import this data into R. We also add the two columns we’ve added during the aforementioned lab session.

download.file('http://www.let.rug.nl/wieling/Statistics/Basic-Tests/lab/mtcars.csv', 'mtcars.csv')
dat <- read.csv2('mtcars.csv')
dat$relHP <- dat$hp / dat$wt
dat$sportscar <- FALSE
dat[dat$relHP > 42,]$sportscar <- TRUE 

3 The structure of the data

Note that this dataset is similar to the mtcars dataset standard available in R, so the description of the columns can be obtained with ?mtcars. There is one addition column ‘region’ which contains the region which the car maker originated from.

4 Running basic statistical analyses

In this section, we will conduct several basic statistical analyses using the data.

4.1 Comparing one or two means

# We will investigate if the weight in the sample significantly 
# differs from 3 (x1000) lbs. 

# First look at the distribution of the weight: is wt normally distributed?

# Visualize the weight using a box-plot, and add a horizontal line at height 3

# Run the one-sample t-test and obtain the effect size (Cohen's D)

# Assess if the weight of sportscars differs significantly from non-sportscars.
# First assess if the distribution of wt of both groups is approximately normal.
# If not, use an appropriate non-parametric alternative. Also calculate the effect size.

4.2 Assessing categorical dependence

# Assess if there is a dependency between transmission and sportscar.
# Also obtain the effect size.

4.3 Comparing three or more means

# Assess if the region of the car maker influences the weight. Start with visualizing the data
# If region influences weight also assess which regions differ. Also obtain the effect size.
# For simplicity, you may assume that the data is normally distributed (even though it is not).

# Investigate if the region of the car maker and the type of car (sportscar or not)
# influence the weight. 

# Finally assess if adding the number of carburators (carb) is a signiciant covariate and
# obtain the effect sizes.

5 Answers

The answers to the questions in this file can be viewed here: https://www.let.rug.nl/wieling/Statistics/Basic-Tests/lab/answers. The associated R markdown file can be downloaded here: https://www.let.rug.nl/wieling/Statistics/Basic-Tests/lab/answers/answers.Rmd

6 Replication

From within RStudio, you can simply download this file using the commands:

# download original file if not already exists (to prevent overwriting)
if (!file.exists('lab.Rmd')) {
  download.file('http://www.let.rug.nl/wieling/Statistics/Basic-Tests/lab/lab.Rmd', 'lab.Rmd')
}

Subsequently, open it in the editor and use the Knit HMTL button to generate the html file.

If you use plain R, you first have to install Pandoc. Then copy the following lines to the most recent version of R.

# install rmarkdown package if not installed
if(!"rmarkdown" %in% rownames(installed.packages())) {
   install.packages("rmarkdown")
}
library(rmarkdown) # load rmarkdown package

# download original file if not already exists (to prevent overwriting)
if (!file.exists('lab.Rmd')) { 
  download.file('http://www.let.rug.nl/wieling/Statistics/Basic-Tests/lab/lab.Rmd', 'lab.Rmd')
} 

# generate output
render('lab.Rmd') # generates html file with results

# view output in browser
browseURL(paste('file://', file.path(getwd(),'lab.html'), sep='')) # shows result