Exercise 7 ---------- linear regression Jim Baumann en Leah Jones of the School of Education, Purdue University, did research for methods in reading education. The students that were questioned took two tests before the lessons and three tests after. The variable "before" lists the averages of the tests before the lessons and the variable "after" lists the averages of the tests after the lessons. The results are given in the table below. (Source: research done by Jim Baumann and Leah Jones from the School of Education of Purdue University.) case before after 1 3.50 16.67 2 5.50 18.33 3 6.50 17.00 4 9.00 19.67 5 10.50 21.67 6 14.00 20.67 7 11.00 20.67 8 9.50 14.00 9 7.50 16.00 10 8.00 17.67 11 10.00 19.33 12 5.50 17.67 13 8.50 16.33 14 7.00 20.00 15 7.00 15.33 16 10.00 22.67 17 6.50 15.33 18 8.50 15.00 19 7.00 15.00 20 6.00 15.00 21 5.00 21.00 22 7.50 15.67 The data must be entered by hand. Define the data columns and choose suitable variable names. a. Draw a scatterplot of "before" vs "after", with the least-squares line. Is there a linear correlation? Is it alright to determine the least-squares line? Examine the residues (the differences between the observed values and the values predicted by the least squares line). Draw two scatterplots: case vs residue and before-scores vs residue. The mean of the residues always equals 0. Draw the line residue=0 in each of the two scatterplots. Can you see suspect patterns or abnormal observations? b. Determine b1 and b0 and give the equation of the least-squares line. c. We will investigate the residues further. Draw a normal quantile plot of the residues. Do they form a straight line? Do they have a normal distribution? Give s, the standard error of the residues. d. Determine s_b1, the standard error for b1, and determine s_b0, the standard error for b0. e. We want to check whether the students who scored relatively high before the lessons also score relatively high after the lessons. Give a 95% confidence interval for beta1. Formulate H_0 and H_a and prove that the test scores after the lessons have a positive correlation with the scores before the lessons. f. The constant beta0 represents the average score after the lessons for students with a score equal to 0 before the lessons. Give the 95% confidence interval for beta0. Formulate H_0 and H_a and show that beta0 is positive.