We will use data on miles driven for the same 11 people working 4-day weeks and 5-day weeks.. We looked at the files in the R directory to find out what the name of the file was (mileage.txt), then read it into a frame named mileage. We just read it again to see what the variable names were. Then we created a variable diff for the differences, typed diff to see them, and then used t.test on the differences.
> mileage <- read.delim(file="mileage.txt",header=TRUE) > attach(mileage) > read.delim(file="mileage.txt",header=TRUE) Name X5.Day_mileage X4.Day_mileage 1 Jeff 2798 2914 2 Betty 7724 6112 3 Roger 7505 6177 4 Tom 838 1102 5 Aimee 4592 3281 6 Greg 8107 4997 7 Larry G 1228 1695 8 Tad 8718 6606 9 Larry M 1097 1063 10 Leslie 8089 6392 11 Lee 3807 3362 > diff = X5.Day_mileage - X4.Day_mileage > diff [1] -116 1612 1328 -264 1311 3110 -467 2112 34 1697 445 > t.test(diff) One Sample t-test data: diff t = 2.858, df = 10, p-value = 0.01701 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: 216.4276 1747.5724 sample estimates: mean of x 982
We reject the hypothesis of no difference. Note the free confidence interval. A 5-day week means roughly 216 to 1748 extra miles driven per week. We should also Plot the data!
> stem(diff) The decimal point is 3 digit(s) to the right of the | -0 | 531 0 | 04 1 | 3367 2 | 1 3 | 1
This is not too bad for such a small dataset. Note that we need to look at the differences, not the original data.
© 2006 Robert W. Hayden