We will use data on miles driven for the same 11 people working 4-day weeks and 5-day weeks.. We looked at the files in the R directory to find out what the name of the file was (mileage.txt), then read it into a frame named mileage. We just read it again to see what the variable names were. Then we created a variable diff for the differences, typed diff to see them, and then used t.test on the differences.
> mileage <- read.delim(file="mileage.txt",header=TRUE)
> attach(mileage)
> read.delim(file="mileage.txt",header=TRUE)
Name X5.Day_mileage X4.Day_mileage
1 Jeff 2798 2914
2 Betty 7724 6112
3 Roger 7505 6177
4 Tom 838 1102
5 Aimee 4592 3281
6 Greg 8107 4997
7 Larry G 1228 1695
8 Tad 8718 6606
9 Larry M 1097 1063
10 Leslie 8089 6392
11 Lee 3807 3362
> diff = X5.Day_mileage - X4.Day_mileage
> diff
[1] -116 1612 1328 -264 1311 3110 -467 2112 34 1697 445
> t.test(diff)
One Sample t-test
data: diff
t = 2.858, df = 10, p-value = 0.01701
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
216.4276 1747.5724
sample estimates:
mean of x
982
We reject the hypothesis of no difference. Note the free confidence interval. A 5-day week means roughly 216 to 1748 extra miles driven per week. We should also Plot the data!
> stem(diff) The decimal point is 3 digit(s) to the right of the | -0 | 531 0 | 04 1 | 3367 2 | 1 3 | 1
This is not too bad for such a small dataset. Note that we need to look at the differences, not the original data.
© 2006 Robert W. Hayden