The data here come from a huge table of records of heart attack victims. Getting tables into R is a bit complicated so use this file which contains only the data on the DIED variable (coded 1=died). Save it on your hard drive in the directory where the R program is located. If you name the file DIED4R.txt, you can use this R command to input the data
> died = scan(file="DIED4R.txt") Read 12844 items
This puts the data into a variable called "died". Use table on this variable to get counts if you do not already have them.
> table(died) died 0 1 11434 1410
1410 of the patients died. A single command gives confidence intervals and tests any hypothetical p0 specified. Here we test whether the results from this hospital match a hypothetical national average of 10%. Ignore the X-squared value and use the p-value for a hypothesis test. We need the number of 1's (from the table command), the number of subjects (from scan or length), and the hypothesized proportion.
> prop.test(1410,12844,p=0.1) 1-sample proportions test with continuity correction data: 1410 out of 12844, null probability 0.1 X-squared = 13.5385, df = 1, p-value = 0.0002337 alternative hypothesis: true p is not equal to 0.1 95 percent confidence interval: 0.1044507 0.1153421 sample estimates: p 0.1097789
We reject the hypothesis that the local proportion is the same as the national proportion. However, the confidence interval indicates that it is only slightly higher. Is that national average exactly 10.0000%? Is there enough of a difference to matter?
© 2006 Robert W. Hayden