As you can tell, there are several ways to obtain a confidence interval for a process probability. When the sample size is large, they will yield very similar results for the endpoints and the coverage rate. When the sample size is small, the Adjusted Wald method is preferred. You can use the Clopper-Pearson confidence interval method when the sample size is small, but the Binomial intervals tend to be wider and donβt have the estimate \(\pm\) margin-of-error simplicity), but is more computationally efficient than the Blaker method. Below we compare the 95% confidence intervals for the 8 of 10 and 71 of 361 studies for these methods plus a fourth, the "Score interval" which is a more direct inversion of the one sample proportion \(z\)-test; as well as the adjusted Blaker method discussed in Investigation 1.6.
Wald interval (aka normal approximation aka asymptotic). Finds the values of \(\pi\) so that \(P(-z \le \frac{\hat{p} - \pi}{\sqrt{\hat{p}(1-\hat{p})/n}} \le z) = C\text{.}\) Formula: \(\hat{p} \pm z^* \sqrt{\hat{p}(1-\hat{p})/n}\text{.}\) Tendency for poor coverage properties if small \(n\) or extreme \(\pi\)
(.5521, 1.048)
(.1557, .2377)
TBI applet or JMP or R (iscamonepropztest)
Plus Four (special case of Adjusted Wald). (95%) Add two successes and two failures to sample results. Formula: \(\tilde{p} \pm z^* \sqrt{\frac{\tilde{p}(1-\tilde{p})}{n}}\text{;}\) where \(\tilde{p} = (X + 2)/(n + 4)\) and \(\tilde{n} = n + 4\text{.}\) Very good coverage properties
(.4776, .9509)
(.1590, .2410)
TBI applet or JMP or R after adjust sample data
Adjusted Wald (aka Agresti-Coull). \(\tilde{p} \pm z^* \sqrt{\frac{\tilde{p}(1-\tilde{p})}{\tilde{n}}}\text{;}\)\(\tilde{p} = (X + 0.5z^{*2})/(n + z^{*2})\text{,}\)\(\tilde{n} = n + z^{*2}\text{.}\) Very good coverage properties.
(.4794, .9541)
(.1588, .2409)
Wilson interval (aka Score interval). Finds the values of \(\pi\) so that \(P(-z \le \frac{\hat{p} - \pi}{\sqrt{\pi(1-\pi)/n}} \le z) = C\text{.}\) Formula: \(\frac{\hat{p} + \frac{1}{2n}z^2 \pm z \sqrt{\frac{\hat{p}(1-\hat{p})}{n} + \frac{z^2}{4n^2}}}{1 + z^2/n}\text{.}\) Can have poor coverage properties with \(\pi\) near 0 or 1.
(.4902, .9433)
(.1590, .2408)
JMP (Analyze > Distribution) or R prop.test w/o continuity corr.
(.4422, .9646)
(.1577, .2423)
R prop.test w/ continuity corr.
Exact Binomial (Clopper-Pearson). Finds the values of \(\pi\) so \(P(X \le k) \ge (1 - C)/2\) and \(P(X \ge k) \ge (1 - C)/2\) where \(k\) is observed number of successes. Tends to be "conservative" (longer than necessary)
(.4439, .9748)
(.1569, .2415)
JMP (summary stats) or R using iscambinomtest
Blaker (Inv 1.6). Finds the values of \(\pi\) so the two-sided p-value (using tail probabilities) is less \((1 - C)/2\text{.}\) Tends to be shorter than the Clopper-Pearson interval.
(.4445, .9632)
(.1570, .2403)
R (blakerCI)
Whatever procedure is used to determine the confidence interval, you interpret a (valid) interval the same way β as the interval of plausible values for the parameter. For example, we are 95% confident that the underlying probability of death within 30 days of a heart transplant operation at St. Georgeβs Hospital is between 0.16 and 0.24; where by "95% confident," we mean that if we were to use this procedure to construct intervals from thousands of representative samples, roughly 95% of those intervals will succeed in capturing the actual (but unknown) value of the parameter of interest.
Also note how to use technology to perform these calculations. You should NOT use the Simulating Confidence Intervals applet to construct a confidence interval for a particular sample of data.