42
Stata Technical Bulletin
STB-58
This article presents a dialog-box-driven program sskdlg to calculate the sample size for the kappa statistic when there
are two unique raters evaluating a binary event. sskdlg is geared towards calculating a sample size from a precision oriented
perspective, that is, choosing a sample size so that the standard error of the estimate and the resulting limits for a confidence
interval do not exceed specified values. The program is based on the asymptotic variance presented by Fleiss, et al. (1969) (see
also Fleiss 1981, equations 13.15-13.18) and follows the procedure outlined by Cantor (1996). This procedure is based on a
quantity
+ (I-TT0)2
Q = (1 - τrc) 1
7Γe)
(7Γ.∙i +7Γj.)(l -TT0)]2
/ ) 2ττf. I
'i≠j
ττo)2}
where, given a 2 × 2 table, τre = 7Γι.τr.ι + 7r2.7r.2 and τro = тгц + 7r22. Since Q equals the variance of kappa times the sample
size, the latter can be solved out and calculated.
I Sample size for kappa X∣
0
ÔT
∩T
0√Γ
95]
Hypothsized kappa
Prop. pos. by rater 1
Prop. pos. by rater 2
Half-width conf. int.
Confidence level
Γ^ Show value of Q
I- Show value of maximum Q
------ GraphOptions ------
GraphQ I . * kappa
--1 о Ш
GraphS I Г k-rr'ax
, X: prop-
D¾EE∣ ∣o I r∣^crη
ι X: ssize
GraphD I ∣Q ∣ ∣5□∩~~∣
I- Keep variables on exit
I- Sample size for diff. ∣1 OOO ∣
OK I Exit I Help I
The dialog box is presented above. Without any options, clicking the “OK” button displays the calculated sample size in
the Results window. The following five parameters are needed.
T The estimate of kappa the researcher expects to find. sskdlg uses the default value of 0 which presumes calculating a
standard error when both ratings are independent. Although more realistic values are possible and should be encouraged,
this default value, albeit extremely conservative, is suitable for projecting the sample size of a study that, ultimately,
would be analyzed using the traditional equation for the standard error as found in Stata’s кар program. Depending on the
specification of other conditions, the expected kappa can take any value within the permissible bounds, never exceeding
— 1 to 1. If the specified kappa is incompatible with the selected marginals (proportion of positives expected by each rater)
or is outside the plausible range, sskdlg outputs a warning and the sample size is not calculated.