Computing optimal sampling designs for two-stage studies



Stata Technical Bulletin

41

7

1       0

1

1

.001

8

8

1       0

2

2

.004

10

9

1       0

3

3

.013

10

10

1 1

1

4

.002

10

11

1 1

2

5

.003

10

12

1 1

3

6

.002

10

the optimal

sampling

fraction

(sample

size) for

grp_yz 1 = .118 (7)

the optimal

sampling

fraction

(sample

size) for

g^P_yz 2 = .231 (87)

the optimal

sampling

fraction

(sample

size) for

grp_yz 3 = .044 (82)

the optimal

sampling

fraction

(sample

size) for

grp_yz 4 = .145 (22)

the optimal

sampling

fraction

(sample

size) for

g^P_yz 5 = .079 (11)

the optimal

sampling

fraction

(sample

size) for

g^P_yz 6 = .119 (16)

the optimal

sampling

fraction

(sample

size) for

grp.yz 7 = 1 (3)

the optimal

sampling

fraction

(sample

size) for

grp.yz 8=1 (11)

the optimal

sampling

fraction

(sample

size) for

g^P_yz 9 = 1 (36)

the optimal

sampling

fraction

(sample

size) for

g^p_yz 10 = 1 (6)

the optimal

sampling

fraction

(sample

size) for

grp.yz 11 = 1 (8)

the optimal

sampling

fraction

(sample

size) for

g^P_yz 12 = 1 (6)

the optimal

number of obs = 2799

the minimum

variance

for Ive :

.00038298

total budget spent :

10023

Note that the optimal design samples all available cases and a varying proportion of controls in the different sex-weight categories.

Example 3

In Example 2, we used the optbud command to find an optimal design subject to a budget of £10,000, where the cost
per first-stage observation was
£2 and the cost per second-stage observation was £15. The minimum achievable variance for the
variable Ivedbp was .00038298.

Now we reverse our question. If we wish to achieve a variance of .00038298 for Ivedbp, what is the design that will
minimize the study cost? The function optprec calculates the design to minimize the cost subject to a desired precision, and
so can be used to answer this question.

. use wtpilot

. coding mort sex wtcat
(
output omitted )

. matrix prev=(0.02,.134,.670,.054,.05,.047,.001,.004,.013,.002,.003,.002)'

. optprec mort sex-surg,first(sex wtcat) prev(prev) var(7) prec(.00038298) cl(2) c2(15)

(output omitted )

The optimal design for this case is exactly the same as its counterpart in Example 2, as these are simply two ways of asking
the same question.

References

Reilly, M. 1996. Optimal sampling strategies for two-stage studies. American Journal of Epidemiology 143: 92-100.

Reilly, M. and M. S. Pepe. 1995. A mean score method for missing and auxiliary covariate data in regression models. Biometrika 82: 299-314.

sxd3 Sample size for the kappa-statistic of interrater agreement

Michael E. Reichenheim, Instituto de Medicina Social/UERJ, Brazil, [email protected]

Abstract: The dialog-box-driven program sskdlg for calculating the sample size for the kappa-statistic when there are two
unique raters evaluating a binary event is introduced and illustrated.

Keywords: sample size, kappa statistics, dialog box.

Introduction

In recent years there has been an increasing call for researchers in the fields of psychiatry and epidemiology to account for
the stochasticity of reliability estimators, among them the kappa-statistic measure of interrater agreement (Shrout and Newman
1989). Yet, if this issue is to be addressed properly, calculating sample sizes in the planning stage of an investigation becomes
mandatory. Although some proposals for calculating sample size are available in the literature (Linnet 1987, Donner and Eliasziw
1992, Cantor 1996, Walter et al. 1998; Shrout and Newman 1989), to our knowledge there has only been a limited implementation
in one sample size oriented package (Statistical Solutions 1999) and none in any of the major commercial statistical software
packages, Stata included.



More intriguing information

1. AN ECONOMIC EVALUATION OF THE COLORADO RIVER BASIN SALINITY CONTROL PROGRAM
2. The English Examining Boards: Their route from independence to government outsourcing agencies
3. ENVIRONMENTAL POLICY: THE LEGISLATIVE AND REGULATORY AGENDA
4. Investment in Next Generation Networks and the Role of Regulation: A Real Option Approach
5. A Critical Examination of the Beliefs about Learning a Foreign Language at Primary School
6. CROSS-COMMODITY PERSPECTIVE ON CONTRACTING: EVIDENCE FROM MISSISSIPPI
7. Cultural Neuroeconomics of Intertemporal Choice
8. The name is absent
9. The name is absent
10. The name is absent
11. The name is absent
12. The name is absent
13. The name is absent
14. The name is absent
15. The name is absent
16. ROBUST CLASSIFICATION WITH CONTEXT-SENSITIVE FEATURES
17. Growth and Technological Leadership in US Industries: A Spatial Econometric Analysis at the State Level, 1963-1997
18. The name is absent
19. The name is absent
20. A Rare Case Of Fallopian Tube Cancer