Computing optimal sampling designs for two-stage studies



Stata Technical Bulletin

41

7

1       0

1

1

.001

8

8

1       0

2

2

.004

10

9

1       0

3

3

.013

10

10

1 1

1

4

.002

10

11

1 1

2

5

.003

10

12

1 1

3

6

.002

10

the optimal

sampling

fraction

(sample

size) for

grp_yz 1 = .118 (7)

the optimal

sampling

fraction

(sample

size) for

g^P_yz 2 = .231 (87)

the optimal

sampling

fraction

(sample

size) for

grp_yz 3 = .044 (82)

the optimal

sampling

fraction

(sample

size) for

grp_yz 4 = .145 (22)

the optimal

sampling

fraction

(sample

size) for

g^P_yz 5 = .079 (11)

the optimal

sampling

fraction

(sample

size) for

g^P_yz 6 = .119 (16)

the optimal

sampling

fraction

(sample

size) for

grp.yz 7 = 1 (3)

the optimal

sampling

fraction

(sample

size) for

grp.yz 8=1 (11)

the optimal

sampling

fraction

(sample

size) for

g^P_yz 9 = 1 (36)

the optimal

sampling

fraction

(sample

size) for

g^p_yz 10 = 1 (6)

the optimal

sampling

fraction

(sample

size) for

grp.yz 11 = 1 (8)

the optimal

sampling

fraction

(sample

size) for

g^P_yz 12 = 1 (6)

the optimal

number of obs = 2799

the minimum

variance

for Ive :

.00038298

total budget spent :

10023

Note that the optimal design samples all available cases and a varying proportion of controls in the different sex-weight categories.

Example 3

In Example 2, we used the optbud command to find an optimal design subject to a budget of £10,000, where the cost
per first-stage observation was
£2 and the cost per second-stage observation was £15. The minimum achievable variance for the
variable Ivedbp was .00038298.

Now we reverse our question. If we wish to achieve a variance of .00038298 for Ivedbp, what is the design that will
minimize the study cost? The function optprec calculates the design to minimize the cost subject to a desired precision, and
so can be used to answer this question.

. use wtpilot

. coding mort sex wtcat
(
output omitted )

. matrix prev=(0.02,.134,.670,.054,.05,.047,.001,.004,.013,.002,.003,.002)'

. optprec mort sex-surg,first(sex wtcat) prev(prev) var(7) prec(.00038298) cl(2) c2(15)

(output omitted )

The optimal design for this case is exactly the same as its counterpart in Example 2, as these are simply two ways of asking
the same question.

References

Reilly, M. 1996. Optimal sampling strategies for two-stage studies. American Journal of Epidemiology 143: 92-100.

Reilly, M. and M. S. Pepe. 1995. A mean score method for missing and auxiliary covariate data in regression models. Biometrika 82: 299-314.

sxd3 Sample size for the kappa-statistic of interrater agreement

Michael E. Reichenheim, Instituto de Medicina Social/UERJ, Brazil, [email protected]

Abstract: The dialog-box-driven program sskdlg for calculating the sample size for the kappa-statistic when there are two
unique raters evaluating a binary event is introduced and illustrated.

Keywords: sample size, kappa statistics, dialog box.

Introduction

In recent years there has been an increasing call for researchers in the fields of psychiatry and epidemiology to account for
the stochasticity of reliability estimators, among them the kappa-statistic measure of interrater agreement (Shrout and Newman
1989). Yet, if this issue is to be addressed properly, calculating sample sizes in the planning stage of an investigation becomes
mandatory. Although some proposals for calculating sample size are available in the literature (Linnet 1987, Donner and Eliasziw
1992, Cantor 1996, Walter et al. 1998; Shrout and Newman 1989), to our knowledge there has only been a limited implementation
in one sample size oriented package (Statistical Solutions 1999) and none in any of the major commercial statistical software
packages, Stata included.



More intriguing information

1. Does Market Concentration Promote or Reduce New Product Introductions? Evidence from US Food Industry
2. Quality Enhancement for E-Learning Courses: The Role of Student Feedback
3. El impacto espacial de las economías de aglomeración y su efecto sobre la estructura urbana.El caso de la industria en Barcelona, 1986-1996
4. THE MEXICAN HOG INDUSTRY: MOVING BEYOND 2003
5. Strategic monetary policy in a monetary union with non-atomistic wage setters
6. The Composition of Government Spending and the Real Exchange Rate
7. EDUCATIONAL ACTIVITIES IN TENNESSEE ON WATER USE AND CONTROL - AGRICULTURAL PHASES
8. The name is absent
9. THE ANDEAN PRICE BAND SYSTEM: EFFECTS ON PRICES, PROTECTION AND PRODUCER WELFARE
10. Should Local Public Employment Services be Merged with the Local Social Benefit Administrations?