Update to a program for saving a model fit as a dataset



38


Stata Technical Bulletin


STB-58


Notice that each of the optfixn, opt bud, and optprec commands have an option coding, which can be used if one is
sure of the order in which the vector of first-stage sample sizes or prevalences should be entered. This option results in coding
being automatically called from inside the optimal sampling command. Since this results in the creation of variables named
grp_yz and grp_z, an error message will be generated if one already has variables with these names.

Options

V irst Vmrrlitt) specifies the first-stage covariates.

nl Vecarmee) specifies the vector of first-stage sample sizes for each stratum.

prev Vecarmee) specifies the vector of prevalences for each stratum.

n2(#) specifies the second-stage sample sizes (used only with optfixn).

b(#) specifies the available budget (used only with opt bud).

cl(#) specifies the cost per observation at the first stage (used with optbud and optprec).

c2(#) specifies the cost per observation at the second stage (used with optbud and optprec).

var(#) specifies the position in the logistic regression model of the covariate whose variance is to be minimized (that is,
optimized). For example, in the simple model
Y = bo + δι-X"ι + b%Xw, if we want to minimize the variance of Xi, then
var = 2.

prec(#) specifies the desired precision, that is, the variance (used only with optprec).

coding(#) is a logical flag; the default of 0 (that is, false) means that prior to calling optfixn, optbud, or optprec one must
have run the coding command.

Example 1

The following example is from CASS (Coronary Artery Surgery Study) and appears in Reilly (1996). This study collected
data on the operative mortality and various risk factors for 8,096 subjects. Let us suppose that at the first stage we have only
mortality status
Y and sex Z as specified in the table below, and that it has been agreed to record the age for a subsample of
1,000 subjects in order to estimate the sex-adjusted odds ratio for age. The example is fictitious as we do have all the covariates
on all subjects, but for illustrative purposes we ignore this information (that is, set values to missing). In order to compute
optimal sample sizes, we require pilot data in all of the strata of the table, and so we “sampled” (reset the missing values to
the actual age values) for a randomly selected 25 observations from each stratum. The resulting dataset of 100 observations is
available as pilotcas accompanying this insert.

male female

_______Y Z = O Z=I

alive У = 0   6,666   1,228

deceased У = 1    144     58

We start by computing the optimal allocation for a second-stage sample of 1,000.

. use pilotcas

. coding mort sex

grp.yz

mort

sex

g IT-Z

nobs

1

0

0

1

25

2

0

1

2

25

3

1

0

1

25

4

1

1

2

25

for functions requiring first stage sample sizes∕prevalences
enter these in the order of grp_yz

The coding function tells us that we have to enter the vector of first-stage sample sizes in the order specified in the following
table.

First element
Second element
Third element
Fourth element


grp-yz = 1

grp_yz = 2

grp_yz = 3

grp_yz = 4


first-stage sample sizes for living (mort = 0) males (sex = 0)
first-stage sample sizes for living females

first-stage sample sizes for deceased (mort = 1) males

first-stage sample sizes for deceased females



More intriguing information

1. MULTIPLE COMPARISONS WITH THE BEST: BAYESIAN PRECISION MEASURES OF EFFICIENCY RANKINGS
2. The name is absent
3. Locke's theory of perception
4. Inflation and Inflation Uncertainty in the Euro Area
5. Has Competition in the Japanese Banking Sector Improved?
6. The duration of fixed exchange rate regimes
7. The name is absent
8. Strategic Planning on the Local Level As a Factor of Rural Development in the Republic of Serbia
9. Evaluating the Success of the School Commodity Food Program
10. The name is absent
11. Transgression et Contestation Dans Ie conte diderotien. Pierre Hartmann Strasbourg
12. Governance Control Mechanisms in Portuguese Agricultural Credit Cooperatives
13. The name is absent
14. A NEW PERSPECTIVE ON UNDERINVESTMENT IN AGRICULTURAL R&D
15. Activation of s28-dependent transcription in Escherichia coli by the cyclic AMP receptor protein requires an unusual promoter organization
16. The name is absent
17. Public Debt Management in Brazil
18. Disturbing the fiscal theory of the price level: Can it fit the eu-15?
19. Antidote Stocking at Hospitals in North Palestine
20. Mergers under endogenous minimum quality standard: a note