Computing optimal sampling designs for two-stage studies



26


Stata Technical Bulletin


STB-58


Description

The meanscor command performs a weighted logistic regression using the mean score method. This function requires the
complete covariate(s) to be categorical, and the default output contains the regression coefficient estimates and their standard
errors in odds-ratio form.

An important area of application of this function is in the analysis of data from a two-stage study. In this type of study,
some variables are incomplete due to only a subset of the study subjects being sampled at the second stage (Reilly 1996).

Options

f irst Vaarlist) specifies the complete covariates.

second(aαrlist) specifies the incomplete covariates.

odd(#) specifies whether the odds-ratio (odd = 1) or regression coefficients (odd = 0) format is reported. Default value is 1.

Methods and Formulas

The mean score estimates will maximize the weighted likelihood

Z nI(zi,yi)

Σ (1 + √⅛j) ⅛spβ(yih)

where          is the number of incomplete observations in each stratum defined by the different levels of response and

complete covariates Zi, and nc^i,yi^ is the number of complete observations in each stratum.

As the above equation indicates, the mean score method weights each complete observation according to the total number
of observations in the same stratum.

The asymptotic variance of the mean score estimate is given by

Var(3) = ɪ 11v)

where n is total number of observations, and I is the usual information matrix. V is estimated by the matrix

n(y, Z)n1^y'z'>

Σ  nc(y,z)  Var(⅜*)

(У,*)

where Var{Sβ{y, z)) is the variance-covariance matrix of the score in each {y,z) stratum.

We can regard the second term of the variance expression as a penalty for the incompletely observed observations. Hence,
the mean score estimates will have larger variance than the estimates obtained if all observations were complete but smaller
variance than the estimates from an analysis of complete cases only.

Examples

We begin with a simulated dataset. We generated 1,000 observations of a predictor variable x from the standard normal
distribution. The response variable
y was then generated as a Bernoulli random variable with p = ехр(ж)/{1 + ехр(ж)}. A
dichotomous surrogate variable for
x. called z, was generated as one for positive x and zero otherwise.

A random subsample of 500 observations had their x value deleted (set to missing). The dataset, called sim_miss.dta is
provided with this insert as an illustration and can be analyzed using the mean score method by

. use sim-miss

. meanscor у x,first(z) second(x)
meanscore estimates


I

odd-ratio

Std. Err.

z

P>z

[957. Conf.

Interval]

cons I

1.050643

.0751759

0.690

0.490

.9131663

1.208817

X I

2.770173

.282211

10.002

0.000

2.268772

3.382384



More intriguing information

1. Large Scale Studies in den deutschen Sozialwissenschaften:Stand und Perspektiven. Bericht über einen Workshop der Deutschen Forschungsgemeinschaft
2. Feeling Good about Giving: The Benefits (and Costs) of Self-Interested Charitable Behavior
3. The name is absent
4. The name is absent
5. SOME ISSUES IN LAND TENURE, OWNERSHIP AND CONTROL IN DISPERSED VS. CONCENTRATED AGRICULTURE
6. Effects of a Sport Education Intervention on Students’ Motivational Responses in Physical Education
7. Personal Income Tax Elasticity in Turkey: 1975-2005
8. Announcement effects of convertible bond loans versus warrant-bond loans: An empirical analysis for the Dutch market
9. DIVERSITY OF RURAL PLACES - TEXAS
10. Segmentación en la era de la globalización: ¿Cómo encontrar un segmento nuevo de mercado?
11. The name is absent
12. LIMITS OF PUBLIC POLICY EDUCATION
13. BILL 187 - THE AGRICULTURAL EMPLOYEES PROTECTION ACT: A SPECIAL REPORT
14. Methods for the thematic synthesis of qualitative research in systematic reviews
15. THE WELFARE EFFECTS OF CONSUMING A CANCER PREVENTION DIET
16. The Cost of Food Safety Technologies in the Meat and Poultry Industries.
17. The name is absent
18. Indirect Effects of Pesticide Regulation and the Food Quality Protection Act
19. The name is absent
20. Equity Markets and Economic Development: What Do We Know