Stata Technical Bulletin
27
We can compare this to the logistic regression analysis using only the complete observations:
. keep if x”=.
. logit y x, or
Logit estimates Log likelihood = -299.1S92S |
Number of obs = LR chi2(l) Prob > chi2 = Pseudo R2 = |
500 92.97 0.0000 | ||
— I odd-ratio Std. Err. |
z |
P>∣z∣ |
[957, Conf. |
— Interval] |
---------+------------------------- x I 2.771684 .3326964 |
8.493 |
0.000 |
2.190638 |
3.506847 |
Note that the mean score estimate above had smaller standard error, reflecting the additional information used in the analysis.
Also, since i is a surrogate for .r, it is not used in the complete case analysis.
Next, we consider a real example of an application of the mean score method to a case-control study of the association
between ectopic pregnancy and sexually transmitted diseases; see Reilly and Pepe (1995) for a full description of the data
. use ectopic
. meanscor y gonn-chlam,first(gonn-sexptn) second(chlam)
meanscore estimates
I |
odd-ratio |
Std. Err. |
z |
P>∣z∣ |
[957. Conf. |
Interval] | |
cons |
I |
.4543184 |
.0987123 |
-3.631 |
0.000 |
.2967666 |
.6955137 |
gonn |
I |
.9495978 |
.2856096 |
-0.172 |
0.863 |
.5266531 |
1.712201 |
contr |
I |
.0943838 |
.0176643 |
-12.612 |
0.000 |
.0654021 |
.1362082 |
sexptn |
I |
2.099286 |
.4938943 |
3.152 |
0.002 |
1.323766 |
3.329139 |
chlam |
I |
2.471606 |
.7808384 |
2.864 |
0.004 |
1.330653 |
4.590858 |
For comparison, an analysis of complete cases only gives
. keep if chlam ~=.
. logit y gonn-chlam, or | |||||
Logit estimates Log likelihood = -169.54627 |
Number of obs = Prob > chi2 = Pseudo R2 = |
327 | |||
— I |
odd-ratio Std. Err. |
z |
P>∣z∣ |
[957. Conf. |
— Interval] |
— — —--— —--+— |
— | ||||
gonn I |
.7445515 .3132037 |
-0.701 |
0.483 |
.3264582 |
1.698095 |
contr I |
.1098308 .0303352 |
-7.997 |
0.000 |
.063918 |
.1887231 |
sexptn I |
1.93898 .7101447 |
1.808 |
0.071 |
.945853 |
3.97487 |
chlam I |
2.47682 .7576623 |
2.965 |
0.003 |
1.359912 |
4.511054 |
References
Reilly, M. 1996. Optimal sampling strategies for two-stage studies. American Journal of Epidemiology 143: 92-100.
Reilly, M. and M. S. Pepe. 1995. A mean score method for missing and auxiliary covariate data in regression models. Biometrika 82: 299-314.
sg157 Predicted values calculated from linear or logistic regression models
Joanne M. Garrett, University of North Carolina, [email protected]
Abstract: The program predcalc for easily calculating predicted values and confidence intervals from linear or logistic regression
model estimates for specified values of the X variables is introduced and illustrated.
Keywords: regression models, predicted values.
Syntax
predcalc yvar, 7yt⅛x(xvarli.st) [ level (#) model linear ]