Update to a program for saving a model fit as a dataset



Stata Technical Bulletin

27


We can compare this to the logistic regression analysis using only the complete observations:

. keep if x”=.

. logit y x, or

Logit estimates

Log likelihood = -299.1S92S

Number of obs =

LR chi2(l)

Prob > chi2     =

Pseudo R2       =

500

92.97

0.0000
0.1345

I odd-ratio Std. Err.

z

P>z

[957, Conf.

Interval]

---------+-------------------------

x I 2.771684   .3326964

8.493

0.000

2.190638

3.506847

Note that the mean score estimate above had smaller standard error, reflecting the additional information used in the analysis.
Also, since
i is a surrogate for .r, it is not used in the complete case analysis.

Next, we consider a real example of an application of the mean score method to a case-control study of the association
between ectopic pregnancy and sexually transmitted diseases; see Reilly and Pepe (1995) for a full description of the data

. use ectopic

. meanscor y gonn-chlam,first(gonn-sexptn) second(chlam)
meanscore estimates

I

odd-ratio

Std. Err.

z

P>z

[957. Conf.

Interval]

cons

I

.4543184

.0987123

-3.631

0.000

.2967666

.6955137

gonn

I

.9495978

.2856096

-0.172

0.863

.5266531

1.712201

contr

I

.0943838

.0176643

-12.612

0.000

.0654021

.1362082

sexptn

I

2.099286

.4938943

3.152

0.002

1.323766

3.329139

chlam

I

2.471606

.7808384

2.864

0.004

1.330653

4.590858

For comparison, an analysis of complete cases only gives

. keep if chlam ~=.

. logit y gonn-chlam, or

Logit estimates

Log likelihood = -169.54627

Number of obs =
LR chi2(4)

Prob > chi2     =

Pseudo R2       =

327
104.24
0.0000
0.2351

I

odd-ratio Std. Err.

z

P>z

[957. Conf.

Interval]

— — —--— —--+—

gonn I

.7445515   .3132037

-0.701

0.483

.3264582

1.698095

contr I

.1098308   .0303352

-7.997

0.000

.063918

.1887231

sexptn I

1.93898   .7101447

1.808

0.071

.945853

3.97487

chlam I

2.47682   .7576623

2.965

0.003

1.359912

4.511054

References

Reilly, M. 1996. Optimal sampling strategies for two-stage studies. American Journal of Epidemiology 143: 92-100.

Reilly, M. and M. S. Pepe. 1995. A mean score method for missing and auxiliary covariate data in regression models. Biometrika 82: 299-314.

sg157 Predicted values calculated from linear or logistic regression models

Joanne M. Garrett, University of North Carolina, [email protected]

Abstract: The program predcalc for easily calculating predicted values and confidence intervals from linear or logistic regression
model estimates for specified values of the
X variables is introduced and illustrated.

Keywords: regression models, predicted values.

Syntax

predcalc yvar, 7yt⅛x(xvarli.st) [ level (#) model linear ]



More intriguing information

1. BUSINESS SUCCESS: WHAT FACTORS REALLY MATTER?
2. An Interview with Thomas J. Sargent
3. Regional Intergration and Migration: An Economic Geography Model with Hetergenous Labour Force
4. The name is absent
5. The name is absent
6. The name is absent
7. Placentophagia in Nonpregnant Nulliparous Mice: A Genetic Investigation1
8. Manufacturing Earnings and Cycles: New Evidence
9. The name is absent
10. PRIORITIES IN THE CHANGING WORLD OF AGRICULTURE