Update to a program for saving a model fit as a dataset

Stata Technical Bulletin

We can compare this to the logistic regression analysis using only the complete observations:

. keep if x”=.

. logit y x, or

Logit estimates

Log likelihood = -299.1S92S

Number of obs =

LR chi2(l)

Prob > chi2 =

Pseudo R2 =

500

92.97

0.0000
0.1345

—

I odd-ratio Std. Err.

P>∣z∣

[957, Conf.

—

Interval]

---------₊-------------------------

x I 2.771684 .3326964

8.493

0.000

2.190638

3.506847

Note that the mean score estimate above had smaller standard error, reflecting the additional information used in the analysis.
Also, since i is a surrogate for .r, it is not used in the complete case analysis.

Next, we consider a real example of an application of the mean score method to a case-control study of the association
between ectopic pregnancy and sexually transmitted diseases; see Reilly and Pepe (1995) for a full description of the data

. use ectopic

. meanscor y gonn-chlam,first(gonn-sexptn) second(chlam)
meanscore estimates

	I	odd-ratio	Std. Err.	z	P>∣z∣	[957. Conf.	Interval]
cons	I	.4543184	.0987123	-3.631	0.000	.2967666	.6955137
gonn	I	.9495978	.2856096	-0.172	0.863	.5266531	1.712201
contr	I	.0943838	.0176643	-12.612	0.000	.0654021	.1362082
sexptn	I	2.099286	.4938943	3.152	0.002	1.323766	3.329139
chlam	I	2.471606	.7808384	2.864	0.004	1.330653	4.590858

For comparison, an analysis of complete cases only gives

. keep if chlam ~=.

. logit y gonn-chlam, or
Logit estimates Log likelihood = -169.54627		Number of obs = LR chi2(4) Prob > chi2 = Pseudo R2 =			327 104.24 0.0000 0.2351
— I	odd-ratio Std. Err.	z	P>∣z∣	[957. Conf.	— Interval]
— — —--— —--+—					—
gonn I	.7445515 .3132037	-0.701	0.483	.3264582	1.698095
contr I	.1098308 .0303352	-7.997	0.000	.063918	.1887231
sexptn I	1.93898 .7101447	1.808	0.071	.945853	3.97487
chlam I	2.47682 .7576623	2.965	0.003	1.359912	4.511054

References

Reilly, M. 1996. Optimal sampling strategies for two-stage studies. American Journal of Epidemiology 143: 92-100.

Reilly, M. and M. S. Pepe. 1995. A mean score method for missing and auxiliary covariate data in regression models. Biometrika 82: 299-314.

sg157 Predicted values calculated from linear or logistic regression models

Joanne M. Garrett, University of North Carolina, [email protected]

Abstract: The program predcalc for easily calculating predicted values and confidence intervals from linear or logistic regression
model estimates for specified values of the X variables is introduced and illustrated.

Keywords: regression models, predicted values.

Syntax

predcalc yvar, 7yt⅛x(xvarli.st) [ level (#) model linear ]

More intriguing information

1. BUSINESS SUCCESS: WHAT FACTORS REALLY MATTER?
2. An Interview with Thomas J. Sargent
3. Regional Intergration and Migration: An Economic Geography Model with Hetergenous Labour Force
4. The name is absent
5. The name is absent
6. The name is absent
7. Placentophagia in Nonpregnant Nulliparous Mice: A Genetic Investigation1
8. Manufacturing Earnings and Cycles: New Evidence
9. The name is absent
10. PRIORITIES IN THE CHANGING WORLD OF AGRICULTURE