Stata Technical Bulletin
17
Statistics calculated by the i option are stored in variables named presid, hat, stpresid, etc. Here is a listing of the
additional diagnostic variables created.
logindex = Logit; Index value
sepred = Standard error of index
pred = Probability of success (1)
mpred = Prob of covariate pattern success
presid = Pearson Residual
stpresid = Standardized Pearson Residual
hat = Hat matrix diagonal
dev = Deviance
cook = Cookzs distance
deltad = Change in Deviance
deltax = Change in Pearson chi-square
deltab = Difference in coefficient due to
deletion of observation and others
The formulas for each are given below, although the interested reader is directed to Hosmer and Lemeshow (1989) for a
detailed discussion. Also see Hamilton (1992).
Stored in presid is Tj = (г/j — mjpredj)/ʌ/mhpredj (1 — pre⅛) where j represents the observation number, r∏j the number
of observations sharing J’s covariate pattern, and yj the number of positive responses within the covariate pattern of which j
is a member. predj is the predicted probability of a positive outcome. Note that this residual is the same for all observations
sharing the same covariate pattern.
mpred contains mjpredj, which is the expected number of positive responses for observations sharing J’s covariate pattern.
hat contains hj = (sepred^)(mjpredj)(l — predj), where sepred represents the standard error of index.
stpresid contains rsj = Tj∕-∖∕1 — hj.
dev contains _______________________________________________________
⅛ = ±'A(¾ In (-*±-τ) + (,,,j - 1,j) l,ɪ f J⅛ - ¾) ɔi
j y V7 ∖mjpredj ) v j 2 ∖mj(l - predj) ) )
The sign of the deviance residual is identical to the sign of (yj — mjpredj). For observations having only a single covariate
pattern dj = y∕-2 lnpre⅛ for observed positive responses and dj = —ʌ/-2 ln(l — pre⅛) for observed negative responses.
cook contains r2hj∕(l — hj).
deltax contains r2∕(q- hj) = r%j.
deltad contains d^ + ((r2hj)∕(l — hj)).
deltab contains r2hj∕(l — hj)2.
The Pearson χ2 statistic is ∑rj and the deviance statistic is V<∕^.
Hosmer and Lemeshow suggest the following four diagnostic plots:
. gr deltax pred, xlab ylab yline(4)
. gr deltad pred, xlab ylab yline(4)
. gr deltab pred, xlab ylab yline(l)
. gr deltax pred=deltab, xlab ylab yline(4)
Observations or covariate patterns whose values exceed yline are considered significantly influential.
An Example
Hosmer and Lemeshow present a full model example based on a study of low birth-weight babies. The following is part
of the output.
. logiodd2 low age race2 race3 smoke ht ui Iwd ptd interl inter2, i
Number of Predictors = 10
Number of Non-Missing Obs = 189
Number of Covariate Patterns = 128
Pearson X2 Statistic = 137.7503
P>chi2(117) = 0.0923
Deviance = 147.3371
P>chi2(117) = 0.0303
More intriguing information
1. Correlation Analysis of Financial Contagion: What One Should Know Before Running a Test2. Insecure Property Rights and Growth: The Roles of Appropriation Costs, Wealth Effects, and Heterogeneity
3. The name is absent
4. The name is absent
5. Constrained School Choice
6. The Dynamic Cost of the Draft
7. The name is absent
8. The name is absent
9. Voluntary Teaming and Effort
10. The name is absent