Stata Technical Bulletin
17
Statistics calculated by the i option are stored in variables named presid, hat, stpresid, etc. Here is a listing of the
additional diagnostic variables created.
logindex = Logit; Index value
sepred = Standard error of index
pred = Probability of success (1)
mpred = Prob of covariate pattern success
presid = Pearson Residual
stpresid = Standardized Pearson Residual
hat = Hat matrix diagonal
dev = Deviance
cook = Cookzs distance
deltad = Change in Deviance
deltax = Change in Pearson chi-square
deltab = Difference in coefficient due to
deletion of observation and others
The formulas for each are given below, although the interested reader is directed to Hosmer and Lemeshow (1989) for a
detailed discussion. Also see Hamilton (1992).
Stored in presid is Tj = (г/j — mjpredj)/ʌ/mhpredj (1 — pre⅛) where j represents the observation number, r∏j the number
of observations sharing J’s covariate pattern, and yj the number of positive responses within the covariate pattern of which j
is a member. predj is the predicted probability of a positive outcome. Note that this residual is the same for all observations
sharing the same covariate pattern.
mpred contains mjpredj, which is the expected number of positive responses for observations sharing J’s covariate pattern.
hat contains hj = (sepred^)(mjpredj)(l — predj), where sepred represents the standard error of index.
stpresid contains rsj = Tj∕-∖∕1 — hj.
dev contains _______________________________________________________
⅛ = ±'A(¾ In (-*±-τ) + (,,,j - 1,j) l,ɪ f J⅛ - ¾) ɔi
j y V7 ∖mjpredj ) v j 2 ∖mj(l - predj) ) )
The sign of the deviance residual is identical to the sign of (yj — mjpredj). For observations having only a single covariate
pattern dj = y∕-2 lnpre⅛ for observed positive responses and dj = —ʌ/-2 ln(l — pre⅛) for observed negative responses.
cook contains r2hj∕(l — hj).
deltax contains r2∕(q- hj) = r%j.
deltad contains d^ + ((r2hj)∕(l — hj)).
deltab contains r2hj∕(l — hj)2.
The Pearson χ2 statistic is ∑rj and the deviance statistic is V<∕^.
Hosmer and Lemeshow suggest the following four diagnostic plots:
. gr deltax pred, xlab ylab yline(4)
. gr deltad pred, xlab ylab yline(4)
. gr deltab pred, xlab ylab yline(l)
. gr deltax pred=deltab, xlab ylab yline(4)
Observations or covariate patterns whose values exceed yline are considered significantly influential.
An Example
Hosmer and Lemeshow present a full model example based on a study of low birth-weight babies. The following is part
of the output.
. logiodd2 low age race2 race3 smoke ht ui Iwd ptd interl inter2, i
Number of Predictors = 10
Number of Non-Missing Obs = 189
Number of Covariate Patterns = 128
Pearson X2 Statistic = 137.7503
P>chi2(117) = 0.0923
Deviance = 147.3371
P>chi2(117) = 0.0303
More intriguing information
1. The name is absent2. ESTIMATION OF EFFICIENT REGRESSION MODELS FOR APPLIED AGRICULTURAL ECONOMICS RESEARCH
3. The name is absent
4. Inflation Targeting and Nonlinear Policy Rules: The Case of Asymmetric Preferences (new title: The Fed's monetary policy rule and U.S. inflation: The case of asymmetric preferences)
5. Firm Creation, Firm Evolution and Clusters in Chile’s Dynamic Wine Sector: Evidence from the Colchagua and Casablanca Regions
6. Tastes, castes, and culture: The influence of society on preferences
7. he Virtual Playground: an Educational Virtual Reality Environment for Evaluating Interactivity and Conceptual Learning
8. Enterpreneurship and problems of specialists training in Ukraine
9. The name is absent
10. The name is absent