Stata Technical Bulletin
sg42.1
Extensions to the regpred command
Mead Over, World Bank, [email protected]
regpred2 is a superset of Joanne Garrett’s useful regpred command which appeared in STB-26, July 1995, as entry sg42.
regpred2 does everything that regpred does and adds four additional options: inst, one, zero, and level.
The syntax for regpred2 is
regpred2 yvar xvar [if exp, from(#) to(#) [ inc(#)
adjust Covvlitf) inst (ivlisf) one (varlisf) zero(,aasiitf)
level (#) poly(#) nomodel nolist noplot graph-options ]
The inst option adds the capability to perform instrumental variable estimation. If the inst option is specified with a list
of instrumental variables, regpred2 feeds that list to the regress command which uses it to produce instrumental variable
estimates in the conventional manner, which is documented in the Stata manual. The predictions and forecast interval are then
calculated and presented using the instrumental variable (or two-stage least squares) estimates instead of the ordinary least squares
estimates.
Examples of the one() and zero() options
The regpred command includes the option adjust(covlist) which allows the user to specify a list of covariates which will
be set to their means in computing the predicted values. In applications where some of the right-hand-side variables are dummy
variables to represent categorical variables, it is interesting to compute predictions for specific values of those dummy variables.
Using one of the examples supplied in sg42, suppose that the regression is of serum cholesterol on age and race. The command
. regpred2 chi age, adj(race) from(40) to(80) poly(2)
will present predictions of the (quadratic) relationship between age and cholesterol for the person of average race in the data just
as would the original regpred. However, for various reasons this may be of less interest than the separate curves for race==0
and race==l. These separate curves can be produced by the commands:
. regpred2 chi age, adj(race) from(40) to(80) poly(2) zero(race)
. regpred2 chi age, adj(race) from(40) to(80) poly(2) one(race)
It might be instructive to superimpose the two graphs in the same figure. regpred2 will not superimpose the two separate
graphs, but the user can do this with the Stata Graphics Editor (STAGE) program available separately from Stata. Alternatively,
the predicted values from the two executions of regpred2 can be retained and assembled using an explicit graph command.
A categorical variable might have more than two values. For example, there might be three “races” in the data. In this case
the three would be represented by two categorical variables such as
Value of
dummy variable
Race of subject |
racew |
raceb |
white |
1 |
0 |
black |
0 |
1 |
asian |
0 |
0 |
The third dummy, racea, must be omitted from the regression in order to avoid perfect multicollinearity. With this arrangement
of the data, regpred2 can be used to predict the values of each of the three races by these commands:
For the variable white the command would be
. regpred2 chi age, adj(race) from(40) to(80) one(racew) zero(raceb)