16
Stata Technical Bulletin
STB-20
Categorical variable interactions
VaVamameι*VaVnaname∙r creates the dummy variables associated with the interaction of the categorical variables varname±
and vaaname2. The identification rules—which categories are omitted—are the same as for i. .vrrname. For instance, assume
agegrp takes on four values and race takes on three values. Typing,
. xi: regress y i.agegrp*i.race
results in the model:
+ = a + t>2 Iagegr_2 + ⅛ Iagegr_3 + Ô4 Iagegr _4 (agegrp dummies)
+c2 Irace_2 + C3 Irace_3 (race dummies)
+d22 IaXr_2_2 + ⅛3 IaXr_2_3 + ⅛2 IaXr_3_2 + ⅛3 IaXr_3_3 (agegrp*race dummies)
÷√'∕ ∣2 IaXr-4~2 ⅛ d43 IaXr _4_3
+u
That is,
. xi: regress y i.agegrp*i.race
results in the same model as typing:
. xi: regress y i.agegrp i.race i.agegrp*i.race
While there are lots of other ways the interaction could have been parameterized, this method has the advantage that one can
test the joint significance of the interactions by typing:
. testparm IaXr*
Returning to the estimation step, whether you specify i.agegrp*i.race or i.race*i.agegrp makes no difference (other than
in the names given to the interaction terms; in the first case, the names will begin with IaXr; in the second, IrXa). Thus,
. xi: regress y i.race*!.agegrp
estimates the same model.
You may also include multiple interactions simultaneously:
. xi: regress y i.agegrp*i.race i.agegrp*i.sex
The model estimated is
+ = ++ b2 Iagegr_2 + ⅛ Iagegr_3 + Ô4 Iagegr_4 (agegrp dummies)
+c2 Irace_2 + C3 Irace_3 (race dummies)
+d22 IaXr_2_2 + ⅛3 IaXr_2_3 + <∕32 IaXr_3_2 + <∕33 IaXr_3_3 (agegrp*race dummies)
÷√'∕ ∣2 IaXr-4~2 ⅛ d43 IaXr_4_3
+e2 Isex_2 (sex dummy)
+/22 IaXs_2_2 + /23 IaXs_2_3 + /24 IaXs_2_4 (agegrp*sex dummies)
+u
Note that the agegrp dummies are (correctly) included only once.
Interactions with continuous variables
i.vaaname1*vaaname2 (as distinguished from i.vrmcmeι*i.vrmame2, note the second i.) specifies an interaction of a
categorical variable with a continuous variable. For instance,
. xi: regress y i.agegr*wgt
results in the model:
+ = a + b2 Iagegr_2 + Ô3 Iagegr_3 + Ô4 Iagegr_4 (agegrp dummies)
+cwgt (continuous wgt effect)
+⅛ IaXwgt_2 + ʤ IaXwgtJI + ⅛ IaXwgt_4 (agegrp*wgt interactions)
+u
A variation on this notation, using ∣ rather than * omits the agegrp dummies. Typing
. xi: regress y i.agegr∣wgt