The name is absent



14


Stata Technical Bulletin


STB-20


3. Estimate the model on the remaining к — 1 indicator variables.

It is this procedure that xi automates.

Using xi: Overview

xi provides a convenient way to include dummy or indicator variables when estimating a model (say with regress,
logistic, etc.). For instance, assume the categorical variable agegrp contains 1 for ages 20-24, 2 for ages 25-39, 3 for ages
40-44, etc. Typing

. xi: logistic outcome weight i.agegrp bp

estimates a logistic regression of outcome on weight, dummies for each agegrp category, and bp. That is, xi searches out
and expands terms starting with “i. ” but leaves the other variables alone. xi will expand both numeric and string categorical
variables, so if you had a string variable race containing “white,” “black,” and “other,” typing

. xi: logistic outcome weight bp i.agegrp i.race

would include indicator variables for the race group as well.

The i. indicator variables xi expands may appear anywhere in the varlist, so

. xi: logistic outcome i.agegrp weight i.race bp
would estimate the same model.

You can also create interactions of categorical variables; typing

xi: logistic outcome weight bp i.agegrp*i.race

estimates a model including indicator variables for all agegrp and race combinations.

You can interact dummy variables with continuous variables:

xi: logistic outcome bp i.agegrp*weight i.race

And, of course, you can include multiple interactions:

xi: logistic outcome bp i.agegrp*weight i.agegrp*i.race

We will now back up and consider each of xi’s features in detail.

Indicator variables for simple effects

When you type ‘i.vanaanw’, xi internally tabulates varname (which may be a string or a numeric variable) and creates
indicator (dummy) variables for each observed value, omitting the indicator for the smallest value. For instance, say agegrp
takes on the values 1, 2, 3, and 4. Typing

xi: logistic outcome i.agegrp

creates indicator variables named Iagegr_2, Iagegr_3, and Iagegr_4. (xi chooses the names and tries to make them readable;
xi guarantees that the names are unique.) The expanded logistic model then is

. logistic outcome Iagegr-2 Iagegr-3 Iagegr-4

Afterwards, you can drop the new variables xi leaves behind by typing ‘drop I*’ (note capitalization).

xi provides the following features when you type ‘i .aanaame’:

1. vannane may be string or numeric.

2. Dummy variables are created automatically.

3. By default, the dummy-variable set is identified by dropping the dummy corresponding to the smallest value of the variable
(how to specify otherwise is discussed below).

4. The new dummy variables are left in your data set. You can drop them by typing ‘drop I*’. You do not have to do this;
each time you use the xi prefix or command, any previously created automatically generated dummies are dropped and
new ones created.

5. The new dummy variables have variable labels so you can determine to what they correspond by typing ‘describe’ or
‘describe I*’.

6. xi may be used with any Stata command (not just logistic).



More intriguing information

1. The name is absent
2. Innovation Policy and the Economy, Volume 11
3. The name is absent
4. The Impact of EU Accession in Romania: An Analysis of Regional Development Policy Effects by a Multiregional I-O Model
5. The name is absent
6. Endogenous Heterogeneity in Strategic Models: Symmetry-breaking via Strategic Substitutes and Nonconcavities
7. Review of “From Political Economy to Economics: Method, the Social and Historical Evolution of Economic Theory”
8. The Role of State Trading Enterprises and Their Impact on Agricultural Development and Economic Growth in Developing Countries
9. CONSUMER PERCEPTION ON ALTERNATIVE POULTRY
10. The urban sprawl dynamics: does a neural network understand the spatial logic better than a cellular automata?