Stata Technical Bulletin
17
The remainder of this insert presents a series of Stata commands I have written to produce p values adjusted for multiple
comparisons. The first command, mcompp, calculates and saves the p values but produces no output. The second command,
mcomprl, reports contrasts that are not significantly different from each other. The third command, mcompr2, displays a report
of all pairwise significant differences.
mcompp: Calculate p values
The syntax of mcompp is
mcompp varlist [, nocons bonferroniVvrnnmee) scheffeVarnamme) sida⅛Varrarme) ]
varlist is a list of dummy variables that (in combination with the constant) defines a set of categories. This is a list of variables
as might have been produced with a tabulate, generate() command, and these variables must also appear in the list of
explanatory variables in the most recent estimation command. (All of Stata’s estimation commands store the parameter estimates
and the covariance matrix of estimates. This feature of Stata underlies the design of mcompp.) In the current version of mcompp,
there must be one dummy variable for each category except the default, or omitted, category. The dummy variables must be
coded so that ‘1’ means the category defined is present and ‘0’ means it is absent. This is the standard convention for Boolean
algebra, but it is not the way data are usually received from a survey.
The options named for the three methods (bonferroni, sidak, and scheffe) specify variables that are to contain the
v values for the Bonferroni, Sidak, and Scheffe methods, respectively.
If nocons is specified, then the variable list is assumed to contain an exhaustive list of categories to be compared. Otherwise,
the list is assumed to omit one dummy variable corresponding to the default category.
The nocons option can be used even if it was not used in the original regression. This feature allows you to perform
multiple comparisons on a subset of the categories represented by the dummy variables in the regression. For example, I recently
needed to analyze differences between 34 health plans, but I only wanted to show contrasts within market areas. For all regions
except the region that contained the default plan, I used the nocons option and the dummy variables standing for all the plans
in that region.
We can illustrate mcompp by applying it to our auto price regression:
. mcompp r2 r3 r4 r5, scheffe (schpl) bonferr(bonpl) sidak(sidpl)
. list schpl bonpl sidpl in 1/10
schpl |
bonpl |
sidpl | |
ι. |
.9926566 |
1 |
.9999364 |
2. |
.9786549 |
1 |
.9992037 |
3. |
.9997282 |
1 |
1 |
4. |
.8092963 |
1 |
.9075922 |
ε. |
.8137907 |
1 |
.9117512 |
6. |
.7040055 |
1 |
.7921936 |
7. |
.3465602 |
.3683796 |
.3129417 |
S. |
.1976694 |
.1536581 |
.1434571 |
9. |
.092699 |
.052915 |
.0516726 |
10. |
.5256634 |
.7744335 |
.5533879 |
The contrasts are listed in the following order: r2 vs. default, r3 vs. default, r3 vs. r2, r4 vs. default, r4 vs. r3, etc.
As you can see, the output from mcompp is difficult to decipher. The next two programs, mcomprl and mcompr2, display
this output in more readable forms.
mcompr1: Report similar groups
One question we might want to answer after estimating the auto price regression is which groups of repair categories have
similar prices. In this context, “similar” means there are no statistically significant differences between any of the categories
within a group. mcomprl creates a report of similar groups, using the output from mcompp as its criterion for similarity.
The syntax of mcomprl is
mcomprl arrarme [, cutoff (#) default Vrmee) IAVarrarme) generate (αrrrrme) label ]