14
Stata Technical Bulletin
STB-22
The 5 MPG difference is still highly significant. At this point, we might be tempted to halt the analysis and to rely on the
te test to support the hypothesis that foreign cars get better mileage than domestic cars. Instead, let’s calculate the overlapping
coefficient to see whether the statistically significant difference detected by the t test corresponds to a meaningful difference in
the distributions of MPG between groups.
. overlap mpg, by(foreign)
MLE of overlap:
Dissimilarity Index=I-OVL: 0.0687
Dissimilarity Index=I-OVL: 0.3579
Variances equal: 0.9313
Variances UNequal: 0.6421
When the variance is assumed to be the same in both groups, the overlapping coefficient is .93, indicating substantial overlap
of the two distributions and raising questions about the interpretation of the t test for these data. Note, that for these data, the
overlapping coefficient is sensitive to the assumption of equal variances. When the variances are allowed to differ, the overlapping
coefficient drops to .64, indicating substantially less overlap than the initial calculation.
The immediate version of the program, overlap!, produces the same results as overlap. The immediate version makes it
easy to calculate the overlapping coefficient when the data are not readily available, for instance, when only summary statistics
are reported in a book or research paper.
. overlapi 19.82692 24.77273
MLE of overlap:
Variances equal: 0.9313
Variances UNequal: 0.6421
4.743297 6.61187 52 22
Dissimilarity Index=I-OVL:
Dissimilarity Index=I-OVL:
0.0687
0.3579
An “improved” rank-sum statistic
The Mann-Whitney-Wilcoxon two-sample rank-sum test provides a nonparametric alternative to the t test (signrank [5s];
Moses, Emerson, and Hosseini 1992; Fleiss 1981). There are a variety of equivalent ways of stating the test, but the intuition
behind the test is straightforward. A characteristic is measured in two independent samples; in our example above, we’ve
measured fuel efficiency in a sample of domestic cars and a sample of foreign cars. The null hypothesis states there is no
systematic difference in the characteristic between the two samples. The rank-sum test merges the two samples and sorts them
in order of the characteristic. Under the null hypothesis, the expected sum of the ranks of the observations from the first sample
is equal to the expected sum of the ranks of the observations from the second sample, corrected for any imbalance in sample
sizes. The Mann-Whitney U statistic is a function of the sum of the ranks with a known distribution under the null hypothesis.
A U statistic can be calculated for either sample, and it can be show that
ɪʃi + D2 = mn
where Ui is the U statistic for the ith sample and m and n are the numbers of observations in the two samples. By convention,
the smaller of U1 and D2 is the test statistic.
Stata’s ranksum command reports the p value of the rank-sum test. As with the t test, ranksum provides a measure of the
statistical significance but not the substantive importance of the difference in the locations of the distributions of the characteristic
between the two samples.
ranksum2 augments Stata’s ranksum command by also reporting U/mn. From the formula above, it is clear the U/mn = 1/2
under the null hypothesis. By construction, 0 ≤ U/mn ≤ 1/2. Thus, this measure provides an intuitive measure of how much
the data deviate from the null hypothesis.
We use the automobile data again to demonstrate ranksum2:
. use ∖stata∖auto, clear
(1978 Automobile Data)
. ranksum2 mpg, by(foreign)
Test: Equality of medians (Two-Sample Wilcoxon Rank-Sum)
Sum of Ranks: 1086.5 (foreign == 1)
Expected Sum: 825
z-statistic 3.09
Prob > ∣z∣ 0.0020
U∕mn .27141608
Every line but the last is identical to that produced by Stata’s ranksum command. The last line reports the ratio of the U statistic
to the product of the numbers of observations in each of the two samples. ranksum2 also augments the stored results supplied
by ranksum: S_5 contains the Mann-Whitney U statistic and S_6 contains the ratio U/mn.