The name is absent

Stata Technical Bulletin

STB-22

Repair I Record 19781	Freq.	Percent	Cum.
			—
1 I	2	2.90	2.90
2 I	8	11.59	14.49
3 I	30	43.48	57.97
4 I	18	26.09	84.06
5 I	11	15.94	100.00
			—
Total I	69	100.00

. regress price r2 r3 r4 r5 length displ weight mpg

Source I	SS	df	MS		Number of obs = 69 T?/ O CΛ'∣ _ Λ Λ~t
^^ ^^ ------ ----+∙			—		Г ∖ V , VV√	— V . ∙± I
Model I	267197833	8 33399729.2			Prob > F	= 0.0000
Residual ∣	309599125	60 5159985.42			R-square	= 0.4632
						— A ^,5Q17^,
^^ ^^ ^^—^^ ^^—+∙			—		Adj R-square	~ U♦O√ɪ I
Total I	576796959	68 8482308.22			Root MSE	= 2271.6
— price I	Coef.	Std. Err.	t	p>∣t∣	[957. Conf.	— Interval]
^^ ^^ ^^—^^ ^^—+∙						—
r2 I	907.3499	1817.764	0.499	0.619	-2728.719	4543.419
r3 I	1105.359	1668.122	0.663	0.510	-2231.381	4442.099
r4 I	2147.658	1702.115	1.262	0.212	-1257.08	5552.395
r5 I	3816.672	1787.51	2.135	0.037	241.1194	7392.226
length I	-117.3064	40.65207	-2.886	0.005	-198.6226	-35.99012
displ I	8.447532	8.423298	1.003	0.320	-8.401571	25.29664
weight I	4.089227	1.597143	2.560	0.013	.8944658	7.283989
≡Pg I	-129.2005	84.52707	-1.529	0.132	-298.2799	39.87876
_cons I	15158.53	6179.409	2.453	0.017	2797.871	27519.19

According to these estimates, cars with fair repair records cost an average of $907 more than cars with poor repair records.
The gap increases with each improvement in repair record. Cars with excellent repair records cost an average of $3,817 more
than cars with poor repair records.

The question may now arise: Which pairs of groups (categories of rep78) can we legitimately claim are different from each
other; which of these differences are unlikely to have arisen by chance? The answer hinges on what we view as “legitimate”.

The aggressive investigator might argue that groups 1 and 5 are different on the strength of the s statistic for the coefficient
on r5 (t = 2.135, with a p value of .037). The cautious investigator (or journal editor), however, would counter that many
comparisons of the different groups could have been made. Perhaps this test was selected for focus solely because it happens
to show a “significant” difference. And when multiple comparisons are made, the probability under the null of finding, say, a
t statistic as large as 2.135 is greater than .037. But how much greater is it—that is, what is the correct p value for this t statistic
when multiple comparisons are made?

There are many philosophical views on this problem. I examine the mechanics of one view—traditional adjustment for
multiple comparisons—in the context of regression-like models. (See [5s] oneway for a discussion of this approach in an ANOVA
context.) This view provides methods for making each test more conservative when there are multiple comparisons, so the overall
probability of making a Type I error for any pairwise comparison (declaring a difference significant when it is merely due to
chance) remains less than a predetermined value, such as 5 percent. We discuss three widely used approaches: the Bonferroni,
Sidak, and Scheffe tests.

The Bonferroni test is the simplest to implement. In this method, the cautious investigator would note that 10 pairs of
groups could have been compared and treat a reported p value of 0.037 as if it were 10 × 0.037 = 0.37. It would take a t value
of 2.9146 to be “significant” at the 5 percent level according to this logic. Using the Bonferroni rule, the contrast r5 vs. rl just
misses attaining significance.

The Sidak test is almost identical to the Bonferroni, unless the number of comparison groups is quite large. In our example,
the relevant critical value is about the same; a t statistic must be at least 2.9063 to be significant. The Scheffe test is even
more conservative, requiring a t statistic of 3.178. The Scheffe procedure is designed to hold for any linear combination of the
categories, not just for contrasts (comparisons of any two categories).

There is another consideration in a regression model that doesn’t arise in the one-way ANOVA context. In the ANOVA, the
means are guaranteed to be independent, since they come from independent samples. In the regression, a common adjustment
introduces correlation between the category means. The Scheffe method is a conservative answer that applies equally well in
the regression and ANOVA models. The Bonferroni and Sidak methods can become non-conservative in a regression context.

More intriguing information

1. Internationalization of Universities as Internationalization of Bildung
2. The name is absent
3. Cross border cooperation –promoter of tourism development
4. EU enlargement and environmental policy
5. The name is absent
6. Analyzing the Agricultural Trade Impacts of the Canada-Chile Free Trade Agreement
7. The economic doctrines in the wine trade and wine production sectors: the case of Bastiat and the Port wine sector: 1850-1908
8. A Critical Examination of the Beliefs about Learning a Foreign Language at Primary School
9. The name is absent
10. An Estimated DSGE Model of the Indian Economy.