The name is absent



16


Stata Technical Bulletin


STB-22


Repair I

Record 19781

Freq.

Percent

Cum.

1 I

2

2.90

2.90

2 I

8

11.59

14.49

3 I

30

43.48

57.97

4 I

18

26.09

84.06

5 I

11

15.94

100.00

Total I

69

100.00

. regress price r2 r3 r4 r5 length displ weight mpg

Source I

SS

df

MS

Number of obs =      69

T?/   O       CΛ' _     Λ Λ~t

^^ ^^ ------ ----+∙

Г   V ,      VV√

—       V . ∙± I

Model I

267197833

8 33399729.2

Prob > F

= 0.0000

Residual

309599125

60 5159985.42

R-square

= 0.4632

— A ,5Q17,

^^ ^^ ^^—^^ ^^—+∙

Adj R-square

~ U♦O√ɪ I

Total I

576796959

68 8482308.22

Root MSE

= 2271.6

price I

Coef.

Std. Err.

t

p>t

[957. Conf.

Interval]

^^ ^^ ^^—^^ ^^—+∙

r2 I

907.3499

1817.764

0.499

0.619

-2728.719

4543.419

r3 I

1105.359

1668.122

0.663

0.510

-2231.381

4442.099

r4 I

2147.658

1702.115

1.262

0.212

-1257.08

5552.395

r5 I

3816.672

1787.51

2.135

0.037

241.1194

7392.226

length I

-117.3064

40.65207

-2.886

0.005

-198.6226

-35.99012

displ I

8.447532

8.423298

1.003

0.320

-8.401571

25.29664

weight I

4.089227

1.597143

2.560

0.013

.8944658

7.283989

≡Pg I

-129.2005

84.52707

-1.529

0.132

-298.2799

39.87876

_cons I

15158.53

6179.409

2.453

0.017

2797.871

27519.19

According to these estimates, cars with fair repair records cost an average of $907 more than cars with poor repair records.
The gap increases with each improvement in repair record. Cars with excellent repair records cost an average of $3,817 more
than cars with poor repair records.

The question may now arise: Which pairs of groups (categories of rep78) can we legitimately claim are different from each
other; which of these differences are unlikely to have arisen by chance? The answer hinges on what we view as “legitimate”.

The aggressive investigator might argue that groups 1 and 5 are different on the strength of the s statistic for the coefficient
on r5
(t = 2.135, with a p value of .037). The cautious investigator (or journal editor), however, would counter that many
comparisons of the different groups could have been made. Perhaps this test was selected for focus solely because it happens
to show a “significant” difference. And when multiple comparisons are made, the probability under the null of finding, say, a
t statistic as large as 2.135 is greater than .037. But how much greater is it—that is, what is the correct p value for this t statistic
when multiple comparisons are made?

There are many philosophical views on this problem. I examine the mechanics of one view—traditional adjustment for
multiple comparisons—in the context of regression-like models. (See [5s] oneway for a discussion of this approach in an
ANOVA
context.) This view provides methods for making each test more conservative when there are multiple comparisons, so the overall
probability of making a Type I error for any pairwise comparison (declaring a difference significant when it is merely due to
chance) remains less than a predetermined value, such as 5 percent. We discuss three widely used approaches: the Bonferroni,
Sidak, and Scheffe tests.

The Bonferroni test is the simplest to implement. In this method, the cautious investigator would note that 10 pairs of
groups could have been compared and treat a reported
p value of 0.037 as if it were 10 × 0.037 = 0.37. It would take a t value
of 2.9146 to be “significant” at the 5 percent level according to this logic. Using the Bonferroni rule, the contrast r5 vs. rl just
misses attaining significance.

The Sidak test is almost identical to the Bonferroni, unless the number of comparison groups is quite large. In our example,
the relevant critical value is about the same; a
t statistic must be at least 2.9063 to be significant. The Scheffe test is even
more conservative, requiring a
t statistic of 3.178. The Scheffe procedure is designed to hold for any linear combination of the
categories, not just for contrasts (comparisons of any two categories).

There is another consideration in a regression model that doesn’t arise in the one-way ANOVA context. In the ANOVA, the
means are guaranteed to be independent, since they come from independent samples. In the regression, a common adjustment
introduces correlation between the category means. The Scheffe method is a conservative answer that applies equally well in
the regression and
ANOVA models. The Bonferroni and Sidak methods can become non-conservative in a regression context.



More intriguing information

1. Keystone sector methodology:network analysis comparative study
2. The name is absent
3. A Consistent Nonparametric Test for Causality in Quantile
4. Revisiting The Bell Curve Debate Regarding the Effects of Cognitive Ability on Wages
5. Credit Market Competition and Capital Regulation
6. THE USE OF EXTRANEOUS INFORMATION IN THE DEVELOPMENT OF A POLICY SIMULATION MODEL
7. Personal Experience: A Most Vicious and Limited Circle!? On the Role of Entrepreneurial Experience for Firm Survival
8. Inhimillinen pääoma ja palkat Suomessa: Paluu perusmalliin
9. The Demand for Specialty-Crop Insurance: Adverse Selection and Moral Hazard
10. The name is absent