Stata Technical Bulletin
We are left with a vague feeling that since many statistical tests and estimates assume normality, we ought to test for
non-normality even if we can’t really interpret the results. In this unsatisfactory situation, a choice between available tests,
if it is to be made at all, should, I contend, be based mainly on power comparisons. Much has been written on this subject
which I shall not try to summarize here. Generally speaking, the power of the K2, W and Shapiro-Francia W' tests seems
broadly comparable and considerably better than the older Kolmogorov and Pearson chi-square tests. K2 seems weak against
skewed, short-tailed distributions. W is weak against symmetric, rather long-tailed distributions. W' is weak against symmetric,
short-tailed distributions. No test is perfect!
References
D’Agostino, R. B., A. Belanger and R. B. D’Agostino, Jr. 1990. A suggestion for using powerful and informative tests of normality. American
Statistician 44(4): 316-321.
——. 1991. sg3.3: Comment on tests of normality. Stata Technical Bulletin 3: 20.
Hosking, J. R. M. 1990. L-moments: analysis and estimation of distributions using linear combinations of order statistics. Journal of the Royal Statistical
Society B, 52: 105-124.
Royston, J. P. 1982. An extension of Shapiro and Wilk’s W test for normality to large samples. Applied Statistics 31: 115-124.
——. 1991a. sg3.1: Tests for departure from normality. Stata Technical Bulletin 2: 16-17.
——. 1991b. sg3.5: Comment on sg3.4 and an improved D’Agostino test. Stata Technical Bulletin 3: 23-24.
——. 1991c. sg3.2: Shapiro-Wilk and Shapiro-Francia tests. Stata Technical Bulletin 3: 19-20.
——. 1991d. Estimating departure from normality. Statistics in Medicine 10: 1283-1293.
——. 1991e. Which measures of skewness and kurtosis are best? Statistics in Medicine 10, in press.
smv1 Single factor repeated measures ANOVA
Joseph Hilbe, Editor, STB, FAX 602-860-1446
The syntax for the ranova command is
ranova VaarUst> [if exp [in range]
ranova automatically checks for missing values across variables listed on the command line. When a missing value is
found in any variable, it deletes the observation from active memory. However, the original data set is restored to memory upon
completion of the analysis. The program provides information regarding excluded observations.
The statistical design permits analysis of repeated (treatment) measures on the same individuals. Each variable in the varlist
corresponds to one treatment. ranova tests the hypothesis that all treatments have the same mean against the alternative that the
treatment means are different from one another. This test is similar to a twoway analysis of variance in which the factors are
treatment and subject, but the data are organized differently. Total model variability is divided into
1. SS Treatment: the variability resulting from the independent variable; that is, the levels or categories of response.
2. SS Within: the variability that cannot be accounted for by the independent variable.
a. SS Subjects: the variability resulting from individual differences.
b. SS Error: the variability resulting from random factors.
This test works for the simplest special case of the repeated measures design, but it does not handle any complications. Since
each individual provides a value for each level or category of the independent variable, it is possible to measure the individual
difference variability. This is not possible in randomized designs. However, there are several complications that may make this
test invalid. In many instances when individuals are being measured over time, there may be a carry-over effect from early
measurements to later ones. This will bias the test statistic. Moreover, when there are more than two measurements, the model
has the assumption of homogeneity of covariance. This assumption is violated, for example, when one pair of levels is fairly
close in time whereas another pair is more distant. Violations of this sort affect Type I error rate. This problem is ameliorated
by using the Huynh-Feldt correction or by transforming the repetitions of dependent variables into separate dependent variables
and analyzing the model by profile analysis or MANOVA.
If a significant difference between levels of the independent variable has been determined, Tukey HSD tests may be calculated
to ascertain which level(s) are significantly different. The formula is
HSIe = wp^MSerra>r∣N where the appropriate q value is found on a Studentized Range Table.