Stata Technical Bulletin
STB-4
simplify my work. At present I use Windows Standard mode under QEMM with the QEMM parameter EMBMEM. If x is one’s
maximum RAM and y is the RAM to be reserved for Intercooled Stata, EMBMEM is set to y. The result is fairly crippled.
Not only does Windows lose access to x-у RAM, but it is also impossible to exploit the task switching option that would
be available if Intercooled Stata could be run in enhanced mode. Is there a way to run Intercooled Stata in enhanced mode?
A. The present version of Intercooled Stata is unable to run under Windows 3 in enhanced mode—you must use standard or
real mode. In order to run in protected mode, a program must be compiled to support the DOS Protected Mode Interface
(DPMI) standard, something that can only be done by CRC. Many protected mode programs, for example, AutoCAD, have a
similar problem. CRC tells me that they will soon release a version of Intercooled Stata that will run in enhanced mode.
In the meantime, depending on your hardware constraints as well as on other software considerations, you may employ any
of the following:
1. In real mode, call Windows by using the win∕r command. Then click on “file” and “run.” Type, for example,
c:∖stata∖istata.exe.
2. In standard mode, call Windows by using the win∕s command. Use the same procedure as in 1. Following your suggestion,
however, most users may need to use the QEMM EMBMEM parameter as you described in your question. Always check the
documentation that came with your computer. Research, experimentation, and patience seem critical.
3. Opt out of Windows and run Intercooled Stata from DOS.
sg3.6 A response to sg3.3: comment on tests of normality
Patrick Royston, Royal Postgraduate Medical School, London, FAX (011)-44-81-740 3119
In my opinion, the distribution of the D’Agostino-Pearson K2 statistic for testing for non-normality is not close enough
to chi-square(2) for practical use. As I showed in sg3.1 (Royston 1991a), it rejects the null hypothesis of normality too often,
particularly for small samples (n<100, say) and for stringent tests (e.g. significance levels of 0.01 or less). However, in sg3.5
(Royston 1991b), I supplied a correction to K2 which overcame the problem, so that it is no longer an issue.
I certainly concur with D’Agostino et al.’s (1990, 1991) stated aim of replacing the Kolmogorov test with better tests.
D’Agostino et al. (1990) recommended both K2 and the Shapiro-Wilk W as good “omnibus” tests. One of their problems with
W was that it was unavailable for n>50; this is not so, see Royston (1982) and swilk.ado in sg3.2 (Royston 1991c). They
also complained that if W rejected the null hypothesis, it provided no information on the nature of the departure from normality.
Meaningful indices of non-normality can in fact be derived by linear transformation of the Shapiro-Wilk and Shapiro-Francia
statistics (Royston 1991d). The index for the Shapiro-Francia test is called V' and is proportional to the variance of the difference
between the ordered data and expected normal order statistics. It provides an intuitively reasonable link between the normal plot
and the test of non-normality.
My approach to testing for non-normality lines up with that of D’Agostino et al. (1990): “A good complete normality
analysis would consist of the use of the [normal probability] plot plus the statistics.” In that order! The normal plot shows all the
data and gives more information about possible non-normality than any number of summary statistics. The information includes
the presence of outliers and whether the data are skew and/or short- or long-tailed. However, it may be useful to know whether
non-linearities in the normal plot are more likely to be “real” than to be caused by chance fluctuation. That is the job of tests
like K2, W, etc.
D’Agostino’s comment “as sample sizes increase, as any applied researcher knows, these tests will reject the null hypothesis
...” applies to any test of significance whatever. It is a well-recognised drawback of the hypothesis-testing approach and is one
reason why modern statistical practice leans towards estimates with confidence intervals rather than P values. It is true that ʌ/ði
and ⅛ do estimate population skewness and kurtosis. Unfortunately, however, the estimates may be highly biased (especially
for skew populations), the bias depends on the sample size and, as far as I know, confidence intervals are not available, so
their value seems to be limited. Arguably better indices of population shape are based on so-called L-moments (Hosking 1990,
Royston 1991e), but even then confidence intervals are problematic.
A major problem with tests and estimates of non-normality underlies D’Agostino et al.’s (1991) comment “[skewness and
kurtosis] can help us to judge if our later inferences will be affected by the nonnormality.” I would ask, how? If one is trying
to use the sample mean and standard deviation to estimate centiles (such as the 5th and 95th, a common clinical application) of
a distribution believed to be approximately normal, even slight departures from normality may make the estimates unacceptably
inaccurate. What values of ʌ/ði and δ2 (or any other statistic) indicate “slight” non-normality here? Similarly, one would like to
know whether for example a given t-test or confidence interval for a mean is valid or is compromised by non-normality in the
data. Until answers to specific questions of this sort are available, the inferential value of non-normality statistics is doubtful.
Clearly, much research is needed.