Update to a program for saving a model fit as a dataset



34


Stata Technical Bulletin


STB-58


probability of only 90% under the Conover method, compared to 94% under the cendif method. The two methods show little
or no difference, either in geometric mean confidence interval width or in coverage probability, when the variances are equal
and the Conover assumption is therefore true. From the results so far, I would recommend the cendif method as an improved
version of the Conover method, offering insurance against the possibility that the Conover assumption is wildly wrong, at little
or no price in performance if the Conover assumption is right. However, I am planning to carry out further simulations on the
two methods and to report the results in due course.

Example 1

In the auto data, we compare weights of American and foreign cars. We use cid and cendif to estimate the median
difference:

. cid weight,by(foreign) median unpaired

Rank-based confidence interval for difference in medians by foreign

Variable Obs Estimate            K         [95% Conf. Interval]

---------+-------------------------------------------------------------

weight I 74          1095         406              720        1350

. cendif weight,by(foreign)

Y-variable: weight (Weight (lbs.))

Grouped by: foreign (Car type)

Group numbers:

Car type Freq.     Percent        Cum.

------------+-----------------------------------

Domestic         52       70.27       70.27

Foreign I         22       29.73      100.00

------------+-----------------------------------
Total I          74      100.00

Transformation: Fisherzs z

95% confidence interval(s) for percentile difference(s)

between values of weight in first and second groups:

Percent Pctl-Dif Minimum Maximum

rl        50      1095       750      1330

We note that the median difference in weight is 1,095 pounds according to both cid and cendif. However, the confidence
limits given by cendif are 750 and 1,330 pounds, whereas the confidence limits given by cid are 720 and 1,350 pounds. This
is because foreign cars are fewer in number and less variable in weight than American cars, and cid assumes equal variances,
whereas cendif allows for unequal variances. If we carry out equal-variance and unequal-variance t tests (not shown), we find
a similar difference in the width of the confidence limits for the mean difference.

cendif can also calculate confidence intervals for percentiles other than medians. These contain information about the
degree of overlap between the two populations. Here, we estimate the 25th, 50th, and 75th percentile differences, using the
centile option.

. cendif weight,by(foreign) ce(25 50 75)

Y-variable: weight (Weight (lbs.))

Grouped by: foreign (Car type)

Group numbers:

Car type Freq.     Percent        Cum.

------------+-----------------------------------
Domestic
        52       70.27       70.27

Foreign I         22       29.73      100.00

------------+-----------------------------------
Total I          74      100.00

Transformation: Fisherzs z
95% confidence interval(s) for percentile difference(s)
between values of weight in first and second groups:

Percent

Pctl-Dif

Minimum

Maximum

rl

25

485

100

810

r2

50

1095

750

1330

r3

75

1555

1320

1790

If we want to estimate percentile ratios of weight, rather than percentile differences, then we simply take logs and use the
eform option.

. gene logwt=log(weight)

. cendif Iogwt,by(foreign) ce(25 50 75) eform



More intriguing information

1. APPLICATIONS OF DUALITY THEORY TO AGRICULTURE
2. The name is absent
3. Graphical Data Representation in Bankruptcy Analysis
4. BODY LANGUAGE IS OF PARTICULAR IMPORTANCE IN LARGE GROUPS
5. Errors in recorded security prices and the turn-of-the year effect
6. The name is absent
7. Economic Evaluation of Positron Emission Tomography (PET) in Non Small Cell Lung Cancer (NSCLC), CHERE Working Paper 2007/6
8. Financial Development and Sectoral Output Growth in 19th Century Germany
9. Activation of s28-dependent transcription in Escherichia coli by the cyclic AMP receptor protein requires an unusual promoter organization
10. The name is absent
11. A Regional Core, Adjacent, Periphery Model for National Economic Geography Analysis
12. Business Cycle Dynamics of a New Keynesian Overlapping Generations Model with Progressive Income Taxation
13. Altruism with Social Roots: An Emerging Literature
14. The name is absent
15. Name Strategy: Its Existence and Implications
16. Learning and Endogenous Business Cycles in a Standard Growth Model
17. Death as a Fateful Moment? The Reflexive Individual and Scottish Funeral Practices
18. The name is absent
19. The Clustering of Financial Services in London*
20. Keynesian Dynamics and the Wage-Price Spiral:Estimating a Baseline Disequilibrium Approach