Computing optimal sampling designs for two-stage studies



34


Stata Technical Bulletin


STB-58


probability of only 90% under the Conover method, compared to 94% under the cendif method. The two methods show little
or no difference, either in geometric mean confidence interval width or in coverage probability, when the variances are equal
and the Conover assumption is therefore true. From the results so far, I would recommend the cendif method as an improved
version of the Conover method, offering insurance against the possibility that the Conover assumption is wildly wrong, at little
or no price in performance if the Conover assumption is right. However, I am planning to carry out further simulations on the
two methods and to report the results in due course.

Example 1

In the auto data, we compare weights of American and foreign cars. We use cid and cendif to estimate the median
difference:

. cid weight,by(foreign) median unpaired

Rank-based confidence interval for difference in medians by foreign

Variable Obs Estimate            K         [95% Conf. Interval]

---------+-------------------------------------------------------------

weight I 74          1095         406              720        1350

. cendif weight,by(foreign)

Y-variable: weight (Weight (lbs.))

Grouped by: foreign (Car type)

Group numbers:

Car type Freq.     Percent        Cum.

------------+-----------------------------------

Domestic         52       70.27       70.27

Foreign I         22       29.73      100.00

------------+-----------------------------------
Total I          74      100.00

Transformation: Fisherzs z

95% confidence interval(s) for percentile difference(s)

between values of weight in first and second groups:

Percent Pctl-Dif Minimum Maximum

rl        50      1095       750      1330

We note that the median difference in weight is 1,095 pounds according to both cid and cendif. However, the confidence
limits given by cendif are 750 and 1,330 pounds, whereas the confidence limits given by cid are 720 and 1,350 pounds. This
is because foreign cars are fewer in number and less variable in weight than American cars, and cid assumes equal variances,
whereas cendif allows for unequal variances. If we carry out equal-variance and unequal-variance t tests (not shown), we find
a similar difference in the width of the confidence limits for the mean difference.

cendif can also calculate confidence intervals for percentiles other than medians. These contain information about the
degree of overlap between the two populations. Here, we estimate the 25th, 50th, and 75th percentile differences, using the
centile option.

. cendif weight,by(foreign) ce(25 50 75)

Y-variable: weight (Weight (lbs.))

Grouped by: foreign (Car type)

Group numbers:

Car type Freq.     Percent        Cum.

------------+-----------------------------------
Domestic
        52       70.27       70.27

Foreign I         22       29.73      100.00

------------+-----------------------------------
Total I          74      100.00

Transformation: Fisherzs z
95% confidence interval(s) for percentile difference(s)
between values of weight in first and second groups:

Percent

Pctl-Dif

Minimum

Maximum

rl

25

485

100

810

r2

50

1095

750

1330

r3

75

1555

1320

1790

If we want to estimate percentile ratios of weight, rather than percentile differences, then we simply take logs and use the
eform option.

. gene logwt=log(weight)

. cendif Iogwt,by(foreign) ce(25 50 75) eform



More intriguing information

1. Standards behaviours face to innovation of the entrepreneurships of Beira Interior
2. Spatial agglomeration and business groups: new evidence from Italian industrial districts
3. The name is absent
4. Philosophical Perspectives on Trustworthiness and Open-mindedness as Professional Virtues for the Practice of Nursing: Implications for he Moral Education of Nurses
5. The name is absent
6. Keynesian Dynamics and the Wage-Price Spiral:Estimating a Baseline Disequilibrium Approach
7. Structural Influences on Participation Rates: A Canada-U.S. Comparison
8. Rural-Urban Economic Disparities among China’s Elderly
9. A Principal Components Approach to Cross-Section Dependence in Panels
10. The name is absent