Computing optimal sampling designs for two-stage studies

Stata Technical Bulletin

STB-58

probability of only 90% under the Conover method, compared to 94% under the cendif method. The two methods show little
or no difference, either in geometric mean confidence interval width or in coverage probability, when the variances are equal
and the Conover assumption is therefore true. From the results so far, I would recommend the cendif method as an improved
version of the Conover method, offering insurance against the possibility that the Conover assumption is wildly wrong, at little
or no price in performance if the Conover assumption is right. However, I am planning to carry out further simulations on the
two methods and to report the results in due course.

Example 1

In the auto data, we compare weights of American and foreign cars. We use cid and cendif to estimate the median
difference:

. cid weight,by(foreign) median unpaired

Rank-based confidence interval for difference in medians by foreign

Variable ∣ Obs Estimate K [95% Conf. Interval]

---------₊-------------------------------------------------------------

weight I 74 1095 406 720 1350

. cendif weight,by(foreign)

Y-variable: weight (Weight (lbs.))

Grouped by: foreign (Car type)

Group numbers:

Car type ∣ Freq. Percent Cum.

------------₊-----------------------------------

Domestic ∣ 52 70.27 70.27

Foreign I 22 29.73 100.00

------------₊-----------------------------------
Total I 74 100.00

Transformation: Fisher^zs z

95% confidence interval(s) for percentile difference(s)

between values of weight in first and second groups:

Percent Pctl_-Dif Minimum Maximum

rl 50 1095 750 1330

We note that the median difference in weight is 1,095 pounds according to both cid and cendif. However, the confidence
limits given by cendif are 750 and 1,330 pounds, whereas the confidence limits given by cid are 720 and 1,350 pounds. This
is because foreign cars are fewer in number and less variable in weight than American cars, and cid assumes equal variances,
whereas cendif allows for unequal variances. If we carry out equal-variance and unequal-variance t tests (not shown), we find
a similar difference in the width of the confidence limits for the mean difference.

cendif can also calculate confidence intervals for percentiles other than medians. These contain information about the
degree of overlap between the two populations. Here, we estimate the 25th, 50th, and 75th percentile differences, using the
centile option.

. cendif weight,by(foreign) ce(25 50 75)

Y-variable: weight (Weight (lbs.))

Grouped by: foreign (Car type)

Group numbers:

Car type ∣ Freq. Percent Cum.

------------₊-----------------------------------
Domestic ∣ 52 70.27 70.27

Foreign I 22 29.73 100.00

------------₊-----------------------------------
Total I 74 100.00

Transformation: Fisher^zs z
95% confidence interval(s) for percentile difference(s)
between values of weight in first and second groups:

	Percent	Pctl-Dif	Minimum	Maximum
rl	25	485	100	810
r2	50	1095	750	1330
r3	75	1555	1320	1790

If we want to estimate percentile ratios of weight, rather than percentile differences, then we simply take logs and use the
eform option.

. gene logwt=log(weight)

. cendif Iogwt,by(foreign) ce(25 50 75) eform

More intriguing information

1. New issues in Indian macro policy.
2. Factores de alteração da composição da Despesa Pública: o caso norte-americano
3. CREDIT SCORING, LOAN PRICING, AND FARM BUSINESS PERFORMANCE
4. Staying on the Dole
5. Skill and work experience in the European knowledge economy
6. Nonlinear Production, Abatement, Pollution and Materials Balance Reconsidered
7. The Value of Cultural Heritage Sites in Armenia: Evidence From a Travel Cost Method Study
8. Globalization, Divergence and Stagnation
9. ENVIRONMENTAL POLICY: THE LEGISLATIVE AND REGULATORY AGENDA
10. STIMULATING COOPERATION AMONG FARMERS IN A POST-SOCIALIST ECONOMY: LESSONS FROM A PUBLIC-PRIVATE MARKETING PARTNERSHIP IN POLAND