14
Stata Technical Bulletin
STB-8
Formul∣
Default case. I basically use the method of Mood and Graybill (1963, 408). Let x1 ≤ x2 ≤ ∙ ∙ ∙ ≤ xn be a sample of
size n arranged in ascending order. Denote the estimated gth centile of the æ’s as cq. We require that 0 < q < 100. Let
R = (n + 1)g∕100 have integer part r and fractional part f, that is, r = int(R) and f = R—r. (If R is itself an integer, then
r = R and f = 0.) Note that 0 ≤ r ≤ n. For convenience, define x∣> = x1 and xn+∣ = xn. Then Cq is estimated by
cq = xr + f × (xr+ι - xr),
that is, cq is a weighted average of xr and a⅛+ι. Loosely speaking, a (conservative) p% confidence interval for Cq involves
finding the observations ranked t and и which correspond respectively to the a = (100 — p)∕200 and 1 — a quantiles of a
binomial distribution with parameters n and g∕100, i.e., B(n,g∕100). More precisely, define the )th value (i = 0,... ,n) of
the cumulative binomial distribution function to be Fi = P(X ≤ г), where X has distribution B(n,g∕100). For convenience,
let Γ-∣ = 0 and Fn+1 = 1. Then i is found such that Ft ≤ a and Ft+1 > a, and и is found such that I--F1u ≤ a and
1 - P∙u-ι > a.
With the cci option in force, the (conservative) confidence interval is (2⅛+1,2⅛+1) and its actual coverage is Fu — Ft.
The default case uses linear interpolation on the Fi as follows. Let
g = (a - Ft)/(Ft+1 - Ft),
h = [α - (1 - Fu)]/[(1 - Fu) -(1- .F1a.1)]
= (ɑ — 1 + Fu)/( F1u-1 — Fu).
Then the interpolated lower and upper confidence limits (cqL,cqu) for Cq are
CqL = Xt+1 +g× (≈⅛+2 - ≈⅛+1)
CqU — a'u ∣ ∣ h × (j'u∣∣ Xu)∙
For example, suppose we want a 95% confidence interval for the median of a sample of size 13. So n = 13, q = 50, p = 95,
α = .025, .R = 14 × 50/100 = 7, f = 0. The median is therefore the 7th observation. Some example data xi and the values of
Fi are as follows:
i |
____Fi |
I-Fi |
Xi |
i |
__R |
l~Fi |
Xi |
0^^ |
0.0001 |
0.9999 |
— |
~1~ |
0.7095 |
0.2905 |
~~33 |
1 |
0.0017 |
0.9983 |
5 |
8 |
0.8666 |
0.1334 |
37 |
2 |
0.0112 |
0.9888 |
7 |
9 |
0.9539 |
0.0461 |
45 |
3 |
0.0461 |
0.9539 |
10 |
10 |
0.9888 |
0.0112 |
59 |
4 |
0.1334 |
0.8666 |
15 |
11 |
0.9983 |
0.0017 |
77 |
5 |
0.2905 |
0.7095 |
23 |
12 |
0.9999 |
0.0001 |
104 |
6 |
0.5000 |
0.5000 |
28 |
13 |
1.0000 |
0.0000 |
211 |
The median is x-i = 33. Also, F2 < .025 and F3 > |
.025 so |
t = |
2; 1 - |
-F110 ≤ .025 and 1 - Fq > .025 so и = 10. The |
(c50L,c50σ) = (x3,x11) = (10,77),
with actual coverage P10 — F2 = .9888 — .0112 = .9776 (97.8% confidence). For the interpolation calculation, we have
g = (.025 - .0112)/(.0461 - .0112) = .395,
h = (.025 - 1 + .9888)/(.0998 - .9539) = .395.
So
⅛ol = X3 + ∙395 × (a?4 — Ж3) = 10 + .395 × 5 = 11.98,
<⅛(jγ = xιι — ∙395 × (жц — жю) = 77 — .395 × 18 = 69.89.
normal case. The value of cq is as above. Its s.e. is given by the formula
sq = ʌ/g(lθθ — q)! [100nZ(cq; x, s)j
More intriguing information
1. The Complexity Era in Economics2. The name is absent
3. Midwest prospects and the new economy
4. Non Linear Contracting and Endogenous Buyer Power between Manufacturers and Retailers: Empirical Evidence on Food Retailing in France
5. The name is absent
6. The name is absent
7. XML PUBLISHING SOLUTIONS FOR A COMPANY
8. The name is absent
9. Assessing Economic Complexity with Input-Output Based Measures
10. The Shepherd Sinfonia