14
Stata Technical Bulletin
STB-8
Formul∣
Default case. I basically use the method of Mood and Graybill (1963, 408). Let x1 ≤ x2 ≤ ∙ ∙ ∙ ≤ xn be a sample of
size n arranged in ascending order. Denote the estimated gth centile of the æ’s as cq. We require that 0 < q < 100. Let
R = (n + 1)g∕100 have integer part r and fractional part f, that is, r = int(R) and f = R—r. (If R is itself an integer, then
r = R and f = 0.) Note that 0 ≤ r ≤ n. For convenience, define x∣> = x1 and xn+∣ = xn. Then Cq is estimated by
cq = xr + f × (xr+ι - xr),
that is, cq is a weighted average of xr and a⅛+ι. Loosely speaking, a (conservative) p% confidence interval for Cq involves
finding the observations ranked t and и which correspond respectively to the a = (100 — p)∕200 and 1 — a quantiles of a
binomial distribution with parameters n and g∕100, i.e., B(n,g∕100). More precisely, define the )th value (i = 0,... ,n) of
the cumulative binomial distribution function to be Fi = P(X ≤ г), where X has distribution B(n,g∕100). For convenience,
let Γ-∣ = 0 and Fn+1 = 1. Then i is found such that Ft ≤ a and Ft+1 > a, and и is found such that I--F1u ≤ a and
1 - P∙u-ι > a.
With the cci option in force, the (conservative) confidence interval is (2⅛+1,2⅛+1) and its actual coverage is Fu — Ft.
The default case uses linear interpolation on the Fi as follows. Let
g = (a - Ft)/(Ft+1 - Ft),
h = [α - (1 - Fu)]/[(1 - Fu) -(1- .F1a.1)]
= (ɑ — 1 + Fu)/( F1u-1 — Fu).
Then the interpolated lower and upper confidence limits (cqL,cqu) for Cq are
CqL = Xt+1 +g× (≈⅛+2 - ≈⅛+1)
CqU — a'u ∣ ∣ h × (j'u∣∣ Xu)∙
For example, suppose we want a 95% confidence interval for the median of a sample of size 13. So n = 13, q = 50, p = 95,
α = .025, .R = 14 × 50/100 = 7, f = 0. The median is therefore the 7th observation. Some example data xi and the values of
Fi are as follows:
i |
____Fi |
I-Fi |
Xi |
i |
__R |
l~Fi |
Xi |
0^^ |
0.0001 |
0.9999 |
— |
~1~ |
0.7095 |
0.2905 |
~~33 |
1 |
0.0017 |
0.9983 |
5 |
8 |
0.8666 |
0.1334 |
37 |
2 |
0.0112 |
0.9888 |
7 |
9 |
0.9539 |
0.0461 |
45 |
3 |
0.0461 |
0.9539 |
10 |
10 |
0.9888 |
0.0112 |
59 |
4 |
0.1334 |
0.8666 |
15 |
11 |
0.9983 |
0.0017 |
77 |
5 |
0.2905 |
0.7095 |
23 |
12 |
0.9999 |
0.0001 |
104 |
6 |
0.5000 |
0.5000 |
28 |
13 |
1.0000 |
0.0000 |
211 |
The median is x-i = 33. Also, F2 < .025 and F3 > |
.025 so |
t = |
2; 1 - |
-F110 ≤ .025 and 1 - Fq > .025 so и = 10. The |
(c50L,c50σ) = (x3,x11) = (10,77),
with actual coverage P10 — F2 = .9888 — .0112 = .9776 (97.8% confidence). For the interpolation calculation, we have
g = (.025 - .0112)/(.0461 - .0112) = .395,
h = (.025 - 1 + .9888)/(.0998 - .9539) = .395.
So
⅛ol = X3 + ∙395 × (a?4 — Ж3) = 10 + .395 × 5 = 11.98,
<⅛(jγ = xιι — ∙395 × (жц — жю) = 77 — .395 × 18 = 69.89.
normal case. The value of cq is as above. Its s.e. is given by the formula
sq = ʌ/g(lθθ — q)! [100nZ(cq; x, s)j
More intriguing information
1. Notes on an Endogenous Growth Model with two Capital Stocks II: The Stochastic Case2. Palvelujen vienti ja kansainvälistyminen
3. The name is absent
4. The voluntary welfare associations in Germany: An overview
5. The Provisions on Geographical Indications in the TRIPS Agreement
6. How we might be able to understand the brain
7. The name is absent
8. Investment in Next Generation Networks and the Role of Regulation: A Real Option Approach
9. The name is absent
10. DEVELOPING COLLABORATION IN RURAL POLICY: LESSONS FROM A STATE RURAL DEVELOPMENT COUNCIL