The name is absent



14


Stata Technical Bulletin


STB-8


Formul

Default case. I basically use the method of Mood and Graybill (1963, 408). Let x1 ≤ x2 ≤ ∙ ∙ ∙ ≤ xn be a sample of
size
n arranged in ascending order. Denote the estimated gth centile of the æ’s as cq. We require that 0 < q < 100. Let
R = (n + 1)g∕100 have integer part r and fractional part f, that is, r = int(R) and f = R—r. (If R is itself an integer, then
r = R and f = 0.) Note that 0 ≤ r ≤ n. For convenience, define x> = x1 and xn+ = xn. Then Cq is estimated by

cq = xr + f × (xr+ι - xr),

that is, cq is a weighted average of xr and a⅛+ι. Loosely speaking, a (conservative) p% confidence interval for Cq involves
finding the observations ranked
t and и which correspond respectively to the a = (100 — p)∕200 and 1 — a quantiles of a
binomial distribution with parameters
n and g∕100, i.e., B(n,g∕100). More precisely, define the )th value (i = 0,... ,n) of
the cumulative binomial distribution function to be
Fi = P(X ≤ г), where X has distribution B(n,g∕100). For convenience,
let Γ-
= 0 and Fn+1 = 1. Then i is found such that Ft ≤ a and Ft+1 > a, and и is found such that I--F1ua and
1 - P∙u-ι > 
a.

With the cci option in force, the (conservative) confidence interval is (2⅛+1,2⅛+1) and its actual coverage is Fu — Ft.

The default case uses linear interpolation on the Fi as follows. Let

g = (a - Ft)/(Ft+1 - Ft),

h = [α - (1 - Fu)]/[(1 - Fu) -(1- .F1a.1)]

= (ɑ — 1 + Fu)/( F1u-1 — Fu).

Then the interpolated lower and upper confidence limits (cqL,cqu) for Cq are

CqL = Xt+1 +g× (≈⅛+2 - ≈⅛+1)

CqU — a'u ∣ ∣ h × (j'u∣∣ Xu)∙

For example, suppose we want a 95% confidence interval for the median of a sample of size 13. So n = 13, q = 50, p = 95,
α = .025, .R = 14 × 50/100 = 7,
f = 0. The median is therefore the 7th observation. Some example data xi and the values of
Fi are as follows:

i

____Fi

I-Fi

Xi

i

__R

l~Fi

Xi

0^^

0.0001

0.9999

~1~

0.7095

0.2905

~~33

1

0.0017

0.9983

5

8

0.8666

0.1334

37

2

0.0112

0.9888

7

9

0.9539

0.0461

45

3

0.0461

0.9539

10

10

0.9888

0.0112

59

4

0.1334

0.8666

15

11

0.9983

0.0017

77

5

0.2905

0.7095

23

12

0.9999

0.0001

104

6

0.5000

0.5000

28

13

1.0000

0.0000

211

The median is x-i = 33. Also, F2 < .025 and F3 >
conservative confidence interval is therefore

.025 so

t =

2; 1 -

-F110 ≤ .025 and 1 - Fq > .025 so и = 10. The

(c50L,c50σ) = (x3,x11) = (10,77),

with actual coverage P10 — F2 = .9888 — .0112 = .9776 (97.8% confidence). For the interpolation calculation, we have

g = (.025 - .0112)/(.0461 - .0112) = .395,
h = (.025 - 1 + .9888)/(.0998 - .9539) = .395.

So

⅛ol = X3 + ∙395 × (a?4 — Ж3) = 10 + .395 × 5 = 11.98,

<⅛(jγ = xιι — ∙395 × (жц — жю) = 77 — .395 × 18 = 69.89.

normal case. The value of cq is as above. Its s.e. is given by the formula

sq = ʌ/g(lθθ — q)! [100nZ(cq; x, s)j



More intriguing information

1. The name is absent
2. Gender and headship in the twenty-first century
3. What Drives the Productive Efficiency of a Firm?: The Importance of Industry, Location, R&D, and Size
4. Income Mobility of Owners of Small Businesses when Boundaries between Occupations are Vague
5. Getting the practical teaching element right: A guide for literacy, numeracy and ESOL teacher educators
6. IMMIGRATION POLICY AND THE AGRICULTURAL LABOR MARKET: THE EFFECT ON JOB DURATION
7. Midwest prospects and the new economy
8. The name is absent
9. Conservation Payments, Liquidity Constraints and Off-Farm Labor: Impact of the Grain for Green Program on Rural Households in China
10. The name is absent