The name is absent



12


Stata Technical Bulletin


STB-8


“copied-back” the information for length 37 from length 38 whereas nlsm more agnostically filled in missing (applying two
even-span smoothers results in shifting the data one unit forward, so information for the first observation is lost).

I do not have an explanation for the remaining three differences except to assert that the results reported by nlsm are as
intended, which is not to say that they are necessarily more correct. There is obviously a difference in assumptions about how
the start-up tail is to be handled between the two routines although, interestingly, that difference is not reflected in how the
trailing tail is handled. (Not too much should be made of that, however. Define the function rev() as the function reversing a
sequence, e.g., rev(j⅛) = yjv-j+ι. Let
SQ be some smoother. One is tempted to think that SQ/Q = rev(S'(rev(yt))). That
is true for median smoothers of odd span, the Hanning smoother, and the end-point rule. It is not, however, true for median
smoothers of even span.)

In any case, the tails produced by any of these smoothers should not be taken too seriously—they are based on too little
data and too many approximations and fix-up rules. The purpose of the smoother is to reveal the pattern for the middle-portions
of the data.

References

Salgado-Ugarte, I. H. and J. C. Garcia. 1992. sed7: Resistant smoothing using Stata. Stata Technical Bulletin 7: 8-11.

Tukey, J. W. 1977. Exploratory Data Analysis, Ch. 7. Reading, MA: Addison-Wesley Publishing Company.

Velleman, P. F. 1977. Robust nonlinear data smoothers: Definitions and recommendations. Proc. Natl. Acad. Sci. USA 74(2): 434-436.

——. 1980. Definition and comparison of robust nonlinear data smoothing algorithms. Journal of the American Statistical Association 75(371): 609-615.

sg1.3 Nonlinear regression command, bug fix

Patrick Royston, Royal Postgraduate Medical School, London, FAX (011)-44-81-740 3119

nlpred incorrectly calculates the predictions and residuals when nl is used with the Inlsq (log least squares) option. The
bug is fixed when the update on the STB-8 diskette is installed. nlpred is used after nl to obtain predicted values and residuals
much as predict is used after regress or fit. The mistake affected only calculations made when the log least squares option
was specified during estimation.

sg7 Centile estimation command

Patrick Royston, Royal Postgraduate Medical School, London, FAX (011)-44-81-740 3119

Stata’s summarize, detail command supplies sample estimates of the 1, 5, 10, 25, 50, 75, 90, 95 and 99th (per)centiles.
To extend summarize, I provide an ado-file for Stata version 3.0 which estimates arbitrary centiles for one or more variables
and calculates confidence intervals, using a choice of methods.

The syntax of centile is

centile VvaiistQ [if exp [in range] [, centile(# [#...]) cci normal meansd level(#) ]

The gth centile of a continuous random variable X is defined as the value of Cq which fulfills the condition P(X < Cq) =
g∕100. The value of q must be in the range 0 < q < 100, though q is not necessarily an integer. By default, centile estimates
C,g for the variables in
varlist and for the value(s) of q given in centile(#...). It makes no assumptions as to the distribution
of
X and, if necessary, uses linear interpolation between neighboring sample values. Extreme centiles (for example, the 99th
centile in samples smaller than 100) are fixed at the minimum or maximum sample value. An ‘exact’ confidence interval for
Cq
is also given, using the binomial-based method described below (see Formulæ). The detailed theory is given by Conover (1980,
111-116). Again, linear interpolation is employed to improve the accuracy of the estimated confidence limits, but extremes are
fixed at the minimum or maximum sample value.

You can prevent centile from interpolating when calculating binomial-based confidence intervals by specifying the
conservative confidence interval option cci. The resulting intervals are in general wider than with the default, that is, the
coverage (confidence level) tends to be greater than the nominal value (given as usual by level(
#), by default 95%).

If the data are believed to be normally distributed (a common case), two alternate methods for estimating centiles are offered.
If normal is specified,
Cq is calculated as just described, but its confidence interval is based on a formula for the standard
error (s.e.) of a normal-distribution quantile given by Kendall and Stuart (1969, 237). If meansd is alternatively specified,
Cq
is estimated as x + zq × s, where x and a are the sample mean and standard deviation and zq is the gth centile of the standard
normal distribution (e.g. Z95 = 1.645). The confidence interval is derived from the s.e. of the estimate of
Cq.



More intriguing information

1. Testing Hypotheses in an I(2) Model with Applications to the Persistent Long Swings in the Dmk/$ Rate
2. Parallel and overlapping Human Immunodeficiency Virus, Hepatitis B and C virus Infections among pregnant women in the Federal Capital Territory, Abuja, Nigeria
3. On the job rotation problem
4. The name is absent
5. The name is absent
6. Qualification-Mismatch and Long-Term Unemployment in a Growth-Matching Model
7. The Role of area-yield crop insurance program face to the Mid-term Review of Common Agricultural Policy
8. Dual Track Reforms: With and Without Losers
9. CURRENT CHALLENGES FOR AGRICULTURAL POLICY
10. Analyzing the Agricultural Trade Impacts of the Canada-Chile Free Trade Agreement
11. The name is absent
12. AJAE Appendix: Willingness to Pay Versus Expected Consumption Value in Vickrey Auctions for New Experience Goods
13. Constrained School Choice
14. The Shepherd Sinfonia
15. Tourism in Rural Areas and Regional Development Planning
16. Antidote Stocking at Hospitals in North Palestine
17. The Works of the Right Honourable Edmund Burke
18. The name is absent
19. The name is absent
20. Reputations, Market Structure, and the Choice of Quality Assurance Systems in the Food Industry