Stata Technical Bulletin
11
result of applying 3 an infinite number of times. R should only be used with odd-span smoothers, since even-span smoothers are
not guaranteed to converge.
The smoother 453R2 applies a span-4 smoother, followed by a span-5 smoother, followed by repeated applications of a
span-3 smoother, followed by a span-2 smoother.
End-point rule
The end-point rule E modifies the values z1 and according to the formulas:
Zi = median(3z2 — 2z3, zɪ, z2)
zjv = median(3zjv-2 — 2zjv-i, zʌ-, zjv-i)
When the end-point rule is not applied, end-points are typically “copied-in,” i.e., zɪ = г/i and = уы.
Splitting operator
The smoothers 3 and 3R can produce flat-topped hills and valleys. The split operator S is an attempt to eliminate such hills
and valleys by splitting the sequence, applying the end-point rule E, rejoining the series, and then resmoothing by 3R.
The S operator may be applied only after 3, 3R, or S.
It is recommended that the S operator be repeated once (SS) or until no further changes take place (SR).
Hanning smoother
H is the Hanning linear smoother zt = (yt-ι + 2yt + ⅜+ι)∕4. End points are copied in, zɪ = г/i and z^ = y^. H should
be applied only after all nonlinear smoothers.
Twicing
A smoother divides the data into a smooth and a rough; observed = smooth + rough. If the smoothing is successful, the
rough should exhibit no pattern. Twicing refers to applying the smoother to the observed, calculating the rough, and then applying
the smoother to the rough. The resulting “smoothed rough” is then added back to the smooth from the first step.
Examples
As a few examples of how nlsm can be used:
. nlsm 3 coalprdn, gen(smcp)
. nlsm 3r coalprdn, gen(smcp2)
. nlsm 3rss coalprdn, gen(smcp3)
. nlsm 3rssh3rssh3 coalprdn, gen(smcp4)
. nlsm 3rssh,twice coalprdn, gen(smcp5)
. nlsm 4253eh,twice gnp, gen(sgnp)
Certifications
nlsm has been tested on most of the examples provided in Tukey (1977) and produces identical results to those reported.
Salgado-Ugarte and Garcia (1992) provided in Table 1 the results of applying 4253EH to their length-frequency data. In comparison
to the results calculated by nlsm, the following differences were observed:
Standard body length |
Frequency (individuals) |
Ugarte-Garcia |
nlsm smoothed values |
37 |
6 |
6.0000 |
. |
38 |
10 |
6.0000 |
6.7500 |
39 |
3 |
6.0000 |
6.3125 |
40 |
7 |
6.0000 |
6.0625 |
41 |
5 |
6.0000 |
6.0000 |
42 |
9 |
5.9375 |
5.9375 |
For the remaining lengths 43-67, there were no differences in the results, so results were identical for lengths 41-67; only results
for the first four lengths differ. The difference in the first observation is due to a difference of implementation; Ugarte-Garcia