Stata Technical Bulletin
21
newvarlist is intended mainly for programmers and allows them to store the splines in temporary variables with temporary
names.)
Methods and formulas
The principles and definitions of В-splines are given in de Boor (1978) and Ziegler (1969). Practical applications in chemistry
are described in Wold (1971, 1974). They are used in signal processing and are associated with a wavelet transformation (Unser,
et al. 1992).
Splines are a method of defining models regressing a scalar У-variate with respect to a scalar X-variate. By definition, a
fcth-degree spline is defined with reference to a set of q knots sɪ < s2 < ∙ ∙ ∙ < sq, dividing the ж-axis into intervals of the form
[s,, Sj+ι). In each of those intervals, the regression is a fcth-degree polynomial in X (usually a different one in each interval), but
the polynomials in any two contiguous intervals have the same Jth derivatives at the knot separating the two intervals, for j from
zero to к — 1. By convention, the zeroth derivative is the function itself, so a zeroth-degree spline is simply a right-continuous
step function, and a first-degree spline is a simple linear interpolation of values between the knots. (By convention, the intervals
[si,Sj+ι) are closed on the left and open on the right, but this convention only matters for splines of degree zero, which, by
convention, are right-continuous rather than left-continuous.)
Splines can be defined using plus-functions. For a power к and a knot s, the fcth-power plus-function at s is defined as
(1)
The plus-functions are a basis for the space of splines. Thatistosay, for any ⅛th-degree spline S,(∙), with knots sɪ < s2 < ∙ ∙ ∙ < sq,
there exists a д-vector a such that, for any x,
S(x)
∑ajPk⅛sj)
J = I
(2)
It might seem that, to fit a spline in a covariate X to a У-variate, all we have to do is to define a design matrix U. such
that Uij = Pk(xi', sj∙) and fit β as a vector of regression coefficients. This is not a good idea for two reasons. First, there are
problems with stability, as Pk(x; w will be very large for к > 1 and x much greater than s. Second, the /^-parameters estimated
will not be easy to explain in words to a nonmathematician. The first problem was solved with the introduction of B-splines
by Schoenberg in the 1960s, and these are calculated by bspline. The second problem is solved using frencurv, which calls
bspline and then transforms the В-splines, so that the regression parameters will simply be values of the spline at reference
points.
The В-splines define an alternative basis of the splines with a given set of knots. Ziegler (1969) defines the В-spline for a
set of к + 2 knots sɪ < s2 < ... < s⅛+2 as
fe+2
B(x; s1,...,s⅛+2) = (fc + l)
∑ π
J=I l<h<k+2,h≠j
(s⅛ - Sj )
Pfe(aηsj)
(3)
The В-spline (3) is positive for x in the open interval (sι,s⅛+2) and zero for other x. If the Sj are part of an extended set of
knots extending forwards to +∞ and backwards to —∞, then the set of В-splines based on sets of к + 2 consecutive knots
forms a basis of the set of all fcth-degree splines defined on the full set of knots. Figure 1 shows the constant, linear, quadratic
and cubic B-splines originating at zero and corresponding to unit knots.
Forthepurposes of bspline and frencurv, Ihave taken the liberty of redefining В-splines by scaling the В(ж; sɪ,..., s⅛+2)
in (3) by a factor equal to the mean distance between two consecutive knots to arrive at the scale-invariant B-spline
Λ( j Sk+2-S1 ∫∑-±L1∏⅛^≠J⅛W,
if si ≤ x < sfe+2
otherwise
A(χ-,s1,... ,sk+2) = —fe + ɪ B(χ-,s1,... ,sk+2) = < θ j^1 j v j
where the functions ≠j⅛(∙) are defined by
f1’
if h = j
if h = j + 1
otherwise
(4)
Φjh{x) = < (sfe+2 - Sι)∕(s⅛ - Sj),
I P√aηsj)∕(s⅛ - sj∙),