20
Stata Technical Bulletin
STB-22
sp_adj is similar to spline except that it permits the user to specify other covariates that are to be included in the
regression model. The smooth is then that part of the “linear predictor” corresponding to the spline basis, and it is this that is
plotted against the ж-variable. The smooth thus represents the transformation of the covariate. If the regression is carried out
with a non-linear link, the smooth will not be in the same scale as the original ^-variable, For this reason the original data is
not included on the graph.
spbase simply generates the truncated power basis for a natural cubic spline. The basis may then be used to adjust for the
ж-variable in a variety of regression models.
A natural cubic spline is a piecewise cubic polynomial that is everywhere twice continuously differentiable. In addition it
is linear beyond the extreme knots; these are taken to be the minimum and maximum of the ж-variable. The number or position
of the interior knots may be specified. By default the program selects approximately AT-1/4 knots and places them at equally
spaced percentiles (N is the number of observations).
Remarks
1. There is much overlap between the three programs, and a reader who is good at Stata programming may wish to write a
“hidden” ado-file, so that these three programs can be rewritten with each one calling the hidden program.
2. The truncated power basis is easy to define, but computationally unstable. Ideally these programs would be converted to
produce a B-spline basis.
Syntax
spline yvar xvar [ if exp [ in range ] [, nknots(#) knots (#... #) regress (command)
gen Smoot>hfifit^) logit nograph graph-options ]
sp_adj yvar xvar [ if exp [ in range ] [, nknots(#) knots (#...#) regress(commmd)
adjust (.varlist) gen(x-transform) nograph graph-options ]
spbase xvar [ if exp [ in range ] , gen(basis) [ nknots(#) knots(#. ..#) ]
Options
genSmoooth.fit') creates a new variable smooth-fit containing the smoothed fitted values from spline. Note that if the logit
option is specified the values in smooth-fit will be in the logit scale.
gen (χjrnnsform) creates a new variable xJransform containing the component of the linear predictor corresponding to the
spline in xvar produced by sp_ad j.
gen(basis) is not optional; it creates к new variables basis 1,..., basisk, where к is the number of internal knots. xvar together
with the к new variables form the spline basis. The names of all the variables in the basis are contained in a macro with
the name of basis.
knots (#.. .#) specifies the exact location of the interior knots. The numbers may be separated by spaces and/or commas.
nknots(#) specifies the number of interior knots. nknots is ignored if the locations are specified using knots.
regress (command) selects the estimation command used to fit the model. By default spline and sp_adj both use regress.
The program has been tested with blogit, bprobit, clogit, cox, glogit, gprobit, logistic, logit, poisson,
and probit. If clogit is used, the stratification variable may be specified in the usual way. Similarly for the censoring
variable with cox. The programs may require modification for use with other estimation commands such as mlogit. The
stratification variable in clogit and the censoring variable in cox may be regarded as “covariates” and for this reason
the options dead and strata may only be used with sp_adj. To use blogit, bprobit, glogit, or gprobit, you must
specify the pos~var as the xvar and the pop.var as the first variable in the varlist of adjust.
adjust (varlist) adds the variables in varlist to the regression, so that the transform of xvar is adjusted. This option is not
required; for instance, one may use sp_adj with reg(clogit) without adjust. To use blogit, bprobit, glogit, or
gprobit, the first variable in the varlist of adjust must be the pop~var.