sample forecasting.18
Some of the extensions alluded to above are designed to make Nelson-Siegel consistent with no-
arbitrage pricing. It is not obvious to us, however, that use of arbitrage-free models is necessary or
desirable for producing good forecasts.19 Indeed we have shown that our model (which is not arbitrage-
free) produces good forecasts, whereas Duffee (2002) and others have recently shown that the popular
affine no-arbitrage models produce very poor forecasts. Moreover, although our model is not
theoretically arbitrage-free, we expect it to be empirically nearly arbitrage-free. The U.S. Treasury bond
market is very liquid, which should make Treasury bond yields nearly arbitrage-free, so that given the
very good fit of our model, it should also be nearly arbitrage-free.
In closing, we would like to elaborate on the likely reason for the forecasting success of our
approach, which relies heavily on a broad interpretation of the shrinkage principle. The essence of our
approach is intentionally to impose substantial a priori structure, motivated by simplicity, parsimony,
and theory, in an explicit attempt to avoid data mining and hence enhance out-of-sample forecasting
ability. This includes our use of a tightly-parametric model that places strict structure on factor loadings
in accordance with simple theoretical desiderata for the discount function, our decision to fix λ, our
emphasis on simple univariate modeling of the factors based upon our theoretically-derived interpretation
of the model as one of approximately orthogonal level, slope and curvature factors, and our emphasis on
the simplest possible AR(1) factor dynamics. All of this is in keeping with a broad interpretation of the
“shrinkage principle,” which has a firm foundation in Bayes-Stein theory, in empirical intuition, and in
an accumulated track record of good performance (e.g., Garcia-Ferrer et al., 1987; Zellner and Hong,
1989; Zellner and Min, 1993). Here we interpret the shrinkage principle as the insight that imposition of
restrictions, which will of course degrade in-sample fit, may nevertheless be helpful for out-of-sample
forecasting, even if the restrictions are false. The fact that the shrinkage principle works in the yield-
curve context, as it does in so many other contexts, is precisely what theory and empirical experience
would lead one to expect. This is not to say, of course, that our specification is in any sense uniquely
best, and we make no claims to that effect. Rather, the broad lesson of the paper is to show in the yield-
curve context that the shrinkage perspective, which tends to produce seemingly-naive but truly
sophisticatedly-simple models (of which ours is one example), may be very appealing when the goal is
18 See Diebold (2004).
19 See Dai and Singleton (2002) for an interesting analysis that explores certain aspects of the
tradeoff between freedom from arbitrage and forecasting performance.
16