It is easy to note that if we have a unique sample for each regime (q =1), or alternatively under the
assumptions that all samples derive from the same fixed distribution, parameter estimation is reduced to
the minimization of a unique KI where the true model is approximated by only one nonparametric density,
that is the nonparametric equivalent of quasi maximum likelihood estimation (NPQMLE). In this case the
similarity weight is defined as follows:
wij =logfj (xi/s; θ)fn(xi),
(8)
and hence the maximization problem is given by:
NN
max wij =max log fj (xi/s; θ)fcn (xi), (9)
θMj θMj
j i=1 j i=1
since this is the only part of KI that depends on the parameters8 .
It is easily observable that in this approach as in QMLE the criterion functional to be maximized is
log fj(x/s; θ)dFn (x).
But, while in QMLE the weighting function Fn (x) is chosen to be equal to
1N
Fn(x) = N У 1 l(-∞,x] (xi),
i=1
such that the empirical criterion becomes:
1N
N ɪ^log fj (xi/s;θ);
i=1
in NPQMLE, Fn (x) is chosen to have the following form
x 1 n≤N x
Fn(x) = J fn(x)dx = nh 2_^ J Kh(xi — x)dx,
such that the empirical criterion becomes equation (2). This implies that not all observations will obtain a
mass equal to П. In contrast each observation will be rescaled by a smoother weight that depends on the
bandwidth parameter h and the kernel function K. It is this different weighting scheme that allows in finite
samples a ‘better’ performance than QMLE estimation. In particular, as it is documented through a limited
set of Monte Carlo experiments, the parameter estimates obtained using NPQMLE deliver a KI between the
true model and the parametric candidate which is smaller than that obtained using QMLE.
Finally, in this context the best model attains solving the following problem:
inf min logfcn(xi) - log fj(xi/s; θ) fcn(xi) .
(10)
{j:1,...,J} θMj ∈Θ i=1
8 Minimizing the functional R log fcn (xi) - log fj (xi/s; θ) dFn (x) or maximizing R log fj (xi/s; θ)dFn (x) w.r.t θ provides the
same results.