Density Estimation and Combination under Model Ambiguity



For these reasons, it seems natural to determine wqj
parametric density fnq (x/s) and the model fj (x/s; θ):
by the opposite of the distance between the non-

Wqj = -KI (fnq (x),fj (x/s; θ)),                                      (1)

where KI(fnq (x), fj (x/s, θ)) is the Kullback-Leibler distance5 , whose empirical version in this study is
defined as follows:

Nq

KI qj = X cq (Xi)log{     }.                              (2)

where i is the index for all observations contained in a sample q. For simplicity I dropped the index relative
to the regime
s.

If the values of the optimal parameters were known, the prediction rule - ranking the plausibility of each
model through the sum of their weights (over the past cases) - will lead us to choose as predictive density
f1 rather than f2 if and only if:

wq1 >      wq2,                                              (3)

qCs        qCs

(where Cs is a partition of C and represents the set of past cases relative to regime s) or equivalently:

X KI(cq(x), fι(x∕s; θ)) < X KI(fcq(x), f2(x∕s; θ)).                     (4)

qCs                               qCs

The sum of the weights relative to model f1 can be interpreted as in Gilboa and Schmeilder (2001) as
the “aggregate similarity or plausibility” of model
f1 . However, as the values of the optimal parameters are
unknown, it is necessary to estimate them. Since the model with the largest aggregate similarity to past cases
is the most appropriate to achieve a good prediction, the candidate model’s parameters
θMjs are obtained
in the following way:

max wqj =min     KI(fcnq (x),fj (x/s; θ)).                         (5)

θMjs qCs       θMjs qCs

The minimization of the sum of these pseudo-distances allows us to obtain the optimal minimum contrast
(MC) estimates6 of the parameters that characterize the a priori distributions. This method gives us the
opportunity to extract the information contained in a nonparametric estimate, while preserving the simplicity
of a parametric model. This goal can be achieved by density-matching: the optimal model is derived to be
consistent with the observed distribution of the data7 .

It follows then that the rank of the competing models is obtained as follows:

f1 Â f2 IFF min X KI(fcnq(x),f1(x/s; θ)) min X KI (fcnq (x), f2 (x/s; θ)),
θM1 Θ qCs                           θM2ΘqCs

(6)


which in turn implies that the best model can be represented by the following prediction rule:

inf

{ j:1,...,J}


min
θMj Θ


X KI(fcq(x),fj(x/s; θ))
qCs


(7)


5 We can choose many other distances, on this purpose see Ullah A.(1996).

6See Dhrymes P. J. (1994) p. 282 .

7See Aït Sahalia Y. (1996).



More intriguing information

1. The name is absent
2. Ability grouping in the secondary school: attitudes of teachers of practically based subjects
3. Nonlinear Production, Abatement, Pollution and Materials Balance Reconsidered
4. On the Desirability of Taxing Charitable Contributions
5. The name is absent
6. Conflict and Uncertainty: A Dynamic Approach
7. Who runs the IFIs?
8. A parametric approach to the estimation of cointegration vectors in panel data
9. The Economics of Uncovered Interest Parity Condition for Emerging Markets: A Survey
10. Strategic monetary policy in a monetary union with non-atomistic wage setters