Density Estimation and Combination under Model Ambiguity

For these reasons, it seems natural to determine w_qj
parametric density f_nq (x/s) and the model fj (x/s; θ):
by the opposite of the distance between the non-

Wqj = -KI (fnq (x),fj (x/s; θ)), (1)

where KI(f_nq (x), fj (x/s, θ)) is the Kullback-Leibler distance⁵ , whose empirical version in this study is
defined as follows:

KI qj = X cq (Xi)log{ ,θ}. (2)

where i is the index for all observations contained in a sample q. For simplicity I dropped the index relative
to the regime s.

If the values of the optimal parameters were known, the prediction rule - ranking the plausibility of each
model through the sum of their weights (over the past cases) - will lead us to choose as predictive density
f1 rather than f2 if and only if:

wq1 > wq2, (3)

q∈Cs q∈Cs

(where Cs is a partition of C and represents the set of past cases relative to regime s) or equivalently:

X KI(cq(x), fι(x∕s; θ)) < X KI(fcq(x), f2(x∕s; θ)). (4)

q∈C_s q∈C_s

The sum of the weights relative to model f1 can be interpreted as in Gilboa and Schmeilder (2001) as
the “aggregate similarity or plausibility” of model f1 . However, as the values of the optimal parameters are
unknown, it is necessary to estimate them. Since the model with the largest aggregate similarity to past cases
is the most appropriate to achieve a good prediction, the candidate model’s parameters θM_js are obtained
in the following way:

max wqj =min KI(fcnq (x),fj (x/s; θ)). (5)

^θ^Mjs q∈Cs ^θ^Mjs q∈Cs

The minimization of the sum of these pseudo-distances allows us to obtain the optimal minimum contrast
(MC) estimates⁶ of the parameters that characterize the a priori distributions. This method gives us the
opportunity to extract the information contained in a nonparametric estimate, while preserving the simplicity
of a parametric model. This goal can be achieved by density-matching: the optimal model is derived to be
consistent with the observed distribution of the data⁷ .

It follows then that the rank of the competing models is obtained as follows:

f₁ Â f₂ IFF min ^X KI(fcnq(x),f1(x/s; θ)) < min ^X KI (fcnq (x), f2 (x/s; θ)),
^θ^M1^∈^Θ q∈Cs ^θ^M2^∈^Θq∈Cs

(6)

which in turn implies that the best model can be represented by the following prediction rule:

inf

{ j:1,...,J}

min
θM_j ∈Θ

X KI(fcq(x),fj(x/s; θ))
q∈Cs

(7)

⁵ We can choose many other distances, on this purpose see Ullah A.(1996).

⁶See Dhrymes P. J. (1994) p. 282 .

⁷See Aït Sahalia Y. (1996).

More intriguing information

1. The name is absent
2. Ability grouping in the secondary school: attitudes of teachers of practically based subjects
3. Nonlinear Production, Abatement, Pollution and Materials Balance Reconsidered
4. On the Desirability of Taxing Charitable Contributions
5. The name is absent
6. Conflict and Uncertainty: A Dynamic Approach
7. Who runs the IFIs?
8. A parametric approach to the estimation of cointegration vectors in panel data
9. The Economics of Uncovered Interest Parity Condition for Emerging Markets: A Survey
10. Strategic monetary policy in a monetary union with non-atomistic wage setters