The Generalized Maximum Entropy (GME) method allows us to consistently and efficiently
estimate equations with with nonnegativity constraints and relatively few degrees of freedom
without imposing restrictions on the error process. The GME estimates are robust even if
errors are not normal and the exogenous variables are correlated. Entropy is used to measure
the uncertainty (state of knowledge) we have about the occurrence of a collection of events.
Given x a random variable with possible outcomes xs, s=1,2,..,N, with probabilities πs , the
entropy of the distribution π= ( π1, .... πN ) is
NN
S(π) = -∑πs lnπs , with ∑πs = 1
11
S reaches a maximum when all the πs are equal , and a minimum when one of the πs is equal
to one. In order to recover the unknown πs that characterize M moments of a given dataset,
one can maximize entropy subject to sample moment information. Using a sample of T
draws of an identically and independently distributed random variable x that can take N
values xt with probabilities πt,. The number of times xs occurs, fs, defines a vector of
outcomes (fi,.fN) such as Σ s fs=T. Maximization of entropy S relies on the idea of selecting
the vector of outcomes that is most likely to be drawn. The frequency that maximizes entropy
is an estimate of the true distribution which can be altered by extra (sample or non sample)
information. The ME method picks the distribution consistent with the data which is closest
to the uniform distribution. The GME approach uses each observation by treating moment
conditions as stochastic restrictions. The parameters to be estimated are expressed as
probabilities using a support space (i.e. a set of discrete points that are uniform intervals
symmetric) and a vector of corresponding unknown weights. The GME estimator maximizes
the joint entropy of all the probabilities representing the parameters to be estimated and the
error terms subject to the data and the various constraints. The various desirable properties
described by Golan et al (2001) include the ability to impose non linear and inequality
constraints, and the efficiency of the estimator with small samples.
In practice, we set support values for parameters and residuals as triplets {-α, 0, α}. An initial
value of the parameters to be estimated is set, and entropy is maximized under the constraints
that the data matches equation (7) and the various constraints imposed: b>0; 0<f<1; θ≥0; p1
≥θ+ p2.
Estimation results are provided in Table A2. The stars indicate estimators significantly
different from zero, respectively at the 10% and 5% threshold. It is noteworthy that most of
the parameters of interest, i.e. parameters b, f, e, are significantly different from zero (with the
exception of Italy, the adjustment is satisfactory). This suggests that the specification
adopted, where the usual marginal become Cm=p2+θ, θ is indeed a a positive function of p1,
a negative function of p2 and a positive function of the quota.
Table A2. Estimation results
R2 |
DW |
b |
e |
f | |
Netherlands |
0.83 |
1.92 |
1.09** |
0.57* |
0.69** |
UK |
060 |
2.08 |
3.97** |
0.76** |
0.99** |
Belgium |
0.82 |
1.95 |
3.74** |
0.61* |
0.38** |
France |
078 |
1.80 |
0.32** |
0.45* |
0.34* |
Germany |
0.96 |
1.89 |
0.36** |
0 |
0.56* |
Spain |
0.87 |
176 |
2.82** |
0 |
0.65** |
Italy |
045 |
172 |
0.67** |
2.64** |
0.001** |
26