m n m n m
(6) MNNLL = { -(n∕2)×ln∣∑∣ -0.5×∑[ln(∣ψj∣)] +∑ ∑ [ln(Gji)] -0.5×∑ ∑ [{Hi*(∑-1)}.*Hi]};
j=1 ι=1j=1 ι=1j=1
where ∑ is an m×m matrix with unit diagonal elements and non-diagonal elements pjk; Gji is as defined in
equation (5) if Yj (and thus Yj*) is not normally distributed or Gji=σj-1 if Yj is normally distributed; and Hi
is a 1xm row vector with elements Hji (j=1,...,m) also defined in equation (5) if Yj is not normally
distributed and Hji=(Yji*-Xji*βj)∕σj if Yj is normally distributed. The operator * indicates a matrix
multiplication; and .* indicates an element-by-element matrix multiplication.
The multivariate log-likelihood function {equation (6)} simply links m univariate functions
[equation (5)] through the cross-error term correlation matrix ∑. As in the normal error case, if some of the
m dependent variables of interest are correlated to each other, using equation (6) to jointly estimate the βj
vectors should result in an improved efficiency in comparison with the case where they are estimated
separately. Maximum likelihood estimation is conducted by finding the values of the parameters (Θj, μ j, βj,
and those in the ∑, Pj, and ψj matrices) that maximize the log-likelihood function [equation (6)]. This is
achieved through numerical optimization procedures, such as the Newton-Raphson algorithm, which are
available in most econometric software packages, including Gauss 386i. These pre-programmed procedures
only require a few standard command lines and the log-likelihood function. In addition to parameter estimates,
they provide standard errors based on a numerical estimate of the Hessian matrix of this function.
Monte Carlo Simulation Analysis
The sample design used by Hsieh and Manski, Newey, and McDonald and White was adopted for
the Monte Carlo simulation to ensure comparability with previous results. For the first phase of the
simulation, the regression model is given by:
(7) Yji = βj0 + βjiXji + Uji = -1 + Xji + Uji (j=1);
where the explanatory variable Xji = 1 with a probability of 0.5 and Xji = 0 with a probability of 0.5. Xji is
also assumed to be statistically independent of Uji. Thus, each model can be interpreted as estimating a shift
parameter that separates two identical distributions except for a location parameter. The specifications for