However, such problems can be avoided by means of econometric models for
count data.
The most frequently applied count data model is the Poisson regression
model.16 It is obtained by assuming that each realization of the count depen-
dent variable yi for cross-sectional observation i (in our case, yi would be the
number of headquarters hosted by municipality i) is drawn from a Poisson
distribution with parameter λ(xi; β) = exp(x0iβ), where x0i is a 1 × K vector
of explanatory variables and β a K × 1 parameter vector to be estimated.
The conditional probability distribution of the count variable is given by
f(yi) =
exp(-exp(x0iβ))exp(yix0iβ)
yi
and the conditional mean and variance are simultaneously determined by the
parameter λ(xi; β):
E(yi | x0i) = V ar(yi | x0i) = exp(x0iβ).
This last feature of the Poisson distribution (referred to as equidispersion, or
equality of mean and variance) renders the Poisson regression model often too
restrictive in applications. In particular, the model tends to under-predict the
frequency of zeros and of large counts for data in which the actual variance is
larger than the mean (referred to as over-dispersion). In our application, we
have both a large number of zeros and a few very large counts so that over-
dispersion is likely a problem. As Figure 1 shows, the tail of the distribution
is very long with 86% of the municipalities hosting no multinational but one
municipality hosting 615 multinationals in 2005.
An approach which is more flexible than the Poisson regression model is
the negative binomial model (NB), which allows for unobserved heterogeneity
by treating the parameter λ of the Poisson process as a random variable. This
model is obtained by setting λi = μiνi, where μi = exp(xtiβ), and the random
component νi > 0 is gamma-distributed with E(νi) = 1 and V ar(νi) = α.
The conditional mean and variance of the NB model are17
E (yi |xi) = μi
Var(yi∣xi) = μi(1 + αμβ
16 For a thorough discussion of the count data models discussed in this section, see
Winkelmann (2003) and Cameron and Trivedi (2006).
17The model with this particular parametrization is known as NB type-II model (see
Cameron and Trivedi, 2006).
11