categorical variables (e.g., industry affiliation) may have a large number of levels (categories),
we do not report the single estimates for each category (i.e. for each dummy variable) but in-
stead provide partial R2 for each variable or effect. Partial R2s are preferred over t -statistics in
analyses with a large number of observations since the significance of simple t-tests does not
express the explanatory power of a variable or an effect (McCloskey and Ziliak, 1996). Partial
R2 are defined as (see Greene, 2003, p. 36):
22
R -R
R2z = ι^^ (3)
where Rx2|z is the partial R2 of variable(s) x, Rx2,z is the R2 for the model including all variables x
and z, and Rz2 is the model R2 where only the z-variables are included.
The partial R2 of a variable expresses how much of the variation of the dependent variable
can be explained by this particular variable, or by a subset of dummy variables (representing
a categorical variable) given that the other variables are included in the model. Therefore,
the partial R2 measures the difference of the model’s R2 with and without a certain variable or
effect. Theil (1971) emphasizes the importance of measuring the incremental contribution of
a variable for explaining the dependent variable. Furthermore, Flury (1989) and Shea (1997)
argue that partial statistics should be especially taken into consideration when analyzing the rel-
evance of variables in multivariate models. Moreover, Hamilton (1987) highlights the merit of
partial correlations in determining which explanatory variables to keep in the case of correlated
variables.
Since the productive efficiency estimate for each firm is time invariant, the second step of the
analysis is based on the cross-section of firms. All explanatory variables are included as firm-
specific averages over the observation period. Even in this cross-sectional setup it is possible to
include year dummies for the years a firm is included in the sample. The respective year dummy
is set to 1ifthe firm is observed in that year; 0 otherwise. The estimation of year dummies with
cross-sectional data is possible since not all firms are observed over the entire period; some
firms are only included only in subperiods. The year dummies capture the overall trend of the
firms’ average efficiency. For instance, if average efficiency improves over time we should find
significantly higher estimates of the year dummy variables for the later years compared to the
first years of the sample period.
Table 4 provides an overview of the firm-level information available in the Cost Structure
Census that is included in the second step of our analyses. The dataset provides a unique
opportunity to investigate the relative importance of a broad range of determinants of efficiency
that have not been investigated in previous studies due to data constraints. In our single study,
11