vii Since many villages have no incoming workers, we provide Tobit estimates along with OLS to account for the
truncated normal distribution of the dependent variable. Another estimation issue is an omitted variables problem since
we do not observe characteristics of laborers in labor markets that could be hired by local firms but are not. As
discussed above, rural firms substitute incoming labor for local workers depending on wage differences between
incoming and local workers. Hence, we would like to include wages for local and incoming workers in the model, as is
done in other studies on the demand for migrant workers (e.g., Struabhaar, 1988). Unfortunately, while we have wage
observations for local labor in all villages with enterprises, we only have wage observations for incoming workers in the
villages that hire them. One solution might be to predict an in-migrant wage for villages without incoming workers and
use these predicted wages in the regression analysis. But, we also do not have human capital characteristics and
information on other traits on incoming migrants who are working in the local labor pool. Due to these limitations our
specification uses only local wages. Although in theory we have omitted variable bias, the bias is likely not very large.
Rozelle, Zhang, and Hughart (1999) have shown that there is no statistical difference in the local off-farm wage rate
between any pair of the provinces that migrants in China come from (such as Sichuan, Shaanxi, Henan, and Hubei).
Differences among migrants’ wages from different areas mostly reflect transportation costs (which are likely to be
minimal for workers staying more than a few months) and local costs of living. These differences are largely
represented by the provincial dummy variables.
viii We also estimated separate regressions for male and female workers and found that the coefficient on local labor
costs is significant for both OLS and Tobit specifications of the demand for male migrants.
ix A lagged dependent variable is frequently used in regression analysis to hold constant a set of one or more
unobserved, village-specific factors that are assumed to be fixed over time and affect the dependent variable in some
way beyond the effects of the other regressors in the equation. For example, in our case the lagged dependent variable
in the regressions in table 6, columns 3 and 4 is the percent of incoming workers in each village in 1988 (which is used
to explain the dependent variable, the percent of incoming workers in 1995).
44