The variable average number of cigarettes smoked per day takes integer values only,
so one might think of estimating a classical count-data model rather than the Tobit
model proposed in this paper. However, we do not follow this modeling approach since
tobacco consumption is not a genuine count-data phenomenon. In fact, any amount of
tobacco can be consumed if cigarettes are partially smoked. Moreover, in the survey
that is used for our analysis - as in many similar ones - individuals were asked about
the average number of cigarettes smoked per day. Consequently, the correct answers
to this question would not necessarily take integer values even if cigarettes were always
completely smoked by individuals. In other words, the fact that cigarette consumption
is measured as an integer is a matter of imprecise reporting, i.e. rounding error, rather
than due to an underlying Poisson process. Parametric count-data models like the
Poisson or the negative binomial, therefore, are likely to misspecify the true data
generating process.
In our empirical analysis, we control for gender, age, age squared and a dummy
variable indicating living in West-Germany. Moreover, the vector xit includes parental
education, parental marital status, number of children at parents home as well as the
way individuals have grown up reflecting the social background of the family. For the
latter, we distinguish between having grown up with the mother, the father or both
captured by an interaction term. Parental education is included in the specification
in form of four dummies: parent has a “low schooling degree”, “a medium degree”,
“a high degree”, or a “university degree”. “Parent has no degree” serves as reference
group. By interacting parental education with dummy variables indicating having
grown up with the parent we allow parental education to have an effect only if the
respondent has grown up with the parent. Parental marital status is measured by one
dummy variable indicating whether parents are married. Variables often controlled for
by other authors - e.g. Chaloupka & Laixuthai (1997), Williams (2005), Yen (2005)
- like own education, marital and labor market status, number of children, current
living situation as well as income are not used as explanatory variables because of
their potential endogeneity. Notwithstanding, we also experimented with including
these variables in additional specifications but it turned out that this does not change
our main findings.19
As discussed, parental smoking and drinking habits serve as instruments zcit and zait .
Individuals that already have moved out from parental home are retrospectively asked
about these variables. For our regression analysis, each parent’s smoking behavior is
19 See Table 7 and 8 in Appendix B for estimation results.
13