5.1 Instrumental Variables
A key feature of this framework is that unobservables don’t bias the treatment effect as long as
an instrumental variable can be found that is non-trivially related to treatment assignment but
is uncorrelated with other variables which are omitted from the outcome equation of interest.
Thus if we are dealing with a “broken” experimental design premised on randomizing treatment,
and we have a concern that not all of the important variables predicting treatment can be
observed given the survey instrument employed, IVs might offer a useful alternative
5.1.1 Wald Estimator: Binary Treatment-Binary IV
Consider once again the single difference estimator introduced earlier. A regression equivalent
of that estimator is:
yij = α + δTij + uij
where T is our treatment dummy; y is our outcome variable; and i, j indexes villages/PSUs
and households respectively.
A simple alternative to this naive approach is the Wald estimator (Angrist, 1990). This
estimator is a special case of the local average treatment estimator or LATE (Imbens and
Angrist, 1994) where we instrument T with a binary variable.
Let this variable be denoted as Pij . Then as long as Pij does not perfectly predict Tij ,
it can be shown that δ is simply equal to the ratio of the difference in means for y (between
households with P = 1 and P = 0) to the difference in means for T (between households with
P = 1 and P = 0). For the most parsimonious case given above where we use a single IV, the
IV estimate of the slope can be written as
δ = (PN=ι(Pij -P)(yj -y))
(Pi=ι(Pij - l' T - T))
= (PL Pij(yij - y))
(PN=i Pij(Tij - T))
_ У1 - У0
= Tl - To
The complete derivation is given in appendix A1. The standard choice for an IV in this
context is to use some indicator of eligibility.
5.1.2 IV Estimator: Continuos Treatment-Binary IV
Often the rules governing participation in a health program might invalidate the use of eligibility
as an IV. For example, many health interventions are deliberately targeted to poorer segments
of a population. If the outcome of interest is some type of welfare metric (say consumption),
then a model such as the one above will have an implausible exclusion restriction since a variable
such as P is likely to covary with y (the outcome variable of interest). However, exogenous
variation can sometimes by extracted through innovative use of prior information about rollout
or other features of program implementation. For example, if the health programme is targeted
to poor villages but at a centralised location such as a clinic, then spatial information such as
the distance from sampled households to the clinic could in principal be used to construct a
model with more plausible exclusion restrictions.
How exactly might this might work? Let D refer to a measure of distance such as the
one just discussed and let P be defined as in the previous model. Now let’s imagine we are
interested in estimating the impact of some health intervention which is best understood as a
“dose”.6 As before, denote treatment (this time assumed continuous) as T. Plausibly, D, P
6 For example, the treatment for iron deficiency anemia ranges from 3-12 months and then has to be com-
plemented for the rest of the patient’s life by a more iron-enriched diet than was the case prior to the onset of
treatment.