■ Average Treatment Effect (ATE). This is the expected gain from participating in a
program for a randomly chosen individual (Heckman, Tobias and Vytlacil, 2001),
calculated as the differences in expected outcomes before and after treatment:
αATE =E(∆)=E(Y1)-E(Y0) (4)
■ Average Treatment Effect on the Treated (ATET). This is the average gain from
treatment for those who select into the treatment (Heckman, Tobias and Vytlacil, 2001):
aATET = E (∆∖D = 1 )= E(Yl∖D = 1 )-E (Y0∖D = 1 ) (5)
■ Average Treatment Effect on the Untreated (ATEU). This is the effect for non-
participants which may be useful for future policy decisions on extending treatment to
groups that were excluded from treatment (Caliendo, 2006):
αATEU =E(∆∖D=0)=E(Y1∖D=0)-E(Y0 ∖D=0) (6)
Marginal Treatment Effect (MTE). 1 This is the expected effect of treatment conditional
on observed (X) and unobserved (Ud) characteristics of participants (Heckman and
Vytlacil, 2005).2 One interpretation is that it is the mean gain for an individual with
characteristics X and unobservables Ud such that he is indifferent between treatment or
not given a set of Z values, z, where Φ(α ,z)=ud. It is defined as:
MTE(X,Ud)≡E(∆∖X=x,Ud=ud)=E(Y1-Y0∖X=x,Udi=ud)
(7)
=E(γ∖X =x,Udi =ud)=X(β1-β0)+E[u1i-u0i ∖Udi =ud]
The challenge posed by selection bias is evident from the ATET which shows a
hypothetical outcome in the absence of treatment for those individuals who received treatment
(Caliendo, 2006). With non-experimental data, this outcome is not equivalent to the outcome of
non-participants:
E (Yo∣ D = 1)≠ E (Yo∣ D = 0) (8)
Selection bias may arise since participants and non-participants may be deliberately selected
groups with different outcomes, even in the absence of treatment, due to observable and
unobservable factors that may determine participation (Caliendo, 2006):
E(Y1 |D=1)-E(Y0 |D=0)=E(Y1-Y0 |D=1)+[E(Y0 |D=1)-E(Y0 |D=0)] (9)
k V j k V j
ATET Selection bias
1 Bjorklund and Moffitt (1987) are credited with introducing this concept to the literature.
2 The unobserved characteristics are introduced into the model by the decision rule described by equation (1).