Also this quantity should be relatively small since we generally do not expect
that small changes in data cause extreme changes in values or sensitivity of
estimates. Third, as an unlikely large or distant observations may represent
data errors, their influence on estimates should become zero. Such a property
is characterized by the rejection point,
ρ(T,F) = rin>f0{r : IF(x;T,F) = 0, kxk ≥ r}, (4)
which indicates the non-influence of large observations.
Alternatively, behavior of the estimator T can be studied for any finite
amount ε of contamination. The most common property looked at in this
context is the estimator’s bias b(T, H) = EH{T (H)} - EF {T (F)}, which
measures a distance between the estimates for clean data, T (F), and con-
taminated data, T (H), H ∈ Fε,G. The corresponding maximum-bias curve
measures the maximum bias of T on Fε,G at any ε:
B(ε, T) = sup b{T, (1 - ε)F + εδx}. (5)
x∈R
Although the computation of this curve is rather complex, Berrendero and
Zamar (2001) provide general methodology for its computation in the context
of linear regression.
The maximum-bias curve is not only useful on its own, but allows us
to define further scalar measures of robustness. The most prominent is the
breakdown point (Hampel, 1971), which is defined as the smallest amount ε