3.2.5 Test Cost Conditional on Individual Case
The cost of performing a certain test may depend on
idiosyncratic properties of the individual case.
3.2.6 Test Cost Conditional on Time of Test
The cost of performing a certain test may depend on the
timing of the test.
4. Cost of Teacher
Suppose we have a practically unlimited supply of
unclassified examples (i.e., cases, feature vectors), but it
is expensive to determine the correct class of an example.
For example, every human is a potential case for medical
diagnosis, but we require a physician to determine the
correct diagnosis for each person. A learning algorithm
could seek to reduce the cost of teaching by actively
selecting cases for the teacher. A wise learner would
classify the easy cases by itself and reserve the difficult
cases for its teacher.
If a learner has no choice in the cases that it must classify,
then it can only rationally determine whether it should
pay the cost of a teacher when it knows the cost of
misclassification errors. A rational learner would, for each
new case, calculate the expected cost of classifying the
case by itself versus the cost of asking a teacher to
classify the case. This scenario can be handled by using a
rectangular cost matrix, as we discussed in Section 2.
In a more interesting scenario, the learner can explore a
(possibly infinite) set of unclassified (unlabelled)
examples and select examples to ask the teacher to
classify. This kind of learning problem is known as active
learning. In this scenario, we can rationally seek to
minimize the cost of the teacher even when we do not
know the cost of misclassification errors, if we assume
that asking the teacher costs more than a correct
classification (otherwise you would always ask the
teacher) but less than an incorrect classification
(otherwise you would never ask the teacher). However,
we may be able to make better decisions if we have more
information about the cost of misclassification errors.
4.1 Constant Teacher Cost
In the simplest situation, the cost of asking a teacher to
classify a case is assumed to be the same for all cases.
This is the usual assumption in the active learning
literature (Cohn et al., 1995, 1996; Krogh and Vedelsby,
1995; Hasenjager and Ritter, 1998).
4.2 Conditional Teacher Cost
In a more complex situation, the cost of asking a teacher
to classify a case may vary with the circumstances of the
case. For example, the cost may increase with the
complexity of the case. On the other hand, the teacher
may choose to penalize the student for asking the class of
a trivial case.
5. Cost of Intervention
Suppose we have data from a manufacturing process.
Each feature might be a measurement of an aspect of the
process, while the classes might be different types of
products. A learning algorithm could induce rules that
predict the type of product, given the corresponding
features. Suppose we wish to intervene in the
manufacturing process, to make more of one type of
product. We could give the induced rules a causal
interpretation.
For example, assume that we have a continuous process,
such as petroleum distillation. Suppose a rule says, “If
sensor A has a value greater than B, then the yield of
product type C will increase.” If this rule has causal
significance, then we may be able to increase the amount
of product type C by intervening in the process so that
sensor A consistently has a value greater than B. There
may be a cost associated with this intervention. Each
feature may have a corresponding cost, where the cost
represents the effort required to intervene in the
manufacturing process at the particular point represented
by the feature (Verdenius, 1991).
This is somewhat different from the idea of assigning a
cost to a feature based on the effort required to measure
the feature. Instead, the cost represents the effort required
to manipulate the process in order to alter the feature's
value.
5.1 Constant Intervention Cost
In the simplest scenario, the cost of intervention for a
given feature is the same for all cases (Verdenius, 1991).
5.2 Conditional Intervention Cost
In a more complex scenario, the cost of intervention for a
feature may depend on the particular case (for a
continuous process, “observation” may be a more
appropriate term than “case”). For example, if a sensor is
observed to be near its average value, it may be relatively
easy to manipulate the process in order to move the
average up or down slightly. However, if the sensor is
observed to be far from its average value, it may be quite
difficult to move it even further from its average value
(van Someren et al., 1997).
6. Cost of Unwanted Achievements
When we are dealing with the scenario described in
Section 5, where induced rules are used to intervene in a
causal process, the nature of misclassification error cost
changes. Suppose a rule says, “If sensor A has a value
greater than B, then the yield of product type C will