Provided by Cognitive Sciences ePrint Archive
Types of Cost in Inductive Concept Learning
Peter Turney PETER.TURNEY@NRC.CA
Institute for Information Technology, National Research Council of Canada, M-50 Montreal Road, Ottawa, Ontario,
Canada, K1A 0R6
Abstract
Inductive concept learning is the task of learning
to assign cases to a discrete set of classes. In
real-world applications of concept learning, there
are many different types of cost involved. The
majority of the machine learning literature
ignores all types of cost (unless accuracy is
interpreted as a type of cost measure). A few
papers have investigated the cost of
misclassification errors. Very few papers have
examined the many other types of cost. In this
paper, we attempt to create a taxonomy of the
different types of cost that are involved in
inductive concept learning. This taxonomy may
help to organize the literature on cost-sensitive
learning. We hope that it will inspire researchers
to investigate all types of cost in inductive
concept learning in more depth.
1. Introduction
This paper is an attempt to list the different costs that may
be involved in inductive concept learning. The paper
assumes the standard inductive concept learning scenario.
We have a set of cases (i.e., examples, vectors,
observations) represented as vectors in an abstract space
of features (i.e., tests, measurements, sensor values,
attribute values). Each case belongs to a class (i.e., the
feature space is partitioned into a finite set of distinct
subsets; there is a function mapping from feature space
into a finite set of symbols). The learning algorithm
generates hypotheses that may be used to predict the class
of new cases.
In the following, “cost” should be interpreted in its most
abstract sense. Cost may be measured in many different
units, such as monetary units (dollars), temporal units
(seconds), or abstract units of utility (utils). In medical
diagnosis, cost may include such things as the quality of
life of the patient, in so far as such things can be
(approximately) measured. In image recognition, cost
might be measured in terms of the CPU time required for
certain computations. We take “benefit” to be equivalent
to negative cost.
Often we are uncertain about costs. We can represent this
uncertainty with a probability distribution over a range of
possible costs. This applies to all of the following costs.
In this paper, for ease of exposition, we will assume that
we are certain about costs.
2. Cost of Misclassification Errors
Suppose there are C classes. In general, we may have a
C x C matrix, where the element in row i and column j
specifies the cost of assigning a case to class i, when it
actually belongs in class j. Typically (but not necessarily)
the cost is zero when i equals j. In a minor variation on
this approach, we may have a rectangular matrix, where
there is an extra row for the cost of assigning a case to the
unknown (or “too-difficult-for-this-learner”) class.
2.1 Constant Error Cost
The cost of a certain type of error (the value of a cell in
the cost matrix) may be a constant (the same value for all
cases). This is the most commonly investigated type of
cost; for example, see Breiman et al. (1984) or Hermans
et al. (1974).
If the cost is zero if i equals j and one otherwise, then our
cost measure is the familiar error-rate measure. If the cost
is one if i equals j and zero otherwise, then our cost
measure (in this case, our “benefit measure”) is the
familiar accuracy measure.
2.2 Conditional Error Cost
The cost of a certain type of error may be conditional on
the circumstances.
2.2.1 ERROR COST CONDITIONAL ON INDIVIDUAL CASE
The cost of a classification error may depend on the
nature of the particular case. For example, in detection of
fraud, the cost of missing a particular case of fraud will
depend on the amount of money involved in that
particular case (Fawcett and Provost, 1996, 1997).
Similarly, the cost of a certain kind of mistaken medical
diagnosis may be conditional on the particular patient
who is misdiagnosed. For example, the misdiagnosis may
be more costly in elderly patients.