Types of Cost in Inductive Concept Learning



Case-based reasoning, for example, typically has a low
dynamic complexity during training, but a high dynamic
complexity during testing. On the other hand, neural
networks typically have a high dynamic complexity
during training, but a low dynamic complexity during
testing.

8. Cost of Cases

There is often a cost associated with acquiring cases (i.e.,
examples, feature vectors). Typically a machine learning
researcher is given a small set of cases, and acquiring
further cases is either very expensive or practically
impossible. This is why many papers are concerned with
the “learning curve” (performance as a function of the
sample size) of a machine learning algorithm.

8.1 Cost of Cases for a Batch Learner

Suppose that we plan to use a batch learning algorithm to
build a model that will be embedded in a certain software
system. The model will be built once, using a set of
training data. The software system will perform some
task, using the embedded model, a certain number of
times over the operational lifetime of the system.

For a given learning algorithm, if we can estimate (1) the
learning curve (the relation between training set size and
misclassification error rate), (2) the expected number of
classifications that the learned model will make when
embedded in the operational system, over the lifetime of
the system, (3) the cost of misclassification errors, and (4)
the cost of acquiring cases for training data, then we can
calculate the combined cost of training (building the
model) and operating (using the model) as a function of
training set size. We can then optimize the size of the
training set to minimize this combined cost (Provost
et
al
., 1999).

Alternatively, an adaptive learning system, given (1) the
expected number of classifications that the learned model
will make when embedded in the operational system, (2)
the cost of misclassification errors, and (3) the cost of
acquiring cases for training data, could adjust its learning
curve (fast but naɪve versus slow but sophisticated) and
training set size to optimize the combined cost of training
and operating.

8.2 Cost of Cases for an Incremental Learner

Suppose that we plan to use an incremental learning
algorithm to build a model that will be embedded in a
certain software system. Unlike the batch learning
scenario, the model will be continuously refined over the
operational lifetime of the system. However, it is likely
that the software system cannot be operationally deployed
without any training. We must decide how many training
cases we should give to the incremental learner before it
becomes sufficiently reliable to deploy the software
system. To make this decision rationally, we need to
assign a cost to acquiring cases for training data. The
situation is similar to the batch learning situation, except
that we suppose that the misclassification error rate will
continue to decrease after the software system is
deployed.

9. Human-Computer Interaction Cost

There is a human cost to using inductive learning
software. This cost includes finding the right features for
describing the cases, finding the right parameters for
optimizing the performance of the learning algorithm,
converting the data to the format required by the learning
algorithm, analyzing the output of the learning algorithm,
and incorporating domain knowledge into the learning
algorithm or the learned model.

9.1 HCI Cost of Data Engineering

By “data engineering”, we mean the steps required to
prepare the data so that they are suitable for a standard
inductive concept learning algorithm. This includes
finding the right features and converting the data to the
required format. Although there has been some discussion
of the issues involved in data engineering (Turney
et al.,
1995), we are not aware of any attempt to measure the
HCI costs involved in data engineering.

9.2 HCI Cost of Parameter Setting

Most learning algorithms have a number of parameters
that effect their performance, often by adjusting their bias.
There is a cost involved in determining the best parameter
settings. Often cross-validation is used to set the
parameters (Breiman
et al., 1984). Again, we are not
aware of any attempt to measure the HCI costs of
parameter setting.

9.3 HCI Cost of Analysis of Learned Models

There is a human cost associated with understanding
induced models, which is particularly important when the
aim of inductive concept learning is to gain insight into
the physical process that generated the data, rather than to
predict the class of future cases. This is often discussed in
the decision tree induction literature, where it is (crudely)
measured by the number of nodes in the induced decision
tree (Mingers, 1989).

9.4 HCI Cost of Incorporating Domain Knowledge

Several researchers have examined ways of embedding
domain knowledge in a learning algorithm (Opitz and
Shavlik, 1997). It has often been observed, in the context
of expert system construction, that acquiring domain
knowledge from a domain expert is a major bottleneck.
We suppose that it would also be a bottleneck in the



More intriguing information

1. The name is absent
2. The name is absent
3. AN IMPROVED 2D OPTICAL FLOW SENSOR FOR MOTION SEGMENTATION
4. The name is absent
5. On the Relation between Robust and Bayesian Decision Making
6. Credit Market Competition and Capital Regulation
7. The name is absent
8. The name is absent
9. Permanent and Transitory Policy Shocks in an Empirical Macro Model with Asymmetric Information
10. TLRP: academic challenges for moral purposes
11. The name is absent
12. The name is absent
13. Work Rich, Time Poor? Time-Use of Women and Men in Ireland
14. THE ECONOMICS OF COMPETITION IN HEALTH INSURANCE- THE IRISH CASE STUDY.
15. Expectations, money, and the forecasting of inflation
16. The name is absent
17. Fiscal Rules, Fiscal Institutions, and Fiscal Performance
18. The name is absent
19. The name is absent
20. Who is missing from higher education?