Provided by Cognitive Sciences ePrint Archive
ROBUST CLASSIFICATION WITH CONTEXT-SENSITIVE FEATURES
Peter D. Turney
Knowledge Systems Laboratory
Institute for Information Technology
National Research Council Canada
Ottawa, Ontario, Canada, K1A 0R6
ABSTRACT
This paper addresses the problem of classifying observa-
tions when features are context-sensitive, especially when
the testing set involves a context that is different from the
training set. The paper begins with a precise definition of
the problem, then general strategies are presented for
enhancing the performance of classification algorithms on
this type of problem. These strategies are tested on three
domains. The first domain is the diagnosis of gas turbine
engines. The problem is to diagnose a faulty engine in one
context, such as warm weather, when the fault has previ-
ously been seen only in another context, such as cold
weather. The second domain is speech recognition. The
context is given by the identity of the speaker. The
problem is to recognize words spoken by a new speaker,
not represented in the training set. The third domain is
medical prognosis. The problem is to predict whether a
patient with hepatitis will live or die. The context is the
age of the patient. For all three domains, exploiting
context results in substantially more accurate classifica-
tion.
INTRODUCTION
A large body of research in machine learning is
concerned with algorithms for classifying observations,
where the observations are described by vectors in a multi-
dimensional space of features. It often happens that a
feature is context-sensitive. For example, when diagnos-
ing spinal diseases, the significance of a certain level of
flexibility in the spine depends on the age of the patient.
This paper addresses the classification of observations
when the features are context-sensitive.
In empirical studies of classification algorithms, it is
common to randomly divide a set of data into a testing set
and a training set. In this paper, for two of the three
domains, the testing set and the training set have been
deliberately chosen so that the contextual features range
over values in the training set that are different from the
values in the testing set. This adds an extra level of diffi-
culty to the classification problem.
The paper begins with a precise definition of context.
General strategies for exploiting contextual information
are then given. The strategies are tested on three domains.
First, the paper shows how contextual information can
improve the diagnosis of faults in an aircraft gas turbine
engine. The classification algorithms used on the engine
data were a form of instance-based learning (IBL) [1, 2, 3]
and a form of multivariate linear regression (MLR) [4].
Both algorithms benefit from contextual information.
Second, the paper shows how context can be used to
improve speech recognition. The speech recognition data
were classified using IBL and cascade-correlation (CC)
[5]. Again, both algorithms benefit from exploiting
context. Third, the paper shows how context can be used
to improve the accuracy of medical prognosis. Hepatitis
data were classified using IBL.
The presentation of the results in the three test
domains is followed by a discussion of the interpretation
of the results. The work presented here is then compared
with related work by other researchers and future work is
discussed. Finally, the conclusion is given. For the three
domains (engine diagnosis, speech recognition, and
medical prognosis) and three classification algorithms
(IBL, MLR, and CC) studied here, exploiting contextual
information results in a significant increase in accuracy of
classification.
268