MACHINE LEARNING
273
the m speakers and n classes in the training set. The intra-
class deviation of xi is:
m n
σintra = -11 ∑∑σ., (14)
i m n j~k j,k
J = 1 k = 1
The ratio of inter-class deviation to intra-class deviation is
high when a feature varies greatly across class boundaries,
but varies little within a class. A high weight (a high ratio)
suggests that the feature will be useful for classification.
This is a form of contextual weighting, because the weight
is calculated on the basis of the speaker’s identity, which is
a contextual feature.
Table 4 shows the results of using different combina-
tions of these three strategies with IBL. These results show
that there is a form of synergy here, since the sum of the
improvements of each strategy used separately is less than
the improvement of the three strategies used together
((58 - 56) + (55 - 56) + (58 - 56) = 3% for the sum
of the three strategies used separately versus
66 - 56 = 10% for the three strategies used together).
The three strategies were also tested with cascade-
correlation [5]. Because of the time required for training
CC, results were gathered for only two cases: With no pre-
processing, cascade-correlation correctly classified 216
observations (47%). With preprocessing by all three strate-
gies, cascade-correlation correctly classified 236 observa-
tions (51%). This shows that contextual information can
be of benefit for both neural networks and nearest
neighbor pattern recognition.
HEPATITIS PROGNOSIS
Similar to the previous section, this section examines
strategies 1, 2, and 5: contextual normalization, contextual
expansion, and contextual weighting. The problem is to
determine whether hepatitis patients will live or die from
their disease. There are seventeen primary features, of
which twelve are discrete (such as “patient is taking
steroids”, “patient reports fatigue”) and five are continu-
ous (such as “patient’s bilirubin level”). There are two
contextual features, of which one is discrete (patient’s sex)
and one is continuous (patient’s age). The patient’s sex
was not used in the following experiments, since 90% of
the patients were male. The observations fall in two
classes (live or die) [10]. There are many missing values in
the hepatitis data. These were filled in by using the single-
nearest neighbor algorithm with the training data.
For hepatitis prognosis, bilirubin level is a primary
feature for determining whether the patient will die from
the disease. The age of the patient is a contextual feature,
since we can achieve more accurate prognoses by using
the patient’s age. Age is not a primary feature, since
knowing the patient’s age, by itself, does not help us to
make a prognosis. In support of this claim, compare rows
one and three in Table 5. Adding age as a feature actually
reduces accuracy. Background knowledge does not help us
to determine whether age is primary or contextual, since it
is plausible that the patient’s age could be a primary factor
in hepatitis prognosis. In this case, we must use the data to
estimate the probability distribution. The data suggest that
age is a contextual feature.
The data were divided into a training set and a testing
set. Unlike the previous two experiments, there was no
systematic distinction between the training and testing
sets. The data consist of 155 observations, which were
randomly split to make 10 pairs of training and testing
sets. In each pair, there were 100 training observations and
55 testing observations. Thus the total number of observa-
tions for testing purposes was 550.
Three of the five strategies discussed above were
applied to the data:
Contextual normalization: Each feature was normalized
by equation (11), where the context vector c is simply the
patient’s age. Age was converted into a discrete feature by
dividing age into five intervals, with an equal number of
Table 4: The three strategies applied to the vowel data. Table 5: The three strategies applied to the hepatitis data.
strategy 1: |
strategy 2: |
strategy 5: |
no. correct |
percent |
strategy 1: |
strategy 2: |
strategy 5: |
no. correct |
percent |
No |
No |
No |
258 |
56 |
No |
No |
No |
393 |
71 |
No |
No |
Yes |
269 |
58 |
No |
No |
Yes |
393 |
71 |
No |
Yes |
No |
253 |
55 |
No |
Yes |
No |
390 |
71 |
No |
Yes |
Yes |
272 |
59 |
No |
Yes |
Yes |
391 |
71 |
Yes |
No |
No |
267 |
58 |
Yes |
No |
No |
454 |
83 |
Yes |
No |
Yes |
295 |
64 |
Yes |
No |
Yes |
460 |
84 |
Yes |
Yes |
No |
273 |
59 |
Yes |
Yes |
No |
457 |
83 |
Yes |
Yes |
Yes |
305 |
66 |
Yes |
Yes |
Yes |
464 |
84 |