to extract any meaningful results. As a matter of
comparison, a majority class rule would have performed at
the same level (56% of positive examples were rated ‘0’ on
our scale). Second, works in dream analysis often
concentrate on the negative sentiments in dreams since
they are typically more present and differentiated than
positive sentiments [3], [4]. The negative scale can
therefore be useful in isolation.
Negative orientation | ||
Level |
Description |
Sample passage________________ |
0 |
Neutral |
“I was back in Halifax with some |
1 |
Lightly |
“I then got on the street beside a |
2 |
Moderately |
“I ran to the car and it wouldn’t |
3 |
Highly |
“When we got there we were in |
Table 2: Description of the negative scales.
Automatic Dream Analysis
The algorithmic framework presented in this section make
use of the online version1 of the General Inquirer (GI) [10],
the online version2 of the Linguistic Inquiry and Word
Count (LIWC) [9], the weighted GI and HM lexicons
introduced by Turney and Littman [13], and a bag-of-
words approach making use of the Balie3 text pre-
processing software. Results are computed using the Weka
machine learning toolkit [15].
The General Inquirer
The first analysis is performed using the General Inquirer
[10]. This resource contains 3,600 words labeled “positive”
or “negative” (respectively “Pos” and “Neg” tags in GI).
Moreover, each word is paired with disambiguation rules
that allow identifying if a specific occurrence refers to the
sentiment or not. For instance, if the word “kind” is used as
an adjective, it means “benevolent, charitable” and has a
positive orientation. In the case the word “kind” is a noun,
it has no specific orientation. For a particular dream, for
example, we obtain “Neg” = 1,6%, meaning that 1,6% of
1 http://www.webuse.umd.edu:9090/
2 http://www.liwc.net/liwcresearch.php
3 http://balie.sourceforge.net
the words has an unambiguous negative orientation (e.g.,
ANGRY, DISTURB, ...) From a machine learning point
of view, we create a dataset with the following the features.
Note that even if these features are used to score the
negative content of dreams, we still use the positive cues
that may be useful.
1. the number of positive words in GI
2. the number of negative words in GI
3. the percentage of positive words in GI
4. the percentage of negative words in GI
5. the difference 1-2
6. the log ratio 1/2
7. the difference 3-4
8. the log ratio 3/4
9. the negative orientation level {0,1,2,3}
Features 1 to 4 are taken directly from GI output. The
features 5 and 7 give the difference, which is the
“remaining” positive or negative strength of a dream. The
features 6 and 8 give the log ratio, a value related to the
difference but that is less sensitive to the magnitude of the
compared features.
The Linguistic Inquiry and Word Count
The second resource we analyzed is the Linguistic Inquiry
and Word Count [9] software. The LIWC offers measures
of the percentage of positive and negative words in texts.
The LIWC dictionary is composed of 2290 words and
word stems. In contrasts with the GI, this resource makes
no use of disambiguation rule; it relies on simple word
count. The richness of LIWC is its scrupulous choice of
words made by multiple experts that came to near perfect
agreement. We used the following features:
1. the percentage of positive words in LIWC
2. the percentage of negative words in LIWC
3. the difference 1-2
4. the log ratio 1/2
5. the negative orientation level {0,1,2,3}
Again, we use a feature for the difference of percentage
scores and a feature for the log ratio.
The Weighted GI and HM
A third strategy is to use the weighted GI and HM lexicons
as described in Turney and Littman [13]. The HM lexicon
originates from work by Hatzivassiloglou and McKeown
[5] that evaluates the semantic orientation of 1600
adjectives. The GI lexicon is derived from the General
Inquirer used in the previous section. In both resources,
words have a weight that represents their orientation and
strength, in the general case. For instance, in the weighted
GI, the word “kind” has a weight of +0.056. The sign ‘+’
means the orientation is positive and the absolute value
that is near 0 means the word is almost neutral (maybe
because of its meaning as a noun). For the matter of
comparison, an unambiguous word such as “outstanding”
has a weight of +13.41 while “broken-hearted” has a
weight of -14.29.