Provided by Cognitive Sciences ePrint Archive
A Computational Model of Children's Semantic Memory
Guy Denhiere ([email protected])
L.P.C & C.N.R.S. Universite de Provence
Case 66, 3 place Victor Hugo
13331 Marseille Cedex, France
Benoit Lemaire ([email protected])
L.S.E., University of Grenoble 2, BP 47
38040 Grenoble Cedex 9, France
Abstract
A computational model of children's semantic memory is
built from the Latent Semantic Analysis (LSA) of a
multisource child corpus. Three tests of the model are
described, simulating a vocabulary test, an association test
and a recall task. For each one, results from experiments with
children are presented and compared to the model data.
Adequacy is correct, which means that this simulation of
children's semantic memory can be used to simulate a variety
of children's cognitive processes.
Introduction
Models of human language processing are usually based on
a layer of basic semantic representations on top of which
cognitive processes are described. For instance, the
construction-integration model (Kintsch, 1998) describes
processes that operate on a network of propositions. These
basic representations can just be descriptions of what the
human memory looks like, in order for the upper models to
be explicitly stated, but they can also be operationalized so
that the model can be tested on a computer. In the first case,
these representations are usually designed by hand, but this
method prevents large-scale simulations.
This was the case with Kintsch's construction-integration
model until 1998. Before that, researchers had to code
propositions by hand and guess relevant values to code the
strength of links between nodes. Then Kintsch (1998) used
the Latent Semantic Analysis (LSA) model (Deerwester et
al., 1990; Landauer et al., 1998) which provides a way to
automatically build these basic representations. This was a
major step since the construction-integration processes
could then be tested on a large variety of inputs, while being
less dependent on idiosyncratic codings. Such a mechanism
for automatically constructing basic semantic represen-
tations should be carefully designed and tested in order to
simulate as good as possible human semantic memory.
LSA is nowadays considered as a good candidate for
modeling an adult semantic memory based on a large
corpora of representative texts: Bellissens et al. (2002),
Kintsch (2000) and Lemaire & Bianco (2003) used it for
modeling metaphor comprehension; Pariollaud et al. (2002)
used it for modeling the comprehension of idiomatic
expressions; Howard & Kahana (2002) relied on it to model
free recall and episodic memory retrieval; Laham (1997) did
the same for modeling categorization processes; Landauer
& Dumais (1997) designed a model of vocabulary
acquisition based on LSA; Lemaire & Dessus (2001),
Rehder et al. (1998) and Wolfe et al. (1998) used it for
modeling knowledge assessment; Quesada et al. (2001)
modeled complex problem solving by means of LSA basic
representations; Wolfe & Goldman (2003) worked on a
model of reasoning about historical accounts based on LSA.
However, to our knowledge, no computational basic
representations were made that mimic full children's
semantic memory.
This paper aims at presenting such a model. First, we
present LSA. We then describe our corpus, which is
supposed to mimic the kind of texts children are exposed to.
Finally, we present three experiments which aim at
validating the model.
Latent Semantic Analysis
Basic semantic representations
There are many ways of constructing basic semantic
representations that can be processed by a computer. The
first one is to build them by hand. Powerful formalisms like
description logic (Borgida, 1996) or semantic networks
(Sowa, 1991) have been designed to accurately represent
concepts, properties and relations. However, in spite of
huge efforts (Lenat, 1995), no full set of symbolic represen-
tations has been made that can be considered a reasonable
model of human semantic memory. Hand-coding semantic
information is tedious and, as we mention later, symbolic
representations might not be the best formalism for that.
Another strategy is to rely on corpora to get the semantic
information. Artificial intelligence researchers have
designed sophisticated syntactic processing tools for
automatically describing the knowledge using the kind of
symbolic formalisms mentioned earlier. They usually refer
to them as ontologies or knowledge bases (Vossen, 2003).
However, in spite of great strides, this approach still cannot
be the means to form the basic semantic representations that
cognitive researchers need. First, it cannot be fully
automatized, except for specific domains, thus preventing
complete descriptions of the language. Second and quite
paradoxically, since the descriptions are quite elaborated, it
is very hard to design reasoning processes on top of them.
For instance, a simple process like estimating the degree of
semantic association is very hard to operationalize on
complex structures like semantic networks.