FRANCK RAMUS
ber of languages studied does not guarantee that this pattern
would hold for other unrelated languages. Indeed, consider-
ing the great variety of cues present in speech, other proper-
ties than rhythm may have allowed discrimination, and may
therefore be considered as confounding factors. This concern
has led researchers to look for a second line of evidence, by
reducing the speech cues that were available for discrimi-
nation. Thus, Mehler et al. (1988) successfully replicated
their experiments after low-pass filtering their stimuli at 400
Hz. This process, which eliminates the higher frequencies
of speech, and therefore most of the phonetic information,
is thought to preserve only its prosodic properties (rhythm
and intonation). Similarly, the experiments by Nazzi et al.
(1998) used filtered speech exclusively, and Ramus et al.
(2000) used sentences that were resynthesized in such a way
as to preserve only prosodic cues (see below). Thus, there
is converging evidence that prosody is all newborns need to
discriminate languages.
Nevertheless, prosody does not reduce to rhythm. It re-
mains possible that its other major component, intonation,
plays a role in the observed discriminations. Although we
do not know of typological studies of intonation that would
allow to make specific predictions for all the pairs of lan-
guages considered, it is, for instance, predictable that English
and Japanese should be discriminable on the basis of their in-
tonation. Indeed, English is a Head-Complement language,
whereas Japanese is Complement-Head1, and this syntac-
tic parameter is said to have a prosodic correlate, promi-
nence, which is signaled both in terms of rhythm and intona-
tion (Nespor, Guasti, & Christophe, 1996). Moreover, there
is empirical evidence that some languages are discriminable
purely by their intonation, including English and Japanese
indeed (Ramus & Mehler, 1999), English and French (Maid-
ment, 1976, 1983) and English and Dutch (de Pijper, 1983).
In order to assess whether newborns actually perceive lin-
guistic rhythm, it is therefore necessary to get rid of the in-
tonation confound, that is to go beyond speech filtering and
remove intonation from the stimuli.
Ramus and Mehler (1999) have adapted a technique,
speech resynthesis, to selectively degrade the different com-
ponents of speech, including rhythm and intonation. This
technique has notably been used to resynthesize different
versions of English and Japanese sentences, and assess which
components of speech were sufficient for discrimination of
the two languages. The different versions included (a) broad
phonotactics + prosody, (b) prosody, (c) rhythm only, and (d)
intonation only. Results showed that pure rhythm was suf-
ficient for French subjects to discriminate between the two
languages. Pure intonation was also sufficient, but the task
was more difficult and required explicit knowledge of one the
two target languages. In the present series of experiments,
we wish to apply the same rationale to the study of language
discrimination by newborns, i.e., progressively eliminate the
speech cues available for discrimination, and finally assess
whether linguistic rhythm is, as hypothesized, the critical
cue.
Experiment 1: Natural speech
This first experiment aims to test the discrimination of two
languages in the most unconstrained condition, using natu-
ral, unsynthesized sentences. The two languages we have
selected are Dutch and Japanese. The discrimination of this
pair of languages was previously tested in 2-3 month-old En-
glish infants, and yielded only a marginally significant re-
sult (Christophe & Morton, 1998). This was interpreted as
showing a growing focus on the native language, hence a
loss of interest in foreign ones (consistent with Mehler et al.,
1988). This pair of languages has never been tested on new-
borns, but it is expected to be easy to discriminate, given
the English-Japanese discrimination by French newborns ob-
tained by Nazzi et al. (1998), and the fact that English and
Dutch are very close in many respects, including rhythm.
Materials and Method
All the experiments included in this paper use the same
methodology unless otherwise stated. Since we have made
special efforts to improve upon previously used procedures,
our methodology is described below in great detail.
Stimuli
Dutch and Japanese sentences were taken from a corpus
constituted by Nazzi (1997; Nazzi et al., 1998), comprising
short news-like sentences read by four female native speak-
ers per language2. We selected 5 sentences per speaker, i.e.,
20 sentences per language, matched in number of syllables
(15 to 19, with an average of 17) and in duration (3120 ms
±186 for Dutch, 3040 ms ±292 for Japanese, F( 1,39) = 1.1,
p = 0.3). We were also concerned about the possibility that
speakers in one language might have a higher pitch than
speakers in the other language. Average fundamental fre-
quency3 is indeed significantly different between the two lan-
guages: 216 Hz 19 for Dutch, 235 Hz 15 for Japanese,
F(1,39) = 11-8, p = 0.001. This is compensated for through
resynthesis in Experiment 2, and we will see that this had no
influence on discrimination. Sentences in subsequent exper-
iments were resynthesized from these 40 source sentences,
and differ only with respect to the type of synthesis that was
used4.
Experimental protocol
As is customary when testing newborns, we used the non-
nutritive sucking technique in a habituation paradigm (Eimas
et al., 1971). Compared with previous studies (see Fernald &
1 For example, relative phrases come after the corresponding
verb in English, but before it in Japanese.
2 This corpus consists exclusively of adult-directed speech.
3 Fundamental frequency was extracted at intervals of 5 ms using
the Bliss software. We calculated an average F0 for each sentence,
as the average of all its non-zero F0 values.
4 Samples of the different types of stimuli
used in the present experiments can be heard on:
http://www.ehess.fr/centres/lscp/persons/ramus/resynth/ecoute.htm.