WORD FORM FREQUENCY AND PHONE DURATIONS IN FINNISH INFORMAL
DIALOGUE
Mietta Lennes
Department of Speech Sciences
University of Helsinki
ABSTRACT the probabilities of words contribute to phone durations in
Finnish is a quantity language. In continuous speech, how- Finnish speech. The aim of this study is to investigate the
ever, the absolute durations of speech sounds do not relationships of word frequency and phone durations in six
systematically reflect phonemic length. Phonetic duration spontaneous, informal Finnish dialogues. Phone durations
depends on many factors, e.g., the articulation, syllable struc- of different phone classes are studied as well.
ture, position within the word, accent placement within the
utterance, speaking rate and speech style. Moreover, dif- 2. METHODS
ferent speech sounds have different probabilities in speech, Six dialogues were recorded from 12 native speakers (five
which may affect their temporal properties. This problem females) of Finnish, aged between 20 and 30 years. All
is addressed in the present study. Speech sound durations speakers were university students and they had lived in the
measured from informal spoken dialogues are compared capital city area of Finland (Helsinki, Espoo, or Vantaa) for
with the frequencies of the corresponding word forms. most of their lives. The participants of each dialogue knew
Phone durations measured from the initial syllables of fre- each other well.
quent words are found to be generally shorter than those in
rare words. 2.1. Recordings
The recordings were performed in an anechoic room at the
1. INTRODUCTION Laboratory of Acoustics and Audio Signal Processing at
Helsinki University of Technology. The speakers were sit-
In the Finnish language, at least eight vowel phonemes and ting two meters apart, facing in opposite directions. They
13 consonant phonemes can be distinguished. Each of these heard both their own and the other speaker’s voice through
phonemes may occur phonologically as either long or short. headphones. Thus, the situation somewhat resembled a tele-
In continuous speech, however, the absolute durations of phone conversation. The speakers were left alone in the
speech sounds do not systematically reflect phonemic length. room for 45-60 minutes, and a few general topics were given
Both classical and recent studies exist on speech sound du- for them to discuss. However, they were instructed not to
rations and their relationships with phonological quantity in force themselves to keep to these topics.
Finnish [1, 2, among others]. Phonetic duration is known
to depend on at least the phoneme type, its position within Each speaker’s voice was recorded with a high-quality
the moraic structure, accent placement within the utterance, headset microphone to a separate channel of a DAT recorder.
speaking rate and style. However, most of the existing stud- The digital stereo signal was then transferred to a computer.
ies have been performed on read-aloud or prepared speech. The channels were separated into two sound files of identi-
cal length.
The probabilities of words and phonemes in spoken
Finnish are different from those in written language. Speak- 2.2. Annotation
ers need not produce common and predictable words as Each speaker’s utterances were first transliterated following
clearly as rare, informative words. The frequencies of word Finnish orthographic conventions using the Praat program
forms have been shown to affect the articulatory reduction [5]. Boundaries of utterances, words, and syllables were an-
of vowel phones within word tokens in at least Finnish, notated semi-automatically as separate tiers. All phones in
Dutch, and Russian [3, 4]. Accentuation tends to affect seg- the recorded material from 10 speakers were automatically
mental durations as well. Therefore, it may be expected that segmented and labelled. Fragments of the these phonetic
The author’s work has been funded by the Academy of Finland
(projects 53623, 53005).
log(Frequency) geminates that occurred at the first syllable border. A to-
02468 tal of 6585 phone segments (3583 syllables, 312 utterances)
were thus analyzed.
0 500 1000 1500 2000 2500 3.2.1. Associating phone segments with phonemes
Word forms
The ”phonemic” structure of syllables and words was auto-
Fig. 1. The frequency distribution of 2441 orthographically matically derived from the orthographic transcripts of each
different word forms within five informal Finnish dialogues. type of unit. This method is often used in speech tech-
The most common word form ’se’ occurred 2291 times in nology, since the Finnish orthography closely corresponds
the material, the rarest words only once. Word frequency to phonemic structure. However, this is only a pragmatic
is shown as logarithmic. Vertical lines indicate the division means to arrive at a closed and definite set of labels for
into six subgroups, numbered 1-6 from lowest to highest phonetic segments, and the result depends solely on what
frequency, that were used for comparing phone durations. the transcriber has written. For instance, the length of a
phoneme is determined by whether the transcriber has typed
annotations were manually corrected and used for further a single or a double character in the orthographic transcript,
analyses. and this decision is in turn mostly determined by written
forms. Therefore, in this paper, the notion of phoneme
3. RESULTS refers only to the label of a phonetic segment that was
3.1. Word form frequencies derived from the word label, and not to any phonologi-
A frequency dictionary of 2441 word forms was created cally defined entity.
from a total of 45044 word tokens spoken by the 10 speakers
within 6 dialogues. Since morphological analyses have not In spontaneous speech, phonetic segmentations usually
yet been completed, the frequency dictionary did in some contain sequences of segments which differ from the ex-
cases contain several structurally identical occurrences of a pected ”phonemic” sequence with regard to the number of
word form. The word frequency distribution is shown in segments and their labels. The syllable provides an articu-
figure 1. latorily motivated unit that can in most cases be used to au-
3.2. Phone durations tomatically associate the segmented and transcribed phones
All phone duration measurements were done with the Praat with phoneme labels, as long as the syllable boundaries have
program [5]. In order to build a set of data that would best been marked in the segmentation. Each annotated syllable
reflect variability due to the contextual probability of words, was divided into at most five structural parts: onset1, onset2,
all single-word utterances were excluded from the phone nucleus, coda1, and coda2, of which the nucleus (a vowel
duration analysis. Also, utterance-initial stop consonants part) was always required. Thus, a long vowel phoneme
were excluded, since they tend to be unusually short (the would have the total duration of all vowel phone segments
mere bursts of stops). Moreover, all utterance-final phones that had been segmented within the boundaries of one syl-
were discarded to reduce effects of pre-pausal lengthening. lable. Diphthongs were also dealt with as a separate group.
Since every word token has at least one syllable and since
the word-initial syllables allow for the greatest structural 3.2.2. Phone durations
complexity in Finnish, the phone durations were investi-
gated primarily in initial syllables. The data also contains The word forms were sorted according to their frequency
and divided into six groups that were numbered from 1 to 6
according to increasing frequency (group 1 containing rare
word forms and group 6 the most frequent words, see figure
1).
Figure 2 shows the distributions of phone durations for
short and long phonemes and dipthongs within word-initial
syllables according to the the six frequency groups. Fre-
quency groups 5 and 6 contain mostly function words. Such
common function words as niin, joo, siis, ei can be held re-
sponsible for the longest durations of the long phonemes
and diphthongs in group 6.
The separation between long and short quantities (as de-
termined from the orthographic labels) is most apparent for
the infrequent words in group 1. The mean durations of
short and long phonemes are different for all word frequen- Duration (ms) Short phonemes
cies, but the distribution of values is non-symmetric and 0 50 150 250
there is a large amount of variability. 123456
0 50 150 250 Word form frequency group
Figure 3 indicates how phone duration for short
phonemes within word-initial syllables varies by word form 0 50 150 250 Long phonemes
frequency for different phoneme labels. There is again a
great deal of variability in the duration values, suggesting 123456
that many factors probably contribute to them. However, the
smoothing curves show a similar, slightly downward trend Diphthongs
for each phoneme label. The phoneme labels with the high-
est frequencies in the whole dialogue material were /i,e,A/ 123456
for vowels and /s,t,n/ for consonants.
Fig. 2. Durations of short and long phonemes and diph-
In the current study, the word form frequencies have thongs within word-initial syllables of six different word
only been analysed on the basis of orthographic transcripts, frequency classes. Single-word utterances, utterance-initial
and homonymous forms have not been separately consid- stops and utterance-final phones were excluded. Word fre-
ered. A morphological analysis of the word forms in this quency increases from group 1 (left) to group 6 (right). De-
corpus is underway, which may help to build models for spite the variability, a negative correlation was found for
phone duration. both short and long phonemes. Some outliers of very long
duration are not visible.
4. CONCLUSIONS
It has been shown that in casual Finnish speech, the phone
durations in the initial syllables of words tend to be shorter
in frequent and predictable words. This may reflect a
general increase in speaking rate. It may also be assumed
that some of the duration-based contrasts that exist in clearly
pronounced speech may not be as important in highly pre-
dictable parts of casual speech.
5. REFERENCES
[1] Jaakko Lehtonen, Aspects of quantity in stan-
dard Finnish, Number [VI] in Studia Philologica
Jyväskyläensia. Jyväskylä: University of Jyväskylä,
1970.
[2] Michael O’Dell, Intrinsic timing and quantity in
Finnish, Ph.D. thesis, University of Tampere, 2004.
[3] Mietta Lennes, “On the expected variability of vowel
quality in Finnish informal dialogue,” in Proceedings
of the 15th International Congress of Phonetic Sciences
(ICPhS), Barcelona, Spain, M. Solé, D. Recasens, and
J. Romero, Eds., 2003, pp. 2985–2988.
[4] Rob J. J. H. van Son, Olga Bolotova, Mietta Lennes, and
Louis C. W. Pols, “Frequency effects on vowel reduc-
tion in three typologically different languages (Dutch,
Finnish, Russian),” in ICSLP 2004 (INTERSPEECH),
4.-8.10.2004, Jeju Island, Korea, 2004.
[5] Paul Boersma and David Weenink, “Praat: doing
phonetics by computer,” 1992–2004, available at:
http://www.praat.org/.
Phone duration 246 v
yäö N
150 150
100 100
50 50
or su i
150 246
100
50
j l mn
a efh
150 246
100
log(Word form frequency)
50
246
Fig. 3. Phone durations for short phonemes that occurred in word-initial syllables. Frequency of word forms increases from
left to right. The smooth curves indicate that an increase in word form frequency is apparently associated with slightly
shorter phone durations. There are only a few observations for such rare phoneme labels as /ö/ and /f/. Word-initial stops
were excluded from the data, but geminates at the 1st-2nd syllable border were included. A small number of large values are
not visible.