Ozero

реклама
IS SENTENCE VIABLE?
The 3rd International Conference
on Cognitive Science
Moscow, June 21, 2008
Andrej A. Kibrik ([email protected])
Vera I. Podlesskaya ([email protected] ) 1
Does spoken language
consist of sentences?
 Sheer facts:
 Spoken language is the primary form of language
 Spoken language does not contain periods, question
marks and other explicit signals of sentence
boundaries
 Research question:
 Is sentence, as a theoretical construct, as identifiable
and as basic for the primary form of language as it is
(or as it is thought to be) for written language?
2
Sentence in spoken language

Position 1: sentence is a universal and basic unit of
language
 Assumption typically held by not only by linguists but also by
other cognitive scientists
 “With no more than 50 to 100 K words humans can create and
understand an infinite number of sentences” (Bernstein et al.
1994: 349-350)

 Psycholinguistics: “Sentence processing”
 But sentence is very far from being obvious in spoken language
Position 2: avoidance of the issue, typical of discourseoriented linguists
 If so, how could sentences become so much entrenched in
written language?
3
Night Dream Stories
 Corpus of spoken Russian stories
 Speakers: children and adolescents
 Subject matter: retelling of night dreams
 Discourse type: monologic narrative
(personal stories)
 Speech act type: declaratives
4
Two basic features of spoken
discourse
 Segmentation
 Transitional continuity
5
Segmentation
 Elementary discourse units (EDUs)
 Identified on the basis of a conjunction of
prosodic criteria:





Tempo pattern
Loudness pattern
Integral tonal contour
Presence of an accentual center
Pausing pattern
 Speakers tend to organize EDUs as clausal units
6
Example of segmentation Z54
/мы с= || ехали на \автобусеw.
/my s= || exali na \avtobusew.
We
rode on bus
Discourse transcription
...(0.6) /Я /первая села в \автобус.
...(0.6) /Ja /pervaja sela v \avtobus.
I
first
got on bus
..(0.4)
..(0.4)
А
A
And
/тогда
/togda
then
уже
uže
already
д= || ..(0.2)
d= || ..(0.2)
d=
..(0.1)
..(0.1)
и
i
and
/’Аня
/Anja
Anja
не
ne
not
–успела
–uspela
managed
закрывались
zakryvalis’
were.closing
\двери,
\dveri,
doors
\сесть.
\sest’.
get.in
...(0.7) Иw мм(0.4) /\когда-а ..(0.2) ’’(0.3) ..(0.4) {ЧМОКАНЬЕ 0.2} ..(0.4) когда я приехала на нашу
/остановку’,
...(0.7) IW mm(0.4) /\kogda-a ..(0.2) ’’(0.3) ..(0.4) {SMACKING 0.2} ..(0.4) kogda ja priexala na našu /ostanovku’,
And
when
when I arrived to our station
7
Transitional continuity
 Term by J. DuBois et al. 1992
 Alternative term by Sandro V. Kodzasov: phase
 Discourse semantic category: ‘end’ vs. ‘non-end’



(=expectation of a forthcoming end)
Hierarchical nature of phase
End of tentative sentence – falling tonal accent
Non-end – rising tonal accent
8
A canonical example of the
transitional continuity distinction
z57:15-16




..(0.4) /\Мы-ы’ ..(0.4) \как бы за них /взя-ались,
..(0.4) /\My-y’ ..(0.4) \kak by za nix /vzja-alis’,
We
sort of at them got.hold Rising (“comma”)
Non-end
...(0.5) и-и ввь= || ..(0.2) полетели \вве-ерх.
...(0.5) i-i vv’= || ..(0.2) poleteli
\vve-erx.
and
flew
upward Falling (“period”)
End
 If things were that easy, sentence
would be uncontroversial
9
Uncanonical situation:
Non-end with a falling tonal accent

....(1.5)

..(0.3) (Или

или
/\озеро,


но
по-моему \озеро,


потому что’ ..(0.2) как-то-оw
/\Озеро ...(0.5)
какое-то,
/\речка,



...(0.6) \маленькое такое,

\небольшое.)


....(1.0)

’и-иh ...(0.7) через /него
..(0.3) как-то

типа
\моста.
\бревно какое-то,

....(1.5) /\Ozero ...(0.5) kakoe-to,
Lake
some
..(0.3) (Ili
/\rečka,
Either river
ili
/\ozero,
or
lake
no
po-moemu \ozero,
but I guess
lake
potomu čto’ ..(0.2) kak-to-oW
because
somehow
...(0.6) \malen’koe
takoe,
small
such
\nebol’šoe.)
minor
....(1.0) ’i-iH ...(0.7) čerez /nego
and
across it
..(0.3) kak-to
\brevno kakoe-to,
somehow
log
some
tipa \mosta.
10
like bridge
The problem of two kinds of
falling
 The existence of non-final falling may call
relevance of sentence into question
 However, the distinction between two
kinds of falling is very systematic
 The two kinds of falling:
 are prosodically distinct
 have distinct discourse functions
11
Prosodic criteria of the final vs.
non-final falling distinction

Primary criteria:
1. Target frequency band
2. Post-accent behavior
12
Criterion 1: Target frequency
band
 Final falling (“period”): targets at the
bottom of the speaker’s F0 range
 Non-final falling (“faling comma”): targets
at level several dozen Hz (several
semitones) higher
13
F0 graph for the “lake” example
12
10
12
8
5
\ozero,
\malen’koe \nebol’
takoe,
šoe.
\brevno kakoe
\mosta.
-to,
14
Non-final falling (210 Гц),
final falling (170 Гц),
rising, post-rising falling Z54:
4-5
170 Hz
210 Hz
..(0.4)
..(0.4)
А
A
And
/тогда
/togda
then
уже
uže
already
д= || ..(0.2)
d= || ..(0.2)
d=
..(0.1)
..(0.1)
и
i
and
/’Аня
/Anja
Anja
не
ne
not
–успела
–uspela
managed
закрывались
zakryvalis’
were.closing
\двери,
\dveri,
doors
\сесть.
\sest’.
get.in
...(0.7) Иw мм(0.4) /\когда-а ..(0.2) ’’(0.3) ..(0.4) {ЧМОКАНЬЕ 0.2} ..(0.4) когда я приехала на нашу
/остановку’,
...(0.7) IW mm(0.4) /\kogda-a ..(0.2) ’’(0.3) ..(0.4) {SMACKING 0.2} ..(0.4) kogda ja priexala na našu /ostanovku’,
15
And
when
when I arrived to our station
Criterion 2: Post-accent
behavior
 Final falling (“period”): steady falling on
the post-accent syllables
 Non-final falling (“comma”): lack of falling
on post-accent syllables, often rise of tone
(V-curve)
16
V-curve z26
260 Hz
240 Hz
....(5.7) /Домик ...(0.6) был /около \реч↑ки,
....(5.7) /Domik ...(0.6) byl /okolo \reč↑ki,
Little.house
was near
creek
235 Hz
....(3.3) /рядом были \–родник-ки,
....(3.3) /rjadom byli \–rodnik-ki,
nearby were springs
..(0.4) и \–ле-ес.
..(0.4) i
\–le-es.
and forest
17
Secondary criteria
3.
4.
5.
6.
Pausing pattern
Reset vs. latching
Steepness of falling
Interval of falling
18
The final vs. non-final falling
distinction
 A speaker’s prosodic pattern must be
identified
 On its basis the difference between final
and non-final falling distinction can be
identified with a high degree of
robustness
19
Contexts of non-final falling
 Anticipatory mirror-image adaptation
 Inset
 Stepwise falling
20
Anticipatory mirror-image
adaptation
 ....(1.8) Когда
Kogda
when
 ...(0.5) что-о
čto-o
that
я
\услышала,
ja \uslyšala,
I
heard
/бомба гремит,
/bomba gremit,
bomb
growls
21
Inset
 /Входит

/Vxodit
enters
’ ’ ..(0.1)
’ ’ ..(0.1)
 ..(0.1)
..(0.1)
и
i
and
это ...(0.5) /\ма-аль↑чик,
èto ...(0.5) /\ma-al’↑čik,
here
boy
/\ну к
\другому,
/\nu k
\drugomu,
well to
another
\говорит:
\govorit:
says
22
Stepwise falling

....(1.5)

..(0.3) (Или

или
/\озеро,


но
по-моему \озеро,


потому что’ ..(0.2) как-то-оw
/\Озеро ...(0.5)
какое-то,
/\речка,



...(0.6) \маленькое такое,

\небольшое.)

....(1.5) /\Ozero ...(0.5) kakoe-to,
Lake
some
..(0.3) (Ili
/\rečka,
Either river
ili
/\ozero,
210 Hz
or
lake
no
po-moemu \ozero,
but I guess
lake
potomu čto’ ..(0.2) kak-to-oW
because
somehow
...(0.6) \malen’koe
takoe,
small
such
\nebol’šoe.)
190 Hz
minor
160 Hz
23
Representation of EDU continuity
types in corpus
1188
1200
1000
800
600
400
200
0
Final
falling
894
606
Non-final
falling
(Non-final)
rising
24
The status of sentence
 In the speech of most speakers final falling is



clearly distinct from non-final patterns
Final intonation, expressly distinct from non-final
intonation (both rising and falling), makes the
notion of sentence valid for spoken discourse
Speakers “know” when they complete a
sentence and when they do not
Apparently, spoken sentences are the prototype
of written sentences
25
Functions of sentences
 Ease the processing by creating
intermediate informational chunks
 Chafe: superfoci of consciousness
26
However





Identification of sentences is possible only on the basis
of a complex analytic procedure
It is dependent on prior understanding of a speaker’s
prosodic “portrait”
There are prototypes of final and non-final fallings, but
there are intermediate instances, therefore
sentencehood may be a matter of degree
A significant tune-up is necessary to apply the procedure
to a different discourse type or a different language
Therefore, sentence is an elusive, intermediate, nonbasic unit of language
27
EDUs vs. sentences: degree
of variability
EDUs:
distribution in terms
of number of words
700
600
53% – 3±1
80% – 3±2
500
Sentences:
distribution in terms
of number of EDUs
450
400
350
300
400
250
300
200
150
200
100
100
50
0
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
28
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
EDUs vs. sentences: degree
of variability
 Unlike EDUs, sentences are highly variable
 Speakers with short sentences
 Speakers with long sentences equaling
stories
 Clause chaining
29
Conclusions
 Sentence is an intermediate hierarchical
grouping between a whole discourse and
an EDU (roughly, clause)
 Sentence is very far away from being a
basic unit of spoken language
30
Acknowledgement
Member of our project Nikolay Korotaev
31
Скачать