IS SENTENCE VIABLE? The 3rd International Conference on Cognitive Science Moscow, June 21, 2008 Andrej A. Kibrik ([email protected]) Vera I. Podlesskaya ([email protected] ) 1 Does spoken language consist of sentences? Sheer facts: Spoken language is the primary form of language Spoken language does not contain periods, question marks and other explicit signals of sentence boundaries Research question: Is sentence, as a theoretical construct, as identifiable and as basic for the primary form of language as it is (or as it is thought to be) for written language? 2 Sentence in spoken language Position 1: sentence is a universal and basic unit of language Assumption typically held by not only by linguists but also by other cognitive scientists “With no more than 50 to 100 K words humans can create and understand an infinite number of sentences” (Bernstein et al. 1994: 349-350) Psycholinguistics: “Sentence processing” But sentence is very far from being obvious in spoken language Position 2: avoidance of the issue, typical of discourseoriented linguists If so, how could sentences become so much entrenched in written language? 3 Night Dream Stories Corpus of spoken Russian stories Speakers: children and adolescents Subject matter: retelling of night dreams Discourse type: monologic narrative (personal stories) Speech act type: declaratives 4 Two basic features of spoken discourse Segmentation Transitional continuity 5 Segmentation Elementary discourse units (EDUs) Identified on the basis of a conjunction of prosodic criteria: Tempo pattern Loudness pattern Integral tonal contour Presence of an accentual center Pausing pattern Speakers tend to organize EDUs as clausal units 6 Example of segmentation Z54 /мы с= || ехали на \автобусеw. /my s= || exali na \avtobusew. We rode on bus Discourse transcription ...(0.6) /Я /первая села в \автобус. ...(0.6) /Ja /pervaja sela v \avtobus. I first got on bus ..(0.4) ..(0.4) А A And /тогда /togda then уже uže already д= || ..(0.2) d= || ..(0.2) d= ..(0.1) ..(0.1) и i and /’Аня /Anja Anja не ne not –успела –uspela managed закрывались zakryvalis’ were.closing \двери, \dveri, doors \сесть. \sest’. get.in ...(0.7) Иw мм(0.4) /\когда-а ..(0.2) ’’(0.3) ..(0.4) {ЧМОКАНЬЕ 0.2} ..(0.4) когда я приехала на нашу /остановку’, ...(0.7) IW mm(0.4) /\kogda-a ..(0.2) ’’(0.3) ..(0.4) {SMACKING 0.2} ..(0.4) kogda ja priexala na našu /ostanovku’, And when when I arrived to our station 7 Transitional continuity Term by J. DuBois et al. 1992 Alternative term by Sandro V. Kodzasov: phase Discourse semantic category: ‘end’ vs. ‘non-end’ (=expectation of a forthcoming end) Hierarchical nature of phase End of tentative sentence – falling tonal accent Non-end – rising tonal accent 8 A canonical example of the transitional continuity distinction z57:15-16 ..(0.4) /\Мы-ы’ ..(0.4) \как бы за них /взя-ались, ..(0.4) /\My-y’ ..(0.4) \kak by za nix /vzja-alis’, We sort of at them got.hold Rising (“comma”) Non-end ...(0.5) и-и ввь= || ..(0.2) полетели \вве-ерх. ...(0.5) i-i vv’= || ..(0.2) poleteli \vve-erx. and flew upward Falling (“period”) End If things were that easy, sentence would be uncontroversial 9 Uncanonical situation: Non-end with a falling tonal accent ....(1.5) ..(0.3) (Или или /\озеро, но по-моему \озеро, потому что’ ..(0.2) как-то-оw /\Озеро ...(0.5) какое-то, /\речка, ...(0.6) \маленькое такое, \небольшое.) ....(1.0) ’и-иh ...(0.7) через /него ..(0.3) как-то типа \моста. \бревно какое-то, ....(1.5) /\Ozero ...(0.5) kakoe-to, Lake some ..(0.3) (Ili /\rečka, Either river ili /\ozero, or lake no po-moemu \ozero, but I guess lake potomu čto’ ..(0.2) kak-to-oW because somehow ...(0.6) \malen’koe takoe, small such \nebol’šoe.) minor ....(1.0) ’i-iH ...(0.7) čerez /nego and across it ..(0.3) kak-to \brevno kakoe-to, somehow log some tipa \mosta. 10 like bridge The problem of two kinds of falling The existence of non-final falling may call relevance of sentence into question However, the distinction between two kinds of falling is very systematic The two kinds of falling: are prosodically distinct have distinct discourse functions 11 Prosodic criteria of the final vs. non-final falling distinction Primary criteria: 1. Target frequency band 2. Post-accent behavior 12 Criterion 1: Target frequency band Final falling (“period”): targets at the bottom of the speaker’s F0 range Non-final falling (“faling comma”): targets at level several dozen Hz (several semitones) higher 13 F0 graph for the “lake” example 12 10 12 8 5 \ozero, \malen’koe \nebol’ takoe, šoe. \brevno kakoe \mosta. -to, 14 Non-final falling (210 Гц), final falling (170 Гц), rising, post-rising falling Z54: 4-5 170 Hz 210 Hz ..(0.4) ..(0.4) А A And /тогда /togda then уже uže already д= || ..(0.2) d= || ..(0.2) d= ..(0.1) ..(0.1) и i and /’Аня /Anja Anja не ne not –успела –uspela managed закрывались zakryvalis’ were.closing \двери, \dveri, doors \сесть. \sest’. get.in ...(0.7) Иw мм(0.4) /\когда-а ..(0.2) ’’(0.3) ..(0.4) {ЧМОКАНЬЕ 0.2} ..(0.4) когда я приехала на нашу /остановку’, ...(0.7) IW mm(0.4) /\kogda-a ..(0.2) ’’(0.3) ..(0.4) {SMACKING 0.2} ..(0.4) kogda ja priexala na našu /ostanovku’, 15 And when when I arrived to our station Criterion 2: Post-accent behavior Final falling (“period”): steady falling on the post-accent syllables Non-final falling (“comma”): lack of falling on post-accent syllables, often rise of tone (V-curve) 16 V-curve z26 260 Hz 240 Hz ....(5.7) /Домик ...(0.6) был /около \реч↑ки, ....(5.7) /Domik ...(0.6) byl /okolo \reč↑ki, Little.house was near creek 235 Hz ....(3.3) /рядом были \–родник-ки, ....(3.3) /rjadom byli \–rodnik-ki, nearby were springs ..(0.4) и \–ле-ес. ..(0.4) i \–le-es. and forest 17 Secondary criteria 3. 4. 5. 6. Pausing pattern Reset vs. latching Steepness of falling Interval of falling 18 The final vs. non-final falling distinction A speaker’s prosodic pattern must be identified On its basis the difference between final and non-final falling distinction can be identified with a high degree of robustness 19 Contexts of non-final falling Anticipatory mirror-image adaptation Inset Stepwise falling 20 Anticipatory mirror-image adaptation ....(1.8) Когда Kogda when ...(0.5) что-о čto-o that я \услышала, ja \uslyšala, I heard /бомба гремит, /bomba gremit, bomb growls 21 Inset /Входит /Vxodit enters ’ ’ ..(0.1) ’ ’ ..(0.1) ..(0.1) ..(0.1) и i and это ...(0.5) /\ма-аль↑чик, èto ...(0.5) /\ma-al’↑čik, here boy /\ну к \другому, /\nu k \drugomu, well to another \говорит: \govorit: says 22 Stepwise falling ....(1.5) ..(0.3) (Или или /\озеро, но по-моему \озеро, потому что’ ..(0.2) как-то-оw /\Озеро ...(0.5) какое-то, /\речка, ...(0.6) \маленькое такое, \небольшое.) ....(1.5) /\Ozero ...(0.5) kakoe-to, Lake some ..(0.3) (Ili /\rečka, Either river ili /\ozero, 210 Hz or lake no po-moemu \ozero, but I guess lake potomu čto’ ..(0.2) kak-to-oW because somehow ...(0.6) \malen’koe takoe, small such \nebol’šoe.) 190 Hz minor 160 Hz 23 Representation of EDU continuity types in corpus 1188 1200 1000 800 600 400 200 0 Final falling 894 606 Non-final falling (Non-final) rising 24 The status of sentence In the speech of most speakers final falling is clearly distinct from non-final patterns Final intonation, expressly distinct from non-final intonation (both rising and falling), makes the notion of sentence valid for spoken discourse Speakers “know” when they complete a sentence and when they do not Apparently, spoken sentences are the prototype of written sentences 25 Functions of sentences Ease the processing by creating intermediate informational chunks Chafe: superfoci of consciousness 26 However Identification of sentences is possible only on the basis of a complex analytic procedure It is dependent on prior understanding of a speaker’s prosodic “portrait” There are prototypes of final and non-final fallings, but there are intermediate instances, therefore sentencehood may be a matter of degree A significant tune-up is necessary to apply the procedure to a different discourse type or a different language Therefore, sentence is an elusive, intermediate, nonbasic unit of language 27 EDUs vs. sentences: degree of variability EDUs: distribution in terms of number of words 700 600 53% – 3±1 80% – 3±2 500 Sentences: distribution in terms of number of EDUs 450 400 350 300 400 250 300 200 150 200 100 100 50 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 28 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 EDUs vs. sentences: degree of variability Unlike EDUs, sentences are highly variable Speakers with short sentences Speakers with long sentences equaling stories Clause chaining 29 Conclusions Sentence is an intermediate hierarchical grouping between a whole discourse and an EDU (roughly, clause) Sentence is very far away from being a basic unit of spoken language 30 Acknowledgement Member of our project Nikolay Korotaev 31