Title: Number Words
1Number Words Frequency in Modern Lithuanian
- Adriano Cerri
- University of Pisa, Department of Linguistics
- adriano.cerri_at_for.unipi.it
2Introduction
Methodology
Data Remarks
Conclusions
Future directions of study
3History
Etymology
Numerals
Linguistic typology
Anthropology
Quantitative studies
Psychology
4Numerals in many of the worlds languages (cf.
Stampe 1976, Greenberg 1978)
- they are part of a system
- they play different roles (simple units, main
bases, secondary bases, upper units, etc.)
?
Number words frequency
5Basic questions
- Are numerals used with random frequency?
- If a pattern of use emerge, how can this pattern
be understood within the structure of the system?
6Introduction
Methodology
Data Remarks
Conclusions
Future directions of study
7Target language Modern Lithuanian
- Useful tools
- - L. Grumadiene V. Žilinskiene (1997-1998),
Dažninis dabartines rašomosios lietuviu kalbos
žodynas Frequency Dictionary of Modern Written
Lithuanian - A. Utka (2009), Dažninis rašytines lietuviu
kalbos žodynas Frequency Dictionary of Written
Lithuanian - Dabartines lietuviu kalbos tekstynas Corpus of
Contemporary Lithuanian Language (CCLL)
donelaitis.vdu.lt - Lietuviu mokslo kalbos tekstynas Corpus
Academicum Lithuanicum (CorALit) coralit.lt
8The Dictionaries Advantages
- M. F.
- NOM. penki penkios
- GEN. penkiu
- DAT. penkiems penkioms
- ACC. penkis penkias
- INS. penkiais penkiomis
- LOC. penkiuose penkiose
NOM.M penki tot. occ. 187
9The Dictionaries Limits
- Complex numerals (two or more number words, e.g.
du šimtai septyniasdešimt trys 273) are not
registered as a single numeral, but their
components are counted separately (e.g. 2 100
70 3)
Consequence complex numerals are not
represented, their single components are over
represented
Original database on number words frequency
using the CCLL
10- Search Simple numerals (e.g. keturi 4)
Number Word Occurr.
NOM.M keturi 6399
NOM.F keturios 2809
GEN.M F keturiu 6421
DAT.M keturiems 596
DAT.F keturioms 283
ACC.M keturis 5982
ACC.M keturias 3145
INS.M keturiais 929
INS.F keturiomis 374
LOC.M keturiuose 312
LOC.F keturiose 480
Total 27.730
11Search Complex numerals (e.g. dvidešimt penki
25)
12Search Complex numerals (e.g. dvidešimt penki
25)
s studentu grupe iš visos Europos. Dvidešimt
penki instrumentalistai dirba dra iai daugiau
nei kitam mirtingajam ( dvidešimt penki lavonai
vardan grožio!). Pe d, jo nuomone, Lietuvoje yra
kokie dvidešimt penki verti demesio
skulptoriai i 3, penkiolika futbolininku - po 2,
dvidešimt penki - po 1. Ši savaitgali ir
tau, kad meluoji! Buvo mažiausiai dvidešimt
penki gorciai, tik išmatavome neg dalyvavo
trisdešimt trys teatrai. Dvidešimt penki iš ju
vaidino lietuviu, o a iu tuos tris šimtus metru,
turesiu dvidešimt progu tuo isitikinti penki
jusu s imk savo pelna. O tas pelnas buvo
dvidešimt penki kartai, kuriuos jis visados
iesiausias kelias i Daugpili - vos dvidešimt
penki kilometrai. Taciau ten Riman i tai, kas
priklauso. Priklause dvidešimt penki kirciai,
kuriuos jis labai s kiekvienais metais ne mažiau
kaip dvidešimt penki milijardai doleriu
pervedami i kompaktiniu disku deželes (telpa
dvidešimt penki sargiai). Dar roskildieciai ji
automobilio modeli - "Carisma". Dvidešimt penki
šalies gyventojai, savo lan iaus ir D.Gireno
skrydžiu, kai "... dvidešimt penki tukstanciai
lietuviu nesulau nkauskui. "Senuku" asortimentas
- dvidešimt penki tukstanciai prekiu
vakariet lturos skyriaus ataskaitoje...
Dvidešimt penki žymiausi ivairiu kartu Balta jo
pulko karininku buvo areštuoti dvidešimt
septyni, taip pat penki puskarinin
13Search Complex numerals (e.g. dvidešimt penki
25)
First word Contextual word
dvidešimt penki
dvidešimt penkios
dvidešimt penkiu
dvidešimt penkiems
dvidešimt penkioms
dvidešimt penkis
dvidešimt penkias
dvidešimt penkiais
dvidešimt penkiomis
dvidešimt penkiuose
dvidešimt penkiose
14Introduction
Methodology
Data Remarks
Conclusions
Future directions of study
15Table 1. Counting of number words occurrences in
the Corpus of Contemporary Lithuanian Language
(CCLL)
16Chart 1. Number words occurrences in the Corpus
of Contemporary Lithuanian Language (CCLL)
17Chart 2. Numerals 1-9
Trend Frequency lowers as numerical value
increases
(Cf. Hurford (1987 91) for Modern English)
18Chart 3. The tens
19Chart 4. The series of round numerals
20Chart 5. Numerals 11-19
21Chart 6. Numerals 21-29
22Chart 7. The peaks of frequency
Correspondence between the structural role of a
numeral, its cognitive salience and its frequency
of use
23The base (10) of the system is a upper-level unit
Charts 2 and 3.
24Introduction
Methodology
Data Remarks
Conclusions
Future directions of study
25Main results
- Lithuanian number words are not used with random
frequency - Trend within each cycle, the lower the numeral
is, the higher its frequency - Frequency can be subject to comparative
predictions (e.g. frequency 4 gt 9) - The cycle 1-9 serves as a basic model ruled by
the above-mentioned trend - The whole system proceeds by reproducing the
basic model
26Main results
- Vienas 1 is the most frequently used numeral
- It serves as a model for those numerals sharing
the semantic trait of unity (10, 100, 1000
etc.) - A correspondence is shown between the structural
role of a numeral, its cognitive salience and its
frequency of use - Round numerals attract a higher number of
occurrences
27Round numerals
- fulfil the universal need of milestones along
the endless path of numbers
- more salient
- more frequent
- more suitable for approximate uses (to round
off a quantity)
28Introduction
Methodology
Data Remarks
Conclusions
Future directions of study
29- Other languages, especially non-decimal ones
Cross-linguistic perspective a frequency
typology of numerals?
What is culturally determined? What is universal?
30(No Transcript)
31References
- Bybee Hopper (eds., 2001) Frequency and
Emergence of Linguistic Structure. Amsterdam
John Benjamins. - Bybee (2007) Frequency of Use and the
Organization of Language. Oxford Oxford
University Press. - CCLL Corpus of Contemporary Lithuanian Language
/ Dabartines lietuviu kalbos tekstynas,
http//donelaitis.vdu.lt. - CorALit Corpus Academicum Lithuanicum /
Lietuviu mokslo kalbos tekstynas,
http//coralit.lt. - Greenberg (1978) Generalizations about numeral
systems. J.H. Greenberg, C.A. Ferguson, E.A.
Moravcsick (eds.). Universals of human language
3 Word structure. Standford Standford
University Press, 249-295. - Grumadiene Žilinskiene (1997) Dažninis
dabartines rašomosios lietuviu kalbos žodynas
(mažejancio dažnio tvarka). Vilnius Lietuviu
kalbos institutas, Matematikos ir informatikos
institutas. - Grumadiene Žilinskiene (1998) Dažninis
dabartines rašomosios lietuviu kalbos žodynas
(abeceles tvarka). Vilnius Lietuviu kalbos
institutas, Matematikos ir informatikos
institutas. - Hurford (1987) Language and Number The
Emergence of a Cognitive System. Oxford Basil
Blackwell. - Kaufman, Lord, Reese Volkmann (1949) The
Discrimination of Visual Number. American Journal
of Psychology, 62 (4), 498-525. - Mandler Shebo (1982) Subitizing an Analysis
of its Component Processes. Journal of
Experimental Psychology General, 111, 1-22. - Ruke-Dravina (1979) On numerals in Baltic and
Slavic languages. Acta Baltico-Slavica, 12,
53-66. - Stampe 1976 Cardinal Number Systems. S.S.
Mufwene, C.A. Walker, S.B. Steever (eds.). Papers
from the Twelfth Regional Meeting of the Chicago
Linguistic Society. Chicago Chicago Linguistic
Society, 594-609. - Thorndike Lorge (1944) The Teachers Word
Book of 30.000 Words. New York Columbia
University Teachers College. - Trick Pylyshyn (1994) Why are small and large
numbers enumerated differently? A
limited-capacity preattentive stage in vision.
Psychological Review, 101 (1), 80-102. - Utka (2009) Dažninis rašytines lietuviu kalbos
žodynas. Kaunas VDU leidykla.