Typology: Language Sampling Anna Siewierska - PowerPoint PPT Presentation

1 / 138
About This Presentation
Title:

Typology: Language Sampling Anna Siewierska

Description:

Types of biases in samples. Two strategies. Samples in typological literature. The DV method ... 'Rara et Rarissima' Typology: Language Sampling. 39. Data collecting ... – PowerPoint PPT presentation

Number of Views:163
Avg rating:3.0/5.0
Slides: 139
Provided by: bak111
Category:

less

Transcript and Presenter's Notes

Title: Typology: Language Sampling Anna Siewierska


1
TypologyLanguageSampling Anna Siewierska
Dik Bakker
2
Empirical Cycle
L
L
PROVISIONAL
L
L
L
Definition Categories C1 C3 Hypotheses
L
L
DATA
L
L
L
L
L
L
L
L
3
Empirical Cycle
L
L
PROVISIONAL
L
L
L
Definition Categories C1 C3 Hypotheses
L
L
DATA
L
L
L
L
L
L
L
L
4
Empirical Cycle
L
L
PROVISIONAL
L
L
L
Definition Categories C1 C3 Hypotheses
L
L
DATA
L
L
L
L
L
TEST
L
L
L
5
Empirical Cycle
L
L
L
L
PROVISIONAL
L
L
L
Definition Categories C1 C3 Hypotheses
L
L
L
DATA
L
L
L
L
L
L
L
L
L
TEST
L
L
L
L
L
L
6
Empirical Cycle
L
L
L
L
PROVISIONAL
L
L
L
Definition Categories C1 C3 Hypotheses
L
L
L
DATA
L
L
L
L
L
L
L
L
L
TEST
L
L
L
L
L
L
7
Overview
8
Overview
  • Collecting language data
  • Why a sample?
  • Types of biases in samples
  • Two strategies
  • Samples in typological literature
  • The DV method

9
Data collecting
Languages of the world n ? 7000
10
Data collecting
Languages of the world n ? 7000
S A M P L E (50 500)
11
Data collecting
  • Why not all languages in our database?
  • Too many
  • Only lt1000 well described (grammar)
  • lt2000 partial sketch
  • Not (always) necessary- Sometimes even wrong
  • Impossible even in principle

12
All Languages impossible
Extant languages 7000
13
All Languages impossible
Extant languages 7000 Extinct languages 500
(Ruhlen 1991)
14
All Languages impossible
  • Extant languages 7000
  • Extinct languages 500 (Ruhlen 1991)
  • Latin, Cl. Greek, Gothic, Hebrew, Hittite,

15
All Languages impossible
  • Extant languages 7000
  • Extinct languages 500 (Ruhlen 1991)
  • Latin, Cl. Greek, Gothic, Hebrew, Hittite,
  • Cl. Turkic, Cl.Tibetan, Archaic Chinese,

16
All Languages impossible
  • Extant languages 7000
  • Extinct languages 500 (Ruhlen 1991)
  • Latin, Cl. Greek, Gothic, Hebrew, Hittite,
  • Cl. Turkic, Cl.Tibetan, Archaic Chinese,
  • - Manx, Cornish,

17
All Languages impossible
  • Extant languages 7000
  • Extinct languages 500 (Ruhlen 1991)
  • Latin, Cl. Greek, Gothic, Hebrew, Hittite,
  • Cl. Turkic, Cl.Tibetan, Archaic Chinese,
  • Manx, Cornish,
  • Problem?

18
All Languages impossible
  • Extant languages 7000
  • Extinct languages 500 (Ruhlen 1991)
  • Latin, Cl. Greek, Gothic, Hebrew, Hittite,
  • Cl. Turkic, Cl.Tibetan, Archaic Chinese,
  • Manx, Cornish,
  • No native speaker intuitions

19
All Languages impossible
  • Extant languages 7000
  • Extinct languages 500 (Ruhlen 1991)
  • Latin, Cl. Greek, Gothic, Hebrew, Hittite,
  • Cl. Turkic, Cl.Tibetan, Archaic Chinese,
  • Manx, Cornish,
  • Illinois, Mohican, Massachusett, Carolina,

20
All Languages impossible
  • Extant languages 7000
  • Extinct languages 500 (Ruhlen 1991)
  • Latin, Cl. Greek, Gothic, Hebrew, Hittite,
  • Cl. Turkic, Cl.Tibetan, Archaic Chinese,
  • Manx, Cornish,
  • Illinois, Mohican, Massachusett, Carolina,
  • X1, X2, X3, , Xn

21
All Languages impossible
Extant languages 7000 Extinct languages 500
(Ruhlen 1991) X1, X2, X3, , Xn ????
22
All Languages impossible
Homo Sapiens 200,000 BP Geat Leap Forward
40,000 BP Average n of lgs 6000 Diachronic
change 1000 year X lgs (40,000 / 1000) 6000
240,000
23
All Languages impossible
Extant languages 7000 Extinct languages
500 X1, X2, X3, , Xn 240,000 Human
languages 247,500
24
All Languages impossible
Extant languages 7000 Extinct languages
500 X1, X2, X3, , Xn 240,000 Human
languages 247,500
3.0
25
All Languages impossible
Extant Documented 1500 Extinct languages
500 X1, X2, X3, , Xn 240,000 Human
languages 247,500
0.6
26
All Languages impossible
Extant Documented 1500 Extinct languages
500 X1, X2, X3, , Xn 240,000 Human
languages 247,500
0.6
spoken anno 2000
27
All Languages impossible
Extant Documented 1500 Extinct languages
500 X1, X2, X3, , Xn 240,000 Human
languages 247,500
0.6
spoken anno 2000
28
All Languages impossible
Extant Documented 1500 Extinct languages
500 X1, X2, X3, , Xn 240,000 Human
languages 247,500
0.6
spoken anno 2000
Typology Universals of Human Language
29
All Languages impossible
Extant Documented 1500 Extinct languages
500 X1, X2, X3, , Xn 240,000 Human
languages 247,500
0.6
spoken anno 2000
Human Language
30
All Languages impossible
Extant Documented 1500 Extinct
Documented lt100 X1, X2, X3, ,
Xn 240,000 Human languages 247,500
spoken anno 2000
Human Language
31
All Languages impossible
Uniformi- tarianism (Lass 1997)
Extant Documented 1500 Extinct
Documented lt100 X1, X2, X3, ,
Xn 240,000 Human languages 247,500
spoken anno 2000
Human Language
32
All Languages impossible
Uniformi- tarianism (Lass 1997)
Extant Documented 1500 Extinct
Documented lt100 X1, X2, X3, ,
Xn 240,000 Human languages 247,500
spoken anno 2000
Human Language
33
All Languages impossible
Uniformi- tarianism (Lass 1997)
Extant Documented 1500 Extinct
Documented lt100 X1, X2, X3, ,
Xn 240,000 Human languages 247,500
spoken anno 2000
Human Language
34
All Languages impossible
Extant Documented 1500 Extinct languages
500 X1, X2, X3, , Xn 240,000 Human
languages 247,500
0.6
spoken anno 2000
Typology Variety among human languages
35
All Languages impossible
Extant Documented 1500 Extinct languages
500 X1, X2, X3, , Xn 240,000 Human
languages 247,500
0.6
spoken anno 2000
Variety among human languages
36
Variety rare types
Variety
37
Variety rare types
Variety Clicks (only in one family Khoisan
30 lgs) Active nominal marking (Pomo,
Laz) Opposite person hierarchy Acc-Erg
(Tib.Burm.) Tripartite agreement on
ditransitives Syntactic ergativity (Aus,
Maya) Adverbial agreement with focal (Aus,
Cauc) OSV main clause order (S.Am) N.B.
combination of (rare) features (cf. Greenberg)
38
Variety rare types
Variety Clicks (only in one family Khoisan
30 lgs) Active nominal marking (Pomo,
Laz) Opposite person hierarchy Acc-Erg
(Tib.Burm.) Tripartite agreement on
ditransitives Syntactic ergativity (Aus,
Maya) Adverbial agreement with focal (Aus,
Cauc) OSV main clause order (S.Am) ? Rara et
Rarissima
39
Data collecting
  • Why not all languages in our database?
  • Too many
  • Only lt1000 well described (grammar)
  • lt2000 partial sketch
  • Not (always) necessary- Sometimes even wrong
  • Impossible even in principle
  • Problematic for variety
  • Possibly not for universality

40
Data collecting
  • Why not all languages in our database?
  • Too many
  • Only lt1000 well described (grammar)
  • lt2000 partial sketch
  • Not (always) necessary- Sometimes even wrong

41
Data collecting
  • Why not all languages in our database?
  • Too many
  • Only lt1000 well described (grammar)
  • lt2000 partial sketch
  • Not (always) necessary- Sometimes even wrong

42
Too many languages
Samples in the typological literature Greenberg
(1963) Word order 30 Hawkins (1983) Word
order 225 Tomlin (1986) Word
order 402 Nichols (1992) Head/Dep
marking 174 Bybee (1994) Tense/Aspect/Mood 76 Si
ewierska Bakker (1990-) Pers.Agr. 450 Dryer
(1985-) Word order 1200 Typical PhD project
(1 person, 3 years) 50 - 100
43
Data collecting
  • Why not all languages in our database?
  • Too many
  • Only lt1000 well described (grammar)
  • lt2000 partial sketch
  • Not (always) necessary- Sometimes even wrong

44
Data collecting
  • Why not all languages in our database?
  • Too many ? sample inevitable
  • Only lt1000 well described (grammar)
  • lt2000 partial sketch
  • Not (always) necessary- Sometimes even wrong

45
Data collecting
  • Why not all languages in our database?
  • Only lt1000 well described (grammar)
  • lt2000 partial sketch
  • Not (always) necessary- Sometimes even wrong

46
Data collecting
  • Why not all languages in our database?
  • Only lt1000 well described (grammar)
  • lt2000 partial sketch
  • Not (always) necessary- Sometimes even wrong

47
Lack of material
  • Bibliographic bias
  • - (very) old
  • scarce
  • theory specific (Tagmemics GG)
  • restricted to phonology and morphology
  • biased selection of the worlds languages

48
Lack of material
Further types of bias
49
Lack of material
  • Further types of bias
  • Genetic

50
Lack of material
  • Further types of bias
  • Genetic
  • Indo-European, Ugric, Bantu
  • Australian, Amerindian, Papuan - -

51
Lack of material
  • Further types of bias
  • Genetic
  • Areal

52
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Sprachbund Balkan
  • Circum-Baltic
  • C.America
  • S.E.Asia

53
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological

54
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Parametric variables (Hawkins 1983)

55
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Parametric variables (Hawkins 1983)
  • Adposition

56
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Parametric variables (Hawkins 1983)
  • Prep

57
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Parametric variables (Hawkins 1983)
  • Prep ? Dem Num Adj Gen Rel N NP

58
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Parametric variables (Hawkins 1983)
  • PRepNounModHierarchy

59
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Parametric variables (Hawkins 1983)
  • PRepNounModHierarchy
  • Prep ? ((NDem OR NNum ? NA) AND
  • (NA ? NGen) AND (NGen ? NRel))

60
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Cultural

61
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Cultural
  • Linguistic relativity (Sapir Whorf)

62
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Cultural
  • Linguistic relativity (Sapir Whorf)
  • Lucy (1992) count nouns vs classifiers
  • counting tasks

63
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Cultural
  • Community size

64
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Cultural
  • Community size
  • Small ? high genetic drift (Kimura 1983)

65
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Cultural
  • Community size
  • Small ? high genetic drift (Kimura 1983)
  • Also linguistic drift? (Dahl hunter/gatherer)

66
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Cultural
  • Community size
  • Small ? high genetic drift (Kimura 1983)
  • Also linguistic drift? (Dahl hunter/gatherer)
  • N.B. OSV/OVS only in lt 3000 languages

67
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Cultural
  • Community size
  • Language contact

68
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Cultural
  • Community size
  • Language contact
  • Borrowed phenomenon measured twice

69
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Cultural
  • Community size
  • Language contact
  • BUT contact may also create new types

70
Lack of material
  • Further types of bias
  • Genetic
  • Areal
  • Typological
  • Cultural
  • Community size
  • Language contact
  • BUT contact may also create new types

71
Data collecting
  • Why not all languages in our database?
  • Only lt1000 well described (grammar)
  • lt2000 partial sketch
  • ( bibliographical bias)
  • Not (always) necessary- Sometimes even wrong

72
Data collecting
  • Why not all languages in our database?
  • Only lt1000 well described (grammar)
  • lt 2000 partial sketch
  • Cater for biases by stratifying for the
  • relevant dimensions
  • Not (always) necessary- Sometimes even wrong

73
Data collecting
  • Why not all languages in our database?
  • Not (always) necessary- Sometimes even wrong

74
Small is beautiful
A good sample may be better than a large
sample Sample type and size depends on goal of
project Establish the probability of a language
type (e.g. prepositional vs postpositional) ?
Probability sample Explore the existing variety
on a certain dimension (e.g. case systems
combination of order patterns) ? Variety sample
75
Small is beautiful
  • 1. Probability sample
  • Only independent cases
  • Control for
  • - genetic relations
  • - language contact
  • But relative stability of relevant variables
  • - Reflexive passive (Romance vs Slavic)

76
Small is beautiful
Samples in the typological literature Greenberg
(1963) Word order 30 Hawkins (1983) Word
order 225 Tomlin (1986) Word
order 402 Nichols (1992) Head/Dep
marking 174 Bybee (1994) Tense/Aspect/Mood 76 Si
ewierska Bakker (1990-) Pers.Agr. 450 Dryer
(1985-) Word order 1200
probab
77
Large may be better
  • 2. Variety sample
  • Maximum (all?) different cases
  • Cater for
  • - variation in genetic/areal groups
  • - typically cyclical
  • - stop when no new cases found
  • Research parameters typically unknown !

78
Probability vs Variety
  • Probability sample
  • relatively small (30 150)
  • may be too large (double cases)
  • Variety sample
  • relatively large (gt 200)
  • can not be too large (just superfluous)

79
Sampling in the literature
Introductions to Typology Comrie (1981)
9-12 (4) Croft (1990) 18-26 (9) Whaley
(1997) 36-43 (8) Song (2001) 17-38 (22)
80
Probability sampling
  • Bell (1978)
  • - genetic, areal and typological bias
  • 478 genetic groups (gt 3000 year depth)
  • - per family n of lgs proportional to n of
    groups
  • problems
  • sample lt 478 selection
  • small families disappear

81
Probability sampling
  • Perkins (1980)
  • Bell stratified for culture (Murdock 1967)
  • 50 languages with optimal genetic and
  • cultural distance
  • - good for probability, too small for variety

82
Probability sampling
  • Dryer (1989)
  • Bell, but
  • 322 established genera, 3500 4000 years deep
  • variable values established per genus
  • not language (mainly stable, else the most
    frequent)
  • - 5 macro-areas, counting genera per area

83
Probability sampling
 
? SOV gt SVO
 
84
Probability sampling
 
 
Good for universal preferences on stable
variables Unclear how to generalize to other
types of sampling, with languages central
85
Variety sampling
 
Characteristics Create variety samples of any
size Free choice of classification used
(Gen/Ar/Typ) Stratification on other parameter
(Gen Ar/Typ) Generate new samples evaluate
existing samples Fully formalized and computer
implemented
 
86
Variety sampling
 
  • Central idea
  • classifications express linguistic
  • (dis)similarities between languages
  • established on the basis of expert knowledge
  • subject to cyclical improvement and refinement
  • best starting point for explorative research
  • into variation among languages

 
87
Variety sampling
Afro-Asiatic
Amerindian
Caucasian
Dravidian
88
Variety sampling
Afro-Asiatic
Amerindian
Caucasian
Dravidian
Mimimum sample 1 language per family
89
Variety sampling
Afro-Asiatic
Amerindian
Caucasian
Dravidian
HBR
ARB
QUE
GUA
GEO
CHE
KAN
TAM
90
Variety sampling
Afro-Asiatic
Amerindian
Caucasian
Dravidian
HBR
ARB
QUE
GUA
GEO
CHE
KAN
TAM
91
Variety sampling
Afro-Asiatic
Amerindian
Caucasian
Dravidian
HBR
ARB
QUE
GUA
GEO
CHE
KAN
TAM
Select language with the best description (for
the purpose)
92
Variety sampling
Afro-Asiatic
Amerindian
Caucasian
Dravidian
HBR
ARB
QUE
GUA
GEO
CHE
KAN
TAM
Includes all ISOLATES Basque, Burushaski, Ket,
Nahali,
93
Variety sampling
Afro-Asiatic
Amerindian
Caucasian
Dravidian
Mimimum sample 1 language per family Ruhlen
(1991) 27 Ethnologue (2005) 120
Basic Sample Murdock (1967) 50
94
Variety sampling
DV3
DV6
DV2
Afro-Asiatic
Amerindian
Caucasian
Dravidian
Extending the Basic Sample to preferred
size e.g. extend Ruhlen-based sample from 27 ?
50 KEY relative complexity of family tree
95
Variety sampling
DV3
DV6
DV2
Afro-Asiatic
Amerindian
Caucasian
Dravidian
Adjusting DV values to full tree
structure Recursively down the trees Lower
levels contribute relatively less to DV
96
Variety sampling
DV3
DV6
DV2
Afro-Asiatic
Amerindian
Caucasian
Dravidian
Formula for weight per level Ck Ck-1 ( Nk
- Nk-1 ) ( MAX (k-1) ) / MAX )
See Rijkhoff Bakker (1998)
97
Variety sampling
DV55.5
DV178.4
DV8.5
Afro-Asiatic
Amerindian
Caucasian
Dravidian
Formula for weight per level Ck Ck-1 ( Nk
- Nk-1 ) ( MAX (k-1) ) / MAX )
98
Variety sampling
3
6
2
DV55.5
DV178.4
DV8.5
Afro-Asiatic
Amerindian
Caucasian
Dravidian
Formula for weight per level Ck Ck-1 ( Nk
- Nk-1 ) ( MAX (k-1) ) / MAX )
99
Variety sampling
DV55.5
DV178.4
DV8.5
Afro-Asiatic
Amerindian
Caucasian
Dravidian
Computer program
100
Variety sampling
DV55.5
DV178.4
DV8.5
Afro-Asiatic
Amerindian
Caucasian
Dravidian
  • Computer program
  • Number of lgs per family given sample size

101
Variety sampling
 
   
 
   
RUHLEN (1991)
 
 
 
 
102
Variety sampling
 
   
 
   
5.9
3.3
6.1
 
 
 
 
103
Variety sampling
DV55.5
DV178.4
DV8.5
Afro-Asiatic
Amerindian
Caucasian
Dravidian
  • Computer program
  • Number of lgs per family given sample size

104
Variety sampling
DV55.5
DV178.4
DV8.5
Afro-Asiatic
Amerindian
Caucasian
Dravidian
  • Computer program
  • Number of lgs per family given sample size
  • Optimal distribution over subbranches
  • (maximum distance ? maximum variety)

105
Variety sampling
 
   
 
   
 
 
 
 
 
106
Variety sampling
 
   
 
   
 
 
 
 
 
107
Variety sampling
 
   
 
   
 
 
 
 
 
108
Variety sampling
 
   
 
   
 
 
 
 
 
109
Variety sampling
 
   
 
   
 
 
 
 
 
110
Variety sampling
 
   
 
   
 
 
 
 
 
111
Variety sampling
 
   
 
   
Amerind (51 / 854)
Andean (3 / 30)
 
 
 
 
112
Variety sampling
 
   
 
   
Amerind (51 / 854)
Andean (3 / 30)
NORTH
SOUTH
AYMA
QUECH
CAHUA
URA
 
 
 
 
113
Variety sampling
 
   
 
   
Amerind (51 / 854)
Andean (3 / 30)
NORTH
SOUTH
AYMA
QUECH
CAHUA
URA
 
 
 
 
114
Variety sampling
 
   
 
   
Amerind (51 / 854)
Andean (3 / 30)
NORTH
SOUTH
AYMA
QUECH
CAHUA
URA
 
 
 
 
115
Variety sampling output
 
   
 
   
Typical output
 
 
 
 
116
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/average Sample size 100 (
1.90 of 5273)
 
 
 
 
117
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273)
 
 
 
 
118
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Afro-Asiatic (55.53/6/258)
6 Altaic (15.07/2/62) 2 Amerind
(178.44/6/854) 18Australian
(67.58/30/262) 7
 
 
 
 
119
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Afro-Asiatic (55.53/6/258)
6 Altaic (15.07/2/62) 2 Amerind
(178.44/6/854) 18Australian
(67.58/30/262) 7 Na-Dene (9.44/2/41)
1 Niger-Kordofanian (90.38/2/1068) 9
 
 
 
 
120
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Afro-Asiatic (55.53/6/258)
6 Altaic (15.07/2/62) 2 Amerind
(178.44/6/854) 18Australian
(67.58/30/262) 7 Na-Dene (9.44/2/41)
1 Niger-Kordofanian (90.38/2/1068)
9 Basque (1.00/0/0) 1Etruscan
(1.00/0/0) 1Gilyak (1.00/0/0)
1
 
 
 
 
121
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Afro-Asiatic (55.53/6/258)
6 Altaic (15.07/2/62) 2 Amerind
(178.44/6/854) 18Australian
(67.58/30/262) 7 Na-Dene (9.44/2/41)
1 Niger-Kordofanian (90.38/2/1068)
9 Basque (1.00/0/0) 1Etruscan
(1.00/0/0) 1Gilyak (1.00/0/0)
1
 
 
 
 
122
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Niger-Kordofanian
(90.38/2/1068) 9
 
 
 
 
123
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Niger-Kordofanian (2/1068) 9
Niger-Congo (2/1036) 8
Niger-Congo Proper (2/1007) 7
Central Niger-Congo (2/961) 6
South Central Niger-Congo (3/755) 3
Eastern (9/703) 1
Western (2/47) 1
Ijo-Defaka (2/5) 1
North Central Niger-Congo (4/206) 3
West Atlantic (3/46) 1
Mande (3/29) 1 Kordofanian
(2/32) 1
 
 
 
 
124
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Niger-Kordofanian (2/1068) 9
Niger-Congo (2/1036) 8
Niger-Congo Proper (2/1007) 7
Central Niger-Congo (2/961) 6
South Central Niger-Congo (3/755) 3
Eastern (9/703) 1
Western (2/47) 1
Ijo-Defaka (2/5) 1
North Central Niger-Congo (4/206) 3
West Atlantic (3/46) 1
Mande (3/29) 1 Kordofanian
(2/32) 1
 
 
 
 
125
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Niger-Kordofanian (2/1068) 9
Niger-Congo (2/1036) 8
Niger-Congo Proper (2/1007) 7
Central Niger-Congo (2/961) 6
South Central Niger-Congo (3/755) 3
Eastern (9/703) 1
Western (2/47) 1
Ijo-Defaka (2/5) 1
North Central Niger-Congo (4/206) 3
West Atlantic (3/46) 1
Mande (3/29) 1 Kordofanian
(2/32) 1
 
 
 
 
126
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Niger-Kordofanian (2/1068) 9
Niger-Congo (2/1036) 8
Niger-Congo Proper (2/1007) 7
Central Niger-Congo (2/961) 6
South Central Niger-Congo (3/755) 3
Eastern (9/703) 1
Western (2/47) 1
Ijo-Defaka (2/5) 1
North Central Niger-Congo (4/206) 3
West Atlantic (3/46) 1
Mande (3/29) 1 Kordofanian
(2/32) 1
 
 
 
 
127
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Niger-Kordofanian (2/1068) 9
Niger-Congo (2/1036) 8
Niger-Congo Proper (2/1007) 7
Central Niger-Congo (2/961) 6
South Central Niger-Congo (3/755) 3
Eastern (9/703) 1
Western (2/47) 1
Ijo-Defaka (2/5) 1
North Central Niger-Congo (4/206) 3
West Atlantic (3/46) 1
Mande (3/29) 1 Kordofanian
(2/32) 1
 
 
 
 
128
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Niger-Kordofanian (2/1068) 9
Niger-Congo (2/1036) 8
Niger-Congo Proper (2/1007) 7
Central Niger-Congo (2/961) 6
South Central Niger-Congo (3/755) 3
Eastern (9/703) 1
Western (2/47) 1
Ijo-Defaka (2/5) 1
North Central Niger-Congo (4/206) 3
West Atlantic (3/46) 1
Mande (3/29) 1 Kordofanian
(2/32) 1
 
 
 
 
129
Variety sampling output
 
   
 
   
Classification Ruhlen91Criterion 1 Diversity
Value dynamic/global/averageSample size 100 (
1.90 of 5273) Niger-Kordofanian (2/1068) 9
Niger-Congo (2/1036) 8
Niger-Congo Proper (2/1007) 7
Central Niger-Congo (2/961) 6
South Central Niger-Congo (3/755) 3
Eastern (9/703) 1
Western (2/47) 1
Ijo-Defaka (2/5) 1
North Central Niger-Congo (4/206) 3
West Atlantic (3/46) 1
Mande (3/29) 1 Kordofanian
(2/32) 1
 
 
 
 
130
Variety sampling
 
   
 
   
Side effect of large (variety) sample Hidden
diachrony
 
 
 
 
131
Variety sampling
 
   
 
   
  • Problems
  • works only on tree-shaped classifications
  • time depth in genetic trees unbalanced
  • not good for probability samples
  • Creoles? Extinct languages?

 
 
 
 
132
Round off
 
   
 
   
 
 
 
 
133
Round off
 
   
 
   
Two Sample Strategies
 
 
 
 
134
Round off
 
   
 
   
Two Sample Strategies 1. Probability sample
- relatively small - control for Gen/Ar/Typ
bias
 
 
 
 
135
Round off
 
   
 
   
Two Sample Strategies 1. Probability sample
- relatively small - control for Gen/Ar/Typ
bias 2. Variety sample - relatively large
- may be stratified for bias parameters -
may have diachronic dimension
 
 
 
 
136
Round off
 
   
 
   
Sample Types 1. Probability sample 2. Variety
sample 3. Random sample when bias is unimportant
 
 
 
 
137
Round off
 
   
 
   
Sample Types 1. Probability sample 2. Variety
sample 3. Random sample when bias is
unimportant 4. Convenience sample when
bibliographical constraints kick in ...
 
 
 
 
138
?
Write a Comment
User Comments (0)
About PowerShow.com