Given an annotated corpus - PowerPoint PPT Presentation

About This Presentation
Title:

Given an annotated corpus

Description:

Introduction to our series of syntactically annotated corpora of earlier stages ... Complementizer omission in complement and relative clauses (Jaeger & Wasow) ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 28
Provided by: annta8
Category:

less

Transcript and Presenter's Notes

Title: Given an annotated corpus


1
Given an annotated corpus
Using annotated corpora to study syntactic
variation and change
Ann Taylor University of York (UK)
2
Outline
  • Introduction to our series of syntactically
    annotated corpora of earlier stages of English
  • Illustration of the kind of research that can be
    done with these corpora that couldnt be done
    without them

3
Syntactically annotated corpora of earlier stages
of English
  • The York-Toronto-Helsinki Parsed Corpus of Old
    English Prose (Taylor et al, 2003)
  • The Penn-Helsinki Parsed Corpus of Middle
    English II (Kroch and Taylor, 2000)
  • The Penn-Helsinki Parsed Corpus of Early Modern
    English (Kroch et al, 2005)
  • The Parsed Corpus of Early English
    Correspondence (Taylor et al, 2006)
  • The Penn Parsed Corpus of Modern British English
    (Kroch et al, in progress)

4
Corpus Period Word Count
YCOE c.800-1100 1,452,086
PPCME2 1125-1500 1,155,965
PPCEME 1500-1710 1,657,058
PCEEC 1410-1700 2,162,134
Total 6,427,243
PPCMBE 1710-1914 3,000,000
5
Audience
  • The corpora are intended primarily to support
    quantitative work in language variation and
    change
  • Goals
  • Easy to access structures not just lexis or part
    of speech
  • Large enough to generate valid statistics
  • Sufficient coverage to be able to trace changes
    over time

6
The annotation system
  • A modified Penn Treebank scheme
  • Cosmetic changes
  • Nodes are given labels more familiar to
    generative linguists
  • Major changes
  • No VP
  • Function is marked on a wider range of sentential
    and NP nodes, but not on PPs

7
( (IP-MAT (CONJ and) (NP-SBJ (PRO I))
(BEP am) (ADJP (ADJ sure)
(CP-THT (C 0)
(IP-SUB (NP-SBJ (PRO I))
(MD shall)
(VB desyre)
(NP-OB1 (PRO it))))) (PP (PN because)
(CP-ADV (C 0)
(IP-SUB (NP-SBJ (PRO you))
(BEP are)
(ADVP-LOC (ADV there))))) (. .)) (ID
OSBORNE,5.002.40))
8
Old English
( (IP-MAT (CONJ ac) (NP-NOM (PRON
he)) (VBD bediglode) (ADVP
(ADV swa) (ADV teah)) (NP-ACC (PRO
his) (NA dada)) (NP-DAT (DD tam)
(ND casere) (NP-DAT-PRN (NRD
Dioclitiane)) (CP-REL
(WNP-NOM-1 (DN se)) (C
0) (IP-SUB (NP-NOM
T-1) (BEDI
was)
(NP-NOM-PRD (NP-GEN (NG deofles))
(NN
biggencga))))) (. .)) (ID
coaelive,ALS_Sebastian8.1215))
9
Correspondence Corpus
( (METADATA (AUTHOR BRIAN_DUPPAMALEFRIEND15896
1) (RECIPIENT JUSTINIAN_ISHAMMALEFRI
END161139) (LETTER
DUPPA_001E31650AUTOGRAPHFRIEND)) (IP-IMP
(IP-MAT-PRN (NP-SBJ (PRO I))
(VBP pray)) (VBI putt)
(NP-OB1 (PRO it)) (PP (P upon)
(NP (PRO your) (N score))) (. ,))
(ID DUPPA,4.001.13))
10
Searching the corpora with CorpusSearch
  • Searches structures using dominance and
    precedence relations
  • Generates statistics
  • Can search its own output

11
Variation in verb-object order in Old and Middle
English
  • (Pintzuk Taylor 2006)

12
Verb-object order in Old English Ac he sceal
pa sacfullan gesibbian But he must the
quarrelsome reconcile But he must reconcile the
quarrelsome ... (colwstan1,ALet_2_Wulfstan_1
188.256) Se wolde gelytlian pone lyfigendan
hælend He would diminish the living
lord He would diminish the living lord
... (colwstan1,ALet_2_Wulfstan_155.98)
13
Verb-object order in Middle English ear
he hefde his ranceun fulleliche ipaizet before
he had his ransom fully paid Before
he had fully paid his ransom ... (CMANCRIW,II.10
1.1228) zef pu wult habben bricht sichde wid
pine heorte echnen if you will have
bright sight with your hearts eyes If you
will have bright sight with your hearts eyes
... (CMANCRIW,II.73.839)
14
  • The question what factors affect object position
    in OE and ME?
  • The data 10,000 tokens containing a medial
    auxiliary, a non-finite verb and an object

15
Factors affecting object position
  • Date of text
  • Length of object
  • Type of object

16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
Type of object
  • Quantified
  • Negative
  • Positive (non-negative, non-quantified)

21
Quantified objects (Middle English) zef ze
habbed ani god don if you have any good
done ... if you have done any good
... (CMANCRIW,I.76.310) fordon pe he scal
azein zeuen awiht for he shall again
give something ... for he shall again give
something. (CMLAMBX1,31.396)
22
Negative objects (Middle English) pt he ne
mai nan ping don us buten godes leaue that
he neg can no thing do us without Gods
leave ... that he can do nothing to us without
Gods leave. (CMANCRIW,II.169.2346) swa pet
ho ne scal of pere wunde habbe nan oder
uuel so that she neg shall from her wound
have no other evil ... so that she shall
have no other evil from her wound. (CMLAMB1,83.1
95)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
Syntactic variation in PDE
  • Heavy-NP shift (Wasow Arnold)
  • Dative alternation (Wasow, Bresnan)
  • Particle shift (Gries)
  • Saxon vs. of-genitive (Szmrecsanyi)
  • Complementizer omission in complement and
    relative clauses (Jaeger Wasow)
  • Topicalization, etc. (Cresswell)

27
Conclusions
  • The study of syntactic variation is an up and
    coming topic in linguistics
  • It cant be studied using the usual methods
    (introspection, intuition) but requires naturally
    occurring data
  • Text corpora are only so useful for this
  • To study syntactic variation efficiently, you
    really need annotated data, and the more the
    better
Write a Comment
User Comments (0)
About PowerShow.com