Dependency Parsing with Reference to Slovene, Spanish and Swedish - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Dependency Parsing with Reference to Slovene, Spanish and Swedish

Description:

Dependency Parsing with Reference to Slovene, Spanish and Swedish – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 12
Provided by: kevi61
Category:

less

Transcript and Presenter's Notes

Title: Dependency Parsing with Reference to Slovene, Spanish and Swedish


1
Dependency Parsing with Reference to Slovene,
Spanish and Swedish
  • Simon Corston-Oliver
  • Anthony Aue
  • Microsoft Research

2
Noteworthy results
  • Slovene
  • Labeled DA 72.42 (second)
  • Not significantly different from 1 (73.44)
  • Swedish
  • 1 for unlabeled DA (89.54)
  • Much worse than 3 for labeled DA(79.69 vs
    82.31)

3
Outline
  • Two stage pipeline
  • Identify unlabeled directed dependencies
  • Label the dependencies

4
Parser
  • Unlabeled directed dependencies
  • Discriminatively trained linear classifier
  • Projective dependencies only
  • Parse features
  • Case-normalized surface form and lemma
  • POS of each token
  • POS of intervening and neighboring tokens
  • Combinations of these
  • Direction and distance of attachment

5
POS features
  • Use fine POS tags for all languages except Dutch
    and Turkish
  • Swedish Normalize tags for auxiliaries
  • Orig vara (be) AV måst (must) MV
  • Replace with aux
  • Unlabeled DA 89.23 ? 89.45

6
Root identification features
  • Many errors identifying root in periphrastic
    constructions with aux and participle
  • E.g. German aux/modal in second position in
    declarative main clause
  • initial with subj-aux inversion
  • New features
  • POS sequence to left of each token
  • Leftmost finite verb and not preceded by
    subordinating conj or relative pron
  • Sentence does (not) contain finite verb

7
Root identification features
  • Danish improved
  • RA 94.12 ? 94.72
  • Spanish improved
  • RA 80.08 ? 83.57

8
Labeling dependencies
  • Use a maximum entropy classifier (Berger et al
    1996)
  • Fast to train
  • Good probability estimates
  • Intended to jointly model sets of labels
  • Actually labeled independently
  • Better results with SVM?

9
Swedish using SVMs
10
Japanese using SVMs
11
Conclusion
  • Two stage pipeline
  • Feature engineering important
  • For predicting dependencies
  • For labeling dependencies
  • Replacing maxent classifier with SVM gave boost
Write a Comment
User Comments (0)
About PowerShow.com