Statistical PhraseBased Translation Authors: Koehn, Och, Marcu - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Statistical PhraseBased Translation Authors: Koehn, Och, Marcu

Description:

We have a new way of learning phrase translations, but... 'What is the best method to extract phrase translation pairs?' What to do, what to do ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 28
Provided by: na33
Category:

less

Transcript and Presenter's Notes

Title: Statistical PhraseBased Translation Authors: Koehn, Och, Marcu


1
Statistical Phrase-Based TranslationAuthors
Koehn, Och, Marcu
  • Presented by Albert Bertram
  • Titles, charts, graphs, figures and tables were
    extracted from the paper.
  • Critique and snarky remarks, however, are
    original.

2
Motivation
  • We have a new way of learning phrase
    translations, but
  • What is the best method to extract phrase
    translation pairs?

3
What to do, what to do
  • Compose a framework for consistent comparison
  • Implement each algorithm
  • Compare the results

4
Evaluation Framework
  • Phrases
  • Models involved
  • Language model
  • Statistical model for translation
  • Distortion model
  • Decoder

5
Evaluation Framework Phrases
  • We all know what phrases are right?
  • NP, VP, wait, what? Oh.
  • Here, theyre generic spanning and
    non-overlapping subsequences of words.
  • Are these guys really linguists?

6
Evaluation Framework Models
  • Language Model
  • Trigram usually p(enen-1,en-2)
  • Translation Model
  • Argmaxe p(ef) argmaxe p(fe)p(e)
  • ebest argmaxe p(fe)pLM(e)?length(e)
  • p(fe) is decomposed into

7
Evaluation Framework Models
  • Distortion Model
  • d(ai bi-1)
  • ai start position of the foreign phrase
    translated into the ith English phrase
  • bi-1 end position of the foreign phrase
    translated into the (i-1)th English phrase
  • Learned from the joint probability model Ping
    told us about

8
Evaluation Framework Decoder
  • Left-to-right incremental
  • Stack-based beam search
  • Estimates future costs
  • Same decoder used in all experiments

9
Baseline Experiments
  • Word-based alignment
  • Syntactic phrases
  • Phrase alignments

10
Baseline Word-based Alignment
  • Learn the phrases from word alignments

11
Baseline Syntactic Phrases
  • Learn only syntactically correct phrases
  • Start with the word based alignment
  • Prune out the phrase pairs which arent subtrees
    in the parsed sentences for either language.

12
Baseline Phrase Alignment
  • Marcu and Wong, 2002
  • Yes, this is the paper Ping just presented.

13
Experiment Background
  • Europarl and BLEU
  • Training corpus of 10, 20, 40, 80,160 and 320
    kilo-sentence pairs

14
Baseline Results
Notice the bottom row there? Comparing these
models is like taking a 5-year old to a chess
tournament.
15
Baseline Results
16
More Experiments
  • Weighting Syntactic Phrases
  • Maximum Phrase Length
  • Lexical Weighting
  • Phrase Extraction Heuristic
  • Simpler Underlying Word-Base Models
  • Other Languages

17
Experiments and Results
  • Weighting Syntactic Phrases
  • Double the count on syntactic phrases
  • Is that sufficient?
  • Insufficient post-analysis on this one
  • The same BLEU score
  • Were the translations in better syntax?
  • Did the translations at least use more syntactic
    phrases?

18
Experiments and Results
  • Maximum Phrase Length

19
Experiments and Results
  • Lexical Weighting
  • Lexical probability distribution
  • Lexical Weight

20
Experiments and Results
  • Lexical Weighting
  • Example

21
Experiments and Results
  • Lexical Weighting
  • With multiple alignments
  • Extended to fit this model

22
Experiments and Results
  • Lexical Weighting
  • Improvement .01 BLEU

23
Experiments and Results
  • Phrase Extraction Heuristic
  • Align Bidirectionally
  • Note this gives two different word alignment sets
  • Start with the intersection of the two sets
  • Add possible alignments
  • Only if theyre in the union of the sets
  • Only if they connect at least one previously
    unaligned word

24
Experiments and Results
  • Phrase Extraction Heuristic
  • Algorithm
  • Start with the first English word
  • Expand only directly adjacent alignment points
  • Move to the next English word, repeat.
  • Finally add non-adjacent alignment points which
    meet the heuristic criteria.

25
Experiments and Results
26
Experiments and Results
  • Simpler Underlying Word-Base Models
  • IBM models 1-4

27
Experiments and Results
  • Other Languages
Write a Comment
User Comments (0)
About PowerShow.com