Machine Translation - PowerPoint PPT Presentation

About This Presentation
Title:

Machine Translation

Description:

English - Spanish. English - Greek. Training Data cont. Eliminated. Misaligned sentences ... A Chinese to English Translation. The End. Are there any questions ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 32
Provided by: michae266
Category:

less

Transcript and Presenter's Notes

Title: Machine Translation


1
Machine Translation
  • A Presentation by
  • Julie Conlonova,
  • Rob Chase,
  • and Eric Pomerleau

2
Overview
  • Language Alignment System
  • Datasets
  • Sentence-aligned sets for training (ex. The
    Hansards Corpus, European Parliamentary
    Proceedings Parallel Corpus)
  • A word-aligned set for testing and evaluation to
    measure accuracy and precision
  • Decoding

3
Language Alignment
  • Goal Produce a word-aligned set from a
    sentence-aligned dataset
  • First step on the road toward Statistical Machine
    Translation
  • Example Problem
  • The motion to adjourn the House is now deemed to
    have been adopted.
  • La motion portant que la Chambre s'ajourne
    maintenant est réputée adoptée.

4
IBM Models 1 and 2-Kevin Knight, A Statistical
MT Tutorial Workbook, 1999
  • Each capable of being used to produce a
    word-aligned dataset separately.
  • EM Algorithm
  • Model 1 produces T-values based on normalized
    fractional counting of corresponding words.
  • Additionally, Model 2 uses A-values for reverse
    distortion probabilities probabilities based
    on the positions of the words

5
Training Data
  • European Parliament Proceedings Parallel Corpus
    1996-2003
  • Aligned Languages
  • English - French
  • English - Dutch
  • English - Italian
  • English - Finish
  • English - Portuguese
  • English - Spanish
  • English - Greek

6
Training Data cont.
  • Eliminated
  • Misaligned sentences
  • Sentences with 50 or more words
  • XML tags
  • Symbols and numerical characters other then
    commas and periods

7
Ideally
http//www.cs.berkeley.edu/klein/cs294-5
8
Bypassing Interlingua Models I-III
  • Variables contributing to the probability of a
    sentence
  • Correlation between words in the source/target
    languages
  • Fertility of a word
  • Correlation between order of words in source
    sentence and order of words in target

9
A Translation Matrix
Rob Cat is Dog
Rob 1 0 0 0
Gato 0 1 0 0
es 0 0 .5 0
esta 0 0 .5 0
Perro 0 0 0 1
10
Building the Translation Matrix Starting from
alignments
  • Find the sentence alignment
  • If a word in the source aligns with a word in the
    target, then increment the translation matrix.
  • Normalize the translation matrix

11
Cant find alignments
  • Most sentences in the hansards corpus are 60
    words long. There are many that can be over 100.
  • 100100 possible alignments

12
Counting
  • Rob is a boy. Rob es nino.
  • Rob is tall. Rob es alto.
  • Eric is tall. Eric es alto.
  • Base counts on co-occurrence, weighting based on
    sentence length.

13
Iterative Convergence
  • Use Estimation Maximization algorithm
  • Creates translation matrix

Rob Is Tall boy
Rob .66 .33 .25 .25
es .30 .66 .25 .25
alto .2 .05 .5 0
nino .2 .05 0 .5
14
Distorting the Sentence
  • Word order changes between languages
  • How is a sentence with 2 words distorted?
  • How is a sentence with 3 words distorted?
  • How is a sentence with
  • To keep track of this information we use

15
A tesseract!
  • (A quadruply nested default dictionary)
  • This could be a problem if there are more than
    100 words in a sentence.
  • 100x100x100x100 too big for RAM and takes too
    much time

16
Broad Look at MT
  • The translation process can be described simply
    as
  • Decoding the meaning of the source text, and
  • Re-encoding this meaning in the target language.
  • - Translation Process, Wikipedia, May 2006

17
Decoding
  • How to go from the T-matrix and A-matrix to a
    word alignment?
  • There are several approaches

18
Viterbi
  • If only doing alignment, much smaller memory and
    time requirements.
  • Returns optimal path.
  • T-Matrix probabilities function as the emission
    matrix
  • A-Matrix probabilities concerned with the
    positioning of words

19
Decoding as a Translator
  • Without supplying a translated sentence to the
    program, it is capable of being a stand-alone
    translator instead of a word aligner.
  • However, while the Viterbi algorithm runs quickly
    with pruning for decoding, for translating the
    run time skyrockets.

20
Greedy Hill ClimbingKnight Koehn, Whats New
in Statistical Machine Translation, 2003
  • Best first search
  • 2-step look ahead to avoid getting stuck in most
    probable local maxima

21
Beam SearchKnight Koehn, Whats New in
Statistical Machine Translation, 2003
  • Optimization of Best First Search with heuristics
    and beam of choices
  • Exponential tradeoff when increasing the beam
    width

22
Other Decoding MethodsKnight Koehn, Whats New
in Statistical Machine Translation, 2003
  • Finite State Transducer
  • Mapping between languages based on a finite
    automaton
  • Parsing
  • String to Tree Model

23
Problem One to Many
  • Necessary to take all alignments over a certain
    probability in order to capture the probability
    that e has fertility at least a given value

Al-Onaizan, Curin, Jahr, etc., Statistical
Machine Translation, 1999
24
Results
  • Study done in 2003 on word alignment error rates
    in Hansards corpus
  • Model 2
  • 29.3 on 8K training sentence pairs
  • 19.5 on 1.47M training sentence pairs
  • Optimized Model 6
  • 20.3 on 8K training sentence pairs
  • 8.7 on 1.47M training sentence pairs
  • Och and Ney, A Systematic Comparison of Various
    Statistical Alignment Models, 2003

25
Expected Accuracy
  • 70 overall
  • Language performance
  • Dutch
  • French
  • Italian, Spanish, Portuguese
  • Greek
  • Finish

26
Possible Future Work
  • Given more time, we wouldve implemented IBM
    Model 3
  • Additionally uses n, p, and d fertilities for
    weighted alignments
  • N, number of words produced by one word
  • D, distortion
  • P, parameter involving words that arent involved
    directly
  • Invokes Model 2 for scoring

27
Another Possible Translation Scheme
  • Example-Based Machine Translation
  • Translation-by-Analogy
  • Can sometimes achieve better than the gist
    translations from other models

28
Why Is Improving Machine Translation Necessary?
29
A Chinese to English Translation
30
The End
  • Are there any questions/comments?

31
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com