Extraction of Bilingual Information from Parallel Texts - PowerPoint PPT Presentation

About This Presentation
Title:

Extraction of Bilingual Information from Parallel Texts

Description:

Machine Translation. Traditional vs. Statistical Architectures. Experimental Results ... Traditional Machine Translation. September 2004. CSAW 2004. 5. Remarks ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 24
Provided by: MikeR2
Category:

less

Transcript and Presenter's Notes

Title: Extraction of Bilingual Information from Parallel Texts


1
Extraction of Bilingual Information from Parallel
Texts
  • Mike Rosner

2
Outline
  • Machine Translation
  • Traditional vs. Statistical Architectures
  • Experimental Results
  • Conclusions

3
Translational Equivalencemanymany relation
SOURCE
TARGET
4
Traditional Machine Translation
5
Remarks
  • Character of System
  • Knowledge based.
  • High quality results if domain is well delimited.
  • Knowledge takes the form of specialised rules
    (analysis synthesis transfer).
  • Problems
  • Limited coverage
  • Knowledge acquisition bottleneck.
  • Extensibility.

6
Statistical Translation
  • Robust
  • Domain independent
  • Extensible
  • Does not require language specialists
  • Uses noisy channel model of translation

7
Noisy Channel ModelSentence Translation (Brown
et. al. 1990)
target sentence
sourcesentence
sentence
8
The Problem of Translation
  • Given a sentence T of the target language, seek
    the sentence S from which a translator produced
    T, i.e.
  • find S that maximises P(ST)
  • By Bayes' theorem
  • P(ST) P(S) x P(TS)
  • P(T)
  • whose denominator is independent of S.
  • Hence it suffices to maximise P(S) x P(TS)

9
A Statistical MT System
S
T
Source Language Model
Translation Model
P(S) P(TS)
P(S,T)
T
S
Decoder
10
The Three Components of a Statistical MT model
  • Method for computing language model probabilities
    (P(S))
  • Method for computing translation probabilities
    (P(ST))
  • Method for searching amongst source sentences for
    one that maximisesP(S) P(TS)

11
ProbabilisticLanguage Models
  • GeneralP(s1s2...sn) P(s1)P(s2s1)
    ...P(sns1...s(n-1))
  • TrigramP(s1s2...sn) P(s1)P(s2s1)P(s3s1,s2)
    ...P(sns(n-1)s(n-2))
  • BigramP(s1s2...sn) P(s1)P(s2s1)
    ...P(sns(n-1))

12
A Simple Alignment Based Translation Model
  • Assumption target sentence is generated from
    the source sentence word-by-word S John
    loves Mary T Jean aime Marie

13
Sentence Translation Probability
  • According to this model, the translation
    probability of the sentence is just the product
    of the translation probabilities of the words.
  • P(TS) P(Jean aime MarieJohn loves Mary)
    P(JeanJohn) P(aimeloves) P(MarieMary)

14
More Realistic Example
The proposal will not now
be implemented
Les propositions ne seront pas mises en
application maintenant
15
Some Further Parameters
  • Word Translation ProbabilityP(ts)
  • Fertility the number of words in the target that
    are paired with each source word (0 N)
  • Distortion the difference in sentence position
    between the source word and the target word
    P(ij,l)

16
Searching
  • Maintain list of hypotheses. Initial hypothesis
    (Jean aime Marie )
  • Search proceeds interatively. At each iteration
    we extend most promising hypotheses with
    additional wordsJean aime Marie John(1) Jean
    aime Marie loves(2) Jean aime Marie
    Mary(3)

17
Parameter Estimation
  • In general - large quantities of data
  • For language model, we need only source language
    text.
  • For translation model, we need pairs of sentences
    that are translations of each other.
  • Use EM Algorithm (Baum 1972) to optimize model
    parameters.

18
Experiment (Brown et. al. 1990)
  • Hansard. 40,000 pairs of sentences approx.
    800,000 words in each language.
  • Considered 9,000 most common words in each
    language.
  • Assumptions (initial parameter values)
  • each of the 9000 target words equally likely as
    translations of each of the source words.
  • each of the fertilities from 0 to 25 equally
    likely for each of the 9000 source words
  • each target position equally likely given each
    source position and target length

19
English not
  • French Probability
  • pas .469
  • ne .460
  • non .024
  • pas du tout .003
  • faux .003
  • plus .002
  • ce .002
  • que .002
  • jamais .002
  • Fertility Probability
  • 2 .758
  • 0 .133
  • 1 .106

20
English hear
  • French Probability
  • bravo .992
  • entendre .005
  • entendu .002
  • entends .001
  • Fertility Probability
  • 0 .584
  • 1 .416

21
Bajada 2003/4
  • 400 sentence pairs from Malta/EU accession treaty
  • Three different types of alignment
  • Paragraph (precision 97 recall 97)
  • Sentence (precision 91 recall 95)
  • Word 2 translation models
  • Model 1 distortion independent
  • Model 2 distortion dependent

22
Bajada 2003/4
23
Conclusion/Future Work
  • Larger data sets
  • Finer models of word/word translation
    probabilities taking into account
  • fertility
  • morphological variants of the same words
  • Role and tools for bilingual informant (not
    linguistic specialist)
Write a Comment
User Comments (0)
About PowerShow.com