Extraction of Bilingual Information from Parallel Texts - PowerPoint PPT Presentation

About This Presentation

Title:

Extraction of Bilingual Information from Parallel Texts

Description:

Machine Translation. Traditional vs. Statistical Architectures. Experimental Results ... Traditional Machine Translation. September 2004. CSAW 2004. 5. Remarks ... – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 24

Provided by: MikeR2

Category:

more less

Transcript and Presenter's Notes

Title: Extraction of Bilingual Information from Parallel Texts

1
Extraction of Bilingual Information from Parallel
Texts

Mike Rosner

2
Outline

Machine Translation
Traditional vs. Statistical Architectures
Experimental Results
Conclusions

3
Translational Equivalencemanymany relation
SOURCE
TARGET
4
Traditional Machine Translation
5
Remarks

Character of System
Knowledge based.
High quality results if domain is well delimited.
Knowledge takes the form of specialised rules
(analysis synthesis transfer).
Problems
Limited coverage
Knowledge acquisition bottleneck.
Extensibility.

6
Statistical Translation

Robust
Domain independent
Extensible
Does not require language specialists
Uses noisy channel model of translation

7
Noisy Channel ModelSentence Translation (Brown
et. al. 1990)
target sentence
sourcesentence
sentence
8
The Problem of Translation

Given a sentence T of the target language, seek
the sentence S from which a translator produced
T, i.e.
find S that maximises P(ST)
By Bayes' theorem
P(ST) P(S) x P(TS)
P(T)
whose denominator is independent of S.
Hence it suffices to maximise P(S) x P(TS)

9
A Statistical MT System
S
T
Source Language Model
Translation Model
P(S) P(TS)
P(S,T)
T
S
Decoder
10
The Three Components of a Statistical MT model

Method for computing language model probabilities
(P(S))
Method for computing translation probabilities
(P(ST))
Method for searching amongst source sentences for
one that maximisesP(S) P(TS)

11
ProbabilisticLanguage Models

GeneralP(s1s2...sn) P(s1)P(s2s1)
...P(sns1...s(n-1))
TrigramP(s1s2...sn) P(s1)P(s2s1)P(s3s1,s2)
...P(sns(n-1)s(n-2))
BigramP(s1s2...sn) P(s1)P(s2s1)
...P(sns(n-1))

12
A Simple Alignment Based Translation Model

Assumption target sentence is generated from
the source sentence word-by-word S John
loves Mary T Jean aime Marie

13
Sentence Translation Probability

According to this model, the translation
probability of the sentence is just the product
of the translation probabilities of the words.
P(TS) P(Jean aime MarieJohn loves Mary)
P(JeanJohn) P(aimeloves) P(MarieMary)

14
More Realistic Example
The proposal will not now
be implemented
Les propositions ne seront pas mises en
application maintenant
15
Some Further Parameters

Word Translation ProbabilityP(ts)
Fertility the number of words in the target that
are paired with each source word (0 N)
Distortion the difference in sentence position
between the source word and the target word
P(ij,l)

16
Searching

Maintain list of hypotheses. Initial hypothesis
(Jean aime Marie )
Search proceeds interatively. At each iteration
we extend most promising hypotheses with
additional wordsJean aime Marie John(1) Jean
aime Marie loves(2) Jean aime Marie
Mary(3)

17
Parameter Estimation

In general - large quantities of data
For language model, we need only source language
text.
For translation model, we need pairs of sentences
that are translations of each other.
Use EM Algorithm (Baum 1972) to optimize model
parameters.

18
Experiment (Brown et. al. 1990)

Hansard. 40,000 pairs of sentences approx.
800,000 words in each language.
Considered 9,000 most common words in each
language.
Assumptions (initial parameter values)
each of the 9000 target words equally likely as
translations of each of the source words.
each of the fertilities from 0 to 25 equally
likely for each of the 9000 source words
each target position equally likely given each
source position and target length

19
English not