Title: LING 138238 SYMBSYS 138 Intro to Computer Speech and Language Processing
1LING 138/238 SYMBSYS 138Intro to Computer Speech
and Language Processing
- Lecture 12 Machine Translation (II)
- November 4, 2004
- Dan Jurafsky
Thanks to Kevin Knight for much of this material!!
2Outline for MT Week
- Intro and a little history
- Language Similarities and Divergences
- Four main MT Approaches
- Transfer
- Interlingua
- Direct
- Statistical
- Evaluation
3Thanks to Bonnie Dorr!
- Next ten slides draw from her slides on BLEU
4How do we evaluate MT? Human
- Fluency
- Overall fluency
- Human rating of sentences read out loud
- Cohesion (Lexical chains, anaphora, ellipsis)
- Hand-checking for cohesion.
- Well-formedness
- 5-point scale of syntactic correctness
- Fidelity (same information as source?)
- Hand rating of target text on 100pt scale
- Clarity
- Comprehensibility
- Noise test
- Multiple choice questionnaire
- Readability
- cloze
5Evaluating MT Problems
- Asking humans to judge sentences on a 5-point
scale for 10 factors takes time and (weeks or
months!) - We cant build language engineering systems if we
can only evaluate them once every quarter!!!! - We need a metric that we can run every time we
change our algorithm. - It would be OK if it wasnt perfect, but just
tended to correlate with the expensive human
metrics, which we could still run in quarterly.
6BiLingual Evaluation Understudy (BLEU Papineni,
2001)
http//www.research.ibm.com/people/k/kishore/RC221
76.pdf
- Automatic Technique, but .
- Requires the pre-existence of Human (Reference)
Translations - Approach
- Produce corpus of high-quality human translations
- Judge closeness numerically (word-error rate)
- Compare n-gram matches between candidate
translation and 1 or more reference translations
7Bleu Comparison
Chinese-English Translation Example Candidate 1
It is a guide to action which ensures that the
military always obeys the commands of the
party. Candidate 2 It is to insure the troops
forever hearing the activity guidebook that party
direct.
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
8How Do We Compute Bleu Scores?
- Intuition What percentage of words in candidate
occurred in some human translation? - Proposal count up of candidate translation
words (unigrams) in any reference translation,
divide by the total of words in candidate
translation - But cant just count total of overlapping
N-grams! - Candidate the the the the the the
- Reference 1 The cat is on the mat
- Solution A reference word should be considered
exhausted after a matching candidate word is
identified.
9Modified n-gram precision
- For each word compute
- (1) total number of times it occurs in any
single reference translation - (2) number of times it occurs in the candidate
translation - Instead of using count 2, use the minimum of 2
and 2, I.e. clip the counts at the max for the
reference transcription - Now use that modified count.
- And divide by number of candidate words.
10Modified Unigram Precision Candidate 1
It(1) is(1) a(1) guide(1) to(1) action(1)
which(1) ensures(1) that(2) the(4) military(1)
always(1) obeys(0) the commands(1) of(1) the
party(1)
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Whats the answer???
17/18
11Modified Unigram Precision Candidate 2
It(1) is(1) to(1) insure(0) the(4) troops(0)
forever(1) hearing(0) the activity(0)
guidebook(0) that(2) party(1) direct(0)
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Whats the answer????
8/14
12Modified Bigram Precision Candidate 1
It is(1) is a(1) a guide(1) guide to(1) to
action(1) action which(0) which ensures(0)
ensures that(1) that the(1) the military(1)
military always(0) always obeys(0) obeys the(0)
the commands(0) commands of(0) of the(1) the
party(1)
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
10/17
Whats the answer????
13Modified Bigram Precision Candidate 2
It is(1) is to(0) to insure(0) insure the(0) the
troops(0) troops forever(0) forever hearing(0)
hearing the(0) the activity(0) activity
guidebook(0) guidebook that(0) that party(0)
party direct(0)
Reference 1 It is a guide to action that ensures
that themilitary will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Whats the answer????
1/13
14Catching Cheaters
the(2) the the the(0) the(0) the(0) the(0)
Reference 1 The cat is on the mat Reference 2
There is a cat on the mat
Whats the unigram answer?
2/7
Whats the bigram answer?
0/7
15Bleu distinguishes human from machine translations
16Bleu problems with sentence length
- Candidate of the
- Solution brevity penalty prefers candidates
translations which are same length as one of the
references
Reference 1 It is a guide to action that ensures
that themilitary will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Problem modified unigram precision is 2/2,
bigram 1/1!
17Statistical MT
- Fidelity and fluency
- Best-translation
- Developed by researchers who were originally in
speech recognition at IBM - Called the IBM model
18The IBM model
- Hmm, those two factors might look familiar
- Yup, its Bayes rule
19Fluency P(T)
- How to measure that this sentence
- That car was almost crash onto me
- is less fluent than this one
- That car almost hit me.
- Answer language models (N-grams!)
- For example P(hitalmost) P(wasalmost)
- But can use any other more sophisticated model of
grammar - Advantage this is monolingual knowledge!
20Faithfulness P(ST)
- French ça me plait that me pleases
- English
- that pleases me - most fluent
- I like it
- Ill take that one
- How to quantify this?
- Intuition degree to which words in one sentence
are plausible translations of words in other
sentence - Product of probabilities that each word in target
sentence would generate each word in source
sentence.
21Faithfulness P(ST)
- Need to know, for every target language word,
probability of it mapping to every source
language word. - How do we learn these probabilities?
- Parallel texts!
- Lots of times we have two texts that are
translations of each other - If we knew which word in Source Text mapped to
each word in Target Text, we could just count!
22Faithfulness P(ST)
- Sentence alignment
- Figuring out which source language sentence maps
to which target language sentence - Word alignment
- Figuring out which source language word maps to
which target language word
23Big Point about Faithfulness and Fluency
- Job of the faithfulness model P(ST) is just to
model bag of words which words come from say
English to Spanish. - P(ST) doesnt have to worry about internal facts
about Spanish word order thats the job of P(T) - P(T) can do Bag generation put the following
words in order - Have programming a seen never I language better
- Actual the hashing is since not collision-free
usually the is less perfectly the of somewhat
capacity table
24P(T) and bag generation the answer
- Usually the actual capacity of the table is
somewhat less, since the hashing is not
collision-free
25A motivating example
- Japanese phrase 2000nen taio
- 2000nen
- 2000 - highest
- Y2K
- 2000 years
- 2000 year
- Taio
- Correspondence -highest
- Corresponding
- Equivalent
- Tackle
- Dealing with
- Deal with
P(ST) alone prefers 2000 Correspondence
Adding P(T) might produce correct Dealing with
Y2K
26More formally The IBM Model
- Lets flesh out these intuitions about P(ST) and
P(T) a bit. - Many of the next slides are drawn from Kevin
Knights fantastic A Statistical MT Tutorial
Workbook!
27IBM Model 3 as probabilistic version of Direct MT
- We translate English into Spanish as follows
- Replace the words in the English sentence by
Spanish words - Scramble around the words to look like Spanish
order - But we cant propose that English words are
replaced by Spanish words one-for-one, because
translations arent the same length.
28IBM Model 3 (from Knight 1999)
- For each word ei in English sentence, choose a
fertility ?i. The choice of ?i depends only on
ei, not other words or ?s. - For each word ei, generate ?i Spanish words.
Choice of French word depends only on English
word ei, not English context or any Spanish
words. - Permute all the Spanish words. Each Spanish word
gets assign absolute target position slot (1,2,3,
etc). Choice of Spanish word position dependent
only on absolute position of English word
generating it.
29Translation as String rewriting (from Knight 1999)
- Mary did not slap the green witch
- Assign fertilities 1 copy over word, 2 copy
twice, etc. 0 delete - Mary not slap slap slap the the green witch
- Replace English words with Spanish one-for-one
- Mary no daba una botefada a la verde bruja
- Permute the words
- Mary no daba una botefada a la bruja verde
30Model 3 P(ST) training parameters
- What are the parameters for this model? Just look
at dependencies - Words P(casahouse)
- Fertilities n(1house) prob that house will
produce 1 Spanish word whenever house appears. - Distortions d(52) prob that English word in
position 2 of English sentence generates French
word in position 5 of French translation - Actually, distortions are d(5,2,4,6) where 4 is
length of English sentence, 6 is Spanish length - Remember, P(ST) doesnt have to model fluency
31Model 3 last twist
- Imagine some Spanish words are spurious they
appear in Spanish even though they werent in
English original - Like function words we generated a la from
the by giving the fertility 2 - Instead, we could give the fertility 1, and
generat a spuriously - Do this by pretending every English sentence
contains invisible word NULL as word 0. - Then parameters like t(aNULL) give probability
of word a generating spuriously from NULL
32Spurious words
- We could imagine having n(3NULL) (probability of
being exactly 3 spurious words in a Spanish
translation) - Instead, of n(0NULL), n(1NULL) N(25NULL),
have a single parameter p1 - After assign fertilities to non-NULL English
words we want to generate (say) z Spanish words. - As we genreate each of z words, we optionally
toss in spurious Spanish word with probability p1 - Probability of not tossing in spurious word
p01-p1
33Distortion probabilities for spurious words
- Cant just have d(50,4,6), I.e. chance that NULL
word will end up in position 5. - Why? These are spurious words! Could occur
anywhere!! To hard to predict - Instead,
- Use normal-word distortion parameters to choose
positions for normally-generated Spanish words - Put Null-generated words into empty slots left
over - If three NULL-generated words, and three empty
slots, then there are 3!, or six, ways for
slotting them all in - Well assign a probability of 1/6 for each way
34Real Model 3
- For each word ei in English sentence, choose
fertility ?i with prob n(?i ei) - Choose number ?0 of spurious Spanish words to be
generated from e0NULL using p1 and sum of
fertilities from step 1 - Let m be sum of fertilities for all words
including NULL - For each i0,1,2,L , k1,2, ?I
- choose Spanish word ?ikwith probability t(?ikei)
- For each i1,2,L , k1,2, ?I
- choose target Spanish position ?ikwith prob
d(?ikI,L,m) - For each k1,2,, ?0 choose position ?0k from ?0
-k1 remaining vacant positions in 1,2,m for
total prob of 1/ ?0! - Output Spanish sentence with words ?ik in
positions ?ik (0
35String rewriting
- Mary did not slap the green witch (input)
- Mary not slap slap slap the green witch (choose
fertilities) - Mary not slap slap slap NULL the green witch
(choose number of spurious words) - Mary no daba una botefada a la verde bruja
(choose translations) - Mary no daba una botefada a la bruja verde
(choose target positions)
36Model 3 parameters
- N,t,p,d
- If we had English strings and step-by-step
rewritings into Spanish, we could - Compute n(0did) by locating every instance of
did, see what happens to it during first
rewriting step - If did appeared 15,000 times and was deleted
during the first rewriting step 13,000 times,
then n(0did) 13/15
37Alignments
- NULL And the program has been implemented
- /\
- Le programme a ete mis en
application - If we had lots of alignments like this,
- n(0d) how many times did connects to no
French words - T(maisonhouse) how many of all French words
generated by house were maison - D(52,4,6) out of all times some word2 moved
somewhere, how many times to word5?
38Where to get alignments
- It turns out we can bootstrap alignments
- If we just have a bilingual corpus
- We can bootstrap alignments
- Assume some startup values for n,d,?, etc
- Use values for n,d, ?, etc to use model 3 to do
forced alignment I.e. to pick the best word
alignments between sentences - Use these alignments to retrain n,d, ?, etc
- Go to 2
- This is called the Expectation-Maximization or EM
algorithm
39Summary
- Intro and a little history
- Language Similarities and Divergences
- Four main MT Approaches
- Transfer
- Interlingua
- Direct
- Statistical
- Evaluation
40Classes
- LINGUIST 139M/239M. Human and Machine
Translation. (Martin Kay) - CSCI 224N. Natural Language Processing (Chris
Manning)