LING 138238 SYMBSYS 138 Intro to Computer Speech and Language Processing

About This Presentation

Title:

LING 138238 SYMBSYS 138 Intro to Computer Speech and Language Processing

Description:

... by Spanish words one-for-one, because translations aren't the ... (3|NULL) (probability of being exactly 3 spurious words in a Spanish translation) ... –

Number of Views:68

Avg rating:3.0/5.0

Slides: 41

Provided by: DanJur6

Learn more at: https://web.stanford.edu

Category:

more less

Transcript and Presenter's Notes

Title: LING 138238 SYMBSYS 138 Intro to Computer Speech and Language Processing

1
LING 138/238 SYMBSYS 138Intro to Computer Speech
and Language Processing

Lecture 12 Machine Translation (II)
November 4, 2004
Dan Jurafsky

Thanks to Kevin Knight for much of this material!!
2
Outline for MT Week

Intro and a little history
Language Similarities and Divergences
Four main MT Approaches
Transfer
Interlingua
Direct
Statistical
Evaluation

3
Thanks to Bonnie Dorr!

Next ten slides draw from her slides on BLEU

4
How do we evaluate MT? Human

Fluency
Overall fluency
Human rating of sentences read out loud
Cohesion (Lexical chains, anaphora, ellipsis)
Hand-checking for cohesion.
Well-formedness
5-point scale of syntactic correctness
Fidelity (same information as source?)
Hand rating of target text on 100pt scale
Clarity
Comprehensibility
Noise test
Multiple choice questionnaire
Readability
cloze

5
Evaluating MT Problems

Asking humans to judge sentences on a 5-point
scale for 10 factors takes time and (weeks or
months!)
We cant build language engineering systems if we
can only evaluate them once every quarter!!!!
We need a metric that we can run every time we
change our algorithm.
It would be OK if it wasnt perfect, but just
tended to correlate with the expensive human
metrics, which we could still run in quarterly.

6
BiLingual Evaluation Understudy (BLEU Papineni,
2001)
http//www.research.ibm.com/people/k/kishore/RC221
76.pdf

Automatic Technique, but .
Requires the pre-existence of Human (Reference)
Translations
Approach
Produce corpus of high-quality human translations
Judge closeness numerically (word-error rate)
Compare n-gram matches between candidate
translation and 1 or more reference translations

7
Bleu Comparison
Chinese-English Translation Example Candidate 1
It is a guide to action which ensures that the
military always obeys the commands of the
party. Candidate 2 It is to insure the troops
forever hearing the activity guidebook that party
direct.
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
8
How Do We Compute Bleu Scores?

Intuition What percentage of words in candidate
occurred in some human translation?
Proposal count up of candidate translation
words (unigrams) in any reference translation,
divide by the total of words in candidate
translation
But cant just count total of overlapping
N-grams!
Candidate the the the the the the
Reference 1 The cat is on the mat
Solution A reference word should be considered
exhausted after a matching candidate word is
identified.

9
Modified n-gram precision

For each word compute
(1) total number of times it occurs in any
single reference translation
(2) number of times it occurs in the candidate
translation
Instead of using count 2, use the minimum of 2
and 2, I.e. clip the counts at the max for the
reference transcription
Now use that modified count.
And divide by number of candidate words.

10
Modified Unigram Precision Candidate 1
It(1) is(1) a(1) guide(1) to(1) action(1)
which(1) ensures(1) that(2) the(4) military(1)
always(1) obeys(0) the commands(1) of(1) the
party(1)
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Whats the answer???
17/18
11
Modified Unigram Precision Candidate 2
It(1) is(1) to(1) insure(0) the(4) troops(0)
forever(1) hearing(0) the activity(0)
guidebook(0) that(2) party(1) direct(0)
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Whats the answer????
8/14
12
Modified Bigram Precision Candidate 1
It is(1) is a(1) a guide(1) guide to(1) to
action(1) action which(0) which ensures(0)
ensures that(1) that the(1) the military(1)
military always(0) always obeys(0) obeys the(0)
the commands(0) commands of(0) of the(1) the
party(1)
Reference 1 It is a guide to action that ensures
that the military will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
10/17
Whats the answer????
13
Modified Bigram Precision Candidate 2
It is(1) is to(0) to insure(0) insure the(0) the
troops(0) troops forever(0) forever hearing(0)
hearing the(0) the activity(0) activity
guidebook(0) guidebook that(0) that party(0)
party direct(0)
Reference 1 It is a guide to action that ensures
that themilitary will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Whats the answer????
1/13
14
Catching Cheaters
the(2) the the the(0) the(0) the(0) the(0)
Reference 1 The cat is on the mat Reference 2
There is a cat on the mat
Whats the unigram answer?
2/7
Whats the bigram answer?
0/7
15
Bleu distinguishes human from machine translations
16
Bleu problems with sentence length

Candidate of the
Solution brevity penalty prefers candidates
translations which are same length as one of the
references

Reference 1 It is a guide to action that ensures
that themilitary will forever heed Party
commands. Reference 2 It is the guiding
principle which guarantees the military forces
always being under the command of the
Party. Reference 3 It is the practical guide for
the army always to heed the directions of the
party.
Problem modified unigram precision is 2/2,
bigram 1/1!
17
Statistical MT

Fidelity and fluency
Best-translation
Developed by researchers who were originally in
speech recognition at IBM
Called the IBM model

18
The IBM model

Hmm, those two factors might look familiar
Yup, its Bayes rule

19
Fluency P(T)

How to measure that this sentence
That car was almost crash onto me
is less fluent than this one
That car almost hit me.
Answer language models (N-grams!)
For example P(hitalmost) P(wasalmost)
But can use any other more sophisticated model of
grammar
Advantage this is monolingual knowledge!

20
Faithfulness P(ST)

French ça me plait that me pleases
English
that pleases me - most fluent
I like it
Ill take that one
How to quantify this?
Intuition degree to which words in one sentence
are plausible translations of words in other
sentence
Product of probabilities that each word in target
sentence would generate each word in source
sentence.

21
Faithfulness P(ST)

Need to know, for every target language word,
probability of it mapping to every source
language word.
How do we learn these probabilities?
Parallel texts!
Lots of times we have two texts that are
translations of each other
If we knew which word in Source Text mapped to
each word in Target Text, we could just count!

22
Faithfulness P(ST)

Sentence alignment
Figuring out which source language sentence maps
to which target language sentence
Word alignment
Figuring out which source language word maps to
which target language word

23
Big Point about Faithfulness and Fluency

Job of the faithfulness model P(ST) is just to
model bag of words which words come from say
English to Spanish.
P(ST) doesnt have to worry about internal facts
about Spanish word order thats the job of P(T)
P(T) can do Bag generation put the following
words in order
Have programming a seen never I language better
Actual the hashing is since not collision-free
usually the is less perfectly the of somewhat
capacity table

24
P(T) and bag generation the answer

Usually the actual capacity of the table is
somewhat less, since the hashing is not
collision-free

25
A motivating example

Japanese phrase 2000nen taio
2000nen
2000 - highest
Y2K
2000 years
2000 year
Taio
Correspondence -highest
Corresponding
Equivalent
Tackle
Dealing with
Deal with

P(ST) alone prefers 2000 Correspondence
Adding P(T) might produce correct Dealing with
Y2K
26
More formally The IBM Model

Lets flesh out these intuitions about P(ST) and
P(T) a bit.
Many of the next slides are drawn from Kevin
Knights fantastic A Statistical MT Tutorial
Workbook!

27
IBM Model 3 as probabilistic version of Direct MT

We translate English into Spanish as follows
Replace the words in the English sentence by
Spanish words
Scramble around the words to look like Spanish
order
But we cant propose that English words are
replaced by Spanish words one-for-one, because
translations arent the same length.

28
IBM Model 3 (from Knight 1999)

For each word ei in English sentence, choose a
fertility ?i. The choice of ?i depends only on
ei, not other words or ?s.
For each word ei, generate ?i Spanish words.
Choice of French word depends only on English
word ei, not English context or any Spanish
words.
Permute all the Spanish words. Each Spanish word
gets assign absolute target position slot (1,2,3,
etc). Choice of Spanish word position dependent
only on absolute position of English word
generating it.

29
Translation as String rewriting (from Knight 1999)

Mary did not slap the green witch
Assign fertilities 1 copy over word, 2 copy
twice, etc. 0 delete
Mary not slap slap slap the the green witch
Replace English words with Spanish one-for-one
Mary no daba una botefada a la verde bruja
Permute the words
Mary no daba una botefada a la bruja verde

30
Model 3 P(ST) training parameters

What are the parameters for this model? Just look
at dependencies
Words P(casahouse)
Fertilities n(1house) prob that house will
produce 1 Spanish word whenever house appears.
Distortions d(52) prob that English word in
position 2 of English sentence generates French
word in position 5 of French translation
Actually, distortions are d(5,2,4,6) where 4 is
length of English sentence, 6 is Spanish length
Remember, P(ST) doesnt have to model fluency

31
Model 3 last twist

Imagine some Spanish words are spurious they
appear in Spanish even though they werent in
English original
Like function words we generated a la from
the by giving the fertility 2
Instead, we could give the fertility 1, and
generat a spuriously
Do this by pretending every English sentence
contains invisible word NULL as word 0.
Then parameters like t(aNULL) give probability
of word a generating spuriously from NULL

32
Spurious words

We could imagine having n(3NULL) (probability of
being exactly 3 spurious words in a Spanish
translation)
Instead, of n(0NULL), n(1NULL) N(25NULL),
have a single parameter p1
After assign fertilities to non-NULL English
words we want to generate (say) z Spanish words.
As we genreate each of z words, we optionally
toss in spurious Spanish word with probability p1
Probability of not tossing in spurious word
p01-p1

33
Distortion probabilities for spurious words

Cant just have d(50,4,6), I.e. chance that NULL
word will end up in position 5.
Why? These are spurious words! Could occur
anywhere!! To hard to predict
Instead,
Use normal-word distortion parameters to choose
positions for normally-generated Spanish words
Put Null-generated words into empty slots left
over
If three NULL-generated words, and three empty
slots, then there are 3!, or six, ways for
slotting them all in
Well assign a probability of 1/6 for each way

34
Real Model 3

For each word ei in English sentence, choose
fertility ?i with prob n(?i ei)
Choose number ?0 of spurious Spanish words to be
generated from e0NULL using p1 and sum of
fertilities from step 1
Let m be sum of fertilities for all words
including NULL
For each i0,1,2,L , k1,2, ?I
choose Spanish word ?ikwith probability t(?ikei)
For each i1,2,L , k1,2, ?I
choose target Spanish position ?ikwith prob
d(?ikI,L,m)
For each k1,2,, ?0 choose position ?0k from ?0
-k1 remaining vacant positions in 1,2,m for
total prob of 1/ ?0!
Output Spanish sentence with words ?ik in
positions ?ik (0

35
String rewriting

Mary did not slap the green witch (input)
Mary not slap slap slap the green witch (choose
fertilities)
Mary not slap slap slap NULL the green witch
(choose number of spurious words)
Mary no daba una botefada a la verde bruja
(choose translations)
Mary no daba una botefada a la bruja verde
(choose target positions)

36
Model 3 parameters

N,t,p,d
If we had English strings and step-by-step
rewritings into Spanish, we could
Compute n(0did) by locating every instance of
did, see what happens to it during first
rewriting step
If did appeared 15,000 times and was deleted
during the first rewriting step 13,000 times,
then n(0did) 13/15

37
Alignments

NULL And the program has been implemented
/\
Le programme a ete mis en
application
If we had lots of alignments like this,
n(0d) how many times did connects to no
French words
T(maisonhouse) how many of all French words
generated by house were maison
D(52,4,6) out of all times some word2 moved
somewhere, how many times to word5?

38
Where to get alignments

It turns out we can bootstrap alignments
If we just have a bilingual corpus
We can bootstrap alignments
Assume some startup values for n,d,?, etc
Use values for n,d, ?, etc to use model 3 to do
forced alignment I.e. to pick the best word
alignments between sentences
Use these alignments to retrain n,d, ?, etc
Go to 2
This is called the Expectation-Maximization or EM
algorithm

39
Summary