Machine translation (I) MT overview - PowerPoint PPT Presentation

1 / 65
About This Presentation
Title:

Machine translation (I) MT overview

Description:

Machine translation (I) MT overview Ling 571 Fei Xia Week 9: 11/22/05 11/29/05 Plan for the rest of the quarter MT: Part I: MT overview: 11/22 -- 11/29 Part II ... – PowerPoint PPT presentation

Number of Views:230
Avg rating:3.0/5.0
Slides: 66
Provided by: F208
Category:

less

Transcript and Presenter's Notes

Title: Machine translation (I) MT overview


1
Machine translation (I)MT overview
  • Ling 571
  • Fei Xia
  • Week 9 11/22/05 11/29/05

2
Plan for the rest of the quarter
  • MT
  • Part I MT overview 11/22 -- 11/29
  • Part II Word-based SMT 12/1-12/6
  • Next quarter seminar on MT
  • Starting with a real baseline system
  • Improving the system in various ways
  • Reading and presenting recent papers
  • Project presentation 12/8

3
Homework and Quizzes
  • Hw 8 due on 11/23 (tomorrow)
  • Hw 9 due on 12/6
  • Hw 10 presentation due on 12/8, report due on
    12/13
  • Quiz 3 11/29
  • Quiz 4 12/6

4
Outline
  • MT in a nutshell
  • Major challenges in MT
  • Major approaches
  • Evaluation of MT systems

5
MT in a nutshell
6
Q1 what is the ultimate goal of translation?
  • Translation source language ? target language
    (S?T)
  • Ultimate goal find the perfect translation for
    text in S, thus allowing people to appreciate
    the text in S without knowing S
  • Accuracy faithful to S, including meaning,
    connotation, style,
  • Fluency the translation is as natural as an
    utterance in T.

7
Q2 what are the perfect translations?
  • What do Accuracy and Fluency mean?
  • Ex1 Complement / downplayer
  • (1) A Your daughter was phenomenal.
  • (2) B No. She was just so-so.
  • Ex2 Greeting hows everything?
  • Old days chi1 le5 ma5? (Have you eaten?)
  • 1980s -- now fa1 le5 ma5? (Have (you) gotten
    rich?)
  • 2000s -- now li2 le5 ma5? (Have (you) gotten
    divorced?)
  • The answer it depends

8
Q3 Can human always get the perfect translations?
  • Novels Shakespeare, Cao Xueqin,
  • - hidden messages c1c0, c2 c0, c3 c0, c4 c0
  • c1 c2 c3 c4
  • Word play, jokes, puns
  • What do prisoners use to call each other?
  • Cell phones.
  • Concept gaps double jeopardy, go Greek, fen sui,
    bei fen, .
  • Other constraints lyrics, dubbing, poem.

9
Crazy English by Richard Lederer
  • Lets face it English is a crazy language. There
    is no egg in eggplant or ham in hamburger,
    neither apple nor pine in pineapple.
  • When a house burns up, it burns down. You fill in
    a form by filling it out and an alarm clock goes
    off by going on.
  • When the stars are out, they are visible, but
    when the lights are out, they are invisible. And
    why, when I wind up my watch, I start it, but
    when I wind up this essay, I end it?

10
How to translate it?
  • Compound words Lets face it English is a
    crazy language. There is no egg in eggplant or
    ham in hamburger, neither apple nor pine in
    pineapple.
  • Verbparticle When a house burns up, it burns
    down. You fill in a form by filling it out and an
    alarm clock goes off by going on.
  • Predicateargument When the stars are out, they
    are visible, but when the lights are out, they
    are invisible. And why, when I wind up my watch,
    I start it, but when I wind up this essay, I end
    it?

11
Q4 Can machines be as good as humans in
translation quality?
  • We know there are things that even humans cannot
    translate perfectly.
  • For things that humans can translate, will
    machines be ever as good as humans in translation
    quality?
  • Never say never.
  • Not in the near future.

12
Q5 what is MT good for?
  • Rough translation web data
  • Computer-aided human translation
  • Translation for limited domain
  • Cross-lingual IR
  • Machine is better than human in
  • Speed much faster than humans
  • Memory can easily memorize millions of
    word/phrase translations.
  • Manpower machines are much cheaper than humans
  • Fast learner it takes minutes or hours to build
    a new system. Erasable memory ?
  • Never complain, never get tired,

13
Q6 whats the MT history? (Based on work by
John Hutchins)
  • Before the computer In the mid 1930s, a
    French-Armenian Georges Artsrouni and a Russian
    Petr Troyanskii applied for patents for
    translating machines.
  • The pioneers (1947-1954) the first public MT
    demo was given in 1954 (by IBM and Georgetown
    University).
  • The decade of optimism (1954-1966) ALPAC
    (Automatic Language Processing Advisory
    Committee) report in 1966 "there is no immediate
    or predictable prospect of useful machine
    translation."

14
A brief history of MT (cont)
  • The aftermath of the ALPAC report (1966-1980) a
    virtual end to MT research
  • The 1980s Interlingua, example-based MT
  • The 1990s Statistical MT
  • The 2000s Hybrid MT

15
Q7 where are we now?
  • Huge potential/need due to the internet,
    globalization and international politics.
  • Quick development time due to SMT, the
    availability of parallel data and computers.
  • Translation is reasonable for language pairs with
    a large amount of resource.
  • Start to include more minor languages.

16
MT in a nutshell
  • What is the ultimate goal of translation?
  • What are the perfect translations?
  • Can human always get perfect translations?
  • Can machines be as good as humans?
  • What are MT good for?
  • What is the MT history?
  • Where are we now?

17
Outline
  • MT in a nutshell
  • Major challenges in MT
  • Major approaches
  • Evaluation of MT systems

18
Major challenges in MT
19
Major challenges
  • Getting the right words
  • Choosing the correct root form
  • Getting the correct inflected form
  • Inserting spontaneous words
  • Putting the words in the correct order
  • Word order SVO vs. SOV,
  • Unique constructions
  • Divergence

20
Lexical choice
  • Homonymy/Polysemy bank, run
  • Concept gap no corresponding concepts in another
    language go Greek, go Dutch, fen sui, lame duck,
    Chinese idioms,
  • Coding (Concept ? lexeme mapping) differences
  • More distinction in one language e.g., kinship
    vocabulary.
  • Different division of conceptual space

21
More distinction the cousin example
  • Translations male/female, older or younger than
    the speaker, from mother or fathers side,
    parents brother or sisters child
  • ?? (tang xiong) fathers brothers son, who is
    older
  • ?? (tang di) fathers brothers son, who is
    younger
  • ?? (biao xiong)
  • ??? (yi biao xiong) mothers sisters son, who
    is older
  • ??? (jiu biao xiong) mothers brothers son, who
    is older
  • ??? (gu biao xiong) fathers sisters son, who
    is older
  • ?? (biao di) mothers siblings son, who is
    younger
  • 16 translations ? 8 ? 4.

22
More distinction the aunt example
  • Mothers or fathers side, is a sister or is a
    brothers wife
  • ??fathers sister
  • ??fathers younger brothers wife
  • ? fathers elder brothers wife
  • ??mothers sister
  • ??mothers brothers wife

23
A large happy family
  • ????mothers mothers 3rd sisters husband.
  • 3rd among all the sisters, or among all the
    siblings, or among all the members in the
    extended family with the same bei fen.
  • Same bei fensame distance to the root node of
    a family tree.

24
Sapir-Whorf hypothesis
  • Edward Sapir (1884-1936) American linguist and
    anthropologist.
  • Benjamin Lee Whorf (1897-1941) Sapirs student.
  • Linguistic determinism/relativism the language
    we use to some extent determines the way in which
    we view and think about the world around us.
  • Strong determinism language determines thought,
    that language and thought are identical.
  • Weak determinism thought is merely affected by
    or influenced by our language.

25
Sapir-Whorf hypothesis (cont)
  • Snow
  • Whorf the number of words for snow the Inuit
    people have for snow ? Inuit people treat snow
    differently than someone who lives in a less
    snow-dependent environment.
  • Pullum (1991) Other languages transmit the same
    ideas using phrases instead of single words.
  • My personal experience
  • Preference of one language over the other

26
Color coding study
  • Hypothesis if one language categorizes color
    differently than another language, then the
    different groups should perceive it differently
    also.
  • A 1970 study
  • Task give people a sample of 160 colors and ask
    them to sort it.
  • People English speakers (blue-green distinction)
    and Berinmo speaker (nol-wor distinction)
  • the Berinmo speakers were better at matching
    colors across their nol, wor categories than
    across the English blue and green categories and
    English speakers were better at matching colors
    across blue and green than across the Berinmo nol
    and wor (Sawyer, 1999).

27
Sapir-Whorf hypothesis (cont)
  • Whats the relation between language, thought,
    and cultural perception of reality?
  • Does language affect thought? If so, to what
    degree?
  • The hypothesis is still under much debate.

28
Major challenges
  • Getting the right words
  • Choosing the correct root form
  • Getting the correct inflected form
  • Inserting spontaneous words
  • Putting the words in the correct order
  • Word order SVO vs. SOV,
  • Unique construction
  • Structural divergence

29
Choosing the appropriate inflection
  • Inflection gender, number, case, tense,
  • Ex
  • Number Ch-Eng all the concrete nouns
  • ch_book ? book, books
  • Gender Eng-Fr all the adjectives
  • Case Eng-Korean all the arguments
  • Tense Ch-Eng all the verbs
  • ch_buy ? buy, bought, will buy

30
Inserting spontaneous words
  • Function words
  • Determiners Ch-Eng
  • ch_book ? a book, the book, the books,
    books
  • Prepositions Ch-Eng
  • ch_November ? in November
  • Relative pronouns Ch-Eng
  • ch_buy ch_book de ch_person ? the person
    who bought /book/
  • Possessive pronouns Ch-Eng
  • ch_he ch_raise ch_hand ? He raised his
    hand(s)
  • Conjunction Eng-Ch
  • Although S1, S2 ? ch_although S1, ch_but S2

31
Inserting spontaneous words (cont)
  • Content words
  • Dropped argument Ch-Eng
  • ch_buy le ma ? Has Subj bought Obj?
  • Chinese First name Eng-Ch
  • Jiang ? ch_Jiang ch_Zemin
  • Abbreviation, Acronyms Ch-Eng
  • ch_12 ch_big ? the 12th National Congress of
    the CPC (Communist Party of China)

32
Major challenges
  • Getting the right words
  • Choosing the correct root form
  • Getting the correct inflected form
  • Inserting spontaneous words
  • Putting the words in the correct order
  • Word order SVO vs. SOV,
  • Unique construction
  • Structural divergence

33
Word order
  • SVO, SOV, VSO,
  • VP PP ? PP VP
  • VP AdvP ? AdvP VP
  • Adj N ? N Adj
  • NP PP ? PP NP
  • NP S ? S NP
  • P NP ? NP P

34
Unique Constructions
  • Overt wh-movement Eng-Ch
  • Eng Why do you think that he came yesterday?
  • Ch you why think he yesterday come ASP?
  • Ch you think he yesterday why come?
  • Ba-construction Ch-Eng
  • She ba homework finish ASP ? She finished her
    homework.
  • He ba wall dig ASP CL hole ? He digged a hole in
    the wall.
  • She ba orange peel ASP skin ? She peeled the
    oranges skin.

35
Translation divergences(based on Bonnie Dorrs
work)
  • Thematic divergence I like Mary ?
  • S Marta me gusta a mi (Mary pleases me)
  • Promotional divergence John usually goes home ?
  • S Juan suele ira casa (John tends to go
    home)
  • Demotional divergence I like eating ?G Ich esse
    gern (I eat likingly)
  • Structural divergence John entered the house ?
  • S Juan entro en la casa (John entered in
    the house)

36
Translation divergences (cont)
  • Conflational divergence I stabbed John ?
  • S Yo le di punaladas a Juan (I gave
    knife-wounds to John)
  • Categorial divergence I am hungry ?
  • G Ich habe Hunger (I have hunger)
  • Lexical divergence John broke into the room ?
  • S Juan forzo la entrada al cuarto (John
    forced the entry to the room)

37
Outline
  • MT in a nutshell
  • Major challenges in MT
  • Major approaches
  • Evaluation of MT systems

38
How humans do translation?
  • Learn a foreign language
  • Memorize word translations
  • Learn some patterns
  • Exercise
  • Passive activity read, listen
  • Active activity write, speak
  • Translation
  • Understand the sentence
  • Clarify or ask for help (optional)
  • Translate the sentence

Training stage
Translation lexicon
Templates, transfer rules
Reinforced learning? Reranking?
Decoding stage
Parsing, semantics analysis?
Interactive MT?
Word-level? Phrase-level? Generate from meaning?
39
What kinds of resources are available to MT?
  • Translation lexicon
  • Bilingual dictionary
  • Templates, transfer rules
  • Grammar books
  • Parallel data, comparable data
  • Thesaurus, WordNet, FrameNet,
  • NLP tools tokenizer, morph analyzer, parser,
  • ? More resources for major languages, less for
    minor languages.

40
Major approaches
  • Transfer-based
  • Interlingua
  • Example-based (EBMT)
  • Statistical MT (SMT)
  • Hybrid approach

41
The MT triangle
Meaning
(interlingua)

Synthesis
Analysis
Transfer-based
Phrase-based SMT, EBMT
Word-based SMT, EBMT
word
Word
42
Transfer-based MT
  • Analysis, transfer, generation
  • Parse the source sentence
  • Transform the parse tree with transfer rules
  • Translate source words
  • Get the target sentence from the tree
  • Resources required
  • Source parser
  • A translation lexicon
  • A set of transfer rules
  • An example Mary bought a book yesterday.

43
Transfer-based MT (cont)
  • Parsing linguistically motivated grammar or
    formal grammar?
  • Transfer context-free rules? Additional
    constraints on the rules? Apply at most one rule
    at each level? How are rules created?
  • Translating words word-to-word translation?
  • Generation using LM or other additional
    knowledge?
  • How to create the needed resources automatically?

44
Interlingua
  • For n languages, we need n(n-1) MT systems.
  • Interlingua uses a language-independent
    representation.
  • Conceptually, Interlingua is elegant we only
    need n analyzers, and n generators.
  • Resource needed
  • A language-independent representation
  • Sophisticated analyzers
  • Sophisticated generators

45
Interlingua (cont)
  • Questions
  • Does language-independent meaning representation
    really exist? If so, what does it look like?
  • It requires deep analysis how to get such an
    analyzer e.g., semantic analysis
  • It requires non-trivial generation How is that
    done?
  • It forces disambiguation at various levels
    lexical, syntactic, semantic, discourse levels.
  • It cannot take advantage of similarities between
    a particular language pair.

46
Example-based MT
  • Basic idea translate a sentence by using the
    closest match in parallel data.
  • First proposed by Nagao (1981).
  • Ex
  • Training data
  • w1 w2 w3 w4 ? w1 w2 w3 w4
  • w5 w6 w7 ? w5 w6 w7
  • w8 w9 ? w8 w9
  • Test sent
  • w1 w2 w6 w7 w9 ? w1 w2 w6 w7 w9

47
EMBT (cont)
  • Types of EBMT
  • Lexical (shallow)
  • Morphological / POS analysis
  • Parse-tree based (deep)
  • Types of data required by EBMT systems
  • Parallel text
  • Bilingual dictionary
  • Thesaurus for computing semantic similarity
  • Syntactic parser, dependency parser, etc.

48
EBMT (cont)
  • Word alignment using dictionary and heuristics
  • ? exact match
  • Generalization
  • Clusters dates, numbers, colors, shapes, etc.
  • Clusters can be built by hand or learned
    automatically.
  • Ex
  • Exact match 12 players met in Paris last Tuesday
    ?
  • 12 Spieler trafen sich
    letzen Dienstag in Paris
  • Templates num players met in city time ?
  • num Spieler trafen sich
    time in city

49
Statistical MT
  • Basic idea learn all the parameters from
    parallel data.
  • Major types
  • Word-based
  • Phrase-based
  • Strengths
  • Easy to build, and it requires no human knowledge
  • Good performance when a large amount of training
    data is available.
  • Weaknesses
  • How to express linguistic generalization?

50
Comparison of resource requirement
Transfer-based Interlingua EBMT SMT
dictionary
Transfer rules
parser (?)
semantic analyzer
parallel data
others Universal representation thesaurus
51
Hybrid MT
  • Basic idea combine strengths of different
    approaches
  • Syntax-based generalization at syntactic level
  • Interlingua conceptually elegant
  • EBMT memorizing translation of n-grams
    generalization at various level.
  • SMT fully automatic using LM optimizing some
    objective functions.
  • Types of hybrid HT
  • Borrowing concepts/methods
  • SMT from EBMT phrase-based SMT Alignment
    templates
  • EBMT from SMT automatically learned translation
    lexicon
  • Transfer-based from SMT automatically learned
    translation lexicon, transfer rules using LM
  • Using two MTs in a pipeline
  • Using transfer-based MT as a preprocessor of SMT
  • Using multiple MTs in parallel, then adding a
    re-ranker.

52
Outline
  • MT in a nutshell
  • Major challenges in MT
  • Major approaches
  • Evaluation of MT systems

53
Evaluation
  • Unlike many NLP tasks (e.g., tagging, chunking,
    parsing, IE, pronoun resolution), there is no
    single gold standard for MT.
  • Human evaluation accuracy, fluency,
  • Problem expensive, slow, subjective,
    non-reusable.
  • Automatic measures
  • Edit distance
  • Word error rate (WER), Position-independent WER
    (PER)
  • Simple string accuracy (SSA), Generation string
    accuracy (GSA)
  • BLEU

54
Edit distance
  • The Edit distance (a.k.a. Levenshtein distance)
    is defined as the minimal cost of transforming
    str1 into str2, using three operations
    (substitution, insertion, deletion).
  • Let the operation cost be subCost, insCost, and
    delCost, respectively.
  • Let Str1m and Str2n, D(i,j) stores the edit
    distance of converting str11..i to str21..j.
  • D(m,n) is the answer that we are looking for.
  • Use DP and the complexity is O(mn).

55
Calculating edit distance
  • D(0, 0) 0
  • D(i, 0) delCost i
  • D(0, j) insCost j
  • D(i1, j1)
  • min( D(i,j) sub,
  • D(i1, j) insCost,
  • D(i, j1) delCost)
  • sub 0 if str1i1str2j1
  • subCost otherwise

56
An example
  • Sys w1 w2 w3 w4
  • Ref w1 w3 w2
  • All three costs are 1.
  • Edit distance2

0 1 2 3
1 0 1 2
2 1 1 1
3 2 1 2
4 3 2 2
57
WER, PER, and SSA
  • WER (word error rate) is edit distance, divided
    by Ref.
  • PER (position-independent WER) same as WER but
    disregards word ordering
  • SSA (Simple string accuracy) 1 - WER
  • Previous example
  • Sys w1 w2 w3 w4
  • Ref w1 w3 w2
  • Edit distance 2
  • WER2/3
  • PER1/3
  • SSA1/3

58
Generation string accuracy (GSA)
  • Example
  • Ref w1 w2 w3 w4
  • Sys w2 w3 w4 w1
  • Del1, Ins1 ? SSA1/2
  • Move1, Del0, Ins0 ? GSA3/4

59
BLEU
  • Proposal by Papineni et. al. (2002)
  • Most widely used in MT community.
  • BLEU is a weighted average of n-gram precision
    (pn) between system output and all references,
    multiplied by a brevity penalty (BP).

60
N-gram precision
  • N-gram precision the percent of n-grams in the
    system output that are correct.
  • Clipping
  • Sys the the the the the the
  • Ref the cat sat on the mat
  • Unigram precision
  • Max_Ref_count the max number of times a ngram
    occurs in any single reference translation.

61
N-gram precision
  • i.e. the percent of n-grams in the system output
    that are correct (after clipping).

62
Brevity Penalty
  • For each sent si in system output, find closest
    matching reference ri (in terms of length).
  • Longer system output is already penalized by the
    n-gram precision measure.

63
An example
  • Sys The cat was on the mat
  • Ref1 The cat sat on a mat
  • Ref2 There was a cat on the mat
  • Assuming N3
  • p15/6, p23/5, p31/4, BP1 ? BLEU0.50
  • What if N4?

64
Summary
  • MT in a nutshell
  • Major challenges in MT
  • Choose the right words (root form, inflection,
    spontaneous words)
  • Put them in right positions (word order, unique
    constructions, divergences)

65
Summary (cont)
  • Major approaches
  • Transfer-based MT
  • Interlingua
  • Example-based MT
  • Statistical MT
  • Hybrid MT
  • Evaluation of MT systems
  • Edit distance
  • WER, PER, SSA, GSA
  • BLEU
Write a Comment
User Comments (0)
About PowerShow.com