Example-based Machine Translation based on Deeper NLP - PowerPoint PPT Presentation

About This Presentation

Title:

Example-based Machine Translation based on Deeper NLP

Description:

Example-based Machine Translation based on Deeper NLP Toshiaki Nakazawa1, Kun Yu1, Sadao Kurohashi2 1. Graduate School of Information Science and Technology, – PowerPoint PPT presentation

Number of Views:167

Avg rating:3.0/5.0

Slides: 25

Provided by: acjp

Category:

more less

Transcript and Presenter's Notes

Title: Example-based Machine Translation based on Deeper NLP

1
Example-based Machine Translation based on Deeper
NLP

Toshiaki Nakazawa1, Kun Yu1, Sadao Kurohashi2

1. Graduate School of Information Science and
Technology, The University of Tokyo, Tokyo,
Japan, 113-8656 2. Graduate School of
Informatics, Kyoto University, Kyoto, Japan,
606-8501
2
Outline

Why EBMT?
Description of Kyoto-U EBMT System
Japanese Particular Processing
Pronoun Estimation
Japanese Flexible Matching
Result and Discussion
Conclusion and Future Work

3
Outline

Why EBMT?
Description of Kyoto-U EBMT System
Japanese Particular Processing
Pronoun Estimation
Japanese Flexible Matching
Result and Discussion
Conclusion and Future Work

4
Why EBMT?

Pursuing deep NLP

Improvement of fundamental analyses leads to
improvement of MT
Feedback from MT can be expected

EBMT setting is suitable in many cases

Not a large corpus, but similar translation
examples in relatively close domain
e.g. manual translation, patent translation,

5
Outline

Why EBMT?
Description of Kyoto-U EBMT System
Japanese Particular Processing
Pronoun Estimation
Japanese Flexible Matching
Result and Discussion
Conclusion and Future Work

6
Kyoto-U System Overview
7
Structure-based Alignment

- Step1 Dependency structure transformation
- Step2 Word/phrase correspondences detection
- Step3 Correspondences disambiguation
- Step4 Handling remaining words
- Step5 Registration to database

8
Dependency Structure Transformation
Step1

J JUMAN/KNP
E Charniaks nlparser ? Dependency tree

9
Word Correspondence Detection
Step2

KENKYUSYA J-E, E-J dictionaries (300K entries)
Transliteration (person/place names, Katakana
words)

Ex) ??
? shinjuku (similarity1.0)
? shinjuku sinjuku synjucu ...
??
the car
? ? ?
came
??
at me
??
from the side
? ?
at the intersection
????? ?? ???
10
Step3
Correspondence Disambiguation

Calculate correspondence score based on
unambiguous alignment
Select correspondence with higher score

distJ/E
Distance to unambiguous correspondence in
Japanese/English tree
11
Step3
Correspondence Disambiguation (cont.)
0.8
1.5
1.0
12
Handling Remaining Words
Step4

Align root nodes when remained
Merge Base NP nodes
Merge into ancestor nodes

??
the car
? ? ?
came
??
at me
??
from the side
? ?
at the intersection
????? ?? ???
13
Step5
Registration to Database

14
Translation

Translation example (TE) retrieval
- for all the sub-trees in the input
TE selection
- prefer to large size example
TE combination
- greedily from the root node

15
Combination Example
Input
16
Combination Example (cont.)
Input
17
Outline

Why EBMT?
Description of Kyoto-U EBMT System
Japanese Particular Processing
Pronoun Estimation
Japanese Flexible Matching
Result and Discussion
Conclusion and Future Work

18
Pronoun Estimation

Pronouns are often omitted in Japanese sentences

Omitted in TE
- TE ??????? ? Ive a stomachache
- Input ????????? ?
Omitted in Input
- TE ????????????? ? Will you mail
this to Japan?
- Input ?????????? ?

I Ive a stomachache
Will you mail to Japan?
?
19
Pronoun Estimation (cont.)

Estimate omitted pronoun by modality and subject
case

Omitted in TE
- TE ??????? ? Ive a stomachache
- Input ????????? ?
Omitted in Input
- TE ????????????? ? Will you mail
this to Japan?
- Input ?????????? ?

(??)??????? ? Ive a stomachache
Ive a stomachache ?
(???)?????????? ?
Will you mail this to Japan? ?
20
Various Expressions in Japanese

Synonymous Relation

Hiragana/Katakana/Kanji variations
??? ??? ?? (apple)
Variations of Katakana expressions
?????? ??????? (computer)
Synonymous words
?? ??? (climbing mountain vs mountain
climgbing)
Synonymous phrases
???? ????

Morphological Analyzer
Automatically Acquired from Japanese
Dictionaries
(nearest)
(most) (near)

Hypernym-Hyponym Relation

?? ? ?? ? ??(earthquake)???(typhoon)

(disaster)
21
Japanese Flexible Matching
22
IWSLT06 Evaluation Results

Open data track (JE)
Correct recognition translation ASR output
translation

BLEU NIST
Correct recognition Dev1 0.5087 9.6803
Correct recognition Dev2 0.4881 9.4918
Correct recognition Dev3 0.4468 9.1883
Correct recognition Dev4 0.1921 5.7880
Correct recognition Test 0.1655 (8th/14) 5.4325 (8th/14)
ASR output Dev4 0.1590 5.0107
ASR output Test 0.1418 (9th/14) 4.8804 (10th/14)
23
Results Discussion