CSA4050: Advanced Topics in NLP - PowerPoint PPT Presentation

About This Presentation
Title:

CSA4050: Advanced Topics in NLP

Description:

... buys a book on international politics. Database: He buys a ... I read a book on international politics. Watachi wa kokusai seiji nitsuite kakareta hon o yomu ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 12
Provided by: MikeR2
Category:
Tags: nlp | advanced | csa4050 | topics

less

Transcript and Presenter's Notes

Title: CSA4050: Advanced Topics in NLP


1
CSA4050Advanced Topics in NLP
  • Example Based MT

2
Example Based MT
  • Man does translation, first, by properly
    decomposing an input sentence into certain
    fragmental phrases, then by translating these
    phrases into other language phrases, and finally
    by properly composing these fragmental
    translations into one long sentence.
  • Nagao, 1984

3
Example
  • Problem translateHe buys a book on
    international politics
  • Database
  • He buys a notebookKare wa noto o
    kauhe topic notebook object buys
  • I read a book on international politicsWatachi
    wa kokusai seiji nitsuite kakareta hon o yomu
  • Answerkare wa kokusai seiji nitsuite kakareta
    hon o kau.

4
Translation Process
  • Input he buys a book on international politics
  • Indentify/translate fragments
  • he buyskare wa .. kau
  • a book on international politicskokusai seiji
    nitsuite kakareta hon
  • Combine translated fragments.kare wa kokusai
    seiji nitsuite kakareta hon kau

5
Three Step Process
  • Match identify relevant source language examples
    in database.
  • Align find corresponding fragments in target
    language.
  • Recombine target language fragments to form
    sentences.

6
Matching
  • Nature of matching process depends on database
    organisation.
  • In the simplest case it is simply a bilingual
    corpus that has been aligned at sentence level.
  • Effectively this means that the database is a
    collection of sentence pairs.
  • Identification of relevant pairs is carried out
    by matching sentences.

7
Matching
  • Different methods of sentence/sentence matching
  • Character Based
  • Word Based
  • Structure Based
  • Partial

8
Character Based
  • Problem semantic edit distance versus character
    edit distance
  • Paper tray A holds up to 400 sheets
  • Paper tray B holds up to 400 sheets
  • The large paper tray
  • The small paper tray

d1
d5
9
Word Based
  • Allows matching between different words on the
    basis of a similarity metric (e.g based on
    semantic features).
  • This allows inexact matches and also solves the
    problem of which stored translation to choose.
  • In the following case which involve different
    translations for the word eat in Japanese.
  • StoredA man eats vegetables.Acid eats
    metal.Example to translateHe eats potatoes

10
Matching Fragments

He buys a book on international
politics
he buys a notebook he buys a horse he buys a car
for his mum he buys a politician John buys
toothpaste
he reads a book on international politics books
on international poltics are exciting John sold a
a book on antiques
11
Bibliography
  • Example Based MTH Somers Review
    ArticleMachine Translation 14.2
Write a Comment
User Comments (0)
About PowerShow.com