Creation of a Russian-English Translation Program - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Creation of a Russian-English Translation Program

Description:

Arnold, Doug, Lorna Balkan, Siety Meijer, R. Lee ... James. Natural Language Understanding. New York: Benjamin/Cummings Publishing Company, 1995. Arnold ... – PowerPoint PPT presentation

Number of Views:183
Avg rating:3.0/5.0
Slides: 16
Provided by: tjhsstEdu
Learn more at: https://www.tjhsst.edu
Category:

less

Transcript and Presenter's Notes

Title: Creation of a Russian-English Translation Program


1
Creation of a Russian-English Translation Program
Karen Shiells
2
Purpose
  • Object-oriented approach
  • Interactive machine translation
  • Designed for aid, not independent translation
  • Explore algorithms used in machine translation
  • Identify grammatical obstacles to translation
  • Create a base to expand later

3
Scope of Study
  • Machine translation is and will be imperfect
  • Modern translation uses statistical methods
  • Project is limited to
  • Separating base words from morphological endings
  • Constructing syntax trees from source text
  • Generating simple English output from tree
  • Identifying words already known to the program

4
Other Research
  • Part-of-speech tagging
  • Uses probability to identify parts of speech
  • Applied to unknown words and structures
  • Complex labeling systems, beyond conventional
  • Translation algorithms
  • Massive dictionaries store words and information
  • Aided by verb categorization
  • Omit unknown words and translate without
  • Usually comprehensible, but require human revision

5
Old Methods
  • Direct Translation
  • First method
  • Rearranges sentences without parsing
  • Based on rules of transfer for specific languages
  • Interlingua
  • From era of international languages
  • Uses one representation as an intermediary
  • Intermediary is usually a constructed language
  • Easier to add language pairs

6
Syntactic Transfer
  • Similar to interlingua
  • Generates syntax tree using specific parser
  • Rearranges tree to fit target structure
  • Uses specific generation method to form output
  • Entire algorithm specific to one language pair
  • Best quality translations
  • Relatively new
  • Not as common in commercial software

7
Alternative Structures
  • Valency
  • Stores number of complements for each word
  • Type of complements not specified
  • Occupies less space in dictionary
  • Phrase-Structure Representation
  • Most familiar noun phrase, verb phrase, etc.
  • Breaks sentence into superstructures
  • Puts terminal symbols only in leaves
  • Non-terminal symbols for branches

8
Dependency Trees
  • Uses words as nodes, not just leaves
  • Examples
  • Verb dependent on subject
  • Objects dependent on verb
  • Adjectives dependent on nouns
  • Prepositions vary by type of prepositional phrase
  • Easier to verify agreement between words
  • Occupies less space

9
Object Orientation
  • Object-oriented approach allows more flexibility
  • Endings, cases, and declensions are classes
  • Fewer hard-coded rules
  • Methods for locating dependents are in classes
  • Modular design allows gradual changes
  • Changes in lexical analysis do not affect parsing
  • Changes in dictionary do not affect translation

10
Verb Typing
  • Divides verbs into categories, for example
  • Transitive
  • Intransitive
  • Directional or Non-directional motion
  • Condenses structure storage
  • Dictionary stores only type of a verb
  • Particular structures taken from general
  • Code can apply to general structures, not specific

11
Dictionary
  • Open, save, add, remove, and search functions
  • Stores
  • Russian nominative
  • English nominatives
  • Part of speech
  • Noun/pronoun attributes
  • Verb types

12
Translator
  • Uses transliteration for ease of testing
  • Can be easily converted to Unicode Cyrillic
  • Debugging output to terminal window

13
Results
  • Subject, verb, direct object translated
  • Subject is first nominative
  • Verb matched by gender, number, and person
  • Direct object is first accusative
  • Adjectives matched to nouns
  • Matched by case, number, and gender
  • Word order not considered
  • Word order should be accounted for, but aren't
  • Adjectives to nearest, not matching
  • Prepositional objects should be nearby

14
Conclusions
  • Part-of-speech guessing could be added easily
  • When a subordinate is not found, add to list
  • For each unmatched word, prompt user
  • Allow selection between subordinates not found
  • Verb typing would be harder, but helpful
  • Restricting complements makes more precise
  • More efficient, not searching for all possible
  • Prepositions could be associated with nouns
  • Even in inflecting languages, word order matters
  • Subordinates should be located by proximity
  • Multiple functions use the same inflections

15
Bibliography
  • Allen, James. Natural Language Understanding.
    New York Benjamin/Cummings Publishing Company,
    1995.
  • Arnold, Doug, Lorna Balkan, Siety Meijer, R. Lee
    Humphreys, and Louisa Sandler. Machine
    Translation An Introductory Guide. London NCC
    Blackwell, 1994. Available Online
    http//www.essex.ac.uk/linguistics/clmt/MTbook/Pos
    tScript.
  • Barber, Charles. The English Language A
    Historical Introduction. Cambridge Cambridge
    University Press, 1993.
  • Beard, Robert. Russian An Interactive On-Line
    Reference Grammar. November 1, 2005. Available
    Online http//www.alphadictionary.com/rusgrammar/
    .
  • Comrie, Bernard, ed. The World's Major
    Languages. Oxford Oxford University Press,
    1990.
  • Hutchins, John and Harold Somers. An
    Introduction to Machine Translation. London
    Academic Press, 1992. Available Online
    http//ourworld.compuserve.com/hompages/WJHutchins
    /IntroMT-TOC.htm.
Write a Comment
User Comments (0)
About PowerShow.com