The Translation Process: a Digression - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

The Translation Process: a Digression

Description:

Hand-coded back end translator associated with each search engine ... hydrogen OR hybrid OR 'low emission' automobile OR car OR vehicle NOT SUV NOT ' ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 16
Provided by: pagesCsam
Category:

less

Transcript and Presenter's Notes

Title: The Translation Process: a Digression


1
The Translation Process a Digression
  • CMPT 371
  • Fall 2004
  • J.W. Benham

2
Phases of Translation
  • Lexical Analysis
  • Produces Tokens
  • Syntactic Analysis (Parsing)
  • Analysis of structure of source
  • Produces parse tree (at least in principle)
  • Intermediate representation
  • Syntax tree
  • Optimization
  • Removing stop words
  • Other transformations??
  • Target string generation

3
Lexical Analysis and Tokens
  • A token is smallest meaningful unit of source
    string
  • Search term
  • Boolean operation
  • Search restriction specifier
  • URL
  • Date
  • Lexical Analyzer finds tokens (actually the
    corresponding lexemes) in source string

4
Lexical Analysis Example
  • Source String hydrogen OR hybrid OR low
    emission AND automobile OR car OR vehicle AND
    NOT (SUV OR Sport Utility Vehicle)
  • Tokens
  • SearchTerm hydrogen
  • OrOperation
  • SearchTerm hybrid
  • OrOperation
  • SearchTerm low emission
  • AndOperation
  • SearchTerm automobile
  • OrOperation
  • SearchTerm car
  • OrOperation
  • Search Term vehicle
  • AndOperation
  • NotOpeartion
  • LeftParenthesis
  • SearchTerm SUV
  • OrOperation
  • SearchTerm SportUtilityVehicle
  • RightParenthesis

5
Syntactic Specification BNF
  • sourceString searchExpr searchRestrictions
  • searchExpr searchExprAND searchExpr
    searchExpr OR searchEexpr NOT searchExpr
    (searchExpr) searchTerm
  • searchRestriction
  • Operator precedence information also needed
  • Parenthesized string, NOT, OR, AND
  • Left associative
  • Precedence table or precedence functions

6
Syntactic Analysis (Parsing)
  • Produces parse tree
  • Represents structure of search string
  • Parser may not produce an explicit representation
  • Operator precedence parsing uses a stack to keep
    track of tokens that are not yet processed

7
Precedence Table
8
Use of Precedence Table
  • if (topOfStack gt currentToken)
  • pop stack
  • while (topOfStack gt lastTokenPopped)
  • pop stack
  • process tokens from stack
  • push currentToken
  • else if (topOfStack lt currentToken)push
    currentToken
  • else // equal precedence
  • pop stack
  • process token form stack and currentToken
    together

9
Example Parse Tree (1)
expr
Subtreeon nextslide
expr
AND
AND
expr
expr
expr
expr
OR
expr
OR
expr
OR
expr
expr
expr
expr
OR
Termlowemission
Termvehicle
Termhydrogen
Termhybrid
Termcar
Termautomobile
10
Example Parse Tree (2)
Repeatedfrom previousslide
expr
AND
expr
Subtreefrompreviousslide
NOT
expr
(
expr
)
expr
expr
OR
TermSport UtilityVehicle
TermSUV
11
Intermediate Representation
  • Syntax tree
  • Syntax directed translation (SDT)
  • Example expr expr1 AND expr2 expr.ST
    createSyntaxTree (expr1.ST, AND, expr2.ST)

12
Example Syntax Tree
AND
NOT
OR
TermSUV
TERMsport utilityvehicle
AND
OR
OR
Termlow emission
OR
OR
Termvehicle
Termautomobile
Termhydrogen
Termhybrid
Termcar
13
Optimization
  • Typically results in restructured intermediate
    representation (syntax tree)
  • Elimination of common stop words
  • May be search-engine specific
  • Restructuring of syntax tree
  • This will require more thought
  • Levels of optimization (including none)
  • Capture users exact intentions, even if it will
    be inefficient

14
Target String Generation
  • Back end generate translated string from syntax
    tree
  • Use operations supported by target search engine
  • May be difficult to use a generic back end
    translator guided by SE syntax specification
    system re-configuration
  • Hand-coded back end translator associated with
    each search engine
  • Some information can be recorded in SE syntax
    specification

15
Example Translated Search String
  • Suppose search engine syntax includes
  • AND is default operation
  • OR has precedence over AND
  • Parentheses not supported
  • Translated string will be
  • hydrogen OR hybrid OR low emission automobile
    OR car OR vehicle NOT SUV NOT sport utility
    vehicle
  • Note use of DeMorgans laws
Write a Comment
User Comments (0)
About PowerShow.com