ENGLISH TO SQL USING FSTS AND SMT - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

ENGLISH TO SQL USING FSTS AND SMT

Description:

Conversion of a NL statement into its logical representation by applying ... Output: title='Alien' SELECT id salary) id Casablanca. Stastical Machines Translation ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 16
Provided by: caen2
Category:
Tags: and | english | fsts | smt | sql | using | alien

less

Transcript and Presenter's Notes

Title: ENGLISH TO SQL USING FSTS AND SMT


1
ENGLISH TO SQL USING FSTS AND SMT
  • SHYAM GALA
  • EECS 595
  • NATURAL LANGUAGE PROCESSING
  • FINAL PROJECT

2
TRADITIONAL APPROACH
  • All the NLDBi presented so use a process that
    involves
  • - Conversion of a NL statement into its logical
    representation by applying syntactic and semantic
    analysis.
  • - Conversion of the internal representation into
    a database query using

3
TRADITONAL APPROACH
TOKEN BASED AND TEMPLATE BASED APPROACHES 1
The number steps involved with these traditional
methods make them very expensive, computationally.
4
MY APPROACH
  • I try to analyze two approaches that achieve the
    conversion in a light weight manner
  • The first approach involves the use of SMT do
    the conversion.
  • The second approach involves the use FSTs to
    achieve the conversion.

5
STASTICAL MACHINE TRANSLATION
  • SMT involves the following 3 steps
  • Generation of a Language Model(Training)
  • Generation of a Translation Model(Training)
  • Doing the actual translations by using the 2
    generated models which is done by a
    decoder(Translation)
  • Once training is done, translation is very
    inexpensive

6
STASTICAL MACHINE TRANSLATION
  • Had to create the parallel corpus manually
  • Experiments and Evaluations
  • Some sample results using a trigram model

Sentence Show all employees whose first name is
Mary Output from empinfo first where
'Mary' Sentence What is the id of the film
'Casablanca'. Output title'Alien' SELECT id
salary) id Casablanca.
7
Stastical Machines Translation
  • Possible reasons for bad results
  • Small size of data
  • Applying unstructured language model to a highly
    structured language
  • Future potential in using the structured language
    model introduced by Knight and Yamada

8
FINITE STATE TRANSDUCERS
  • This is the 2nd approach that I tried
  • It involves 2 steps
  • Extracting Keywords, Indicators, and their
    relative ordering from a given string using FSTs
  • Conversion of the output of the FST to a SQL
    query using a preprocessor written in Java.
  • This is also a very inexpensive method

9
KEYWORD EXTRACTION
  • The first step of the process is to feed the
    English statement to the FST
  • FST looks for certain keywords and indicators
    based on the database schema

Sentence Show the names of managers who have
employees in a department located in
Bombay Output selectnamemanagerwhoseemp
loyeedepartmentlocation'Bombay'
10
POSTPROCESSOR
  • The output of the FST is fed to the postprocessor
  • The post processor tries to break the input into
    the Object component and the optional Conditions
    component
  • Needs knowledge about the database

11
Objects Component
  • Looks at all the fields that have been mentioned
    and tries to associate them their respective
    tables
  • If more than 1 table exists
  • Generates subquery

12
Conditions Component
  • Iterates for the fields
  • Once it finds a field it finds the relatively
    closest operator and value
  • Goes through the same process as the above to
    find the associated table
  • Repeats this process for all the fields
  • Checks for all the tables in the query so far and
    links them up.

13
NL Sentence
Show the names of managers who have employees in
a department located in Bombay
FST
selectnamemanagerwhoseemployeedepartmentloc
ation'Bombay'
Postprocessor
Objects
Components
Select manager.name from manager, employee,
department
where manager.managerid employee.managerid and
employee.departmentid department.departmentid
and department.location Bombay
Select manager.name from manager, employee,
department where manager.managerid
employee.managerid and employee.departmentid
department.departmentid and department.location
Bombay
14
LIMITATIONS AND SOLUTIONS
  • Building a domain specific FST that can be
    cubersome
  • FST can be replaced by a PERL script which can
    make this task a lot easier
  • There are commands that it will not be able to
    process
  • Can be addressed by iterating back and forth
    between a user to see if there is a equivalent
    sentence in the preferred structure using pattern
    matching
  • Needs knowledge about the database
  • This can be automated by a Java program that
    connects to a DB and retrieves all the useful
    metadata information
  • And onetime effort from the user to associate
    keywords to tables, fields, operators.

15
CONCLUSION
  • We have seen 2 lightweight methods to that try to
    address the problem of conversion of English
    statement into SQL statement
  • SMT not very successful, but does warrant some
    further efforts
  • Keywords approach seems really promising and
    definitely deserves some attention
Write a Comment
User Comments (0)
About PowerShow.com