Implementation of a QA system in a real context - PowerPoint PPT Presentation

About This Presentation
Title:

Implementation of a QA system in a real context

Description:

Title: PowerPoint Presentation Last modified by: Carlos Amaral Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 17
Provided by: nkp3
Category:

less

Transcript and Presenter's Notes

Title: Implementation of a QA system in a real context


1
Implementation of a QA system in a real context
  • Carlos Amaral (Priberam, Portugal)
  • Dominique Laurent (Synapse Développement, France)

Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
2
  • 1. The Question-Answering system
  • What is a QA System ?
  • System that enables the extraction of an answer
    (or several) to a request (a question) based on a
    corpus
  • The problematic of  the type of the question 
  • An answer or several, possibly a list from one or
    several documents, an answer of the type Yes/No,
  • On a corpus in one or several languages

Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
3
1.1. QA and Language Processing
  • A QA system appears to be a LP  par excellence 
  • However, certain systems are uniquely based on
    pattern matching (cf Soubotine Soubotine, TREC
    2003),
  • These systems seems to have reached their limits
  • And, if they can process all what is factual, the
    complex questions/queries are far beyond their
    possibility.
  • The best systems validated at TREC and CLEF are
    based on Automated Language Processing.

Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
4
  • 1.2. OUR QA SYSTEM
  • First developed (1999 - 2001) within a French
    innovation project (Anvar)
  • Then (end 2001- end 2003) within the European
    project TRUST (FP5)
  • Currently, (2005/06) within the European project
    M-CAST (FP6)
  • Main features targets B2B and B2C,
    multilingual, NLP based and intensive.

Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
5
A modular conception
French Language Module
Italian Language Module
Portuguese Language Module
Polish Language Module
English Language Module
Czech Language Module
Indexation engine
Extraction of text engine
Documents
Visualization of Results
Index
Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
6
Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
7
  • 1.3. Evaluations of the QA system
  • Professional benchmarking contests and campaigns
    such as EQueR (2004) and CLEF (2005 2006),
  • Evaluations for the French, English, Portuguese
    and Spanish language modules, in monolingual and
    multilingual.

Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
8
CLEF 2005
Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
9
CLEF 2006
Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
10
  • In CLEF 2005 and CLEF 2006, the best engines for
    monolingual were our systems for Portuguese and
    French. And the best systems for multilingual
    were our systems for English-French,
    Portuguese-French, Spanish-Portuguese,
    Portuguese-Spanish.
  • Synapse Développement and Priberam are now
    partners of the project Quaero.

Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
11
  • 2. Implementation in M-CAST Project
  • Tests carried-out on books in the National Czech
    library and the Torun library in Poland,
  • Processing several millions of digitized
    documents,
  • Manages meta-data and UDC classification,
  • Accommodates questions and answers in English,
    French, Italian, Portuguese, Polish, Czech
  • Implemented on both library portals

Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
12
2.1. Adaptation to Digital Libraries Resources
  • Scanned texts poor quality
  • gt Spell checker to improve the quality of
    documents.
  • One book, lots of pages
  • gt Management of multi-part documents during
    semantic analysis

Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
13
2.2. Integration of Dublin Core documents
attributes
  • Storage of Dublin Core attributes as Metadata
  • QA Who is the author of Hamlet ?
  • Adaptation of the system to search in metadata
  • Use of those metadata as filters

Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
14
2.3. Universal Decimal Classification
  • Storage of UDC codes for each document
  • Search through UDC codes
  • Filtering through UDC codes
  • Semantic disambigation through UDC codes

Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
15
Technical architecture
16
END of Presentation I would appreciate your
questions ! Thank you - Merci !
Workshop TellMeMore, November 24, 2006, C.Amaral,
D.Laurent
Write a Comment
User Comments (0)
About PowerShow.com