NLP Technology Applied to e-discovery - PowerPoint PPT Presentation

1 / 6
About This Presentation
Title:

NLP Technology Applied to e-discovery

Description:

'The Current Status and Future of Search and Retrieval ... Application of Natural Language Processing Technology to effectively: ... Investigating e-discovery ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 7
Provided by: leaam
Category:

less

Transcript and Presenter's Notes

Title: NLP Technology Applied to e-discovery


1
NLP Technology Applied to e-discovery
  • Bill Underwood
  • Principal Research Scientist
  • william.underwood_at_gtri.gatech.edu
  • The Current Status and Future of Search and
    Retrieval Technology
  • WG1 Mid-Year Meeting
  • Cambridge, Maryland April 21-22, 2002

2
Research Sponsored by ERA Program of NARA
  • Application of Natural Language Processing
    Technology to effectively
  • Summarize Series of Presidential e-records
  • Identify FOIA exemptions and PRA restrictions in
    Presidential e-records
  • Search for e-records relevant to a FOIA request
  • Search for e-records in massive collections in
    support legal discovery

3
NLP Methods in Document Retrieval
  • Morphological processing
  • Identifying words
  • Parsing-Linguistic representation
  • Word sense disambiguation
  • Represent, identify and exploit semantic
    relationships
  • Conceptual indexing
  • Matching concepts in query to conceptual index

4
Current Weaknesses of NLP in Information Retrieval
  • NLP methods of document retrieval have failed to
    perform better that Boolean and statistical
    methods. Why?
  • Broad nature of retrieval tasks
  • Lack of weighting scheme for compound terms
  • Poor word sense ambiguation for documents and
    queries.
  • Need to handle verbs as well as nouns and noun
    phrases.
  • Poor POS tagging
  • Need better parsing algorithms and grammars.
  • Inadequate handling of negation

5
Advanced NLP Methods Applied to PERPOS Research
Tasks
  • Morphological analysis
  • Word sense disambiguation
  • Larger lexicon
  • Domain-dependent Lexicons.
  • Information extraction to identify classes of
    words
  • Template filling to identify communication acts
    of records (nominate, request information,
    provide information)
  • Learning and identification of document types
  • Method of reasoning with negation in NL
  • Conceptual taxonomy
  • Rule-based reasoning
  • Question answering technology

6
Plausible, Hybrid Approach toInvestigating
e-discovery
  • Formulate e-discovery task not just in search
    terms but also complaint itself including parties
    and laws involved. Express the kinds of evidence
    that would enable one to prove the case as a
    series of questions or if-then rules drawn from
    precedent cases. And experience.
  • Use a COTS text retrieval system with Boolean
    queries and statistical method to retrieve
    documents using key terms related to the case.
  • Use contextual knowledge with questions and NLP
    methods, (e.g., question answering) to review the
    retrieved documents to determine more precisely
    those relevant to the case, i.e., those that
    would represent evidence.
Write a Comment
User Comments (0)
About PowerShow.com