Scanned Documents - PowerPoint PPT Presentation

About This Presentation
Title:

Scanned Documents

Description:

Transducer Capabilities. OCR. MT. Handwriting. Speech. The Big Picture. Find the words ... Duplicate detection for declassification productivity and anti-tiling ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 10
Provided by: agn82
Category:

less

Transcript and Presenter's Notes

Title: Scanned Documents


1
Scanned Documents
  • LBSC 796/INFM 718R
  • Douglas W. Oard
  • Week 8, October 29, 2007

2
(No Transcript)
3
(No Transcript)
4
Expanding the Search Space
Scanned Docs
Identity Harriet Later, I learned that John
had not heard
5
High Payoff Investments
Searchable Fraction
Transducer Capabilities
6
The Big Picture
  • Find the words
  • Index the words
  • Do ranked retrieval
  • Use that system to find what you want

7
Some Issues
  • Language-based search without language!
  • Shape codes
  • Accuracy-selection effect of ranked retrieval
  • Poor recognition scatters in the query-term space
  • Blind relevance feedback
  • Based on clean text
  • Image-domain summaries

8
Some Applications
  • Case management for litigation
  • Duplicate detection for declassification
    productivity and anti-tiling
  • Knowledge management from everything I have ever
    xeroxed or faxed

9
Some Applications
  • Legacy Tobacco Documents Library
  • http//legacy.library.ucsf.edu/
  • Google Books
  • http//books.google.com/
  • George Washingtons Papers
  • http//ciir.cs.umass.edu/irdemo/hw-demo/
Write a Comment
User Comments (0)
About PowerShow.com