Lucene Lab 2 - PowerPoint PPT Presentation

1 / 4
About This Presentation
Title:

Lucene Lab 2

Description:

Lucene Lab 2 030209 General IR Process Start Indexing (start stepping though all files) Tokenize & stem each file Index 1st, Index User enters (roughly) natural ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 5
Provided by: Chuck161
Category:
Tags: lab | lucene

less

Transcript and Presenter's Notes

Title: Lucene Lab 2


1
Lucene Lab 2
  • 030209

2
General IR Process
Start Indexing (start stepping though all files)
Tokenize stem each file
Index
1st, Index
Run query against index
User enters (roughly) natural language query
Tokenize stem the query
Results
2nd, Query/ Search
3
Lucene Process
IndexWriter.java
StandardAnalyzer.java or Other analyzer
Index
1st, Index
Run query against index
User enters (roughly) natural language query
Tokenize stem the query
Results
2nd, Query/ Search
4
Lucene Lab
  • All below will be run against the policies
    directory.
  • 1) Create your own StopWord file run it with
    the StopAnalyzer. Export the results to an XML
    file.
  • Send the
  • source file
  • XML file,
  • your StopWord file to Jeff by beginning of class
    Wed.
  • 2) Compile the SearchFiles.java program run it
    against your indices. Do this for
  • -- indexing with the StandardAnalyzer
  • -- indexing with the SimpleAnalyzer
  • -- indexing with the StopAnalyzer
  • -- indexing with the StopAnalyzer with your stop
    words
  • For each of the above, do one run with
    Streaming option one with the Paging
    option. The \docs\demo2.html file briefly
    discusses the difference. Review the usage
    statement in the source code to see how to select
    between the two. Take a screen shot of the
    results.
  • So this portion of the Lab/Homework will a total
    of 8 screen shots a screen shot of the
    Streaming option a screen shot of the Paging
    option for each of the index files above.
  • REMEMBER The SearchFiles program must use
    THE SAME ANALYZER as the one that created the
    index being searched. For example, when you
    search the index created with the StopAnalyzer,
    then your SearchFiles program must invoke the
    same analyzer, StopAnalyzer in this case in order
    to get appropriate results.
Write a Comment
User Comments (0)
About PowerShow.com