Title: Cold Fusion Verity
1- Cold Fusion Verity
- Oracle Enterprise Search
- Acrobat PDF Index Assistant
- Comparison
Arden O. Weissardenweiss_at_verizon.net
2Scope of Presentation
- The Application using these Tools
- The Menu Structure
- Tool Pros/Cons
- Output Examples
- Conclusions
3The Application Purposes
- A Cold Fusion Application that
- Organizes all archival Facilities, Goals and
Documents by source and content. - Assists users do archival research using SQL
on fields and full-text search logic. - Reports on quantity of archival documents and
types thereof.
4The Application Attributes
- Searches the content of
- Archival PDF files
- Several Oracle Databases
- Email Archive
- Other Public/shared Folders
- Displays Search Results
- Does bean-counting type reports
- Access controlled by Database/App.
5The Analyses Tree Structure
- Facilities (name/address/contact data)
- Research Goals (descp/conflicts/dates)
- Research Documents, Interviews,
categorization data (detailed info)
____________________ Full text Verity Search
includes all Oracle fields plus PDF documents.
6The Cold Fusion Program Menu
7The CF Program In Action
- Accessing documents from the top-down.
- Searching for Facilities.
- Searching for Goals
- Searching for Documents
- Searching for words in Verity Index.
8Cold Fusion Verity Pros/Cons
- Cold Fusion Verity Pros
- Search under Cold Fusion Program control.
- Search speed and results display is fast.
- Drilldown logic can be used to add specificity.
- Output can be formatted as desired.
- Output can be redirected to other data stores.
- Index update under Cold Fusion Program control.
- Cold Fusion Verity Cons
- Problems Indexing large data stores.
- Context for search results not always obvious.
9Oracle Search Pros/Cons
- Oracle Enterprise Search Pros
- Search can do a global search of LAN/WAN.
- Indexing of large data stores is excellent.
- Search speed is fast even for large data stores.
- Output is displayed in Google-like display.
- Search can be run external to/parallel w/CF App.
- Oracle Enterprise Search Cons
- Search logic limited to simple queries.
- Output can not be formatted as desired.
- Output is displayed in Google-like display.
- Search results export is manual via copy/paste.
10PDF Index Assistant Pros/Cons
- Index Assistant Search Pros
- Indexing of large folders of PDF files works
well. - Search speed is fast even for a big set of PDF
files. - Results are displayed in a well-organized
manner. - Results are displayed/highlighted in full
context. - Search can be run external to/parallel w/CF App.
- Index Assistant Search Cons
- Search is limited to contents of indexed PDF
files. - All included files must be in PDF format.
- Keeping the index current is a manual process.
- Search results export is manual via copy/paste.
11Oracle Search Screen
12Oracle Search Example Results
Results can be - Grouped by Source, Date,
Author, File Format - Sorted by Relevance,
Date, Author, File Format, Title, Path, Language
13Oracle Example Search Results
Matching Attribute Names Include (any or
all) Author, Description, Headline 1 2 or 3,
Host, Keywords, Language, Last Modified Date,
Mimetype, Reference to Text, Subject, Title,
Urldepth, Url
14Loading Acrobat Index Builder
15Opening PDF Index (PDX) File
16Selecting Folders to Include
Rebuild recreates PDX file from Scratch about8
min for 1471 PDF files. Build updates existing
PDXfile (took seconds when changes were minimal).
17Finding PDF Files to Include
This is a rebuild operation 1st looks for files
to include.
18Building Acrobat PDX Index
Build (update) operation is faster than Rebuild
operation.
19Scheduled PDX Index Updates
- Use a catalog batch PDX file (.bpdx) to
schedule when to automatically build,
rebuild, update, and purge an index. - A BPDX text file contains a list of platform-
dependent catalog index file paths and flags.
- Use a scheduling application, such as Windows
Scheduler, to display the BPDX file in
Acrobat. - Acrobat re-creates the index according to the
flags in the BPDX file.
20Searching PDX Index (1 of 3)
- On Acrobats Main Menu click on Edit then
Search or press ltShiftgt ltCtrlgt F to
display the Search Window. - Click on Advanced Search Options link at Screen
Bottom. - Click on Select Index at top of Window to
display
21Searching PDX Index (2 of 3)
- The Search Window then changes to show
Currently Selected Indexes with excellent
search options.
- Enter criteria and Press the Search button to
display results.
22Searching PDX Index (3 of 3)
- Search for WEISS in the Currently Selected
Indexes -- whole words only checked.
Results shown below.
23Conclusions and Thoughts
- All three search technologies co-exist well.
- Oracle Search is not PDF-centric and may be
too broad a search function to easily
control. - Oracle Search may be a good way to discover
what missed being put into the CF Archive. - SQL Server may have functionality similar to
Oracle Enterprise Search. - Acrobat Index Search gets you immediately
closer to the real PDFs (Verity does not
highlight search words displayed PDFs. - Verity and Acrobat are much cheaper dates.
24Tha-Tha-Thats All Folks