Commercial Online Databases and the Internet - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Commercial Online Databases and the Internet

Description:

WWW search engines: 1230 minutes. Plus formatting time. Searching Assumptions: ... The information is free. There's no telling how the search engine works ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 21
Provided by: capu
Category:

less

Transcript and Presenter's Notes

Title: Commercial Online Databases and the Internet


1
Commercial Online Databases and the Internet
OSS 99 Global Information Forum May 24,
1999 Anne Caputo Dow Jones Interactive
Publishing
2
Traditional Search Services Challenge the Web
  • The Internet Searchoff
  • September 1997-February 1998
  • Susan Feldman, DATASEARCH
  • sef2_at_cornell.edu
  • Goal
  • Compare searching traditional online services
    with World Wide Web
  • Effectiveness in finding information
  • When to use which one
  • Strengths of each approach

3
Searchoff Ground Rules
  • Be a trained, experienced searcher
  • Use a real question from a client
  • Search either Dialog or Dow Jones Interactive
  • Relevance rank the results
  • Rank the top 30 retrieved documents on a scale of
    1 to 5

4
Subjects Searched
  • Business
  • Technology
  • Medicine/Pharmaceuticals
  • Science
  • Humanities
  • Engineering
  • Other
  • 38
  • 18
  • 14
  • 10
  • 8
  • 6
  • 6

5
Web Search Engines Used
  • Alta Vista
  • Hotbot
  • Excite
  • Infoseek
  • Lycos
  • Webferret
  • 45
  • 20
  • 14
  • 14
  • 5
  • 2

6
Internet Search-Off Results
1400
Web totals
1400
1200
Dlg/dj totals
1143
1000
W
D
800
600
515
484
400
D
W
200
0
Relevance Points
Documents
7
Searching time
  • Total minutes searching time
  • DIALOG/DOW JONES 594 minutes
  • WWW search engines 1230 minutes
  • Plus formatting time

8
Searching Assumptionstraditional search engines
  • Information exists on the subject
  • The information is high quality
  • The information is current
  • The information is expensive
  • To find it, we need expertise and training to
    know how and where to search
  • It will be a surprise if we cant find something

9
Searching assumptionsWorld Wide Web
  • There MIGHT be information on the topic
  • Quality and timeliness is unpredictable
  • The information is free
  • Theres no telling how the search engine works
  • searching requires no skill
  • searching requires no training
  • It will be a surprise if we find something

10
Retrieved Documents by Relevance
350
306
300
Web
250
200
147
150
-- DIALOG/
117
Dow Jones
108
111
100
D
D
60
W
52
38
50
34
26
D
D
W
w
W
0
RANKED 1
RANKED 2
RANKED 3
RANKED 4
RANKED 5
Less Relevant More Relevant
11
Conclusion
  • DIALOG training has influenced an entire
    generation of searchers we automatically shift
    into Boolean

12
Digression
  • Nested Boolean searches dont take advantage of
    the strong points of Web search engines
  • Statistical search engines search a whole
    territory. Boolean engines search for a point
    in that territory

13
Web Strategies
  • Map the territory
  • Use your searching skills to create lists of
    related terms
  • Omit Boolean operators
  • Let the search engine work without interference
  • Put the most important and most rare words first
  • Use MORE LIKE THIS to improve results

14
Web Strategies
  • Use phrases when possible to eliminate irrelevant
    materials
  • Ignore the useless hits and pursue the good ones
  • Dont worry about finding six million documents.
  • Just look at the top 30
  • Rephrase the search
  • Move to another search engine if you dont find
    anything

15
Conclusions traditional search services
  • Predictable archives
  • Chemical Engineering
  • Electrical Engineering
  • Strengths
  • History and background on companies
  • History and historical figures
  • Market reports, industry reports

16
Conclusions traditional search services
  • Current drug studies (authoritative)
  • Industry newsletters and journals
  • Financial industry coverage
  • Scholarly journal articles
  • High quality information
  • Quick searches when you know the information is
    likely to be there

17
Conclusions The Web
  • Pictures and illustrations
  • Some conference coverage and papers
  • Product information comes from company
  • Small companies products/ background
  • Medical statistics (current)
  • If you know where to find the information

18
Conclusions use both
  • To supplement each other for
  • Standards
  • Articles on topics of general interest
  • Popular subjects
  • Organizations
  • Directory information
  • Reviews/evaluations/how-to information

19
Conclusions use both
  • Government regulations and other agency
    information
  • Competitive intelligence
  • Obscure topics
  • Clues for finding information on and offline

20
Conclusions general
  • Time is money.
  • Free information that takes too long to find and
    format is expensive information
  • The Web is a new tool.
  • We need to learn to use both online sources
    well
  • Vary strategies and approach to take advantage of
    each medium
Write a Comment
User Comments (0)
About PowerShow.com