Title: Web Search Principles
1Web Search Principles
2Introduction
- The World Changes Constantly
- New Companies
- Countries
- News and Issues
- Geography
- Products and Product innovations
- Online resources to improve our quality of life
3The Vast Internet
- 8 Billion pages
- Tripled in last two years and growing rapidly
4Web is an Immense Resource
- Research your competitors
- Research gospel topics and do genealogy
- download music, software, drivers
- Product evaluations
- Job searching
5The Good, Bad, and the Ugly
- Gospel Resources Vs. Smut
- Credible Vs. Non-credible
6How the Web is Indexed
- Search Engines
- Single-Threaded (Spider-based)
- Multi-threaded (Meta) Search Engines
- Subject Indexes
- Combinations of the above
7Characteristics of All Web Indexes
- Limited Samples
- No central, all-inclusive index
- Samples are partially overlapping
- All indexes miss some content
- Snapshots
- never completely current
- You need effective searching strategies
8 of Searches by U.S. Web Searchers
(includes altavista and alltheweb)
(Jeeves)
As of July 2005 Source http//searchenginewatch.c
om/reports/article.php/2156431
9Sample Characteristics
NotesSelf-reportedHow deep into the page
indexing occurs Stats are as of November 2005
Source http//blog.searchenginewatch.com/blog/041
111-084221
10Original Sources of Content (July 2005)
Source http//searchenginewatch.com/reports/artic
le.php/2156431
11Features Common to Popular Search Engines
- Boolean searching (AND, OR, NOT)
- Ignore Capitalization
- Wildcards (e.g., three mice) Can search for
terms with other words in between - Search in Domain
Source http//www.philb.com/compare.htm
12Single-Thread Search Engines
- Powerful Searching Features
- (Match All Terms)
- Match Any Term
- - (exclusion)
- Phrases
- Title Search
- URL Search
- Domain Search
13Other Features of Popular Search Indexes
Source http//www.philb.com/compare.htm
14Features Provided by Some of the Listed Search
Engines
Source http//www.philb.com/compare.htm
15Multi-threaded (Meta) Search Engines
- Sample Multiple Other Databases in one interface
- Broader sample than single-thread
- Can chose which other single-thread to sample.
- Less options for specific search control
- Cant do URL, Title, and exclusion Searches
16Examples
- Metacrawler
- http//www.metacrawler.com
- Dogpile
- http//www.dogpile.com
17Subject Directories
- Organized by humans into Topical Categories
- Most Popular Examples
- http//www.yahoo.com/
- Organized by about 100 yahoo employees
- http//www.dmoz.org/
- Organized by an army of volunteers
- Volunteer editors are specialists in their topics
- Most complete topically
18Use a Subject Directory When Looking for the
Following
- Popular subjects
- e.g., computer games, the history of popular
music - A broad subject
- e.g., the history of art, the third world, the
environment - A so-called metapage
- a webpage compiled and made available by an
expert or hobbyist who has collected URLs about a
particular subject.
19In-Class Examples
20Summary Search Tips
- Use search terms, combination, and phrases.
- Reduce the number of haystacks searched by
finding a specialized haystack - Information in a known domain
- Search within a focused directory like Google
Directory search - If your index does not return results
- Try a different index
- Try a meta search engine