Web search engines - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Web search engines

Description:

Find out all documents which match the query (including all or some of the words ... Goole takes up to 10 words. 11. Suggestions for a more effective search ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 13
Provided by: sheng3
Category:
Tags: engines | goole | search | web

less

Transcript and Presenter's Notes

Title: Web search engines


1
Web search engines
2
Web search engines
  • Brief introduction
  • Some observations

3
Major steps in Web search services
  • Collect Web pages
  • Index Web pages (documents)
  • Search services

4
Index Web pages
  • Stemming
  • step-step,
  • private,privately,privation,privatize-privat,
  • future,futurism-futur
  • Removing stopwords such as a, an, the,
    to, there, and so on.
  • Indexing

5
Search service
  • Find out all documents which match the query
    (including all or some of the words in the query)
  • Rank documents and put the most relevant
    documents (estimated) in the top.
  • Ranking algorithm is the key part of a Web search
    service.

6
Several factors for ranking
  • Words and their occurring frequencies in the
    documents
  • Authority of web documents (via link analysis)
  • User feedback

7
Observation 1
  • Web search engines do not understand the meaning
    of information on each web pages.
  • Examples in the Titanic topic, many documents
    are about the film Titanic, and one company
    using the text Why the Titanic Sank to
    ironically sell kitchen and bathroom sinks.
  • Another example is the searching for how
    volcanoes are formed bring back pages on
    Volcanoes on the planet Mars.

8
Observation (2)
  • Uncertainty of Web search results
  • 1. Best results were obtained when wording
    changed and some other words added. On most
    occasions the original phrase did not an adequate
    search.
  • 2. No one type of search comes out best,
    sometimes basic searches work perfectly,
    sometimes the advanced features are required.

9
Observation (3)
  • Huge amount of retrieved pages available for a
    query
  • For volcanoes formed, 287,000 results,
  • for volcanoes made, 473,000 results and for
    describe volcanoes 142,000 results.
  • What can we learn from here? Ranking is very
    important. Another small thing

10
Observation (4)
  • Short queries work better than long queries
  • Also please note for most Web search engines,
    usually there is a limitation for the number of
    words they can take for a query. Goole takes up
    to 10 words.

11
Suggestions for a more effective search
  • Use those words which most likely occur in the
    documents you are looking for.
  • If possible, avoid using very common words,
    especially the so-called stop-words.

12
Suggestions for a more effective search
  • In some situations, we cannot find out the wanted
    documents in one step. Then further actions are
    needed.
  • Do some research to find out the reason.
  • 1. The result set is too big and the wanted
    documents are not top-ranked 2. It seems that
    the result set does not include the wanted
    document.
  • For situation 1, refining the query to be more
    selective (e.g., add more words, use exact
    search, etc.) and promote the wanted documents to
    the top positions. For situation 2, use some
    different words for a new search or try another
    Web search service if you think it is possible
    that the wanted documents are not included in a
    particular Web search engine.
Write a Comment
User Comments (0)
About PowerShow.com