Searching the Internet - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Searching the Internet

Description:

Metasearch Engines. This engine searches search engines. Databases not ... Knowledge of Search and Meta-search Engines. Understanding of searching techniques. ... – PowerPoint PPT presentation

Number of Views:197
Avg rating:3.0/5.0
Slides: 28
Provided by: businessco7
Category:

less

Transcript and Presenter's Notes

Title: Searching the Internet


1
Searching the Internet
  • Shamrock Rovers

2
Overview
  • Directories
  • Open Directory Project
  • Search Engines
  • Meta-Search Engines
  • Search Techniques
  • Intelligent Agents
  • Invisible Web

3
Introduction
  • The Internet has an endless amount of
    information.
  • Two main tools for searching Search Engines and
    Directories.
  • Search Engines are the primary tool we use to
    search the Internet.

4
Popularity of Search Engines
  • Web Directories are not as popular as search
    engines.
  • Survey by Neilsen Mega Service Reporting Service

5
Definitions
  • A Web Directory is compiled by human endeavour,
    with an editor team creating a hierarchical
    structure based on the subject matter.
  • A search engine is a database compiled by robots
    which are automated agents crawling the web
    continuously by means of hyperlinks and sending
    back pages to the search engine for indexing by
    other automated agents.

6
Web Directories
  • Web directories are a web service that
    categorizes web pages so that you can browse
    links to web pages by topic.
  • Popular web directories include Yahoo!,
    www.Yahoo.com And Looksmart, looksmart.com and
    the open directory, www.dmoz.com.

7
Search Strategies for Search Engines
  • A Basic Search Strategy to Search the web is
  • Start your Web browse, such as Netscape or
    Internet Explorer
  • Pick a directory or index you like and tell your
    browse to go to the index or directory's home
    page.
  • If a Search box is available, type some likely
    keywords in the box and click Search
  • If you see a list of links to topic areas, click
    a topic area of interest.
  • Adjust and repeat your search until you find
    something you like. -

8
Open Directory Project
  • Participate not only staff but also users.
  • Its based on volunteer work.
  • Advantage
  • Economical.
  • Closer to users.
  • Disadvantage
  • Introduce annoying rubbish such as ads.
  • Difficult for maintenance. -

9
Web Crawlers ( Spider )
  • Web specific Software Agents.
  • Retrieve and store web pages to search engines.
  • Revisit and update stored pages.
  • Follow links on pages to find new sites.

10
Google
  • Features
  • Simplistic
  • Intuitive
  • Multinational
  • Efficient
  • Intelligent cache and page info
  • Proximity search
  • Uses custom bot, called GoogleBot.

11
Yahoo
  • Features
  • Lots of easy access on front page
  • Large corporate sponsor driven
  • Proximity searching

12
Alta Vista
  • Features
  • Medium access index page
  • Large selection of search options
  • Stores cache of entire page

13
Ask Jeeves
  • Features
  • Friendly interface
  • Humorous and inviting
  • Clean interface
  • -

14
Metasearch Engines
  • This engine searches search engines.
  • Databases not needed.
  • More reliable.
  • e.g. www.dogpile.com
  • Convenient for itself,
  • but inconvenient for other search engines!
  • Its robbery! -

15
Search Techniques
  • There are a variety of search engines but the
    main principles behind them are the same.
  • Search strings are treated as key words.
  • Words such as A, IN, THE, AND, FOR can be omitted
    from the search string.

16
Exact Phrase Search
  • Most basic search.
  • Usually returns pages that match any word in the
    search string.
  • Can also return the exact phrase searched for
    although it is rare to get one.
  • For a successful search, search for any word then
    refine the search.

17
Boolean Search
  • Allows the user control over how the engine uses
    the words in a search string.
  • There are 4 operators AND, OR, NOT, NEAR.
  • AND requires all the words in the search string
    to be present.
  • OR at least one of the words
  • NOT negation operator. E.g dogs AND breed NOT
    food.

18
Other Searches
  • Other searches provide more control to the user.
  • These searches include Title, Site and URL
    searches.
  • These searches are pretty self explanatory. For
    example Title search searches the titles of
    websites.
  • The use special commands typed into the search
    engine. -

19
Information Retrieval
  • Searching the web and online databases is viewed
    as an information retrieval problem in Computer
    Science.
  • Three retrieval methods exist
  • Statistical
  • Semantic
  • Contextual

20
Statistical Semantic Retrieval
  • Statistical
  • This method emphasizes correlations of word
    counts in documents and document collections
  • Semantic
  • This method uses natural-language processing and
    Artificial Intelligence to process search
    requests from the user.

21
Contextual Retrieval
  • This method takes advantage of the structural
    and contextual inferences in documents by use of
    a thesaurus and encoded relationships among words.

22
Intelligent Agents / Bots
  • Computer Program which gathers information based
    on human input
  • Search engines deploy them
  • Collect and Share information
  • Can also be used alone
  • Can be used by users as a standalone searching
    tool deployed whenever the user needs
    information.

23
Advantages of Bots
  • They can communicate and co-operate with other
    agents to perform user-tasks quicker.
  • A Users Bot resides on the users computer and
    waits for service orders, day or night.
  • A User can customise intelligent agents and
    adjust them to fit their preferences and wishes.
  • Intelligent Agents are able to continuously scan
    the internet for information.
  • -

24
Invisible Web
  • The Visible Web is what we see from the results
    of search engines.
  • The Invisible Web (or Deep Web) is the hidden Web
    content that has been excluded from search engine
    results.

25
Searching the Invisible Web
  • Because of different formats searching has become
    problematic.
  • Search engines find Web pages by following links
    and can therefore display special databases that
    do not require a password.

26
Searching the Invisible Web (cont)
  • Some websites have a directory which includes
    content not found by search engines. (Invisible
    WebCatalog)
  • Invisible Databases allow users to search for
    invisible content. Some sites include
  • www.internet.com
  • http//incywincy.com
  • http//dir.lycos.com/reference

27
Conclusion
  • Understanding of Directories, Open Directory
    Project and Search Engines.
  • Knowledge of Search and Meta-search Engines.
  • Understanding of searching techniques.
  • Be able to use intelligent agents to search.
  • Understand and be able to search the invisible
    Web.
Write a Comment
User Comments (0)
About PowerShow.com