A categorized list of top websites for Internet exploration

INTRODUCTION Search Engines - What are they?
How do they work? How about some tips
CHOICES Whats the difference between
searches, metasearches and other online
RESOURCES A categorized list of top websites for
Internet exploration
The Web is potentially a terrific place to get
information on almost any topic. Doing research
without leaving your desk sounds like a great
idea, but all too often you end up wasting
precious time chasing down useless URLs. Almost
everyone agrees that there's got to be a better
way. But for now we're stuck with making the
best use of the search tools that already exist
on the Web.
The main devices we use for finding topics on
the Intenet are search engines. Search engines
use software robots to survey the Web and build
their databases. Web documents are retrieved and
indexed. The primary method of searching is by
inputting a keyword. Most search engines now
index every word on every page. If a search
engine finds your keyword on a webpage, it puts
that webpage on your results list.
The Problem With Keyword Searching Keyword
searches have a tough time distinguishing between
words that are spelled the same way, but mean
something different (I.e. hard cider, a hard
stone, a hard exam, and the hard drive on your
computer). This often results in hits that are
completely irrelevant to your query. Some
search engines also have trouble with so-called
stemming --i.e., if you enter the word "big,"
should they return a hit on the word, "bigger?.
What about singular and plural words? What
about verb tenses that differ from the word you
entered by only an "s," or an "ed"? Search
engines also cannot return hits on keywords that
mean the same, but are not actually entered in
your query. A query on heart disease would not
return a document that used the word "cardiac"
instead of "heart."
Refining Your Search Advanced Searches Most
sites offer two different types of
searches--"basic" and "advanced." In a "basic"
search, you just enter a keyword without sifting
through any pulldown menus of additional
options. Advanced search refining options
differ from one search engine to another, but
some of the possibilities include the ability to
search on more than one word, to give more
weight to one search term than you give to
another, and to exclude words that might be
likely to muddy the results. You might also be
able to search on proper names, on phrases, and
on words that are found within a certain
proximity to other search terms.
Refining Your Search Boolean Logic Boolean
logic refers to the logical relationship among
search terms, and is named for the British
mathematician George Boole. Boolean logic
consists of three logical operators OR - AND -
NOT Each operator can be visually described by
using Venn diagrams, as shown on the following
pages. NOTE Not all search engines permit
Boolean Searches. Click here for a Search Engine
Feature Chart.
  • college OR university
  • Query I would like information about college.
  • In this search, we will retrieve records in which
    AT LEAST ONE of the search terms is present. We
    are searching on the terms college and also
    university since documents containing either of
    these words might be relevant.
  • This is illustrated by
  • the shaded circle with the word college
    representing all the records that contain the
    word "college"
  • the shaded circle with the word university
    representing all the records that contain the
    word "university"
  • the shaded overlap area representing all the
    records that contain both "college" and
  • OR logic is most commonly used to search for
    synonymous terms or concepts.
  • Here is an example of how OR logic works
  • Search terms Results
  • college 17,320,770
  • university 33,685,205
  • college OR university 33,702,660

The more terms or concepts we combine in a search
with OR logic, the more records we will retrieve.
For example Search terms Results college 17
,320,770 university 33,685,205 college OR
university 33,702,660 college OR university OR
campus 33,703,082
  • poverty AND crime
  • Query I'm interested in the relationship between
    poverty and crime.
  • In this search, we retrieve records in which BOTH
    of the search terms are present
  • This is illustrated by the shaded area
    overlapping the two circles representing all the
    records that contain both the word "poverty" and
    the word "crime"
  • Notice how we do not retrieve any records with
    only "poverty" or only "crime"
  • Here is an example of how AND logic works
  • Search terms Results
  • poverty 783,447
  • crime 2,962,165
  • poverty AND crime 1,677
  • NOTE On some search engines you must type in the
    symbol rather than AND

The more terms or concepts we combine in a search
with AND logic, the fewer records we will
For example Search terms
Results poverty 783,447 crime
2,962,165 poverty AND crime
1,677 poverty AND crime AND gender 76 A
few Internet search engines make use of the
proximity operator NEAR. A proximity operator
determines the closeness of terms within a source
document. NEAR is a restrictive AND. The
closeness of the search terms is determined by
the particular search engine. For example, NEAR
in AltaVista (Power Search) is 10 words. As
another example, Google defaults to proximity
searching by default.
  • cats NOT dogs
  • Query I want to see information about cats, but
    I want to avoid seeing anything about dogs.
  • In this search, we retrieve records in which ONLY
    ONE of the terms is present
  • This is illustrated by the shaded area with the
    word cats representing all the records containing
    the word "cats"
  • No records are retrieved in which the word "dogs"
    appears, even if the word "cats" appears there
  • Here is an example of how NOT logic works
  • Search terms Results
  • cats 3,651,252
  • dogs 4,556,515
  • cats NOT dogs 81,497
  • NOT logic excludes records from your search
    results. Be careful when you use NOT the term
    you do want may be present in an important way in
    documents that also contain the word you wish to
  • NOTE On some search engines you must type in the
    symbol - rather than NOT

Want more info?
Take an online course (for free) and work at your
own pace...
How do Search Engines Work? Search Engines for
the general web do not really search the World
Wide Web directly. Each one searches a database
of the full text of web pages selected from the
billions of web pages out there residing on
servers. When you search the web using a search
engine, you are always searching a somewhat stale
copy of the real web page. When you click on
links provided in a search engine's search
results, you retrieve from the server the current
version of the page. Search engine databases are
selected and built by computer robot programs
called spiders. Although it is said they "crawl"
the web in their hunt for pages to include, in
truth they stay in one place. They find the pages
for potential inclusion by following the links in
the pages they already have in their database
(i.e., already "know about"). They cannot think
or type a URL or use judgment to "decide" to go
look something up and see what's on the web about
it. (Computers are getting more sophisticated all
the time, but they are still brainless.) If a web
page is never linked to in any other page, search
engine spiders cannot find it. The only way a
brand new page - one that no other page has ever
linked to - can get into a search engine is for
its URL to be sent by some human to the search
engine companies as a request that the new page
be included. All search engine companies offer
ways to do this. After spiders find pages, they
pass them on to another computer program for
"indexing." This program identifies the text,
links, and other content in the page and stores
it in the search engine database's files so that
the database can be searched by keyword and
whatever more advanced approaches are offered,
and the page will be found if your search matches
its content. Some types of pages and links are
excluded from most search engines by policy.
Others are excluded because search engine spiders
cannot access them. Pages that are excluded are
referred to as the Invisible Web -- what you
don't see in search engine results. The Invisible
Web is estimated to be two to three or more times
bigger than the visible web.
  • What Are "Meta-Search" Engines?
  • In a meta-search engine, you submit keywords in
    its search box, and it transmits your search
    simultaneously to several individual search
    engines and their databases of web pages. Within
    a few seconds, you get back results from all the
    search engines queried. Meta-search engines do
    not own a database of Web pages they send your
    search terms to the databases maintained for
    other search engines.
  • What's WRONG with relying on Meta-Searchers? The
    idea of meta-searching is much better than the
    reality. You would think you would save a lot of
    time by searching only in one place and sparing
    the need to use and learn several seperate search
    engines. In fact, that is what people claim. But,
    in truth, meta-searchers offer a quick and dirty
    approach to searching that sometimes works. Take
    a look at these drawbacks to them
  • None of them searches Google (unless they pay)
    or Northern Light (ever). Google is the BEST
    search engine database and Northern Light is very
    important in academic research.
  • Most of them dumbly pass your search terms on,
    without any concern to what happens to your
    carefully place " " or AND, OR or AND NOT, let
    alone your NEAR or you or -. (Ixqquick and
    ProFusion handle complex searches intelligently.)
  • If you search does not get what you want, you
    do not have the ability to refine your search as
    you in what I consider the most powerful search
    engines around (Google, AltaVista, Northern Light
    HotBot). All you can do is add a term and
    wonder where the meta-search engine is sending
  • None of the meta-search engines consistently
    queries all of the search engine it claims to
    query, and you don't know for sure what it is
    querying until you read the results. If you use
    ProFusions advanced search, you have the best
    control available.

This and the preceding page were reprinted with
permission from http//www.lib.berkeley.edu/Teach
What are Online Databases A Database is any
collection of data organized for storage in a
computers memory. It is designed for easy
access by authorized users. The data may be in
the form of text, numbers, or encoded graphics.
Online databases allow users to search for
specific information within a web site or to
access an archived database of information.
Online databases are usually content specific.
They can include encyclopedia and other library
reference information, journal articles,
collections of art, clip art or literature,
etc. Searching an online database sometimes makes
a lot of sense! If you wanted to find a specific
map, for instance, you would come across one
faster by searching an online atlas database than
by using any search engine. There are several
links included in the RESOURCE chapter to clip
art and library reference material. If you would
like to see what other type of online databases
exist, I recommend this University of Richmond
Search Engines
Kids Engines
Online Reference
Clip Art Sites
Goggle is a very powerful search engine for
everyday use. Its also has a special section
that allows you to just search for images.
HOTBOTs greatest benefit is its advanced search
feature. A template helps you create a Boolean
or phrase search, or limit by media type, date,
domain, etc.
AltaVista is one of the Internets largest search
engines. Alta Vista includes little words (such
as a, to, be, not ) in the search so that you may
search on often-ignored words in a phrase (e.g.
"Vitamin A" or "to be or not to be").
All The Web allows users to search specific
databases in its system web pages, pictures,
videos, and mp3 files. Its advanced
search feature contains modifiable word filters.
Each engine that Ixquick queries has its unique
strengths and vulnerabilities, so some but not
all of each engine's top choices are likely to be
relevant for you. Because engines have different
vulnerabilities, irrelevant sites are unlikely to
be prominently selected by multiple engines. When
engines agree that a site is tops, and have
reached that decision in different ways, the site
is likely to be relevant.
The New ProFusion has over 1000 sources divided
into 200 search groups. You can search hundreds
of sites for targeted content in one query! As
well as providing access to the Web's most
authoritative sources, ProFusion lets you search
more than 500 sources from the Invisible Web, a
vast resource of information neglected by
traditional search engines.
DogpileSM, the most used and popular metasearch
engine, utilizes the Web's best search engines
simultaneously, returning the most comprehensive
search results from across the Internet. Dogpile
combs the most popular search engines to compile
a complete list of matches for a user's query and
organizes the results by each individually
searched resource.
Searchalot searches over 69 search engines at
the same time. It saves you time and really does
find what you are searching for! Searchalot
even has a global search with over 237
international search engines.
A searchable online encyclopedia
Another searchable online encyclopedia
A searchable database of all commonly found
reference books including the Columbia
Encyclopedia, Rogets, Bartletts and more!
An extensive collection of links to general and
subject specific reference resources.
ILOR is a search engine which provides very
useful ways of handling search results. You can
manage search lists and control views in very
unique ways.
Ask Jeeves is a search engine with the useful
ability of generally following natural language.
You can therefore ask it questions and it will
answer them with appropriate search results.
Type in a word and find its rhymes, synonyms,
antonyms, definition, related words, similar
sounding words, homophones, or similarly spelled
An engine that categorizes results into folders
by concept, document and site type. This list
(on the left side of the screen) helps to find
the type of information you really need!
