LIS618 lecture 1 - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

LIS618 lecture 1

Description:

Looking for mate on the Internet retrieves a lot of singles' pages. ... set is a search result set creates a subset of the 'statistically most relevant ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 36
Provided by: kric
Learn more at: http://openlib.org
Category:
Tags: internet | is | lecture | lis618 | most | on | searched | subject | the | what

less

Transcript and Presenter's Notes

Title: LIS618 lecture 1


1
LIS618 lecture 1
  • Thomas Krichel
  • 2004-02-01

2
structure of talk
  • Recap on Boolean (aurally)
  • Before online searching
  • Working with DIALOG
  • Overview
  • Search command
  • Boolean exercise (on the fly)

3
before a search I
  • What is the purpose of the query?
  • brief overview
  • comprehensive search
  • What perspective on the topic is required?
  • scholarly
  • technical
  • business
  • popular

4
before search II
  • What type of information does the patron want?
  • fulltext
  • bibliographic
  • directory
  • numeric
  • Are there any known sources?
  • authors
  • journals
  • papers
  • conferences

5
before search III
  • What are the language restrictions?
  • What, if any, are the cost restrictions?
  • How current need the data to be?
  • How much of each record is required?

6
concept analysis
  • This is the art/science of taking the topic to
    search for and develop facets. Example Internet
    filtering in Libraries
  • Internet filter
  • Libraries
  • Controversy not technical issues
  • We may also need the think about the aim of the
    search.

7
search aims
  • a known needle in a known haystack
  • a known needle in an unknown haystack
  • an unknown needle in an unknown haystack
  • any needle in a haystack
  • the sharpest needle in a haystack
  • most of the sharpest needles in a haystack

8
search aims
  • all the needles in a haystack
  • affirmation of no needles in a haystack
  • things like needles in a haystack
  • is there a new needle in the haystack
  • where are the haystacks
  • needles, haystacks, anything

9
types of searches
  • known-item searches
  • negative searches
  • selective dissemination of information
  • topical or subject searches
  • passage searching, where the user is only
    interested in part of the item

10
search strategies I
  • Building block approach
  • Do a number of elementary searches
  • Combine the resulting sets with Boolean operators
  • This is what I did in the example in the previous
    lecture
  • Works only with the Boolean model

11
search strategies II
  • Snowballing approach
  • Start with a very specific query
  • Think of other term that can be added to get more
    results
  • Stop when a reasonable number of results are
    achieved.
  • Not sure this really works well in practice.

12
search strategies III
  • The successive fraction approach is the opposite
    of the snowballing approach
  • First search for a broad concept
  • Then repeat the query by adding various limiting
    factors.
  • Can work well if the IR system allows to repeat
    and edit queries.
  • But queries can become unwieldy.

13
search strategies IV
  • Most specific facet first
  • Conduct concept analysis
  • Look for the most specific facet
  • Search that first, add others later
  • Presupposes that you have done a decent concept
    analysis.

14
two steps in DIALOG
  • step one select databases (aka files) to look at
  • step two perform searches on the selected
    databases
  • You may wonder why one does not have one single
    step like in a search engine. Discuss.
  • today we concentrate on the second step

15
working on selected files
  • We assume that we have selected database that we
    know and we look at the search interface on the
    selected database.
  • The database selection process is a bit more
    complicated, covered next week.
  • First, let us login and look at the command
    prompt.
  • Then we select the first database (file) with the
    begin command

16
the begin command
  • As its name suggests, usually the first command.
  • begin number, number,
  • selects files with numbers number
  • Once they are selected they can be searched.
  • Now select the ERIC "begin 1"
  • "Begin 1" can be abbreviated as "b 1"

17
substeps in the second step
  • Identify search terms
  • Use Dialog basic commands to conduct a search
  • View records online or print the results

18
the 's' (select) command
  • Once issued the "begin" command to select a
    database, we issue the "s" command on the
    database.
  • "s query_expression" where query_expression is a
    query expression.
  • This will search the index of selected database
    in full-text view for the query issued
  • It will not find any of the following "an and by
    for from of the to with". They are stop words.

19
query expression
  • A query expression contains search terms
    expressed in special ways
  • You can truncate search terms.
  • You can build an elementary expression by putting
    several keywords together. This is achieved by
    DIALOG's connectors.
  • You can combine several expressions with the use
    of Boolean operators
  • We will cover this is in turn now.

20
truncation of terms I
  • Open Truncation
  • "select path?" retrieves all words that begin
    with path paths, pathos, pathway, pathology
  • Controlled-Length Truncation
  • "select path??" retrieves the root and up to two
    additional characters paths, pathos

21
truncation of terms II
  • Embedded Character truncation can be used for
    variant spellings
  • "select organi?ation" -gt organization
    organisation 
  • "select fib??board" -gt fiberboard fibreboard 
  • This truncation feature is also useful for
    searching for unusual plural forms
  • "select wom?n" -gt woman women
  • Apparently you can also do prefixes by putting
    the ? in the beginning.
  • "?mobile" -gt automobile metamobile

22
use of connectors
  • Connectors are used to put several words
    together.
  • One instance where this is useful is when you
    have words that on their own mean different
    things.
  • For example "mate" is a herbal beverage consumed
    in South America. Looking for mate on the
    Internet retrieves a lot of singles' pages.

23
example terms related to "mate"
  • What other terms to be used?
  • matear (drink mate)
  • matero (mate drinker)
  • cebar (prepare mate)
  • cebador (mate preparer)
  • yerba (mate herb)
  • bombilla (mate straw)

24
connectors I
  • '(W)' requires terms to appear one after the
    other next to each other e.g. 'yerba(W)mate?'
    matches "yerba mate".
  • '(i W)' where i is an integer, means followed by
    at most i words, e.g. 'ceba?(3W)mate?' matches
    "cebar un maravilloso mate" but not "cebador
    guapo mirando un buen mate"

25
connectors II
  • '(N)' requires terms to be next to each other
    e.g. 'yerba(N)mate?' matches "yerba mate" or
    "mate yerba".
  • '(i N)' where i is an integer, means proximity by
    at most i words, e.g. 'ceba?(3N)mate?' matches
    "cebar mate" or "matear con la cebadora".
  • '(S)' searches for the occurrence of connected
    terms in the same paragraph.

26
using Boolean operators
  • In your query, you can combine several
    expressions with Boolean operators
  • Example "S LIBRARY(W)SCHOOL? AND
    DISTANCE(W)EDUCATION"
  • But I usually do not issue such fancy queries.

27
executing several searches
  • There can be several searches done sequentially,
    and the results sets are saved by the system.
  • Each time the system assigns a set number, Si,
  • These can be combined in Boolean expressions,
    e.g. 's S1 or S2 and S3'
  • Remember that Boolean operations are
    set-theoretic!

28
Boolean operators on sets
  • When using Booleans, be aware that "and" has
    higher precedence than "or".
  • Thus
  • a or b and c
  • is not the same as
  • (a or b) and c
  • but it is
  • a or (b and c)
  • Use parenthesis when in doubt

29
DS (display sets)
  • This command can be executed any time to review
    the sets that have been formed since the last B
    (begin) command.
  • This can be useful to review your search history.

30
the target command
  • "target set" where set is a search result set
    creates a subset of the "statistically most
    relevant results" in the original set.
  • I have not seen details about how this subset is
    computed.
  • A new result set is being formed.

31
display the type command
  • type set/format/range
  • set is a result set
  • format is a format
  • range can be
  • start end
  • start is a record number to start
  • end is a record number to end
  • all

32
standard delivery formats
  • 2 -- full record except abstract
  • 3 or medium citation
  • 5 or long full except full text
  • 6 or free title and dialog number
  • 8 or short title plus indexing terms
  • useful to find other indexing terms
  • 9 or full everything
  • KWIC or K keywords in context

33
options for delivery
  • I once tried to email results to me, to no avail
  • You can save the html of the search results in
    the browser.
  • You can print the results within the browser.

34
http//openlib.org/home/krichel
  • Thank you for your attention!

35
  • to do set up consistent notation
Write a Comment
User Comments (0)
About PowerShow.com