SI 503 Search and Retrieval - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

SI 503 Search and Retrieval

Description:

Mini ... Mini-EXERCISE: (pair and share) What is not search? List some examples of Non ... E.g., get inspiration from daily activities, or from the other ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 38
Provided by: qip
Category:
Tags: retrieval | search

less

Transcript and Presenter's Notes

Title: SI 503 Search and Retrieval


1
SI 503 Search and Retrieval
  • Prof. George W. Furnas
  • Prof. Amy Warner
  • Qiping Zhang
  • Mark Handel

2
SI 503 Search and RetrievalOutline for the Day
  • Welcome and brief intro to ourselves and the
    course
  • Mechanics of the course syllabus, requirements,
    etc.
  • Exercise Search is everywhere you look - Part 1
  • -- Break 1 --
  • Exercise Search is everywhere you look - Part 2
  • Exercise Search, Scale and Structure - Parts
    1,2,3
  • -- Break 2 --
  • Why SI students should care about search and
    retrieval
  • The Bigger Picture
  • How different searches fit together
  • How search fits with other activities
  • Looking to next week...

3
Welcome and Brief Intro to Ourselves and the
Course
  • Welcome to 503!
  • Who We Are
  • Instructors
  • Prof. George W. Furnas
  • Prof. Amy Warner
  • TAs
  • Qiping Zhang
  • Mark Handel
  • About the course...

4
Foundations Sequence
  • Use of Information (501)--concepts, issues and
    practices aimed at providing an understanding the
    actual use of information in real work settings
  • Choice and Learning (502)--examines how
    information affects rational choice making and
    how rational choice theory can be applied to the
    design and management of information systems
  • Search and Retrieval (503)--looks at search and
    retrieval in formation systems as a continuous
    process, ranging from concepts and procedures
    integral to human-mediated search, to the basic
    issues and mechanisms in collection search, to
    the data structures and algorithms necessary to
    automate the search and retrieval process
  • Social Systems and Collections (504)--considers
    collections of information resources in the
    broadest sense of the term, and the fundamental
    social processes within which such collections
    are embedded and the processes that shape their
    creation
  • Design and Management of Information Systems and
    Services (505)--prepares professionals to invent,
    develop, and implement new systems and services
    and manage their ongoing operation

5
Background and Motivation for SI 503
  • Serves as a gateway course for all
    specializations--Library and Information Services
    (LIS), Archives and Records Management (ARM),
    Human-Computer Interaction (HCI), Multi-Agent
    Systems Design (MAS), Economics of Information
    (EI)
  • Helps us determine the scope, magnitude and
    specific content of Search and Retrieval in
    this emerging, synergistic combination of fields
  • Is primarily based on concepts, issues,
    principles, and theories, rather than specific
    tools, techniques and practices, which are
    covered in advanced courses in specific
    specializations
  • Covers both professional and research literatures
    and perspectives

6
About Search and retrieval
  • Why are search and retrieval important?
  • In its most general form, looking for and getting
    things are significant parts of much human
    activity
  • from cavepersons hunting and gathering food
  • to scholars seeking previous literature,
    mathematicians seeking a proof, or engineers
    seeking a good design
  • Hierarchy of goals, reach an impasse, seek a
    resolution
  • We want to give foundations for understanding
    role of information technology and information
    professions in this activity

7
Approach of this Course
  • This course looks at search and retrieval from a
    variety of perspectives
  • Of use to professionals dedicated to
  • making information, technology, and people work
    together more successfully.
  • Range from understanding
  • how humans search the external visual world
  • and their internal memories
  • to fundamentals of both conceptual and
    computational aspects
  • of electronic information search and retrieval
  • to navigational search
  • to social and organizational memory and retrieval
    processes.

8
Mechanics of the course syllabus, requirements,
etc.
  • One line at course website
  • http//madison.si.umich.edu/Transfer/503Lecture01
    -intro.ppt.sit
  • Lets go look...

9
Mechanics of the course syllabus, requirements,
etc.
  • IMPORTANT
  • One more thing - always bring to class
  • your copy of the readings, for discussion
  • some blank paper and pen/pencil for exercises

10
Exercise Search Is Everywhere You Look
11
Exercise Search Is Everywhere You Look
  • Part 1
  • In pairs, brainstorm and write down as many
    examples of search as you can come up with be
    as broad as you can, being inclusive of all
    disciplinary and professional perspectives and
    real life as well (15 min.)
  • As a class, share our lists of examples, making a
    combined list (10 min.)

12
-- Break 1 --
  • We will restart promptly in 10 minutes!

13
Exercise Search Is Everywhere You Look
  • Part 2
  • In pairs again, try to determine some general
    categories or dimensions along which you would
    group the search examples we have generated (10
    min)
  • As a class, share our findings (5 min)

14
Discussion
  • What makes search hard vs. easy?
  • OPTIONAL Discuss How could info tech play a
    role?
  • OPTIONAL Talk about search v. retrieval
  • distinction
  • examples

15
Exercise Search, Scale and Structure
16
Exercise Search, Scale and Structure
  • Part 1 - Brute Force Search
  • N 1 volunteers
  • 1 searcher
  • N people to form collections of search items
  • Collection Line up. First 3 stand, rest
    squat/sit...
  • Searcher Find the person whose last name would
    come just before yours in alphabetical order
  • Try again with 10 search items (10 standing)

17
Exercise Search, Scale and Structure
  • Part 1 - The Brute Force Search (cont.)
  • The Brute Force List-Search Algorithm
  • 1 Go to beginning of line of people
  • 2 Let your best so far be nothing
  • 3 Ask person in front of you his/her name
  • 4 If it is before you alphabetically and closer
    than best so far, or if best so far is
    nothing,
  • Remember the new name as the new best so far
  • 5 If you are not at the end,
  • Move to next person
  • Go to Step 3
  • If you are at the end, best so far is your
    target (or if that is nothing, you are first in
    the ordering)

18
Exercise Search, Scale and Structure
  • Part 1 - The Brute Force Search (cont.)
  • Discussing the Algorithm
  • Structure
  • setup, iteration, stopping condition
  • Important properties
  • Well Definedness Do all the steps have clear,
    unambiguous meaning?
  • Correctness Does it do the right thing?
  • Completeness Does it work for all inputs?
  • Complexity How much resource does it take as the
    size of the input, N, gets larger?
  • Time
  • Space

19
Exercise Search, Scale and Structure
  • Part 2 - The Sort
  • Collection Everyone stand
  • We are going to sort you alphabetically Right to
    Left (our L-R)
  • Parallel Sort Algorithm
  • The set up
  • 1 count off from your right by twos (base 2 -)
  • 2 All 0s raise left hand, 1s raise right hand
  • 3 find the hand nearest yours
  • 4 hold it (and put your hands down)

20
Exercise Search, Scale and Structure
  • Part 2 - The Sort (cont.)
  • Now the actual sort part of the algorithm
  • 5 Ask your partner her/his name
  • 6 If you are out of alphabetical order, switch
    places
  • 7 If there were any switches...
  • everyone hold your current partners hand
  • raise your free hand
  • grab the nearest free hand (if there is one)
  • drop your old partner
  • you now have a new partner
  • go to to Step 5, and repeat
  • If there were no switches, you are done!

21
Exercise Search, Scale and Structure
  • Part 2 - The Sort (cont.)
  • Discussing the Algorithm
  • Structure
  • setup, iteration, stopping condition
  • Important properties
  • Well Definedness Do all the steps have clear,
    unambiguous meaning?
  • Correctness Does it do the right thing?
  • Completeness Does it work for all inputs?
  • Complexity How much resource does it take as the
    size of the input, N, gets larger?
  • Time
  • Space

22
Exercise Search, Scale and Structure
  • Part 3 - The Search
  • Binary Search of a Sorted List
  • Searcher
  • 1 Go to the person in the middle of the standing
    row
  • 2 Ask his/her name
  • 3 If he/she is before you alphabetically,
  • tell all those before (but not including)
    him/her to sit
  • If he/she is after you alphabetically
  • tell him/her and all those after to sit
  • 4 If there is more than one person standing
  • go to step 1
  • If only one is standing, he/she is your
    target!
  • (If no one is standing, you are alphabetically
    first.)

23
Exercise Search, Scale and Structure
  • Part 3 - The Binary Search (cont.)
  • Discussing the Algorithm
  • Structure
  • setup, iteration, stopping condition
  • Important properties
  • Well Definedness Do all the steps have clear,
    unambiguous meaning?
  • Correctness Does it do the right thing?
  • Completeness Does it work for all inputs?
  • Complexity How much resource does it take as the
    size of the input, N, gets larger?
  • Time
  • Space

24
Exercise Search, Scale and Structure
  • Conclusions
  • Scale Hurts - as N gets large, harder to find
    things
  • e.g., Brute force is O(N)
  • 10 items takes approx. 10 time units
  • 100 items takes approx. 100 time units
  • 1,000 items takes approx. 1,000 time units
  • 1,000,000 items takes approx. 1,000,000 time
    units
  • Organizing (e.g., sorting) takes up front effort
  • But, can lead to much more efficient search
  • e.g., binary search of sorted list is O(logN)
  • 10 items take approx. 3 time units
  • 100 takes approx. 7 time units
  • 1,000 takes approx. 10 time units
  • 1,000,000 takes approx. 20 time units

25
-- Break 2 --
  • We will restart promptly in 10 minutes!

26
Why SI students should care about Search and
Retrieval
  • Prep for future courses and specializations
  • HCI
  • Human mem/vis search
  • Better HCI for large systems, helping users
  • search for functionality, information,
  • Design Space search
  • CS/AI/ProblemSpace search for Intelligent
    interfaces
  • MultiAgent Systems
  • Multiagent Information search
  • CS/AI/ProblemSpace search
  • Design space search

27
Why Care? (cont.)
  • Organizational Behavior
  • How organizations maintain and access their
    accumulated knowledge about how to conduct their
    work
  • Information Economics
  • Optimization search
  • Producer-consumer matching
  • Economically optimized search multiagent search
  • Cost structure of information and search

28
Why Care? (cont.)
  • The Collection Perspective
  • Documents and collections are traditionally
    represented by fairly static mechanisms (i.e.,
    often by human or computer-generated surrogates).
    What would happen if we used concepts and
    methods outside this traditional paradigm to
    visualize virtual documents and collections?
  • Documents are traditionally organized within
    collections on the basis of topical or
    disciplinary similarity of items (LIS), on the
    statistical correlation of the words they contain
    (LIS), or on the basis of the organization of the
    institution from which they came (ARM). What
    would happen if we designed classification and
    other organizational schemes based on what we
    know about how human memory works?

29
Why Care? (cont.)
  • The Computer Science Perspective
  • Designers and developers, as well as service
    providers, of electronic information systems,
    need to know about the fundamental ways of
  • structuring and organizing information in a
    system (data structures)
  • searching basic structures and organizations
    efficiently (algorithms)
  • These same information professionals need to know
    about some of the basic properties of structures
    and algorithms
  • This is part of algorithmic thinking, which is
    fundamental to understanding both the feasibility
    and method for implementing a particular logical
    organization and search approach in an actual
    information system

30
The Bigger Picture
  • How different searches fit together
  • How search fits with other activities

31
How Different Searches Fit Together in The Big
Picture of Search
  • Mini EXERCISE (10min)
  • Look for interactions between the search examples
    (or close their variants)
  • Ways they dovetail
  • Ways the compete/complement
  • Pair (5min) and share (5min) list

32
Why care about the Big Picture of Search?
  • Rethinking the big picture in this age of change
  • In a stable world, things get optimized over time
  • established decomposition, compartmentalized,
    routinized
  • Physical libraries hold books, orgd and searched
    a particular way
  • e.g., Dont hold comic books, your personal mail,
    picture archives
  • In a changing world -- all thrown up in the air
  • taking a new view of the big picture...
  • new decompositions, new syntheses
  • e.g., should these things be treated more
    uniformly, integrated into other activities
    differently...

33
Why care about the Big Picture of Search? (cont.)
  • You should understand what the big picture and
    the various interrelationships
  • so the world can have more useful, integrated
    support tools
  • E.g.,
  • Human Memory search (for search terms)
  • Followed by computerized IR search (for docs
    containing those terms)
  • Followed by Human Visual search (of resulting doc
    lists)
  • Support it all with new IT design
  • That is, so you can better
  • develop more integrated support tools yourselves,
    perhaps
  • look for them, as others develop them
  • evaluate them, as they are proposed
  • use them better, once you have them
  • teach and encourage others to do the same

34
How to think about the Big Picture...A Quick
Intro to the MoRAS
  • We live in a world of many Responsive Adaptive
    Systems
  • Peoples heads, HCI systems, organizations,
    economies, culture, ...
  • Each studied separately by disciplines
    represented in the school
  • Each system, by being a RAS, has rough equivalent
    of
  • motivational mechanisms, choice behavior,
    sensors, effectors, and
  • memory with storage,
  • and (yes!) Search and Retrieval capabilities.
  • Instructive to examine analogies
  • The different RASs fit together
  • Not on different planets
  • but linked or coupled together in a single super
    system we inhabit
  • a Mosaic of Responsive Adaptive Systems (MoRAS)

35
The MoRAS
  • Example New technology introduced, like
    Caller-ID
  • Because of couplings, many parts of the MoRAS
    change
  • Each is perturbed, and responds and adapts
  • Peoples heads and beh, laws, new technology,
    marketplace, new businesses,...
  • Moreover - essence of Coupling in the MoRAS is
    Information
  • so altering IT alters the fundamental structure
    of the Mosaic
  • If we want to design in this env must understand
    the MoRAS
  • MoRAS search example
  • Human Memory -gtIR -gt Human Visual search
  • compete/complement each other...
  • all of this is our design space
  • Grander cost structure of information

36
Role SR play in the Bigger Task Picture
(optional)
  • Mini-EXERCISE (pair and share)
  • What is not search?
  • List some examples of Non-Search activities
  • E.g., get inspiration from daily activities, or
    from the other foundation courses
  • Info Needs
  • Choice, Learning
  • Social Systems, Collections
  • Design, Management
  • Consider
  • What is role of SR in these?
  • What is role of these in SR?

37
Looking to next week...
  • Week 2--January 15, 1998 (AW)
  • Topic Collection Search and Retrieval-- I
  • Discussion of the notions of collection/document/i
    nformation space from the LIS and archives
    perspectives, including definitions of what
    documents and collections are and how they are
    currently represented in a variety of systems and
    using a variety of mechanisms
  • Readings
  • Hagler Simmons (1991) ch. 2 8
  • Warwick Framework Dublin Core
    http9/12/09/www.bibsys.no/warwick.html
  • owley (1992) ch. 12 Miller (ch. 2 3)
  • Dont forget to bring readings, and paper and
    pen/pencil
  • Food Volunteers???
Write a Comment
User Comments (0)
About PowerShow.com