WIRED Future - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

WIRED Future

Description:

Information Retrieval: representation, storage, organization of, and access to ... Fridays - movie reviews, show times, previews. Monthly - stocks and funds ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 26
Provided by: gslisU
Category:
Tags: wired | future

less

Transcript and Presenter's Notes

Title: WIRED Future


1
WIRED Future
  • Quick review of Everything
  • What I do when searching, seeking and retrieving
  • Questions?
  • Projects and Courses in the Fall
  • Course Evaluation

2
WIRED Focus
  • Information Retrieval representation, storage,
    organization of, and access to information items
  • Focus is on the user information need
  • User information need
  • Find all docs containing information on Austin
    which
  • Are hosted by utexas.edu
  • Discuss restaurants
  • Emphasis is on the retrieval of information (not
    data, not just a keyword match)

3
Quick Overview of the IR Process
Documents
documents
index
Ranking
match
Information Need
?
query
4
Indexing and Searching
  • Queries models work against the index
  • Find words, word counts, phrases
  • Sequential search, indexed search
  • Inverted Files Other Indices
  • Boolean Queries
  • Sequential Searching
  • Pattern Matching
  • Structural Queries
  • Data structures
  • The infrastructure of search
  • Varied per data set and query contexts

5
Personalized IR system design
  • How would you design a personal IR system?
  • Who would use it?
  • How would you learn about them?
  • Interests
  • Sources
  • Preferences
  • How do you evaluate a personal system?
  • Understanding users is the key to personalizing
    search or search interfaces.

6
Information Seeking in Context
Learning
Information Seeking
Information Retrieval
Browsing Strategy
Analytical Strategy
7
How do we search?
  • Analytical
  • careful planning
  • recall of query terms
  • iterative query reformulations
  • examination of results
  • batched
  • Browsing
  • heuristic
  • opportunistic
  • recognizing relevant information
  • interactive (as can be)

8
Behavioral Model
  • Recurring Web behavioral patterns that relate
    peoples browser actions (Web moves) to their
    browsing/searching context (Web modes)
  • Modes of scanning Aguilar (1967) Weick Daft
    (1983, 1984)
  • Moves in information seeking behavior Ellis
    (1989) Ellis et. al. (1993, 1997)

9
ISeek Behaviors Web Moves
10
What do I use?
  • Starting
  • Bookmarks and groups of bookmarks
  • Search javascripts
  • Chaining
  • Tabbed windows
  • Bookmarking
  • Printing
  • Browsing and Differentiating
  • Firefox/Mozilla recommended links
  • Blogrolls and PageRank
  • Monitoring
  • RSS feeds with RSS reader
  • (Moderated) Listservs
  • Extracting
  • Saving as HTML, Text, or PDF
  • Bookmarking Printing

11
How do we really use the Web?
  • People dont read, they scan Web pages
  • We move quickly, we know we can go back
  • Quick experimentation short memory
  • Behaviors that work are reinforced continued
  • Satificing makes measures of quality difficult

12
How do I use the Web?
  • Set of standard, daily Web pages
  • Set of occasional Web pages
  • Fridays - movie reviews, show times, previews
  • Monthly - stocks and funds
  • Quick focus on a subject, build a set of
    documents related to that and file for later use
  • I scan quickly down the page and then back up the
    page
  • Site maps, other links, walk up the URL

13
Future Social Issues
  • Who controls the sharing?
  • Who controls the controls?
  • Give to get systems
  • Anonymity vs. Community
  • Community of friends
  • People as data points
  • Free riders
  • Logrolling and Over-rating

14
Future Filtering for IR
  • How about filtering, without the collaboration?
  • Individual preferences
  • Implicit and Explicit
  • Text is analyzed
  • Feature extraction
  • Recall precision measures
  • New models for multidimensional
    users/uses/ratings
  • Relevance Feedback
  • Faster matching, more accurate
  • Metadata (use data, preferences)

15
Future Community Centered CF
  • Forming and keeping community
  • Interfaces, functionality
  • Helping people find new information
  • Interactive search
  • Group browsing
  • Mapping community (prefs?)
  • Daily News
  • Rating Web pages
  • Incenting users to share
  • Providing access to stored preferences
  • Fair, open data collection
  • Users can tune data

16
WWW Documents Investigation
  • How do you collect data like this?
  • Web Crawler
  • URL identifier, link follower
  • Index-like processing
  • Markup parser, keyword identifier
  • Domain name translation (and caching)
  • How do these facts help with indexing?
  • Have general characteristics changed?
  • (This would be a great project to update.)

17
Metadata
  • Information that describes a document that is not
    (necessarily) in the document
  • Describes the document in relation to other
    documents
  • Context about the Content
  • Document semantics
  • Internally consistent descriptions of content for
    individual documents, document sets or a
    specified set of content.
  • For collections or individual documents

18
Metadata Types
  • Dublin Core elements
  • MARC (machine readable cataloging)
  • What isnt machine readable?
  • Semantic Web elements
  • Bottom-up, derived data
  • Format-based
  • ASCII, EBCDIC
  • RTF
  • PostScript PDF
  • MIME

19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
Digital Libraries
  • We all have them
  • Email boxes, archives
  • Papers written
  • Bookmarks
  • What I have
  • 4GB of academic technical papers
  • Mostly PDF, HTML, text
  • Indexed using Adobe Catalog, htDig, OS X Search
  • Data sets from previous studies
  • Program code
  • Scanned documents

23
Big DigLib Questions
  • Whats a document?
  • A file or link
  • How do you trace track the information source?
  • Filenames, memory, metadata
  • How do you integrate the variety of documents
    metadata?
  • Stick to standard formats
  • What kind of storage model?
  • Version Control system
  • Server storage
  • Filenames and directories
  • When do you Index?
  • Continuously
  • After a backup
  • Mostly boolean searching with attributes

24
Course Evaluations (next week)
  • Volunteer to get, distribute, collect and turn-in
    evaluations
  • Overall level of class expertise relevant for
    you?
  • Favorite readings type of readings?
  • Least favorite (obscure difficult) readings?
  • Project ideas and group organization tools?
  • Assignments Group Work vs. Papers?

25
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com