Networked Information Resources - PowerPoint PPT Presentation

About This Presentation
Title:

Networked Information Resources

Description:

A family of web feed formats used to publish frequently updated digital content, ... or ant) is a program which browses the Web in a methodical, automated manner. ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 19
Provided by: mct4
Category:

less

Transcript and Presenter's Notes

Title: Networked Information Resources


1
Networked Information Resources
  • RSS Podcast, Search engine basics

2
Information access
  • Searching
  • Browsing
  • Filtering
  • user profile/personalization/SDI (current
    awareness, RSS feed, )
  • collective filtering/social filtering (Group
    lens, del.icio.us )
  • content-based

3
RSS
  • Really Simple Syndication
  • A family of web feed formats used to publish
    frequently updated digital content, such as
    blogs, news feeds or podcasts.
  • http//www.lib.ntu.edu.tw/RSS/nturss.htm
  • Podcast from NLM
  • RSS Reader/Google reader
  • Users of RSS content use programs called feed
    "readers" or "aggregators" the user subscribes
    to a feed
    (Wikipedia)

4
Podcast exercise
  • The easy way
  • Podcast tutorial (in Chinese!)
  • Download 'audacity (to produce audio
    programming)
  • Create a blog
  • Post your programs on the blog
  • The advanced way
  • Edit XML manually, will talk about it later

5
Create your Podcast manually
  • 1. Create a website
  • 2. Prepare XML file for the RSS/Podcast feed
  • 3. Validate the XML file
  • http//feedvalidator.org
  • rss.scripting.com
  • Subscribe the podcast with one of the RSS readers

6
Information is
  • Everywhere
  • A wealth of information creates a poverty of
    attention
  • Hard to evaluate
  • Second-hand knowledge and cognitive authority
  • Organic
  • Knowledge grows in contexts
  • Structured
  • Bibliographic, behavioral
  • Easy to copy
  • First-copy cost, intellectual property

7
Division of intellectual labor
  • I know very little about the codes of knowledge
    used by the architect and the builder in the
    design and construction of the home, but I
    nonetheless have faith in what they have done.
    My faith is not so much in them, although I
    have to trust their competence, as in the
    authenticity of the expert knowledge which they
    apply something which I cannot usually check
    exhaustively myself (Giddens, 1990, The
    consequence of Modernity)

8
Second-hand knowledge
  • it is not that we have conducted a direct test
    of their knowledge. Rather, we have to cite
    indirect tests or indexes of credibility. The
    situation is one in which we may be faced with a
    number of different people all claiming to be
    knowledgeable on the subject, how can we choose
    among them, or how can we defend our choice once
    made? (Wilson, 1983 p.21).

9
How do search engines work?
  • They search the internet using crawler/spider
  • They keep an index of word/phrases they found
  • They allow user to search for words/phrases

10
Crawler
  • A web crawler (also known as a web spider or ant)
    is a program which browses the Web in a
    methodical, automated manner.
  • Web crawlers are mainly used to create a copy of
    all the visited pages for later processing by a
    search engine that will index the downloaded
    pages to provide fast searches.
    (adapted from Wikipedia)

11
More on crawler
  • A web crawler is one type of software agent. In
    general, it starts with a list of URLs to visit.
    As it visits these URLs, it identifies all the
    hyperlinks in the page and adds them to the list
    of URLs to visit, recursively browsing the Web
    according to a set of policies.
  • Search engine crawler simulator

12
Information retrieval models
  • Extract match Boolean logic
  • Best match rank search results by relevancy
    score

13
Define relevancy
  • Location of keywords
  • Frequency of keywords (search engine spamming)
  • Link analysis PageRank

14
PageRank
  • PageRank relies on the uniquely democratic nature
    of the web by using its vast link structure as an
    indicator of an individual pages value.
  • In essence, Google interprets a link from page A
    to page B as a vote, by page A, for page B.
  • Important, high-quality sites receive a higher
    PageRank, which Google remembers each time it
    conducts a search.

15
Cognitive authority and IR
  • Two kinds of bibliographic control (Patrick
    Wilson)
  • Describing vs. Exploiting
  • Hints of authority (Human information behavior)
  • Rational calculation or blind faith

16
The link structure of the Web
  • Hubs and authorities

Source Kleinberg and Lawrence (2001). The
Structure of the Web. Science.
17
The link structure of the Web
  • Within community link density

Source Kleinberg and Lawrence (2001). The
Structure of the Web. Science.
18
Google desktop
  • Google your files stored on your pc and Web pages
    you have seen..
  • Google Desktop
  • More reference on Google Desktop
Write a Comment
User Comments (0)
About PowerShow.com