Arpeggio: Metadata Searching and Content Sharing with Chord - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Arpeggio: Metadata Searching and Content Sharing with Chord

Description:

9/22/09. 1. Arpeggio: Metadata Searching and Content Sharing with Chord ... Arpeggio: a peer-to-peer file-sharing network. based on the Chord lookup primitive. ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 22
Provided by: hpcCsTsi
Category:

less

Transcript and Presenter's Notes

Title: Arpeggio: Metadata Searching and Content Sharing with Chord


1
Arpeggio Metadata Searching and Content Sharing
with Chord
  • Austin T. Clements, Dan R. K. Ports, David R.
    Karger
  • MIT Computer Science and Artificial Intelligence
    Laboratory

2
Introduction
  • Arpeggio a peer-to-peer file-sharing network
  • based on the Chord lookup primitive.

3
Introduction
  • Instead of storing content data directly on peers
  • small pointers to this data are managed in a
    DHT-like fashion

4
Metadata
  • metadata the file name, its format, etc
  • text documents metadata can be extracted
    manually or algorithmically.
  • Some types of files have metadata built-in for
    example, ID3 tags on MP3 music files

5
metadata block
  • the entire contents of each items metadata can
    be kept in the index as a metadata block
  • along with information on how to obtain the file
    contents.

6
Chord
  • Chord provide an efficient LOOKUP primitive that
    maps a key to the node responsible for its value.

7
Searching
  • the DHT maps each keyword to a list of all files
    whose metadata contains that keyword.
  • ??????????,?????,???metadata?

8
Keyword-Set Indexing
  • keyword set indexes
  • the number of index entries for a file with m
    metadata is I(m) (at most K keywords in search
    query)

9
Index-side filtering
  • A user may only be interested in files of size
    greater than 1MB, or MP3 files with a bitrate
    greater than 128 Kbps

10
Metadata Expiration
  • blocks may contain out-of-date references to
    files.
  • refresh metadata that is still valid, thereby
    periodically resetting its expiration counter.

11
Index Gateways
  • a file is shared by many nodes.
  • Each node will be inserting the same metadata
    block repeatedly

12
Index Gateways
  • If the metadata block already exists and is not
    scheduled to expire soon, then there is no need
    to re-insert it.

13
(No Transcript)
14
Index Replication
  • maintain the index despite node failure
  • index insertions can be propagated periodically
    as part of index replication
  • Arpeggio requires only weak consistency of
    indexes

15
Content Distribution
  • it maintains pointers to each peer that contains
    a certain file.
  • the large file content remains on its originating
    nodes.

16
Segmentation
  • Peers that do not have an entire file are able to
    share the chunks they do have.
  • (just like BitTorrent)

17
Segmentation
  • storing in the DHT a file block for each file,
    which contains a list of chunk IDs, which can be
    used to locate the sources of that chunk

18
Content-Sharing Subrings
  • Diminished Chord protocol
  • the subring is identified by the chunk ID and
    consists of the nodes that are sharing that
    chunk.
  • LOOKUP for a random Chord ID in the subring

19
Content-Sharing Subrings
  • a node has finished downloading a chunk, it
    becomes a source and can join the subring.
  • Content-sharing subrings offer a general
    mechanism for managing data.

20
Postfetching
  • The peer insert a request block into the
    network for a particular unavailable file.
  • a source of that file rejoins the network and
    find the request block
  • sending the file to randomly selected nodes with
    available cache space.

21
Thanks!
Write a Comment
User Comments (0)
About PowerShow.com