Boon Thau Loo1 - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Boon Thau Loo1

Description:

PIER A P2P Relational Query Processor over DHTs ... Output of one operator tree is placed in a storage mechanism (DHT) and is the ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 2
Provided by: unkn492
Category:
Tags: boon | loo1 | one | pier | thau

less

Transcript and Presenter's Notes

Title: Boon Thau Loo1


1
Analysis and Improvements of P2P Search Networks
Boon Thau Loo1 PIER (Joseph Hellerstein1,2, Ryan
Huebsch1, Sam Mardanbeigi1, Sean Rhea1, Timothy
Roscoe2, Scott Shenker1,3, Ion Stoica1) UC
Berkeley1 Intel Research Berkeley2 ICSI3
Gnutellas Search Characteristics
Introduction
  • - Unstructured Networks (Gnutella, KaZaA)
  • 2-level hierarchy Ultrapeers, Peers
  • Flooding-based
  • Effective for highly replicated items.
  • Inverted index over Distributed Hash Tables
    (DHTs)
  • Effective for rare items
  • Guaranteed recall but may use significant
    bandwidth.
  • Our Goals
  • Analysis of Gnutella.
  • Exploring a hybrid search network.
  • Querying Gnutella at scale.
  • Observe queries from 20 ultrapeers on Planetlab.
  • Dynamic flooding the fewer the query results,
    the further it travels in the network. TTL 3 to
    7.
  • 30-40 of queries return less than 10 results.
    Around 10 of queries return no results. Average
    query recall 22.
  • Can we use DHTs to improve query recall?
  • 1.5 hrs crawl. 105715 nodes, 17766 ultrapeers
  • Ultrapeer degree 75 leaves / 6 ultrapeers, 30
    leaves/ 32 ultrapeers

Hybrid Search Network
  • DHT-based Boolean Keyword Searching
  • PIER A P2P Relational Query Processor over
    DHTs
  • For each file, publish FileInformation(docID,
    location, filename, ) and InvertedIndex(keyword,
    docID)
  • Query Evaluation Strategies
  • - Pier-Gnutella Gateway
  • PIER nodes participating in Gnutella network as
    ultrapeers.
  • Selectively publishes observed horizon of
    ultrapeer (files in leaf nodes and query
    results).
  • Uses PIER for searching rare files. Rely on
    Gnutella for searching highly-replicated files.
  • Live deployment on Planetlab
  • 50 PIER-Gnutella ultrapeers.
  • Publish query results of queries with fewer than
    20 results.
  • Execute leaf queries that return no results
    after 60 seconds.
  • 20 reduction in queries with no results.

Querying Gnutella with PIER
Demo K-Horizon Gnutella Crawl from multiple
ultrapeers
  • PIER Recursive Queries
  • Output of one operator tree is placed in a
    storage mechanism (DHT) and is the input to
    either that tree or another tree.
  • Useful for querying network graphs (web
    hyperlinks, network topologies).
  • - Schema Node(src, hop, ultrapeer, files),
    Link(src, dst)
  • Avoid redundant work when horizon of ultrapeers
    overlaps.
  • Increased aggregate bandwidth and throughput.
  • Potential to exploit geographic proximity.
Write a Comment
User Comments (0)
About PowerShow.com