UP2P - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

UP2P

Description:

Gnutella, Routing Indices, Limewire, Neurogrid. Query Routing. Napster, Kazaa, Limewire, JxtaSearch. Metadata. Community Problem ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 47
Provided by: alokemukhe
Learn more at: http://www.employees.org
Category:
Tags: up2p | limewire

less

Transcript and Presenter's Notes

Title: UP2P


1
U-P2P
  • A Peer-to-peer System for Description and
    Discovery of Resource-sharing Communities
  • Aloke Mukherjee, Carleton University
  • August 28, 2003

2
Peer-to-peer File-sharing
  • Exploit storage capability of the edge
  • Balance load
  • Robustness to failure
  • Weaknesses Search and Communities

3
Search Problem
  • Lack of structured metadata
  • Filenames, Keyword matching
  • Opaque identifiers
  • Support for popular formats
  • Ignoring structured metadata
  • Implicit indicators
  • Collaborative filtering

4
State of the Art Search
5
Community Problem
  • Not simple to create a community for sharing a
    new file format
  • Current state
  • Different protocols/apps (gnutella, fasttrack,
    jxtasearch)
  • Inadequate metadata (filename matching, limited
    schemas)
  • Ad-hoc attempts aimed at specific domains
  • Scattered and isolated there is no easy way to
    discover communities

6
State of the Art Communities
7
Improving Search
  • Standard metadata layer
  • Explicit structured metadata
  • All resources are XML files
  • XML Schema used to describe format (e.g. MP3,
    design pattern)

8
Schema instantiates resource

typestring typestring typestring typestring typestring typeanyURI

singleton
gang of four when
creating a new class ensure
a class only has make the
class itself responsible
http//example.com/singleton.jpg
9
Automated interface generation
xslt
instantiates
xslt
10
(No Transcript)
11
(No Transcript)
12
Community Creation and DiscoveryWhat is a
Community?
  • Concrete object with defined tuple of attributes
  • Simplest form (format, protocol, )
  • Known examples
  • (mp3, napster) (video, kazaa)
  • Examples that dont exist
  • (design patterns, gnutella) (p2p papers,
    jxtasearch)
  • Tuple is specified as a XML file

13
Simplifying Community Creation
designpatterns
designpattern.xsd
gnutella designpatt
ern.stylesheet
  • User-designed communities
  • Compose schema to describe format
  • Compose community XML file

14
Community as class

15
Metaclass analogy

16
Community discovery is File discovery
  • MP3 community shares MP3 files
  • Community community shares communities

17
Simplifying Community Discovery
  • A Community for Communities The Root Community
  • Communities are files shared in a real community
  • Root Community includes schema for communities
  • (format, protocol) (community, centralized db)

18
Schema for Communities

name"name" type"xsdstring"/ name"protocol" type"protocolTypes"/



root community
community.xsd
central-db
community.stylesheet y
The Root Community
19
What is U-P2P?
  • A framework that breathes life into these ideas
  • Explicit metadata search and creation for every
    Community
  • Creation of Community tuples
  • (format, protocol etc)
  • Discovery of Community tuples

20
Design
21
Technologies
  • Java
  • Tomcat Servlet Container
  • Java Server Pages (JSP) Servlets
  • XSLT (transforms), XPath (queries)
  • Java components for XSLT, XPath (Xerces, Xalan)
  • eXist XML Database
  • Log4j (logging infrastructure), JUnit (unit
    testing)

22
Evaluation and Validation Areas of Interest
  • Publish and Search times as Community size
    increases
  • Breaking down Publish and Search operations
  • Community effect
  • Multiple central servers

23
Publish
24
Search
25
Community Effect
26
Multiple Central Servers
27
Publish with Multiple Servers
28
Vs. Without Multiple Central Servers
29
Contributions
  • Standard Metadata Layer
  • All communities include support for explicit
    metadata search and creation
  • User-designed Communities
  • Users can easily share new formats with full
    support for metadata
  • Community for Communities
  • Prevents fragmented, isolated communities by
    providing metadata about communities and a
    standard method for discovering them
  • Performance and Scalability Gains
  • Communities can improve performance and
    scalability vs. systems where resources are
    undifferentiated

30
Future Work
  • Performance improvements
  • Protocol independence (adapters for Gnutella,
    Freenet, etc.)
  • Community-aware Gnutella routing
  • More Community parameters (security,
    authentication, etc.)

31
Future Work continued
  • Trust metrics (to differentiate between
    communities, metadata quality)
  • Community evolution
  • Inheritance and multiple inheritance for
    Communities

32
U-P2P Publications
  • A. Mukherjee, B. Esfandiari, N. Arthorne, U-P2P
    A Peer-to-peer System for Description and
    Discovery of Resource-sharing Communities, ICDCS
    Workshops 2002 701-705, July 2002.
  • Neal Arthorne, Babak Esfandiari and Aloke
    Mukherjee, "U-P2P A Peer-to-peer Framework for
    Universal Resource Sharing and Discovery,
    Proceedings of Freenix track of Usenix 2003,
    29-38, June 2003.
  • http//u-p2p.sourceforge.net

33
Backup slides
34
WebAdapter User Interaction Model
35
Repository Design
36
Repository Design Resource IDs
37
Repository Design XML Database
  • Requirements
  • Flexibility to store wide variety of formats
  • Handle powerful queries over all metadata
  • XML Database better suited than RDBMS
  • Difficult to map fields to rows and columns
  • Chose eXist XML database
  • Open source
  • Written in Java
  • Support for XMLDB API

38
Network Adapter Design
  • Abstract interface to Peer-to-peer Network
  • Routing search requests, handling results, handle
    incoming search requests, etc.
  • Only implemented Hybrid model (Napster model)
  • All peers can act as client and/or server

39
Network Adapter Protocol
40
Evaluation and Validation Challenges
  • Finding large XML collections
  • Berkeley Drosophila Genome Project genome
    annotations
  • Other sources DBLP (CS papers), EDGAR (SEC
    filings), GeneOntology (gene-related concepts)
  • Transforming DTDs to XML Schema (DTDXS package)
  • Automation
  • XML-RPC interface for publish and search

41
Publish Breakdown of Operations
42
Publish Client Timings
43
Publish Server Timings
44
Network Adapter Protocol
45
Search Breakdown of Operations
46
Search Total vs. Server Timings
Write a Comment
User Comments (0)
About PowerShow.com