A Step Back Reflections on P2P Techniques - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

A Step Back Reflections on P2P Techniques

Description:

Bit Torrent. Gnutella. Kazaa. Low. Effect. High. Trust. Encryption. Anonymity. Freenet. Oceanstore ... (2003), a little bit of convergence has happened, ... – PowerPoint PPT presentation

Number of Views:201
Avg rating:3.0/5.0
Slides: 42
Provided by: ellic
Category:

less

Transcript and Presenter's Notes

Title: A Step Back Reflections on P2P Techniques


1
A Step BackReflections on P2P Techniques
  • Indranil Gupta
  • March 16, 2006
  • CS 598IG. SP06.

2
Lets keep it Short Today
  • 2 P2P or Not to P2P
  • Scooped, Again

3
2 P2P or Not 2 P2P?
  • Mema Roussopoulos
  • Mary Baker
  • David S. H. Rosenthal
  • TJ Giuli
  • Petros Maniatis
  • Jeff Mogul

4
Candidate problems
Kerry
  • Internet Routing (RON)
  • Resource Sharing (PlanetLab)
  • Cooperative Web Caching
  • Internet Backup and Corporate Backup
  • Distributed Digital Libraries
  • Distributed Monitoring
  • Ad hoc Routing in Disaster Recovery
  • Metropolitan-area Cell Phone Forwarding

5
Ideal P2P properties
  • Self Organizing
  • P2P routing
  • Discovery
  • Symmetric communication
  • Peers are approximately equal
  • Decentralized control
  • No single point of failure

6
P2P Networks
Usenet
Gnutella
Images from http//www.cybergeography.org/atlas/mo
re_topology.html
7
2 P2P or not P2P
Budget
Relevance
Trust
8
Budget
Low
Effect
High
  • Lowest possible cost per peer, rather than lowest
    global cost
  • Bittorrent, Gnutella, Freenet, etc.
  • SETI_at_home
  • Dictates how many peers join
  • Decides if P2P is viable for problem
  • Worries less about performance criticality
  • Favors centralized approaches, P2P irrelevant
  • Clusters, High performance computing

9
Relevance
Low
Effect
High
  • Personal data
  • Private data
  • Internet backup
  • Corporate backup
  • Web caching
  • Relevance of resources encourages peers to join
  • When resource relevance is high, cooperation in
    a P2P solution evolves naturally
  • File sharing
  • Freenet
  • Content distribution
  • Internet routing
  • Bit Torrent
  • Gnutella
  • Kazaa

10
Trust
Low
Effect
High
  • Encryption
  • Anonymity
  • Freenet
  • Oceanstore
  • Ivy
  • Timestamping
  • MojoNation
  • Mutual trust
  • Risks
  • Gnutella
  • Napster
  • Overlays
  • File sharing
  • Usenet

11
Rate of Change
Low
Effect
High
  • Tangler
  • Freenet
  • LOCKSS
  • Time stamping
  • Content distribution
  • Usenet
  • Flash crowds
  • Churn
  • Timeliness
  • Consistency
  • Internet routing
  • Online net monitoring

12
Criticality
Low
Effect
High
  • Usenet
  • Content distribution
  • Offline net study
  • File sharing
  • Centralized control
  • Accountability
  • Fault tolerance
  • Ad hoc disaster recovery
  • Flash crowds
  • Internet monitoring
  • Routing

13
2 P2P or not P2P
Budget
Relevance
Trust
14
Conclusion
  • Framework for analyzing P2P applications
  • Captures constraints and app requirements
  • Limited budget is motivating factor
  • Problems with low relevance are inappropriate for
    P2P
  • Same as our Penny Lane motivation for P2P
    systems

15
Critique
  • Strengths
  • Quantifies application requirements and suitable
    use cases
  • Generically describes suitability of classes of
    P2P apps
  • Weaknesses
  • High churn p2p inappropriate? Or most current
    non-Kelips solutions insufficient?
  • Why cant p2p systems handle critical
    applications? Its a question of developing the
    right, e.g., real-time, technologies.
  • Why is the order of preference budget relevance
    trust churn criticality? Why not a
    different ordering?
  • Fuzzy requirements not accounted for
  • Other requirements will they evolve as new p2p
    applications emerge?

16
Scooped, Again
  • Jonathan Ledlie
  • Jeff Shneidman
  • Margo Seltzer
  • John Huth

17
Outline
  • Introduction
  • Grid Computing
  • P2P Systems
  • Fallacies preventing cooperation
  • Shared and Disjoint Problems
  • Conclusions

What they are, Goals, Manifestations
18
Introduction
  • Peer-to-Peer vs. Grid Computing
  • Overlapping problem domain
  • P2P focuses on research
  • Grid is concerned with concrete, tangible
    solutions
  • History, repeated the Web

19
Introduction cont.
  • Current trends
  • Divergent, parallel development
  • Duplication of work
  • Grid risk of non-optimal solutions
  • Missing out on P2Ps strong achievements (search
    and storage scalability, decentralization,
    anonymity, denial of service prevention)
  • Cooperation is the key

20
Grids
  • What is the Grid?
  • a type of parallel and distributed system that
    enables the sharing, selection, and aggregation
    of resources distributed across multiple
    administrative domains based on the resources
    availability, capability, performance, cost, and
    users QoS requirements
  • Short version virtualizing computer resources
  • Large scale heterogeneous resource sharing
    (different platforms, hardware/software
    architectures, and computer languages)
  • Functional classification
  • Computational grids (run batch jobs during idle
    times)
  • Data grids

21
Grid Layout
22
Grid Goals
  • Design goal
  • Solve problems too big for a single
    supercomputer, but retain the flexibility to work
    on multiple smaller problems
  • Self-configuring, self-tuning, self-healing
  • Allow data sharing and support computation across
    administrative domains
  • Standardized programming interface
  • GGF (Global Grid Forum)
  • Globus toolkit the de facto standard for grid
    middleware

23
Grid Manifestations
  • Protocols
  • Resource management
  • Grid Resource Allocation Management Protocol
    (GRAM)
  • Information services
  • Monitoring and Discovery Service (MDS)
  • Security services
  • Grid Security Infrastructure (GSI)
  • Data movement and management
  • Global Access to Secondary Storage (GASS),
    GridFTP
  • Tools
  • Grid Portal Software (GridPort, OGCE)
  • Grid Packaging Toolkit
  • Grid-enabled MPI (MPICH-G2)
  • Network Weather Service
  • Condor (CPU cycle scavenging) and Condor-G (job
    submission)
  • APIs
  • Web Services Open Grid Services Architecture
    (OGSA)

24
P2P
  • What is P2P?
  • a class of applications that take advantage of
    resources storage, cycles, content, human
    presence available at the edges of the
    Internet
  • Decentralized, non-hierarchical node organization
  • Inherently untrusted (well)

25
P2P Goals
  • Cost sharing / reduction
  • Every peer responsible for its own cost
  • Reduction of file storage costs
  • Reduction of computation costs
  • Improved scalability / reliability
  • Lack of centralization allows new algorithms
    (CAN, Chordetc) to be designed to allow improved
    scalability
  • Resource Aggregation
  • Every peer lends its own resources to the network
  • Increased Autonomy
  • Tasks are performed locally no central service
    provider

26
P2P Goals cont.
  • Anonymity / Privacy
  • FreeNet
  • Dynamism
  • Nodes enter and leave the system in a transparent
    way
  • Ad-hoc communication
  • Members can join and leave based on their
    physical location or interests

27
Summary
  • Grids
  • Parallel, distributed systems concerned with
    resource sharing, selection, aggregation
  • Resource availability, capability, performance,
    cost, and user QoS requirements are considered
  • Self-configuring, self-tuning, self-healing
  • Idle cycle and storage utilization
  • P2P
  • Distributed systems that take advantage of
    resources scattered throughout the Internet
  • Decentralized, non-hierarchical node organization
  • Concerned with fault-tolerance, scalability,
    availabilityetc.
  • Idle cycle and storage utilization

28
Summary cont.
  • Grid
  • Distributed computation
  • distributed.net
  • SETI_at_home
  • Data production / aggregation
  • P2P
  • Distributed file sharing
  • Gnutella, KaZaA
  • Distributed computation
  • distributed.net
  • Anonymity
  • Freenet, Publius

29
Outline
  • Introduction
  • Grid Computing
  • P2P Systems
  • Fallacies preventing cooperation
  • Shared and Disjoint Problems
  • Conclusions

What they are, Goals, Manifestations
30
Fallacies preventing cooperation
  • The technical problems in Grid systems are
    different from those in p2p systems
  • Usage misconception Grid for computing problems,
    P2P for file sharing
  • Data handling and data production in Grid systems
    has become important
  • P2P used in desktop collaboration and network
    computation
  • open problems in both camps have striking
    similarities

31
Fallacies preventing cooperation
  • While the technical problems are similar, the
    architectures (physical topology, bandwidth
    availability and use, trust model, etc.) demand
    that the specific solutions be fundamentally
    different
  • Solving common problems through sharing good
    ideas from each community
  • Application dependent special requirements
    tailored to application needs, however the
    technical approaches for solving a particular
    problem could benefit both communities

32
Fallacies preventing cooperation
  • Grid projects do not have the flexibility to try
    new algorithms/ideas because they have to get
    real work done. P2P research is all about this
    flexibility
  • Grid has room for flexible research, too
  • Testing new applications and protocols
  • Users willing to adopt different technologies to
    get the work done

33
Outline
  • Introduction
  • Grid Computing
  • P2P Systems
  • Fallacies preventing cooperation
  • Shared and Disjoint Problems
  • Conclusions

What they are, Goals, Manifestations
34
Shared problems
  • Topology Formation
  • Node join and neighbor discovery
  • Work has been done by both groups
  • Grid On fully decentralized resource discovery
    in grid environments
  • P2P Self-organization in p2p systems
  • Grid infrastructure in not flexible hard coded
  • Could benefit from P2P research prototypes

35
Shared problems cont.
  • Utilization
  • Resource discovery, data retrieval
  • P2P hash-based look-up schemes are useful
  • Resource management / optimization
  • How to best utilize resources in a network
  • Data replication/caching examined by both
    communities
  • Scheduling and handling of contention
  • P2P focus bandwidth usage (e.g. Gnutella)
  • Grid focus scheduling
  • Load balancing break large tasks into
    distributed smaller ones

36
Shared problems cont.
  • Coping with Failure
  • P2P lossy storage model (Freenet, Gnutella)
  • Considerations for Grid adaptability
  • Different common loss model
  • Storage size (O(petabyte/month))
  • Security-related issues
  • Authenticity verification of data/computation
  • Availability resilience to DoS attacks
  • Authorization ACLs

37
Shared problems cont.
  • Maintenance
  • P2P essentially no standards or APIs
  • Efforts by Berkeley BOINC, Google Compute,
    overlay standardization
  • Grid pushes for a standardized API
  • GGF (Global Grid Forum)
  • OGSA (Open Grid Services Architecture)
  • Web services oriented API Globus as reference
    implementation

38
Disjoint Problems
  • Anonymity
  • Not really useful for Grid systems, yet

39
Conclusions
  • A lot of overlap between the goals and research
    interests of the two communities
  • P2P community needs to consider the needs of the
    Grid users to see how existing research can be
    applied successfully to Grid problems
  • Aim for common standards as much as possible

40
Critique
  • Since this paper was published (2003), a little
    bit of convergence has happened, but not as much
    as predicted by these authors and as predicted by
    Foster et al
  • Will it just take more time?
  • (Skeptics Viewpoint) Really? Arent P2P and Grid
    two different areas?
  • They still have mostly-disjoint research
    communities
  • Or is that an opportunity for more researrch?

41
Have a good Break!
  • Remember Midterm report (with initial
    experimental data) is due April 2!
Write a Comment
User Comments (0)
About PowerShow.com