IEG5270 Advanced Topics in P2P Networking Introduction - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

IEG5270 Advanced Topics in P2P Networking Introduction

Description:

How to deal with unreliable peers? Peers come and go 'Churn' ... Another company bought Napster, still use the name to operate on-line music shop ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 40
Provided by: courseIe
Category:

less

Transcript and Presenter's Notes

Title: IEG5270 Advanced Topics in P2P Networking Introduction


1
IEG5270 Advanced Topics in P2P
NetworkingIntroduction
  • Dah Ming Chiu
  • Chinese University of Hong Kong

2
What is P2P networking?
  • What P2P applications are you aware?
  • What distinguishes a P2P application from a
    non-P2P application?
  • Why are we interested in P2P networking?

3
P2P traffic in Internet
4
Client-server applications
  • Traditional applications are all client-server
  • A service provider must set up a server e.g.
  • Email server
  • Web server
  • A client is configured with servers address/port
  • Server responds to clients one by one

Server
5
Limitations of client/server appls
  • Scalability
  • Compute power, access bandwidth, storage
  • Costs
  • Server, bandwidth, power, management
  • Privacy concerns
  • Some are for illegal reasons
  • The need for fixed address/port

6
Peer-to-peer applications
  • In a pure p2p network, every node is both a
    client and a server
  • As clients increases, servers also increases
  • perfectly scalable
  • Also
  • Distributes costs
  • Increases privacy
  • May use dynamic addresses

7
Hurdles for p2p applications
  • How to find where things are?
  • instead of static configuration of servers
  • How to make peers able to help each other?
  • A peer usually need to acquire the right content
    to be able to serve others
  • How to deal with unreliable peers?
  • Peers come and go Churn
  • How to make peers willing to help each other?
  • Incentives
  • Why should other peers be trust-worthy?

8
A short history of some note-worthy p2p
applications
  • Napster (1999-2002)
  • First p2p file sharing application
  • Relies on a centralized directory to help find
    content
  • Once peer X knows peer Y has what it wants, X
    contacts Y directly
  • X and Y often exchange copy-right protected
    material
  • Napster server shutdown due to law suits
  • Another company bought Napster, still use the
    name to operate on-line music shop

9
History - cont
  • Gnutella (2000-?)
  • Completely distributed p2p file sharing
  • Peers form an overlay network each peer knows
    its neighbors
  • Each peer floods its request to all other peers
  • Prohibitive overheads
  • Open source

10
History - cont
  • Kazza or FastTrack (2001-2003)
  • Partly centralized, rely on super-nodes
  • Also met with law suits
  • Skype (2003-now)
  • Created by same folks who created Kazza
  • Use similar technology as Kazza to discover VoIP
    destinations, instead of illegal content
  • Plus other bells and whistles (codec, encryption,
    instant messaging, friends list/presence)
  • Provide relay service if firewalls/NAT prevent
    connectivity

11
History - cont
  • BitTorrent (2002-now)
  • Does not deal with the content discovery problem
    (avoiding legal problem)
  • Solves another problem how to distribute a file
    to a (large) number of peers
  • Divide a file into many parts chunks
  • Give chunks to different peers
  • Peers rely on tracker to find other peers and
    what chunks they have
  • Content flows through multiple, ad-hoc trees
  • Several variations, and several other protocols
    (e.g. eDonkey)

12
History - cont
  • PPLive (2005-now)
  • Use similar technology as BT to stream Video (TV)
  • Need to have a directory of TV channels/programs
  • Some earlier start-ups had legal problems
  • Mainly popular in China
  • 10s of millions of steady customers
  • Several competitors PPStream, UUC

13
Summary
  • Success stories
  • Skype p2p VoIP
  • BitTorrent p2p file sharing
  • PPLive p2p streaming
  • Some other Client-server appls may change to p2p
    model
  • Microsofts study of using P2P for VoD service
    (Sigcomm 2007)
  • Google to use p2p for YouTube?

14
Other notable P2P projects
  • LOCKSS (Lots of Copies Keep Stuff Safe)
  • Data preservation mechanism (suppose libraries
    want to keep books preserved, even when
    publishers die)
  • Data replicated many times
  • How to deal with sabotage? Peers do periodic
    voting to validate all copies.

15
Other P2P projects
  • SETI_at_home (search for Extra-Terrestrial
    Intelligence)
  • Peers devote their compute power to help analyze
    radio signals from a telescope, to check for
    signs of ET
  • P2P because peers are autonomous, operation is
    distributed
  • A grid computer really

16
Other P2P projects
  • RON (Resilient Overlay Network)
  • In the Internet, packets do not always flow along
    the shortest paths due to ISP peering
  • Peers help relay packets to support shortest path
    routing as much as possible
  • Peers form an overlay network, and check delay
    between each pair of neighboring nodes
  • RON demonstrated that it can provide lower delay
  • Overlay versus p2p

17
Instant messaging vs p2p
  • Some consider ICQ as the first p2p application
  • In some sense, it is similar to Skype
  • The communication part is p2p
  • The search part may be partly via a centralized
    server
  • Instant messaging may be integrated with file
    sharing and streaming, as well as VoIP (e.g. QQ
    and QQlive, a p2p application in China)

18
Summary of other p2p appls
  • The idea of P2P overlaps with application layer
    infrastructures, e.g.
  • Grid computing
  • Overlay networks
  • Distributed databases, search
  • DHT is a building block for many such
    infrastructures

19
Important building blocks
  • How to find things?
  • Centralized approach
  • Flooding
  • Partially distributed approach
  • DHT
  • How to distribute content to multiple peers?
  • IP multicast
  • Structured tree(s) (or push)
  • Unstructured (or data-driven, or pull)

20
Distributed Hash Tables
  • Each object (e.g. file, chunk) is mapped to a key
  • The key space is partitioned among the peers
  • Given a key, you know where to store/retrieve the
    object
  • There are quite a few DHT proposals
  • Chord, CAN, pastry
  • Every DHT supports one main operation
  • Given a key, route to a peer that holds the key

21
DHT cont
  • A DHT should have the following properties
  • Decentralization no central management needed
  • Scalability supports millions of nodes
  • Reliability still works when peers leave and
    join
  • Other properties
  • Small diameter, small degree, get to destination
    node in log(n) steps
  • Compared to index server no single point of
    failure
  • Compared to flooding more efficient
  • Limitation supports lookup, not search

22
The magic of streams
  • Traditional multicast
  • Single tree
  • receivers do not contribute
  • P2P content distribution
  • use multiple trees to distribute chunks
  • peers all contribute uplink bandwidth
  • Peers are unreliable
  • use data-driven approach to build distribution
    trees dynamically

23
Summary of building blocksto be covered in this
course
  • DHT
  • Multi-tree content distribution
  • Models for understanding the capacity achieved by
    these algorithms
  • Others incentive systems etc

24
The economics of p2p applications
  • P2P applications induce a huge amount of traffic
    for ISPs
  • P2P applications can often extract profit from
    users in one way or another
  • Skype provide gateway into phone networks to
    collect money
  • Streaming and VoD service can generate targeted
    advertising opportunities (the same game Google
    plays)

25
Net neutrality
  • ISPs are often prevented from sharing the profits
    from content providers (who use p2p technology)
  • To extract such profit, ISPs need almost monopoly
    power
  • Governments do not want ISPs to become monopolies
  • net neutrality is implicit business model
  • Charge users based on volume of bits, not on
    content

26
The tussle between ISPs and P2P
  • P2p file sharing or streaming is determining its
    own routes for content to travel
  • ISPs settlements depend on their roles in peering
    relationships
  • E.g. customer ISP pays both transit providers,
    but may be transiting some p2p traffic for them

Transit Provide1
Transit Provide2
Customer ISP
27
Tussle (conflict)
  • How should ISPs consider peering decisions in
    view of p2p traffic?
  • How should ISPs provision their peering
    bandwidth?
  • How should ISPs charge subscribers (peers)?
  • How should ISPs manage its p2p traffic?

28
P2P traffic monitoring and detection
  • Traditional ISP settlements based on using
    netflow
  • Gives total traffic volume, and volume for each 5
    minute interval
  • ISP can see traffic types based on well-known
    ports
  • P2p flows may not use well-known ports
  • Deep packet inspection check signature in
    payloads
  • What if payload is encrypted?

29
Detecting P2P traffic by flow behavior
  • How many other nodes a node talks to
  • What combination of protocols is used (UDP, TCP
    etc)
  • Patters on packet sizes
  • Active research area

30
ISP-friendly P2P
  • ISPs love and hate p2p applications at the same
    time
  • P2p applications are the killer appls for
    broadband access drawing and keeping customers
  • P2p traffic quickly eats up any added bandwidth
  • ISPs and researchers are looking for p2p
    algorithms and ISP caching services to reduce the
    cross-ISP traffic demands

31
Discussion the future of p2p
  • Can it ever be made reliable enough as a service?
  • May need to deploy smart servers to help p2p when
    necessary, e.g. in PPlive.
  • Will ISPs change their traffic controls or
    charges to kill p2p?
  • Unlikely in the near term ISP need content
    distribution applications
  • P2p is quite effective, and can be made more
    ISP-friendly

32
Commercial interest in P2P
  • Currently there is a lot of commercial interest
    in p2p technology
  • Large investment in Joost
  • Many other start-ups
  • Interest by Microsoft and Google

33
Research interest in p2p
  • Content distribution algorithms
  • For file sharing
  • For streaming
  • For Video on Demand (VoD)
  • DHTs and their use
  • Revisit naming, addressing and routing in the
    next generation Internet
  • Incentive and reputation systems
  • No good incentive system for p2p streaming

34
Research interests - cont
  • Applying network coding to p2p
  • It may not help improve optimal throughput
  • But it can help simplify peer and chuck selection
    strategies in distribution algorithms
  • P2P traffic classification

35
Compare to a regular course
  • There are some theoretical foundations, but less
    mature than a regular course
  • The long term prospect may not be so clear, but
    the current interest is high
  • Although undergraduates are allowed, but will run
    like a graduate course
  • No exams, no text book
  • Read papers, do some projects, learn to do
    research
  • Practice oral presentation and writing

36
Homework projects
  • Streaming algorithm design and simulation
  • Based on YP Zhous research paper (ICNP 2007)
  • Under some assumptions, try to design the best
    chunk selection strategy
  • Simulate it and compare to YPs algorithm
  • P2P traffic trace analysis
  • Be a detective find p2p traffic in a trace
  • Find properties of different p2p applications
  • Planet Lab - ?

37
Individual project
  • If too many students, we need to make them group
    projects (2 in a group)
  • Three types I can see
  • Survey a problem in depth (read several papers
    related to the problem and summarized/discuss)
  • Do some research on a given problem
  • Implement some specific P2P application/mechanism
    and demo it.
  • Will give a list of potential topics
  • Will give a list of papers to read

38
Some example topics
  • Most commercial systems (e.g. PPlive) are not
    based on open source. Try to deduce the algorithm
    for specific p2p streaming applications
  • What is P2P-SIP and what are its applications?
  • Is it possible to implement a viable search
    engine based on p2p technology?

39
Assessment
  • Main project 50
  • Oral presentation written report
  • Homework projects 40
  • Class participation 10
Write a Comment
User Comments (0)
About PowerShow.com