Peer to Peer Systems and File Sharing - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Peer to Peer Systems and File Sharing

Description:

Kazaa. Hybrid of Napster model and Gnutella model. Notion of a super peer ... 'Free-riding' Definition: downloading but not sharing any data ... – PowerPoint PPT presentation

Number of Views:152
Avg rating:3.0/5.0
Slides: 39
Provided by: carll8
Category:
Tags: file | free | kazaa | peer | sharing | systems

less

Transcript and Presenter's Notes

Title: Peer to Peer Systems and File Sharing


1
Peer to Peer Systems and File Sharing
  • Carl Lagoze CS431 Cornell University
  • May 3, 2004

Portions borrowed from sourceslisted on next
slide
2
Sources of this lecture
  • J. Berkes, Decentralized Peer-to-Peer Network
    Architecture Gnutella and Freenet
  • R. Morris, ChordDHashIvy Building Principled
    Peer-to-Peer Systems
  • S. Kamvar, M. Schlosser, H. Garcia Molina,
    EigenRep Reputation Management in P2P Networks
  • J. Golbeck, B. Paris, J. Hendler, Trust Networks
    on the Semantic Web

3
Characteristics of P2P Network
  • Sharing of computing resources by direct exchange
  • Blur between clients, servers, routers
  • Nodes are autonomous

4
P2P Advantages
  • Efficient use of resources
  • Scalability
  • Reliability
  • Administrative simplicity
  • Democracy

5
P2Ps Political History
  • Major basis in music sharing context
  • Overshadows numerous applications
  • Recent research is investigating generic
    applicability
  • DHTs
  • Reputation and Trust

6
Small-World Phenomenon
  • Milgrams six degrees of separation (1967)
  • Forwarding of letters from Nebraska to Boston, MA
  • Average chain 6 of six hops

7
Power Laws and Small Worlds
  • Out-degree distribution is
  • 1/Ka where a gt 0
  • Characteristics of a variety of phenomenon
  • Web Graph
  • IMDb connection (acted in same movie)
  • Social interactions
  • P2P networks (Gnutella)
  • Epidemiology

8
Strength of Weak Ties
  • Extension of power-law phenomenon
  • Short-cuts (between cliques) critical to small
    world phenomenon

9
Napster and P2P
  • Not really P2P
  • Central search index
  • Direct interaction for access (p2p)
  • Central index was key to litigation

10
Gnutella
  • Fully P2P
  • Flooded query
  • Scalability problems
  • TTL controls broadcast
  • Query Memory controls circularity
  • Reliability problems
  • But whom to sue?

11
Kazaa
  • Hybrid of Napster model and Gnutella model
  • Notion of a super peer
  • Like a regionalized napster server
  • Dynamically chosen by characteristics
  • P2P relationship among super-peers
  • Queries directed towards super-peers

12
Free-riding
  • Definition downloading but not sharing any data
  • On gnutella networks 15 of users contribute 94
    of content

13
Freenet
  • Goal create an uncensorable and secure global
    information store
  • Anonymity and fault tolerance
  • http//freenet.sourceforge.net/
  • Three types of network messages
  • Advertise storage space to store unknown data
  • Insert a file to the network
  • Request a file (with a key) from the network
  • Use of one-way secure hashes to identify files
    and encryption to store files
  • Node does not know what it is storing
  • Non-traceability of messages
  • A node can not determine where its message is
    stored

14
Freenet Request/Response Sequence
15
Distributed Hash Tables (DHT)
  • Overcoming the flooded search problem
  • Operationally like standard hash tables
  • Data is distributed around the network
  • Features
  • Efficient
  • O(log N) messages per lookup
  • Even distribution of keys among nodes
  • Adaptable
  • Network reconfiguration does not cascade to all
    nodes
  • Robust replication of tables provides survival
    to node failures

16
Chord
  • One implementation of DHT within a larger P2P
    project
  • http//www.pdos.lcs.mit.edu/chord/
  • Algorithm properties
  • Common hash function distributes node ID (IP) and
    document ID uniformly
  • Maps a content key to its node successor

17
Chord Key Mapping
N10
K5, K10
Key ID Node ID
N100
K100
Circular ID Space
N32
K11, K30
N80
K65, K70
N60
K33, K40, K52
Robustness via each node remembering N successors
and replicating table at successors
18
Use of finger table to avoid linear lookups
ith finger table position points to first node
that succeeds n by at least 2i1
19
Key location with finger table
  • Use finger table to find furthest node that
    precedes key
  • O(logN) hops leads to target

20
From DHTs to P-trees
  • DHTs only support equality queries
  • Return the value of resources with ID1
  • Need to support range queries
  • OAI type query, find all nodes resources that
    were changed between D1 and D2
  • P-tree reuses aspects of fault-tolerant ring of
    Chord with logarithmic search properties for
    equality and range queries.

21
Pastry Project
  • Factors in network locality as part of DHT
    algorithm
  • http//research.microsoft.com/antr/Pastry/

22
Identity, Trust, Reputation
  • Identity
  • Who is making a statement
  • Certificates, PKI
  • Trust
  • Can I believe the person who is making the
    statement
  • PGP Web of Trust
  • Reputation
  • What is the history of trust in the person making
    the statement
  • Reputation management

23
Reputation Issues
  • Small world phenomenon makes web of trust
    feasible
  • Reputation is context specific
  • I can be trusted with questions about OAI-PMH
  • Can you trust me belaying for you?

24
Simple reputation network
A
C
B
  • A knows and trusts B
  • B knows and trusts C
  • A can infer trust for C

25
Reputation Inference Algorithm
  • Begin at source (node seeking a reputation)
  • Poll each of neighbors whose reputation it trusts
  • Ignore neighbors with bad reputation
  • Have each neighbor recursively find reputation of
    sink (node for which reputation is sought)

26
Accuracy of inferences
  • Incorrect bad rating by a node has minimal effect
  • Will be dropped from path in reputation seeking
  • Will be overcome by correct good rating by
    another node.
  • Incorrect good rating by a node can have
    cascading effect
  • Can cause ratings of good nodes to be ignored
    through lies
  • Serious threat to network
  • Good trust algorithm minimizes effect of bad
    nodes

27
From Golbeck and Hendler
28
Trellis
  • http//trellis.semanticweb.org
  • Semantic web based system for decision making
    assessing reliability of information and sources
  • Decision maker can construct compound statements
    justifying decision and providing basis for
    others decisions

29
Trellis (cont.)
  • Components
  • Statements (Carl Lagoze is a bad teacher)
  • Basis of statement
  • http//cornellbigred.collegesports.com/sports/m-cr
    ew/mtt/kruse_william00.html
  • Principal source of basis/statement
  • William Kruse
  • Qualifications to state certainty of component

30
Trellis compound statement
From Gil and Ratnaker
31
Advogato
  • Trust metrics for open source software developers
  • http//www.advogato.org/
  • Three levels of trust/certification
  • Master
  • Journeyer
  • Apprentice

32
Advogato (cont)
  • Graph structure of trust
  • Domain of master is only master
  • Domain of journeyer is master and journeyer
  • Domain of apprentice is all
  • Computation of trust is via network flow (well
    known problem with efficient solutions)
  • Hard-wired set of users from which all trust
    flows (gods of the system)
  • people reached by the flow are those accepted by
    the trust metric
  • With the three levels, the maxflow is computed
    three times
  • Robust (resistant to attack) and efficient

33
Eigentrust
  • Algorithm for Reputation Management in P2P
    Networks
  • Kamvar, Schlosser, Garcia-Molia (Stanford)
  • http//www.stanford.edu/sdkamvar/research.html

34
Eigentrust Approach
  • Goal Identify sources of inauthentic files and
    bias peers agains downloading from them
  • Method Give each peer a trust value based on its
    previous behavior
  • Trust values
  • Local open a peer has on another based on past
    experience
  • Global trust that entire system places in a peer
  • Want latter computed from aggregate of former
  • Dual goals
  • Know all peers
  • Perform minimal computation and store minimal data

35
Past History Approach
  • Each peer biases its choice of downloads using
    its own opinion vector
  • Problems
  • Each peer has limited past experience
  • Inertia if a peer has good past experience with
    another, it will be biased towards relying on it

36
Friends of friends approach
  • Ask for opinions of the people who you trust
  • Weigh their opinions by your trust in them
  • Problems
  • You have a lot of friends too much to compute
    and store
  • Few friends wont have enough data

37
Eigentrust Approach
  • Whole networks cooperates to store and compute
    trust vector
  • Each peer holds its own opinions
  • Each peer holds its own global reputation
  • Iterative algorithm that converges to compute
    global trust ratings (in the nature of PageRank)

38
More Eigentrust Issues
  • Secure Score Management
  • Voting among multiple score managers
  • Peer score held by another peer
  • Threat scenarios
  • Malicious individuals (always bad)
  • Malicious collectives (always bad, think highly
    of each other)
  • Camouflaged collectives (sometimes good to trick
    people)
  • Malicious spies (good all the time but friends
    with bad folks)
Write a Comment
User Comments (0)
About PowerShow.com