Image Indexing and Retrieval

About This Presentation

Title:

Image Indexing and Retrieval

Description:

Constantly-updated directory hosted at central locations (do ... The overlay topology is highly controlled and files (or metadata ... in an ad-hoc fashion ... – PowerPoint PPT presentation

Number of Views:54

Avg rating:3.0/5.0

Slides: 143

Provided by: osfs

Category:

more less

Transcript and Presenter's Notes

Title: Image Indexing and Retrieval

1
Topics in Database Systems Data Management in
Peer-to-Peer Systems
Search Replication in Unstructured P2p
2
Overview

Centralized
Constantly-updated directory hosted at central
locations (do not scale well, updates, single
points of failure)
Decentralized but structured
The overlay topology is highly controlled and
files (or metadata/index) are not placed at
random nodes but at specified locations
Decentralized and Unstructured
Peers connect in an ad-hoc fashion
The location of document/metadata is not
controlled by the system
No guaranteed for the success of a search
No bounds on search time

3
Overview

Blind Search and Variations
No information about the location of items
Informed Search
Maintain (localized) index information
Local and Routing Indexes
Trade-off cost of maintaining the indexes (when
joining/leaving/updating) vs cost for search

4
Blind Search
Flood-based each node contact its neighbors,
which in turn contact their neighbors, until the
item is located Exponential search time No
guarantees
5
Blind Search Issues

BFS vs DFS better response time, more messages
Iterative vs Recursive (return path)
TTL (time to leave) parameter
Cycles (duplicate messages)
Connectivity
Power-Law Topologies the ith node with most
connections has ?/ia neighbors

6
Gnutella Summary

Completely decentralized
Hit rates are high
High fault tolerance
Adopts well and dynamically to changing peer
populations
Protocol causes high network traffic (e.g.,
3.5Mbps). For example
4 connections C / peer, TTL 7
1 ping packet can cause packets
No estimates on the duration of queries can be
given
No probability for successful queries can be
given
Topology is unknown ? algorithms cannot exploit
it
Free riding is a problem
Reputation of peers is not addressed
Simple and robust

7
Summary and Comparison of Approaches
8
More on Search

Search Options
Query Expressiveness (type of queries)
Comprehensiveness (all or just the first (or k)
results)
Topology
Data Placement
Message Routing

9
Comparison
????
??
????
?
???
10
Comparison
????
?
??
????
????
??
???
?
???
??
11

Client-Server performs well
But not always feasible
Ideal performance is often not the key issue!
Things that flood-based systems do well
Scaling
Decentralization of visibility and liability
Finding popular stuff (e.g., caching)
Fancy local queries
Things that flood-based systems do poorly
Finding unpopular stuff
Fancy distributed queries
Guarantees about anything (answer quality,
privacy, etc.)

12
Blind Search Variations
Modified-BFS Choose only a ratio of the
neighbors (some random subset)
Expanding Ring or Iterative Deepening Start BFS
with a small TTL and repeat the BFS at increasing
depths if the first BFS fails Works well when
there is some stop condition and a small flood
will satisfy the query Else even bigger loads
than standard flooding Appropriate when hot
objects are replicated more widely than cold
objects
13
Blind Search Methods

Random Walks
The node that poses the query sends out k query
messages to an equal number of randomly chosen
neighbors
Each step follows each own path at each step
randomly choosing one neighbor to forward it
Each path a walker
Two methods to terminate each walker
TTL-based or
checking method (the walkers periodically check
with the query source if the stop condition has
been met)
It reduces the number of messages to k x TTL in
the worst case
Some kind of local load-balancing

14
Blind Search Methods
Random Walks In addition, the protocol bias its
walks towards high-degree nodes (choose the
highest degree neighbor)
15
Topics in Database Systems Data Management in
Peer-to-Peer Systems
Q. Lv et al, Search and Replication in
Unstructured Peer-to-Peer Networks, ICS02
16

Search and Replication in Unstructured
Peer-to-Peer Networks
Type of replication depends on the search
strategy used
A number of blind-search variations of flooding
A number of (metadata) replication strategies
Evaluation Method Study how they work for a
number of different topologies and query
distributions

17
Methodology

Performance of search depends on
Network topology graph formed by the p2p
overlay network
Query distribution the distribution of query
frequencies for individual files
Replication number of nodes that have a
particular file

Assumption fixed network topology and fixed
query distribution Results still hold, if one
assumes that the time to complete a search is
short compared to the time of change in network
topology and in query distribution
18
Network Topology
(1) Power-Law Random Graph A 9239-node random
graph Node degrees follow a power law
distribution when ranked from the most connected
to the least, the i-th ranked has ?/ia, where
? is a constant Once the node degrees are chosen,
the nodes are connected randomly
19
Network Topology
(2) Normal Random Graph A 9836-node random graph
20
Network Topology
(3) Gnutella Graph (Gnutella) A 4736-node graph
obtained in Oct 2000 Node degrees roughly follow
a two-segment power law distribution
21
Network Topology
(4) Two-Dimensional Grid (Grid) A two
dimensional 100x100 grid
22
Query Distribution
Assume m objects Let qi be the relative
popularity of the i-th object (in terms of
queries issued for it) Values are normalized S
i1, m qi 1

Uniform All objects are equally popular
qi 1/m
(2) Zipf-like
qi ? 1 / ia

23
Query Distribution Replication
When the replication is uniform, the query
distribution is irrelevant (since all objects are
replicated by the same amount, search times are
equivalent for both hot and cold items) When the
query distribution is uniform, all three
replication distributions are equivalent
(uniform!) Thus, three relevant combinations
query-distribution/replication

Uniform/Uniform
Zipf-like/Proportional
Zipf-like/Square-root

24
Metrics
Pr(success) probability of finding the queried
object before the search terminates hops delay
in finding an object as measured in number of hops
25
Metrics
msgs per node Overhead of an algorithm as
measured in average number of search messages
each node in the p2p has to process nodes
visited Percentage of message duplication Peak
msgs the number of messages that the busiest
node has to process (to identify hot spots)
These are per-query measures An aggregated
performance measure, each query convoluted with
its probability
26
Limitation of Flooding

Choice of TTL
Too low, the node may not find the object, even
if it exists, too high, burdens the network
unnecessarily

There are many duplicate messages (due to cycles)
particularly in high connectivity graphs Multiple
copies of a query are sent to a node by multiple
neighbors Avoiding cycles, decreases the number
of links Duplicated messages can be detected and
not forwarded - BUT, the number of duplicate
messages can still be excessive and worsens as
TTL increases
27
Limitation of Flooding Comparison of the
topologies
Power-law and Gnutella-style graphs particularly
bad with flooding Highly connected nodes means
higher duplication messages, because many nodes
neighbors overlap Random graph best, Because in
truly random graph the duplication ratio (the
likelihood that the next node already received
the query) is the same as the fraction of nodes
visited so far, as long as that fraction is
small Random graph better load distribution
among nodes
28
Random Walks
Experiments show that 16 to 64 walkers give
good results checking once at every 4th step a
good balance between the overhead of the
checking message and the benefits of
checking Keeping state (when the same query
reaches a node, the node chooses randomly a
different neighbor to forward it) Improves
Random and Grid by reducing up to 30 the
message overhead and up to 30 the number of
hops Small improvements for Gnutella and PLRG
29
Random Walks
When compared to flooding The 32-walker random
walk reduces message overhead by roughly two
orders of magnitude for all queries across all
network topologies at the expense of a slight
increase in the number of hops (increasing from
2-6 to 4-15) When compared to expanding ring,
The 32-walkers random walk outperforms expanding
ring as well, particularly in PLRG and Gnutella
graphs
30
Principles of Search

Adaptive termination is very important
Expanding ring or the checking method
Message duplication should be minimized
Preferably, each query should visit a node just
once
Granularity of the coverage should be small
Increase of each additional step should not
significantly increase the number of nodes visited

31
Replication
32
Types of Replication

Two types of replication
Metadata/Index replicate index entries
Data/Document replication replicate the actual
data (e.g., music files)

33
Types of Replication
Caching vs Replication Cache Store data
retrieved from a previous request
(client-initiated) Replication More proactive,
a copy of a data item may be stored at a node
even if the node has not requested it
34
Reasons for Replication

Reasons for replication
Performance
load balancing
locality place copies close to the requestor
geographic locality (more choices for the next
step in search)
reduce number of hops
Availability
In case of failures
Peer departures

Besides storage, cost associated with
replication Consistency Maintenance Make reads
faster in the expense of slower writes
35

No proactive replication (Gnutella)
Hosts store and serve only what they requested
A copy can be found only by probing a host with a
copy
Proactive replication of keys ( meta data
pointer) for search efficiency (FastTrack, DHTs)
Proactive replication of copies for search
and download efficiency, anonymity. (Freenet)

36
Issues
Which items (data/metadata) to replicate Based
on popularity In traditional distributed systems,
also rate of read/write cost benefit the
ratio read-savings/write-increase Where to
replicate (allocation schema) More Later
37
Issues
How/When to update Both data items and metadata
38
Database-Flavored Replication Control Protocols
Lets assume the existence of a data item x with
copies x1, x2, , xn x logical data item xis
physical data items
A replication control protocol is responsible for
mapping each read/write on a logical data item
(R(x)/W(x)) to a set of read/writes on a
(possibly) proper subset of the physical data
item copies of x
39
One Copy Serializability
Correctness A DBMS for a replicated database
should behave like a DBMS managing a one-copy
(i.e., non-replicated) database insofar as users
can tell
One-copy serializable (1SR) the schedule of
transactions on a replicated database be
equivalent to a serial execution of those
transactions on a one-copy database
One-copy schedule replace operation of data
copies with operations on data items
40
ROWA
Read One/Write All (ROWA) A replication control
protocol that maps each read to only one copy of
the item and each write to a set of writes on all
physical data item copies.
Even if one of the copies is unavailable an
update transaction cannot terminate
41
Write-All-Available
Write-all-available A replication control
protocol that maps each read to only one copy of
the item and each write to a set of writes on all
available physical data item copies.
42
Quorum-Based Voting

Read quorum Vr and a write quorum Vw to read or
write a data item
If a given data item has a total of V votes, the
quorums have to obey the following rules
Vr Vw gt V
Vw gt V/2

Rule 1 ensures that a data item is not read or
written by two transactions concurrently
(R/W) Rule 2 ensures that two write operations
from two transactions cannot occur concurrently
on the same data item (W/W)
43
Distributing Writes
Immediate writes Deffered writes Access only one
copy of the data item, it delays the distribution
of writes to other sites until the transaction
has terminated and is ready to commit. It
maintains an intention list of deferred
updates After the transaction terminates, it send
the appropriate portion of the intention list to
each site that contains replicated
copies Optimizations aborts cost less may
delay commitment delays the detection of
conflicts Primary or master copy Updates at a
single copy per item
44
Eager vs Lazy Replication
Eager replication keeps all replicas
synchronized by updating all replicas in a single
transaction Lazy replication asynchronously
propagate replica updates to other nodes after
the replicating transaction commits
In p2p, lazy replication (or soft state)
45
Update Propagation

Who initiates the update
Push by the server item (copy) that changes
Pull by the client holding the copy
When
Periodic
Immediate
Lazy when an inconsistency is detected
Threshold-based Freshness (e.g., number of
updates or actual time)
Value
Expiration-Time Items expire (become invalid)
after that time (most often used in p2p)
Stateless or State-full (the item-owners know
which nodes holds copies of the item)

46
Replication Structured P2P
47
CHORD
Metadata replication or redundancy
Invariant to guarantee correctness of
lookups Keep successors nodes up-to-date Method
Maintain a successor list of its r nearest
successors on the Chord ring Why?
Availability How to keep it consistent Lazy
thought a periodic stabilization
48
CHORD
Data replication
Method Replicate data associated with a key at
the k nodes succeeding the key Why? Availability
49
CAN
Metadata replication
Multiple realities With r realities each node is
assigned r coordinated zones, one on every
reality and holds r independent neighbor
sets Replicate the hash table at each
reality Availability Fails only if nodes at
both r nodes fail Performance Better search,
choose to forward the query to the neighbor with
coordinates closest to the destination
50
CAN
Metadata replication
Overloading coordinate zones Multiple nodes may
share a zone The hash table may be replicated
among zones Higher availability Performance
choices in the number of neighbors, can select
nodes closer in latency Cost for Consistency
51
CAN
Metadata replication
Multiple Hash Functions Use k different hash
functions to map a single key onto k points in
the coordinate space Availability fail only if
all k replicas are unavailable Performance
choose to send it to the node closest in the
coordinated space or send query to all k nodes in
parallel (k parallel searches) Cost for
Consistency Query traffic (if parallel searches)
52
CAN
Metadata replication
Hot-spot Replication A node that finds it is
being overloaded by requests for a particular
data key can replicate this key at each of its
neighboring nodes Then with a certain
probability can choose to either satisfy the
request or forward it Performance load
balancing
53
CAN
Metadata replication
Caching Each node maintains a a cache of the data
keys it recently accessed Before forwarding a
request, it first checks whether the requested
key is in its cache, and if so, it can satisfy
the request without forwarding it any
further Number of cache entries per key grows in
direct proportion to its popularity
54
Replication Theory Replica Allocation Policies
55
Replication Allocation Scheme
Question how to use replication to improve
search efficiency in unstructured networks with a
proactive replication mechanism?
How many copies of each object so that the search
overhead for the object is minimized, assuming
that the total amount of storage for objects in
the network is fixed
56
Replication Theory
Assume m objects and n nodes Each object i is
replicated on ri distinct nodes and the total
number of objects stored is R, that is S i1, m
ri R Also, pi ri/R Assume that object i is
requested with relative rates qi, we normalize it
by setting S i1, m qi 1 For convenience,
assume 1 ltlt ri ? n and that q1 ? q2 ? ? qm
57
Replication Theory
Assume that searches go on until a copy is
found Searches consist of randomly probing
sites until the desired object is found search
at each step draws a node uniformly at random and
asks for a copy
58
Search Example

2 probes

4 probes
59
Replication Theory
The probability Pr(k) that the object is found at
the kth probe is given Pr(k) Pr(not found
in the previous k-1 probes) Pr(found in one (the
kth) probe) (1 ri/n)k-1 ri/n k (search
size step at which the item is found) is a
random variable with geometric distribution and ?
ri/n gt expectation n/ri
60
Replication Theory
Ai Expectation (average search size) for object
i is the inverse of the fraction of sites that
have replicas of the object Ai n/ri The
average search size A of all the objects (average
number of nodes probed per object query) A Si
qi Ai n Si qi/ri
Minimize A n Si qi/ri
61
Replication Theory
If we have no limit on ri, replicate everything
everywhere Then, the average search size Ai
n/ri 1 Search becomes trivial
Assume a limit on R and that the average number
of replicas per site ? R/n is fixed
How to allocate these R replicas among the m
objects how many replicas per object
62
Uniform Replication
Create the same number of replicas for each
object ri R/m Average search size for uniform
replication Ai n/ri m/? Auniform Si qi m/?
m/? (m n/R) Which is independent of the query
distribution It makes sense to allocate more
copies to objects that are frequently queried,
this should reduce the search size for the more
popular objects
63
Proportional Replication
Create a number of replicas for each object
proportional to the query rate ri R qi
64
Uniform and Proportional Replication

Summary
Uniform Allocation pi 1/m
Simple, resources are divided equally
Proportional Allocation pi qi
Fair, resources per item proportional to demand
Reflects current P2P practices

65
Proportional Replication
Number of replicas for each object ri R
qi Average search size for uniform
replication Ai n/ri n/R qi Aproportioanl Si
qi n/R qi m/? Auniform again independent of
the query distribution Why? Objects whose query
rate are greater than average (gt1/m) do better
with proportional, and the other do better with
uniform The weighted average balances out to be
the same So what is the optimal way to allocate
replicas so that A is minimized?
66
Space of Possible Allocations

q i1/q i ? p i1/p i
As the query rate decreases, how much does the
ratio of allocated replicas behave
Reasonable
p i1/p i ? 1
1 for uniform

67
Space of Possible Allocations

Definition Allocation p1, p2, p3,, pm is
in-between Uniform and Proportional if
for 1lt i ltm, q i1/q i lt p i1/p i lt 1
(1 for uniform, for proportial, we want to
favor popular but not too much)
Theorem1 All (strictly) in-between strategies
are (strictly) better than Uniform and
Proportional

Theorem2 p is worse than Uniform/Proportional
if for all i, p i1/p i gt 1 (more popular gets
less) OR for all i, q i1/q i gt p i1/p i (less
popular gets less than fair share)
Proportional and Uniform are the worst
reasonable strategies
68
Space of allocations on 2 items
Uniform
Proportional
p2/p1
q2/q1
69
So, what is the best strategy?
70
Square-Root Replication
Find ri that minimizes A, A Si qi Ai n Si
qi/ri This is done for ri ? vqi where ? R/Si
vqi Then the average search size is Aoptimal
1/? (Si vqi)2
71
How much can we gain by using SR ?
Zipf-like query rates
Auniform/ASR
72
Other Metrics Discussion

Utilization rate, the rate of requests that a
replica of an object i receives
Ui R qi/ri
For uniform replication, all objects have the
same average search size, but replicas have
utilization rates proportional to their query
rates
Proportional replication achieves perfect load
balancing with all replicas having the same
utilization rate, but average search sizes vary
with more popular objects having smaller average
search sizes than less popular ones

73
Replication Summary
74
Pareto Distribution (for the queries)
Pareto principle 80-20 rule 80 of the wealth
owned by 20 of the population Zipf what is the
size of the rth ranked Pareto how many have size
gt r
75
Replication (summary)
Each object i is replicated on ri nodes and the
total number of objects stored is R, that is S
i1, m ri R

Uniform All objects are replicated at the same
number of nodes
ri R/m
(2) Proportional The replication of an object is
proportional to the query probability of the
object
ri ? qi
(3) Square-root The replication of an object i
is proportional to the square root of its query
probability qi
ri ? vqi

76
Assumption that there is at least one copy per
object

Query is soluble if there are sufficiently many
copies of the item.
Query is insoluble if item is rare or non
existent.

What is the search size of a query ?
Soluble queries number of probes until answer is
found.
Insoluble queries maximum search size

SR is best for soluble queries
Uniform minimizes cost of insoluble queries

What is the optimal strategy?
78
104 items, Zipf-like w1.5
All Soluble
85 Soluble
All Insoluble
Uniform
SR
79
We now know what we need.
How do we get there?
80
Replication Algorithms

Uniform and Proportional are easy
Uniform When item is created, replicate its key
in a fixed number of hosts.
Proportional for each query, replicate the key
in a fixed number of hosts (need to know or
estimate the query rate)

Desired properties of algorithm

Fully distributed where peers communicate through
random probes minimal bookkeeping and no more
communication than what is needed for search.
Converge to/obtain SR allocation when query rates
remain steady.

81
Replication - Implementation
Two strategies are popular Owner
Replication When a search is successful, the
object is stored at the requestor node only (used
in Gnutella) Path Replication When a search
succeeds, the object is stored at all nodes along
the path from the requestor node to the provider
node (used in Freenet) Following the reverse path
back to the requestor
82
Achieving Square-Root Replication

How can we achieve square-root replication in
practice?
Assume that each query keeps track of the search
size
Each time a query is finished the object is
copied to a number of sites proportional to the
number of probes
On average object i will be replicated on c n/ri
times each time a query is issued (for some
constant c)
It can be shown that this gives square root

83
Replication - Conclusion
Thus, for Square-root replication an object
should be replicated at a number of nodes that
is proportional to the number of probes that the
search required
84
Replication - Implementation
If a p2p system uses k-walkers, the number of
nodes between the requestor and the provider node
is 1/k of the total nodes visited (number of
probes) Then, path replication should result in
square-root replication Problem Tends to
replicate nodes that are topologically along the
same path
85
Replication - Implementation
Random Replication When a search succeeds, we
count the number of nodes on the path between the
requestor and the provider Say p Then, randomly
pick p of the nodes that the k walkers visited to
replicate the object Harder to implement
86
Achieving Square-Root Replication
What about replica deletion? Steady state
creation time equal with the deletion time The
lifetime of replicas must be independent of
object identity or query rate FIFO or random
deletions is ok LRU or LFU no
87
Replication Evaluation

Study the three replication strategies in the
Random graph network topology
Simulation Details
Place the m distinct objects randomly into the
network
Query generator generates queries according to a
Poisson process at 5 queries/sec
Zipf-distribution of queries among the m objects
(with a 1.2)
For each query, the initiator is chosen randomly
Then a 32-walker random walk with state keeping
and checking every 4 steps
Each sites stores at most objAllow (40) objects
Random Deletion
Warm-up period of 10,000 secs
Snapshots every 2,000 query chunks

88
Replication Evaluation

For each replication strategy
What kind of replication ratio distribution does
the strategy generate?
What is the average number of messages per node
in a system using the strategy
What is the distribution of number of hops in a
system using the strategy

89
Evaluation Replication Ratio
Both path and random replication generates
replication ratios quite close to square-root of
query rates
90
Evaluation Messages
Path replication and random replication reduces
the overall message traffic by a factor of 3 to 4
91
Evaluation Hops
Much of the traffic reduction comes from reducing
the number of hops
Path and random, better than owner For example,
queries that finish with 4 hops, 71 owner, 86
path, 91 random
92
Summary

Random Search/replication Model probes to
random hosts
Proportional allocation current practice
Uniform allocation best for insoluble queries
Soluble queries
Proportional and Uniform allocations are two
extremes with same average performance
Square-Root allocation minimizes Average Search
Size
OPT (all queries) lies between SR and Uniform
SR/OPT allocation can be realized by simple
algorithms.

93
Replication Unstructured P2Pepidemic
algorithms
94

Replication Policy
How many copies
Where (owner, path, random path)
Update Policy
Synchronous vs Asynchronous
Master Copy

95
Methods for spreading updates Push originate
from the site where the update appeared To reach
the sites that hold copies Pull the sites
holding copies contact the master site Expiration
times Epidemics for spreading updates
96
A. Demers et al, Epidemic Algorithms for
Replicated Database Maintenance, SOSP 87
Update at a single site Randomized algorithms
for distributing updates and driving replicas
towards consistency Ensure that the effect of
every update is eventually reflected to all
replicas Sites become fully consistent only when
all updating activity has stopped and the system
has become quiescent Analogous to epidemics
97

Methods for spreading updates
Direct mail each new update is immediately
mailed from its originating site to all other
sites
Timely reasonably efficient
Not all sites know all other sites
Mails may be lost
Anti-entropy every site regularly chooses
another site at random and by exchanging content
resolves any differences between them
Extremely reliable but requires exchanging
content and resolving updates
Propagates updates much more slowly than direct
mail

Methods for spreading updates
Rumor mongering
Sites are initially ignorant when a site
receives a new update it becomes a hot rumor
While a site holds a hot rumor, it periodically
chooses another site at random and ensures that
the other site has seen the update
When a site has tried to share a hot rumor with
too many sites that have already seen it, the
site stops treating the rumor as hot and retains
the update without propagating it further
Rumor cycles can be more frequent that
anti-entropy cycles, because they require fewer
resources at each site, but there is a chance
that an update will not reach all sites

Anti-entropy and rumor spreading are examples of
epidemic algorithms
Three types of sites
Infective A site that holds an update that is
willing to share is hold
Susceptible A site that has not yet received an
update
Removed A site that has received an update but
is no longer willing to share
Anti-entropy simple epidemic where all sites are
always either infective or susceptible

100
A set S of n sites, each storing a copy of a
database The database copy at site s ? S is a
time varying partial function s.ValueOf K ?
uV x t T set of keys set of values
set of timestamps (totally ordered by lt V
contains the element NIL s.ValueOfk NIL, t
item with k has been deleted from the
database Assume, just one item s.ValueOf ? uV
x tT thus, an ordered pair consisting of a
value and a timestamp The first component may be
NIL indicating that the item was deleted by the
time indicated by the second component
101

The goal of the update distribution process is to
drive the system towards
s, s ?S s.ValueOf s.ValueOf
Operation invoked to update the database
UpdateuV s.ValueOf r, Now)

102
Direct Mail
At the site s where an update occurs For each
s ? S PostMailtos, msg(Update, s.ValueOf)
s originator of the update s receiver of the
update
Each site s receiving the update message
(Update, (u, t)) If s.ValueOf.t lt t
s.ValueOf ? (u, t)

The complete set S must be known to s (stateful
server)
PostMail messages are queued so that the server
is not delayed (asynchronous), but may fail when
queues overflow or their destination are
inaccessible for a long time
n (number of sites) messages per update
traffic proportional to n and the average
distance between sites

103
Anti-Entropy
At each site s periodically execute For some s
? S ResolveDifferences, s
s pushes its value to s
s ? s
Three ways to execute ResolveDifference Push
(sender (server) - driven) If s.Valueof.t gt
s.Valueof.t s.ValueOf ? s.ValueOf Pull
(receiver (client) driven) If s.Valueof.t lt
s.Valueof.t s.ValueOf ? s.ValueOf Push-Pull
s.Valueof.t gt s.Valueof.t ? s.ValueOf ?
s.ValueOf s.Valueof.t lt s.Valueof.t ? s.ValueOf
? s.ValueOf
s pulls s and gets s value
104
Anti-Entropy

Assume that
Site s is chosen uniformly at random from the
set S
Each site executes the anti-entropy algorithm
once per period

It can be proved that
An update will eventually infect the entire
population
Starting from a single affected site, this can
be achieved in time proportional to the log of
the population size

105
Anti-Entropy
Let pi be the probability of a site remaining
susceptible after the i cycle of
anti-entropy For pull, A site remains
susceptible after the i1 cycle, if (a) it was
susceptible after the i cycle and (b) it
contacted a susceptible site in the i1
cycle pi1 (pi)2 For push, A site remains
susceptible after the i1 cycle, if (a) it was
susceptible after the i cycle and (b) no
infectious site choose to contact in the i1
cycle pi1 pi (1 1/n)n(1-pi)
1 1/n (site is not contacted by a node) n(1-pi)
number of infectious nodes at cycle i
Pull is preferable than push
106
Anti-Entropy

compare the whole database instance sent over the
network
Use checksums
what about recent updates known only in a few
sites
A list of recent updates (now - timestamp lt
threshold t)
Compare fist recent updates, update the ckecksums
and then compare the checksums, choice of t
Maintain an inverted list of updates ordered by
timestamp
Perform anti-entropy by exchanging timestamps at
reverse timestamp order until their checksums
agree
send only the updates, when to stop

107
Complex Epidemics Rumor Spreading

Initial State n individuals initially inactive
(susceptible)
Rumor plantingspreading
We plant a rumor with one person who becomes
active (infective), phoning other people at
random and sharing the rumor
Every person bearing the rumor also becomes
active and likewise shares the rumor
When an active individual makes an unnecessary
phone call (the recipient already knows the
rumor), then with probability 1/k the active
individual loses interest in sharing the rumor
(becomes removed)
We would like to know
How fast the system converges to an inactive
state (no one is infective)
The percentage of people that know the rumor
when the inactive state is reached

108
Complex Epidemics Rumor Spreading
Let s, i, r be the fraction of individuals that
are susceptible, infective and removed s i r
1 ds/dt - s i di/dt s i 1/k(1-s) i s e
(k1)(1-s) An exponential decrease with s For k
1, 20 miss the rumor For k 2, only 6 miss it
Unnecessary phone calls
109
Criteria to characterize epidemics

Residue
The value of s when i is zero the remaining
susceptible when the epidemic finishes
Traffic
m Total update traffic / Number of sites
Delay
Average delay (tavg) difference between the
time of the initial injection of an update and
the arrival of the update at a given site
averaged over all sites
The delay until (tlast) the reception by the
last site that will receive the update during an
epidemic

110
Simple variations of rumor spreading
Blind vs. Feedback Feedback variation a sender
loses interest only if the recipient knows the
rumor Blind variation a sender loses interest
with probability 1/k regardless of the
recipient Counter vs. Coin Instead of losing
interest with probability 1/k, use a counter so
that we loose interest only after k unnecessary
contacts s e-m There are nm updates sent The
probability that a single site misses all these
updates is (1 1/n)nm
m is the traffic
Counters and feedback improve the delay, with
counters playing a more significant role
111
Simple variations of rumor spreading
Push vs. Pull Pull converges faster If there are
numerous independent updates, a pull request is
likely to find a source with a non-empty rumor
list If the database is quiescent, the push
phase ceases to introduce traffic overhead,
while the pull continues to inject useless
requests for updates
Counter, feedback and pull work better
112

Minimization
Use a push and pull together, if both sites know
the update, only the site with the smaller
counter is incremented
Connection Limit
A site can be the recipient of more than one push
in a cycle, while for pull, a site can service an
unlimited number of requests
What if we set a limit
Push gets better (reduce traffic, since the
spread grows exponentially, most traffic occurs
at the end
Pull gets worst

113
Hunting If a connection is rejected, then the
choosing site can hunt for alternate
sites Then push and pull similar
114
Complex Epidemic and Anti-entropy
Anti-entropy can be run infrequently to back-up a
complex epidemic, so that every update eventually
reaches (or is suspended at) every site What
happens when an update is discovered during
anti-entropy use rumor mongering (e.g., make it
a hot rumor) or direct mail
115
Deletion and Death Certificates
Replace deleted items with death certificates
which carry timestamps and spread like ordinary
data When old copies of deleted items meet death
certificates, the old items are removed. But
when to delete death certificates?
116
Dormant Death Certificates
Define some threshold (but some items may be
resurrected re-appear) If the death certificate
is older than the expected time required to
propagate it to all sites, then the existence of
an obsolete copy of the corresponding data item
is unlikely Delete very old certificates at most
sites, retaining dormant copies at only a few
sites (like antibodies) Use two thresholds, t1
and t2 a list of r retention sites names with
each death certificate (chosen at random when the
death certificate is created) Once t1 is reached,
all servers but the servers in the retention list
delete the death certificate Dormant death
certificates are deleted when t1 t2 is reached
117
Anti-Entropy with Dormant Death Certificates
Whenever a dormant death certificate encounters
an obsolete data item, it must be activated
118
Spatial Distribution
How to choose partners Consider spatial
distributions in which the choice tends to favor
nearby servers
119
Spatial Distribution
The cost of sending an update to a nearby site is
much lower that the cost of sending the update to
a distant site Favor nearby neighbors Trade off
between Average traffic per link and Convergence
times Example linear network, only nearest
neighbor O(1) and O(n) vs uniform random
connections O(n) and O(log n) Determine the
probability of connecting to a site at distance
d For spreading updates on a line, d-2
distribution the probability of connecting to a
site at distance d is proportional to d-2 In
general, each site s independently choose
connections according to a distribution that is a
function of Qs(d), where Qs(d) is the cumulative
number of sites at distance d or less from s
120
Spatial Distribution and Anti-Entropy
Extensive simulation on the actual topology with
a number of different spatial distributions A
different class of distributions less sensitive
to sudden increases of Qs(d) Let each site s
build a list of the other sites sorted by their
distances from s Select anti-entropy exchange
partners from the sorted list according to a
function f(i), where i is its position on the
list (averaging the probabilities of selecting
equidistant sites) Non-uniform distribution
induce less overload on critical links
121
Spatial Distribution and Rumors
Anti-entropy converges with probability 1 for a
spatial distribution such that for every pair
(s, s) of sites there is a nonzero probability
that s will choose to exchange data with
s However, rumor mongering is less robust
against changes in spatial distributions and
network topology As the spatial distribution is
made less uniform, we can increase the value of k
to compensate
122
Replication II A PushPull Algorithm
Updates in Highly Unreliable, Replicated
Peer-to-Peer Systems Datta, Hauswirth, Aberer,
ICDCS03
123
Replication in P2P systems
CAN
P-Grid
Unstructured P2P (sub-) network of replicas How
to update them?
124
Problems in real-world P2P systems
Updates in Highly Unreliable, Replicated
Peer-to-Peer Systems Datta, HauswirthAberer,
ICDCS03

All replicas need to be informed of updates.
Peers have low online probabilities and quorum
can not be assumed.
Eventual consistency is sufficient.
Updates are relatively infrequent compared to
queries.
Metrics Communication overhead, latency and
percentage of replicas getting the update

125
Problems in real-world P2P systems (continued)
Updates in Highly Unreliable, Replicated
Peer-to-Peer Systems Datta, HauswirthAberer,
ICDCS03

Replication factor is substantially higher than
what is assumed for distributed databases.
Connectivity among replicas is high.
Connectivity graph is random.

126
Updates in replicated P2P systems
online
offline

P2P systems search algorithm will find a random
online replica responsible for the key being
searched.
The replicas need to be consistent (ideally)
Probabilistic guarantee Best effort!
Assumption each peer knows a subset of the all
replicas for an item

127
Updates in Highly Unreliable, Replicated
Peer-to-Peer Systems Datta, HauswirthAberer,
ICDCS03

Update Propagation combines
A push phase is initiated by the originator of
the update that pushes the new update to a subset
of responsible peers it knows, which in turn
propagate it to responsible peers they know, etc
(similar to flooding with TTL)
A pull phase is initiated by a peer that needs
to update its copy. For example, because (a) it
was offline (disconnected) or (b) has received a
pull request but is not sure that it has the most
up-to-date copy
Push and pull are consecutive, but may overlap in
time

128
Algorithms

Push
If replica p gets Push(U, V, Rf, t) for a new
(U, V) pair
Define Rp random subset (of size Rfr) of
replicas known to p
With probability PF(t) Push(U, V, Rf U Rp, t1)
to Rp \ Rf

Item, version, counter (similar to counters, when
TTL
Rf partial list of peers that have received the
update, R number of replicas, fr fraction of the
total replicas which peers initially decide to
forward the update (fan-out)

Each message keeps the list of peers were the
update has been sent
Parameters
TTL counter t
PF(t) probability (locally determined at each
peer) to send the update
Rpsize of the random subset - fanout

129
Selective Push
2
2
t1
t
t
t1
1
3
1
t
t
t1
extra update message
2
2
extra update message
avoid sequential redundant update partial lists
of informed neighbors are transmitted with the
message
avoid parallel redundant update messages are
propagated only with probability PF lt 1 and to a
fraction of the neighbors
130
Algorithms

Pull
If p coming online, or got no Push for time T
Contact online replicas
Pull updates based on version vectors

Strategy Push update to online peers asap, such
that later, all online peers always have update
(possibly pulled) w.h.p.
131
Scenario1 Dynamic topology
1
2
3
4
8
5
7
6
9
132
Scenario2 Duplicate messages
Necessary messages
Avoidable duplicates
Unavoidable (?) duplicates
1
2
3
4
8
5
7
6
9
133
Results Impact of varying fanout
A limited fanout (fr) is sufficient to spread the
update, since flooding is exponential. A large
fanout will cause unnecessary duplicate messages
How many peers learn about the update
134
Results Impact of probability of peer staying
online in consecutive push rounds
Sigma (s) probability of online peers staying
online in consecutive push rounds
135
Results Impact of varying probability of pushing
Reduce the probability of forwarding updates with
the increase in the number of push rounds
136
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
PCX Path Caching with Expiration Cache index
entries at intermediary nodes that lie on the
path taken by a search query Cached entries
typically have expiration times Not addressed
which items need to be updated as well as whether
the interest in updating particular entries has
died out CUP Controlled Update Propagation
Asynchronously builds caches of index entries
while answering search queries Propagates
updates of index entries to maintain these caches
(pushes updates)
137
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02

Every node maintains two logical channels per
neighbor
a query channel used to forward search queries
an update channel used to forward query
responses asynchronously to a neighbor and to
update index entries that are cached at the
neighbor (to proactively push updates)
Queries travel to the node holding the item
Updates travel along the reverse path taken by a
query
Query coalescing if a node receives two or more
queries for an item pushes only one instance
Just one Update Channel (does not keep a separate
open connection per request) All responses go
through the update channel use interest bits so
it knows to which neighbors to push the response

138
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02

Each node decides individually
When to receive updates
through registering its interest an
incentive-based policy to determine when to
cut-off incoming updates
When to propagate updates

139
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
For each key K, node n stores a flag that
indicates whether the node is waiting to receive
an update for K in response to a query an
interest vector each bit corresponds to a
neighbor and is set or clear depending on whether
the neighbor is or is not interested in receiving
updates for K a popularity measure or request
frequency of each non-local key K for which it
receives queries The measure is used to
re-evaluate whether it is beneficial to continue
caching and receiving updates for K
140
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
For each key, the authority node that owns the
key is the root of the CUP tree Updates originate
at the root of the tree and travel downstream to
interested nodes Types of updates deletes,
refresh, append
Example A is the root for K3
Applicable to both structured and unstructured In
structured, the query path is well-defined with a
bounded number of hops
141
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
Handling Queries for K 1. Fresh entries for key
K are cached use it to push the response to the
querying neigborhood 2. Key K is not in
cache added and marked it as pending (to
coalesce potential bursts) 3. All cached entries
for K have expired push the query Handling
Updates for K An update of K is forwarded only
to neighbors have registered interest in K Also,
an adaptive control mechanism to regulate the
rate of pushed updates
142
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
Adaptive control mechanism to regulate the rate
of pushed updates Each node N has a capacity U
for pushing updates that varies with its
workload, network bandwidth and/or network
connectivity N divides U among its outgoing
update channels such that each channel gets a
share that is proportional to the length of its
queue Entries in the queue may be re-ordered

Write a Comment

User Comments (0)