Image Indexing and Retrieval - PowerPoint PPT Presentation

About This Presentation
Title:

Image Indexing and Retrieval

Description:

Constantly-updated directory hosted at central locations (do ... The overlay topology is highly controlled and files (or metadata ... in an ad-hoc fashion ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 143
Provided by: osfs
Category:

less

Transcript and Presenter's Notes

Title: Image Indexing and Retrieval


1
Topics in Database Systems Data Management in
Peer-to-Peer Systems
Search Replication in Unstructured P2p
2
Overview
  • Centralized
  • Constantly-updated directory hosted at central
    locations (do not scale well, updates, single
    points of failure)
  • Decentralized but structured
  • The overlay topology is highly controlled and
    files (or metadata/index) are not placed at
    random nodes but at specified locations
  • Decentralized and Unstructured
  • Peers connect in an ad-hoc fashion
  • The location of document/metadata is not
    controlled by the system
  • No guaranteed for the success of a search
  • No bounds on search time

3
Overview
  • Blind Search and Variations
  • No information about the location of items
  • Informed Search
  • Maintain (localized) index information
  • Local and Routing Indexes
  • Trade-off cost of maintaining the indexes (when
    joining/leaving/updating) vs cost for search

4
Blind Search
Flood-based each node contact its neighbors,
which in turn contact their neighbors, until the
item is located Exponential search time No
guarantees
5
Blind Search Issues
  • BFS vs DFS better response time, more messages
  • Iterative vs Recursive (return path)
  • TTL (time to leave) parameter
  • Cycles (duplicate messages)
  • Connectivity
  • Power-Law Topologies the ith node with most
    connections has ?/ia neighbors

6
Gnutella Summary
  • Completely decentralized
  • Hit rates are high
  • High fault tolerance
  • Adopts well and dynamically to changing peer
    populations
  • Protocol causes high network traffic (e.g.,
    3.5Mbps). For example
  • 4 connections C / peer, TTL 7
  • 1 ping packet can cause packets
  • No estimates on the duration of queries can be
    given
  • No probability for successful queries can be
    given
  • Topology is unknown ? algorithms cannot exploit
    it
  • Free riding is a problem
  • Reputation of peers is not addressed
  • Simple and robust

7
Summary and Comparison of Approaches
8
More on Search
  • Search Options
  • Query Expressiveness (type of queries)
  • Comprehensiveness (all or just the first (or k)
    results)
  • Topology
  • Data Placement
  • Message Routing

9
Comparison
????
??
????
?
???
10
Comparison
????
?
??
????
????
??
???
?
???
??
11
  • Client-Server performs well
  • But not always feasible
  • Ideal performance is often not the key issue!
  • Things that flood-based systems do well
  • Scaling
  • Decentralization of visibility and liability
  • Finding popular stuff (e.g., caching)
  • Fancy local queries
  • Things that flood-based systems do poorly
  • Finding unpopular stuff
  • Fancy distributed queries
  • Guarantees about anything (answer quality,
    privacy, etc.)

12
Blind Search Variations
Modified-BFS Choose only a ratio of the
neighbors (some random subset)
Expanding Ring or Iterative Deepening Start BFS
with a small TTL and repeat the BFS at increasing
depths if the first BFS fails Works well when
there is some stop condition and a small flood
will satisfy the query Else even bigger loads
than standard flooding Appropriate when hot
objects are replicated more widely than cold
objects
13
Blind Search Methods
  • Random Walks
  • The node that poses the query sends out k query
    messages to an equal number of randomly chosen
    neighbors
  • Each step follows each own path at each step
    randomly choosing one neighbor to forward it
  • Each path a walker
  • Two methods to terminate each walker
  • TTL-based or
  • checking method (the walkers periodically check
    with the query source if the stop condition has
    been met)
  • It reduces the number of messages to k x TTL in
    the worst case
  • Some kind of local load-balancing

14
Blind Search Methods
Random Walks In addition, the protocol bias its
walks towards high-degree nodes (choose the
highest degree neighbor)
15
Topics in Database Systems Data Management in
Peer-to-Peer Systems
Q. Lv et al, Search and Replication in
Unstructured Peer-to-Peer Networks, ICS02
16
  • Search and Replication in Unstructured
    Peer-to-Peer Networks
  • Type of replication depends on the search
    strategy used
  • A number of blind-search variations of flooding
  • A number of (metadata) replication strategies
  • Evaluation Method Study how they work for a
    number of different topologies and query
    distributions

17
Methodology
  • Performance of search depends on
  • Network topology graph formed by the p2p
    overlay network
  • Query distribution the distribution of query
    frequencies for individual files
  • Replication number of nodes that have a
    particular file

Assumption fixed network topology and fixed
query distribution Results still hold, if one
assumes that the time to complete a search is
short compared to the time of change in network
topology and in query distribution
18
Network Topology
(1) Power-Law Random Graph A 9239-node random
graph Node degrees follow a power law
distribution when ranked from the most connected
to the least, the i-th ranked has ?/ia, where
? is a constant Once the node degrees are chosen,
the nodes are connected randomly
19
Network Topology
(2) Normal Random Graph A 9836-node random graph
20
Network Topology
(3) Gnutella Graph (Gnutella) A 4736-node graph
obtained in Oct 2000 Node degrees roughly follow
a two-segment power law distribution
21
Network Topology
(4) Two-Dimensional Grid (Grid) A two
dimensional 100x100 grid
22
Query Distribution
Assume m objects Let qi be the relative
popularity of the i-th object (in terms of
queries issued for it) Values are normalized S
i1, m qi 1
  • Uniform All objects are equally popular
  • qi 1/m
  • (2) Zipf-like
  • qi ? 1 / ia

23
Query Distribution Replication
When the replication is uniform, the query
distribution is irrelevant (since all objects are
replicated by the same amount, search times are
equivalent for both hot and cold items) When the
query distribution is uniform, all three
replication distributions are equivalent
(uniform!) Thus, three relevant combinations
query-distribution/replication
  • Uniform/Uniform
  • Zipf-like/Proportional
  • Zipf-like/Square-root

24
Metrics
Pr(success) probability of finding the queried
object before the search terminates hops delay
in finding an object as measured in number of hops
25
Metrics
msgs per node Overhead of an algorithm as
measured in average number of search messages
each node in the p2p has to process nodes
visited Percentage of message duplication Peak
msgs the number of messages that the busiest
node has to process (to identify hot spots)
These are per-query measures An aggregated
performance measure, each query convoluted with
its probability
26
Limitation of Flooding
  • Choice of TTL
  • Too low, the node may not find the object, even
    if it exists, too high, burdens the network
    unnecessarily

There are many duplicate messages (due to cycles)
particularly in high connectivity graphs Multiple
copies of a query are sent to a node by multiple
neighbors Avoiding cycles, decreases the number
of links Duplicated messages can be detected and
not forwarded - BUT, the number of duplicate
messages can still be excessive and worsens as
TTL increases
27
Limitation of Flooding Comparison of the
topologies
Power-law and Gnutella-style graphs particularly
bad with flooding Highly connected nodes means
higher duplication messages, because many nodes
neighbors overlap Random graph best, Because in
truly random graph the duplication ratio (the
likelihood that the next node already received
the query) is the same as the fraction of nodes
visited so far, as long as that fraction is
small Random graph better load distribution
among nodes
28
Random Walks
Experiments show that 16 to 64 walkers give
good results checking once at every 4th step a
good balance between the overhead of the
checking message and the benefits of
checking Keeping state (when the same query
reaches a node, the node chooses randomly a
different neighbor to forward it) Improves
Random and Grid by reducing up to 30 the
message overhead and up to 30 the number of
hops Small improvements for Gnutella and PLRG
29
Random Walks
When compared to flooding The 32-walker random
walk reduces message overhead by roughly two
orders of magnitude for all queries across all
network topologies at the expense of a slight
increase in the number of hops (increasing from
2-6 to 4-15) When compared to expanding ring,
The 32-walkers random walk outperforms expanding
ring as well, particularly in PLRG and Gnutella
graphs
30
Principles of Search
  • Adaptive termination is very important
  • Expanding ring or the checking method
  • Message duplication should be minimized
  • Preferably, each query should visit a node just
    once
  • Granularity of the coverage should be small
  • Increase of each additional step should not
    significantly increase the number of nodes visited

31
Replication
32
Types of Replication
  • Two types of replication
  • Metadata/Index replicate index entries
  • Data/Document replication replicate the actual
    data (e.g., music files)

33
Types of Replication
Caching vs Replication Cache Store data
retrieved from a previous request
(client-initiated) Replication More proactive,
a copy of a data item may be stored at a node
even if the node has not requested it
34
Reasons for Replication
  • Reasons for replication
  • Performance
  • load balancing
  • locality place copies close to the requestor
  • geographic locality (more choices for the next
    step in search)
  • reduce number of hops
  • Availability
  • In case of failures
  • Peer departures

Besides storage, cost associated with
replication Consistency Maintenance Make reads
faster in the expense of slower writes
35
  • No proactive replication (Gnutella)
  • Hosts store and serve only what they requested
  • A copy can be found only by probing a host with a
    copy
  • Proactive replication of keys ( meta data
    pointer) for search efficiency (FastTrack, DHTs)
  • Proactive replication of copies for search
    and download efficiency, anonymity. (Freenet)

36
Issues
Which items (data/metadata) to replicate Based
on popularity In traditional distributed systems,
also rate of read/write cost benefit the
ratio read-savings/write-increase Where to
replicate (allocation schema) More Later
37
Issues
How/When to update Both data items and metadata
38
Database-Flavored Replication Control Protocols
Lets assume the existence of a data item x with
copies x1, x2, , xn x logical data item xis
physical data items
A replication control protocol is responsible for
mapping each read/write on a logical data item
(R(x)/W(x)) to a set of read/writes on a
(possibly) proper subset of the physical data
item copies of x
39
One Copy Serializability
Correctness A DBMS for a replicated database
should behave like a DBMS managing a one-copy
(i.e., non-replicated) database insofar as users
can tell
One-copy serializable (1SR) the schedule of
transactions on a replicated database be
equivalent to a serial execution of those
transactions on a one-copy database
One-copy schedule replace operation of data
copies with operations on data items
40
ROWA
Read One/Write All (ROWA) A replication control
protocol that maps each read to only one copy of
the item and each write to a set of writes on all
physical data item copies.
Even if one of the copies is unavailable an
update transaction cannot terminate
41
Write-All-Available
Write-all-available A replication control
protocol that maps each read to only one copy of
the item and each write to a set of writes on all
available physical data item copies.
42
Quorum-Based Voting
  • Read quorum Vr and a write quorum Vw to read or
    write a data item
  • If a given data item has a total of V votes, the
    quorums have to obey the following rules
  • Vr Vw gt V
  • Vw gt V/2

Rule 1 ensures that a data item is not read or
written by two transactions concurrently
(R/W) Rule 2 ensures that two write operations
from two transactions cannot occur concurrently
on the same data item (W/W)
43
Distributing Writes
Immediate writes Deffered writes Access only one
copy of the data item, it delays the distribution
of writes to other sites until the transaction
has terminated and is ready to commit. It
maintains an intention list of deferred
updates After the transaction terminates, it send
the appropriate portion of the intention list to
each site that contains replicated
copies Optimizations aborts cost less may
delay commitment delays the detection of
conflicts Primary or master copy Updates at a
single copy per item
44
Eager vs Lazy Replication
Eager replication keeps all replicas
synchronized by updating all replicas in a single
transaction Lazy replication asynchronously
propagate replica updates to other nodes after
the replicating transaction commits
In p2p, lazy replication (or soft state)
45
Update Propagation
  • Who initiates the update
  • Push by the server item (copy) that changes
  • Pull by the client holding the copy
  • When
  • Periodic
  • Immediate
  • Lazy when an inconsistency is detected
  • Threshold-based Freshness (e.g., number of
    updates or actual time)
  • Value
  • Expiration-Time Items expire (become invalid)
    after that time (most often used in p2p)
  • Stateless or State-full (the item-owners know
    which nodes holds copies of the item)

46
Replication Structured P2P
47
CHORD
Metadata replication or redundancy
Invariant to guarantee correctness of
lookups Keep successors nodes up-to-date Method
Maintain a successor list of its r nearest
successors on the Chord ring Why?
Availability How to keep it consistent Lazy
thought a periodic stabilization
48
CHORD
Data replication
Method Replicate data associated with a key at
the k nodes succeeding the key Why? Availability
49
CAN
Metadata replication
Multiple realities With r realities each node is
assigned r coordinated zones, one on every
reality and holds r independent neighbor
sets Replicate the hash table at each
reality Availability Fails only if nodes at
both r nodes fail Performance Better search,
choose to forward the query to the neighbor with
coordinates closest to the destination
50
CAN
Metadata replication
Overloading coordinate zones Multiple nodes may
share a zone The hash table may be replicated
among zones Higher availability Performance
choices in the number of neighbors, can select
nodes closer in latency Cost for Consistency
51
CAN
Metadata replication
Multiple Hash Functions Use k different hash
functions to map a single key onto k points in
the coordinate space Availability fail only if
all k replicas are unavailable Performance
choose to send it to the node closest in the
coordinated space or send query to all k nodes in
parallel (k parallel searches) Cost for
Consistency Query traffic (if parallel searches)
52
CAN
Metadata replication
Hot-spot Replication A node that finds it is
being overloaded by requests for a particular
data key can replicate this key at each of its
neighboring nodes Then with a certain
probability can choose to either satisfy the
request or forward it Performance load
balancing
53
CAN
Metadata replication
Caching Each node maintains a a cache of the data
keys it recently accessed Before forwarding a
request, it first checks whether the requested
key is in its cache, and if so, it can satisfy
the request without forwarding it any
further Number of cache entries per key grows in
direct proportion to its popularity
54
Replication Theory Replica Allocation Policies
55
Replication Allocation Scheme
Question how to use replication to improve
search efficiency in unstructured networks with a
proactive replication mechanism?
How many copies of each object so that the search
overhead for the object is minimized, assuming
that the total amount of storage for objects in
the network is fixed
56
Replication Theory
Assume m objects and n nodes Each object i is
replicated on ri distinct nodes and the total
number of objects stored is R, that is S i1, m
ri R Also, pi ri/R Assume that object i is
requested with relative rates qi, we normalize it
by setting S i1, m qi 1 For convenience,
assume 1 ltlt ri ? n and that q1 ? q2 ? ? qm
57
Replication Theory
Assume that searches go on until a copy is
found Searches consist of randomly probing
sites until the desired object is found search
at each step draws a node uniformly at random and
asks for a copy
58
Search Example
  • 2 probes

4 probes
59
Replication Theory
The probability Pr(k) that the object is found at
the kth probe is given Pr(k) Pr(not found
in the previous k-1 probes) Pr(found in one (the
kth) probe) (1 ri/n)k-1 ri/n k (search
size step at which the item is found) is a
random variable with geometric distribution and ?
ri/n gt expectation n/ri
60
Replication Theory
Ai Expectation (average search size) for object
i is the inverse of the fraction of sites that
have replicas of the object Ai n/ri The
average search size A of all the objects (average
number of nodes probed per object query) A Si
qi Ai n Si qi/ri
Minimize A n Si qi/ri
61
Replication Theory
If we have no limit on ri, replicate everything
everywhere Then, the average search size Ai
n/ri 1 Search becomes trivial
Assume a limit on R and that the average number
of replicas per site ? R/n is fixed
How to allocate these R replicas among the m
objects how many replicas per object
62
Uniform Replication
Create the same number of replicas for each
object ri R/m Average search size for uniform
replication Ai n/ri m/? Auniform Si qi m/?
m/? (m n/R) Which is independent of the query
distribution It makes sense to allocate more
copies to objects that are frequently queried,
this should reduce the search size for the more
popular objects
63
Proportional Replication
Create a number of replicas for each object
proportional to the query rate ri R qi
64
Uniform and Proportional Replication
  • Summary
  • Uniform Allocation pi 1/m
  • Simple, resources are divided equally
  • Proportional Allocation pi qi
  • Fair, resources per item proportional to demand
  • Reflects current P2P practices

65
Proportional Replication
Number of replicas for each object ri R
qi Average search size for uniform
replication Ai n/ri n/R qi Aproportioanl Si
qi n/R qi m/? Auniform again independent of
the query distribution Why? Objects whose query
rate are greater than average (gt1/m) do better
with proportional, and the other do better with
uniform The weighted average balances out to be
the same So what is the optimal way to allocate
replicas so that A is minimized?
66
Space of Possible Allocations
  • q i1/q i ? p i1/p i
  • As the query rate decreases, how much does the
    ratio of allocated replicas behave
  • Reasonable
  • p i1/p i ? 1
  • 1 for uniform

67
Space of Possible Allocations
  • Definition Allocation p1, p2, p3,, pm is
    in-between Uniform and Proportional if
  • for 1lt i ltm, q i1/q i lt p i1/p i lt 1
  • (1 for uniform, for proportial, we want to
    favor popular but not too much)
  • Theorem1 All (strictly) in-between strategies
    are (strictly) better than Uniform and
    Proportional

Theorem2 p is worse than Uniform/Proportional
if for all i, p i1/p i gt 1 (more popular gets
less) OR for all i, q i1/q i gt p i1/p i (less
popular gets less than fair share)
Proportional and Uniform are the worst
reasonable strategies
68
Space of allocations on 2 items
Uniform
Proportional
p2/p1
q2/q1
69
So, what is the best strategy?
70
Square-Root Replication
Find ri that minimizes A, A Si qi Ai n Si
qi/ri This is done for ri ? vqi where ? R/Si
vqi Then the average search size is Aoptimal
1/? (Si vqi)2
71
How much can we gain by using SR ?
Zipf-like query rates
Auniform/ASR
72
Other Metrics Discussion
  • Utilization rate, the rate of requests that a
    replica of an object i receives
  • Ui R qi/ri
  • For uniform replication, all objects have the
    same average search size, but replicas have
    utilization rates proportional to their query
    rates
  • Proportional replication achieves perfect load
    balancing with all replicas having the same
    utilization rate, but average search sizes vary
    with more popular objects having smaller average
    search sizes than less popular ones

73
Replication Summary
74
Pareto Distribution (for the queries)
Pareto principle 80-20 rule 80 of the wealth
owned by 20 of the population Zipf what is the
size of the rth ranked Pareto how many have size
gt r
75
Replication (summary)
Each object i is replicated on ri nodes and the
total number of objects stored is R, that is S
i1, m ri R
  • Uniform All objects are replicated at the same
    number of nodes
  • ri R/m
  • (2) Proportional The replication of an object is
    proportional to the query probability of the
    object
  • ri ? qi
  • (3) Square-root The replication of an object i
    is proportional to the square root of its query
    probability qi
  • ri ? vqi

76
Assumption that there is at least one copy per
object
  • Query is soluble if there are sufficiently many
    copies of the item.
  • Query is insoluble if item is rare or non
    existent.
  • What is the search size of a query ?
  • Soluble queries number of probes until answer is
    found.
  • Insoluble queries maximum search size

77
  • SR is best for soluble queries
  • Uniform minimizes cost of insoluble queries

What is the optimal strategy?
78
104 items, Zipf-like w1.5
All Soluble
85 Soluble
All Insoluble
Uniform
SR
79
We now know what we need.
How do we get there?
80
Replication Algorithms
  • Uniform and Proportional are easy
  • Uniform When item is created, replicate its key
    in a fixed number of hosts.
  • Proportional for each query, replicate the key
    in a fixed number of hosts (need to know or
    estimate the query rate)

Desired properties of algorithm
  • Fully distributed where peers communicate through
    random probes minimal bookkeeping and no more
    communication than what is needed for search.
  • Converge to/obtain SR allocation when query rates
    remain steady.

81
Replication - Implementation
Two strategies are popular Owner
Replication When a search is successful, the
object is stored at the requestor node only (used
in Gnutella) Path Replication When a search
succeeds, the object is stored at all nodes along
the path from the requestor node to the provider
node (used in Freenet) Following the reverse path
back to the requestor
82
Achieving Square-Root Replication
  • How can we achieve square-root replication in
    practice?
  • Assume that each query keeps track of the search
    size
  • Each time a query is finished the object is
    copied to a number of sites proportional to the
    number of probes
  • On average object i will be replicated on c n/ri
    times each time a query is issued (for some
    constant c)
  • It can be shown that this gives square root

83
Replication - Conclusion
Thus, for Square-root replication an object
should be replicated at a number of nodes that
is proportional to the number of probes that the
search required
84
Replication - Implementation
If a p2p system uses k-walkers, the number of
nodes between the requestor and the provider node
is 1/k of the total nodes visited (number of
probes) Then, path replication should result in
square-root replication Problem Tends to
replicate nodes that are topologically along the
same path
85
Replication - Implementation
Random Replication When a search succeeds, we
count the number of nodes on the path between the
requestor and the provider Say p Then, randomly
pick p of the nodes that the k walkers visited to
replicate the object Harder to implement
86
Achieving Square-Root Replication
What about replica deletion? Steady state
creation time equal with the deletion time The
lifetime of replicas must be independent of
object identity or query rate FIFO or random
deletions is ok LRU or LFU no
87
Replication Evaluation
  • Study the three replication strategies in the
    Random graph network topology
  • Simulation Details
  • Place the m distinct objects randomly into the
    network
  • Query generator generates queries according to a
    Poisson process at 5 queries/sec
  • Zipf-distribution of queries among the m objects
    (with a 1.2)
  • For each query, the initiator is chosen randomly
  • Then a 32-walker random walk with state keeping
    and checking every 4 steps
  • Each sites stores at most objAllow (40) objects
  • Random Deletion
  • Warm-up period of 10,000 secs
  • Snapshots every 2,000 query chunks

88
Replication Evaluation
  • For each replication strategy
  • What kind of replication ratio distribution does
    the strategy generate?
  • What is the average number of messages per node
    in a system using the strategy
  • What is the distribution of number of hops in a
    system using the strategy

89
Evaluation Replication Ratio
Both path and random replication generates
replication ratios quite close to square-root of
query rates
90
Evaluation Messages
Path replication and random replication reduces
the overall message traffic by a factor of 3 to 4
91
Evaluation Hops
Much of the traffic reduction comes from reducing
the number of hops
Path and random, better than owner For example,
queries that finish with 4 hops, 71 owner, 86
path, 91 random
92
Summary
  • Random Search/replication Model probes to
    random hosts
  • Proportional allocation current practice
  • Uniform allocation best for insoluble queries
  • Soluble queries
  • Proportional and Uniform allocations are two
    extremes with same average performance
  • Square-Root allocation minimizes Average Search
    Size
  • OPT (all queries) lies between SR and Uniform
  • SR/OPT allocation can be realized by simple
    algorithms.

93
Replication Unstructured P2Pepidemic
algorithms
94
  • Replication Policy
  • How many copies
  • Where (owner, path, random path)
  • Update Policy
  • Synchronous vs Asynchronous
  • Master Copy

95
Methods for spreading updates Push originate
from the site where the update appeared To reach
the sites that hold copies Pull the sites
holding copies contact the master site Expiration
times Epidemics for spreading updates
96
A. Demers et al, Epidemic Algorithms for
Replicated Database Maintenance, SOSP 87
Update at a single site Randomized algorithms
for distributing updates and driving replicas
towards consistency Ensure that the effect of
every update is eventually reflected to all
replicas Sites become fully consistent only when
all updating activity has stopped and the system
has become quiescent Analogous to epidemics
97
  • Methods for spreading updates
  • Direct mail each new update is immediately
    mailed from its originating site to all other
    sites
  • Timely reasonably efficient
  • Not all sites know all other sites
  • Mails may be lost
  • Anti-entropy every site regularly chooses
    another site at random and by exchanging content
    resolves any differences between them
  • Extremely reliable but requires exchanging
    content and resolving updates
  • Propagates updates much more slowly than direct
    mail

98
  • Methods for spreading updates
  • Rumor mongering
  • Sites are initially ignorant when a site
    receives a new update it becomes a hot rumor
  • While a site holds a hot rumor, it periodically
    chooses another site at random and ensures that
    the other site has seen the update
  • When a site has tried to share a hot rumor with
    too many sites that have already seen it, the
    site stops treating the rumor as hot and retains
    the update without propagating it further
  • Rumor cycles can be more frequent that
    anti-entropy cycles, because they require fewer
    resources at each site, but there is a chance
    that an update will not reach all sites

99
  • Anti-entropy and rumor spreading are examples of
    epidemic algorithms
  • Three types of sites
  • Infective A site that holds an update that is
    willing to share is hold
  • Susceptible A site that has not yet received an
    update
  • Removed A site that has received an update but
    is no longer willing to share
  • Anti-entropy simple epidemic where all sites are
    always either infective or susceptible

100
A set S of n sites, each storing a copy of a
database The database copy at site s ? S is a
time varying partial function s.ValueOf K ?
uV x t T set of keys set of values
set of timestamps (totally ordered by lt V
contains the element NIL s.ValueOfk NIL, t
item with k has been deleted from the
database Assume, just one item s.ValueOf ? uV
x tT thus, an ordered pair consisting of a
value and a timestamp The first component may be
NIL indicating that the item was deleted by the
time indicated by the second component
101
  • The goal of the update distribution process is to
    drive the system towards
  • s, s ?S s.ValueOf s.ValueOf
  • Operation invoked to update the database
  • UpdateuV s.ValueOf r, Now)

102
Direct Mail
At the site s where an update occurs For each
s ? S PostMailtos, msg(Update, s.ValueOf)
s originator of the update s receiver of the
update
Each site s receiving the update message
(Update, (u, t)) If s.ValueOf.t lt t
s.ValueOf ? (u, t)
  • The complete set S must be known to s (stateful
    server)
  • PostMail messages are queued so that the server
    is not delayed (asynchronous), but may fail when
    queues overflow or their destination are
    inaccessible for a long time
  • n (number of sites) messages per update
  • traffic proportional to n and the average
    distance between sites

103
Anti-Entropy
At each site s periodically execute For some s
? S ResolveDifferences, s
s pushes its value to s
s ? s
Three ways to execute ResolveDifference Push
(sender (server) - driven) If s.Valueof.t gt
s.Valueof.t s.ValueOf ? s.ValueOf Pull
(receiver (client) driven) If s.Valueof.t lt
s.Valueof.t s.ValueOf ? s.ValueOf Push-Pull
s.Valueof.t gt s.Valueof.t ? s.ValueOf ?
s.ValueOf s.Valueof.t lt s.Valueof.t ? s.ValueOf
? s.ValueOf
s pulls s and gets s value
104
Anti-Entropy
  • Assume that
  • Site s is chosen uniformly at random from the
    set S
  • Each site executes the anti-entropy algorithm
    once per period
  • It can be proved that
  • An update will eventually infect the entire
    population
  • Starting from a single affected site, this can
    be achieved in time proportional to the log of
    the population size

105
Anti-Entropy
Let pi be the probability of a site remaining
susceptible after the i cycle of
anti-entropy For pull, A site remains
susceptible after the i1 cycle, if (a) it was
susceptible after the i cycle and (b) it
contacted a susceptible site in the i1
cycle pi1 (pi)2 For push, A site remains
susceptible after the i1 cycle, if (a) it was
susceptible after the i cycle and (b) no
infectious site choose to contact in the i1
cycle pi1 pi (1 1/n)n(1-pi)
1 1/n (site is not contacted by a node) n(1-pi)
number of infectious nodes at cycle i
Pull is preferable than push
106
Anti-Entropy
  • compare the whole database instance sent over the
    network
  • Use checksums
  • what about recent updates known only in a few
    sites
  • A list of recent updates (now - timestamp lt
    threshold t)
  • Compare fist recent updates, update the ckecksums
    and then compare the checksums, choice of t
  • Maintain an inverted list of updates ordered by
    timestamp
  • Perform anti-entropy by exchanging timestamps at
    reverse timestamp order until their checksums
    agree
  • send only the updates, when to stop

107
Complex Epidemics Rumor Spreading
  • Initial State n individuals initially inactive
    (susceptible)
  • Rumor plantingspreading
  • We plant a rumor with one person who becomes
    active (infective), phoning other people at
    random and sharing the rumor
  • Every person bearing the rumor also becomes
    active and likewise shares the rumor
  • When an active individual makes an unnecessary
    phone call (the recipient already knows the
    rumor), then with probability 1/k the active
    individual loses interest in sharing the rumor
    (becomes removed)
  • We would like to know
  • How fast the system converges to an inactive
    state (no one is infective)
  • The percentage of people that know the rumor
    when the inactive state is reached

108
Complex Epidemics Rumor Spreading
Let s, i, r be the fraction of individuals that
are susceptible, infective and removed s i r
1 ds/dt - s i di/dt s i 1/k(1-s) i s e
(k1)(1-s) An exponential decrease with s For k
1, 20 miss the rumor For k 2, only 6 miss it
Unnecessary phone calls
109
Criteria to characterize epidemics
  • Residue
  • The value of s when i is zero the remaining
    susceptible when the epidemic finishes
  • Traffic
  • m Total update traffic / Number of sites
  • Delay
  • Average delay (tavg) difference between the
    time of the initial injection of an update and
    the arrival of the update at a given site
    averaged over all sites
  • The delay until (tlast) the reception by the
    last site that will receive the update during an
    epidemic

110
Simple variations of rumor spreading
Blind vs. Feedback Feedback variation a sender
loses interest only if the recipient knows the
rumor Blind variation a sender loses interest
with probability 1/k regardless of the
recipient Counter vs. Coin Instead of losing
interest with probability 1/k, use a counter so
that we loose interest only after k unnecessary
contacts s e-m There are nm updates sent The
probability that a single site misses all these
updates is (1 1/n)nm
m is the traffic
Counters and feedback improve the delay, with
counters playing a more significant role
111
Simple variations of rumor spreading
Push vs. Pull Pull converges faster If there are
numerous independent updates, a pull request is
likely to find a source with a non-empty rumor
list If the database is quiescent, the push
phase ceases to introduce traffic overhead,
while the pull continues to inject useless
requests for updates
Counter, feedback and pull work better
112
  • Minimization
  • Use a push and pull together, if both sites know
    the update, only the site with the smaller
    counter is incremented
  • Connection Limit
  • A site can be the recipient of more than one push
    in a cycle, while for pull, a site can service an
    unlimited number of requests
  • What if we set a limit
  • Push gets better (reduce traffic, since the
    spread grows exponentially, most traffic occurs
    at the end
  • Pull gets worst

113
Hunting If a connection is rejected, then the
choosing site can hunt for alternate
sites Then push and pull similar
114
Complex Epidemic and Anti-entropy
Anti-entropy can be run infrequently to back-up a
complex epidemic, so that every update eventually
reaches (or is suspended at) every site What
happens when an update is discovered during
anti-entropy use rumor mongering (e.g., make it
a hot rumor) or direct mail
115
Deletion and Death Certificates
Replace deleted items with death certificates
which carry timestamps and spread like ordinary
data When old copies of deleted items meet death
certificates, the old items are removed. But
when to delete death certificates?
116
Dormant Death Certificates
Define some threshold (but some items may be
resurrected re-appear) If the death certificate
is older than the expected time required to
propagate it to all sites, then the existence of
an obsolete copy of the corresponding data item
is unlikely Delete very old certificates at most
sites, retaining dormant copies at only a few
sites (like antibodies) Use two thresholds, t1
and t2 a list of r retention sites names with
each death certificate (chosen at random when the
death certificate is created) Once t1 is reached,
all servers but the servers in the retention list
delete the death certificate Dormant death
certificates are deleted when t1 t2 is reached
117
Anti-Entropy with Dormant Death Certificates
Whenever a dormant death certificate encounters
an obsolete data item, it must be activated
118
Spatial Distribution
How to choose partners Consider spatial
distributions in which the choice tends to favor
nearby servers
119
Spatial Distribution
The cost of sending an update to a nearby site is
much lower that the cost of sending the update to
a distant site Favor nearby neighbors Trade off
between Average traffic per link and Convergence
times Example linear network, only nearest
neighbor O(1) and O(n) vs uniform random
connections O(n) and O(log n) Determine the
probability of connecting to a site at distance
d For spreading updates on a line, d-2
distribution the probability of connecting to a
site at distance d is proportional to d-2 In
general, each site s independently choose
connections according to a distribution that is a
function of Qs(d), where Qs(d) is the cumulative
number of sites at distance d or less from s
120
Spatial Distribution and Anti-Entropy
Extensive simulation on the actual topology with
a number of different spatial distributions A
different class of distributions less sensitive
to sudden increases of Qs(d) Let each site s
build a list of the other sites sorted by their
distances from s Select anti-entropy exchange
partners from the sorted list according to a
function f(i), where i is its position on the
list (averaging the probabilities of selecting
equidistant sites) Non-uniform distribution
induce less overload on critical links
121
Spatial Distribution and Rumors
Anti-entropy converges with probability 1 for a
spatial distribution such that for every pair
(s, s) of sites there is a nonzero probability
that s will choose to exchange data with
s However, rumor mongering is less robust
against changes in spatial distributions and
network topology As the spatial distribution is
made less uniform, we can increase the value of k
to compensate
122
Replication II A PushPull Algorithm
Updates in Highly Unreliable, Replicated
Peer-to-Peer Systems Datta, Hauswirth, Aberer,
ICDCS03
123
Replication in P2P systems
CAN
P-Grid
Unstructured P2P (sub-) network of replicas How
to update them?
124
Problems in real-world P2P systems
Updates in Highly Unreliable, Replicated
Peer-to-Peer Systems Datta, HauswirthAberer,
ICDCS03
  • All replicas need to be informed of updates.
  • Peers have low online probabilities and quorum
    can not be assumed.
  • Eventual consistency is sufficient.
  • Updates are relatively infrequent compared to
    queries.
  • Metrics Communication overhead, latency and
    percentage of replicas getting the update

125
Problems in real-world P2P systems (continued)
Updates in Highly Unreliable, Replicated
Peer-to-Peer Systems Datta, HauswirthAberer,
ICDCS03
  • Replication factor is substantially higher than
    what is assumed for distributed databases.
  • Connectivity among replicas is high.
  • Connectivity graph is random.

126
Updates in replicated P2P systems
online
offline
  • P2P systems search algorithm will find a random
    online replica responsible for the key being
    searched.
  • The replicas need to be consistent (ideally)
  • Probabilistic guarantee Best effort!
  • Assumption each peer knows a subset of the all
    replicas for an item

127
Updates in Highly Unreliable, Replicated
Peer-to-Peer Systems Datta, HauswirthAberer,
ICDCS03
  • Update Propagation combines
  • A push phase is initiated by the originator of
    the update that pushes the new update to a subset
    of responsible peers it knows, which in turn
    propagate it to responsible peers they know, etc
    (similar to flooding with TTL)
  • A pull phase is initiated by a peer that needs
    to update its copy. For example, because (a) it
    was offline (disconnected) or (b) has received a
    pull request but is not sure that it has the most
    up-to-date copy
  • Push and pull are consecutive, but may overlap in
    time

128
Algorithms
  • Push
  • If replica p gets Push(U, V, Rf, t) for a new
    (U, V) pair
  • Define Rp random subset (of size Rfr) of
    replicas known to p
  • With probability PF(t) Push(U, V, Rf U Rp, t1)
    to Rp \ Rf

Item, version, counter (similar to counters, when
TTL
Rf partial list of peers that have received the
update, R number of replicas, fr fraction of the
total replicas which peers initially decide to
forward the update (fan-out)
  • Each message keeps the list of peers were the
    update has been sent
  • Parameters
  • TTL counter t
  • PF(t) probability (locally determined at each
    peer) to send the update
  • Rpsize of the random subset - fanout

129
Selective Push
2
2
t1
t
t
t1
1
3
1
t
t
t1
extra update message
2
2
extra update message
avoid sequential redundant update partial lists
of informed neighbors are transmitted with the
message
avoid parallel redundant update messages are
propagated only with probability PF lt 1 and to a
fraction of the neighbors
130
Algorithms
  • Pull
  • If p coming online, or got no Push for time T
  • Contact online replicas
  • Pull updates based on version vectors

Strategy Push update to online peers asap, such
that later, all online peers always have update
(possibly pulled) w.h.p.
131
Scenario1 Dynamic topology
1
2
3
4
8
5
7
6
9
132
Scenario2 Duplicate messages
Necessary messages
Avoidable duplicates
Unavoidable (?) duplicates
1
2
3
4
8
5
7
6
9
133
Results Impact of varying fanout
A limited fanout (fr) is sufficient to spread the
update, since flooding is exponential. A large
fanout will cause unnecessary duplicate messages
How many peers learn about the update
134
Results Impact of probability of peer staying
online in consecutive push rounds
Sigma (s) probability of online peers staying
online in consecutive push rounds
135
Results Impact of varying probability of pushing
Reduce the probability of forwarding updates with
the increase in the number of push rounds
136
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
PCX Path Caching with Expiration Cache index
entries at intermediary nodes that lie on the
path taken by a search query Cached entries
typically have expiration times Not addressed
which items need to be updated as well as whether
the interest in updating particular entries has
died out CUP Controlled Update Propagation
Asynchronously builds caches of index entries
while answering search queries Propagates
updates of index entries to maintain these caches
(pushes updates)
137
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
  • Every node maintains two logical channels per
    neighbor
  • a query channel used to forward search queries
  • an update channel used to forward query
    responses asynchronously to a neighbor and to
    update index entries that are cached at the
    neighbor (to proactively push updates)
  • Queries travel to the node holding the item
  • Updates travel along the reverse path taken by a
    query
  • Query coalescing if a node receives two or more
    queries for an item pushes only one instance
  • Just one Update Channel (does not keep a separate
    open connection per request) All responses go
    through the update channel use interest bits so
    it knows to which neighbors to push the response

138
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
  • Each node decides individually
  • When to receive updates
  • through registering its interest an
    incentive-based policy to determine when to
    cut-off incoming updates
  • When to propagate updates

139
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
For each key K, node n stores a flag that
indicates whether the node is waiting to receive
an update for K in response to a query an
interest vector each bit corresponds to a
neighbor and is set or clear depending on whether
the neighbor is or is not interested in receiving
updates for K a popularity measure or request
frequency of each non-local key K for which it
receives queries The measure is used to
re-evaluate whether it is beneficial to continue
caching and receiving updates for K
140
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
For each key, the authority node that owns the
key is the root of the CUP tree Updates originate
at the root of the tree and travel downstream to
interested nodes Types of updates deletes,
refresh, append
Example A is the root for K3
Applicable to both structured and unstructured In
structured, the query path is well-defined with a
bounded number of hops
141
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
Handling Queries for K 1. Fresh entries for key
K are cached use it to push the response to the
querying neigborhood 2. Key K is not in
cache added and marked it as pending (to
coalesce potential bursts) 3. All cached entries
for K have expired push the query Handling
Updates for K An update of K is forwarded only
to neighbors have registered interest in K Also,
an adaptive control mechanism to regulate the
rate of pushed updates
142
CUP Controlled Update Propagation in
Peer-to-Peer Networks RoussopoulosBaker02
Adaptive control mechanism to regulate the rate
of pushed updates Each node N has a capacity U
for pushing updates that varies with its
workload, network bandwidth and/or network
connectivity N divides U among its outgoing
update channels such that each channel gets a
share that is proportional to the length of its
queue Entries in the queue may be re-ordered
Write a Comment
User Comments (0)
About PowerShow.com