Byzantine%20Techniques%20II

About This Presentation

Title:

Byzantine%20Techniques%20II

Description:

BAR Fault Tolerance for Cooperative Services. Amitanand S. Aiyer, et. al. (SOSP 2005) ... Data on BAR-B can be retrieved within the lease period ... – PowerPoint PPT presentation

Number of Views:45

Avg rating:3.0/5.0

Slides: 124

Provided by: csCor

Learn more at: https://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: Byzantine%20Techniques%20II

1
Byzantine Techniques II

Justin W. Hart
CS 614
12/01/2005

2
Papers

BAR Fault Tolerance for Cooperative Services.
Amitanand S. Aiyer, et. al. (SOSP 2005)
Fault-scalable Byzantine Fault-Tolerant Services.
Michael Abd-El-Malek et.al. SOSP 2005

3
BAR Fault Tolerance for Distributed Services

BAR Model
General Three-Level Architecture
BAR-B

4
Motivation

General approach to constructing cooperative
services that span multiple administrative
domains (MADs)

5
Why is this difficult?

Nodes are under control of multiple
administrators
Broken Byzantine behaviors.
Misconfigured, or configured with malicious
intent.
Selfish Rational behaviors
Alter the protocol to increase local utility

6
Other models?

Byzantine Models Account for Byzantine
behavior, but do not handle rational behavior.
Rational Models Account for rational behavior,
but may break with Byzantine behavior.

7
BAR Model

Byzantine
Behaving arbitrarily or maliciously
Altruistic
Execute the proposed program, whether it benefits
them or not
Rational
Deviate from the proposed program for purposes of
local benefit

8
BART BAR Tolerant

Its a cruel world
At most (n-2)/3 nodes in the system are Byzantine
The rest are rational

9
Two classes of protocols

Incentive-Compatible Byzantine Fault Tolerant
(IC-BFT)
Guarantees a set of safety and liveliness
properties
It is in the best interest of rational nodes to
follow the protocol exactly
Byzantine Altruistic Rational Tolerant
Guarantees a set of safety and liveliness
properties despite the presence of rational nodes
IC-BFT is a subset of BART

10
An important concept

It isnt enough for a protocol to survive drills
of a handful of attacks. It must provably
provide its guarantees.

11
A flavor of things to come

Protocol builds on Practical Byzantine Fault
Tolerance in order to combat Byzantine behavior
Protocol uses game theoretical concepts in order
to combat rational behavior

12
A taste of Nash Equilibrium
Swerve Go Straight
Swerve 0, 0 -1,1
Go Straight 1,-1 X_X,X_X -100,-100
13
and the nodes are starving!

Nodes require access to a state machine in order
to complete their objectives
Protocol contains methods for punishing rational
nodes, including denying them access to the state
machine

14
An expensive notion of identity

Identity is established through cryptographic
keys assigned through a trusted authority
Prevents Sybil attacks
Bounds the number of Byzantine nodes
Gives rational nodes reason to consider long-term
consequences of their actions
Gives real world grounding to identity

15
Assumptions about rational nodes

Receive long-term benefit from staying in the
protocol
Conservative when computing the impact of
Byzantine nodes on their utility
If the protocol provides a Nash equilibrium,
then all rational nodes will follow it
Rational nodes do not colludecolluding nodes
are classified as Byzantine

16
Byzantine nodes

Byzantine fault model
Strong adversary
Adversary can coordinate collusion attacks

17
Important concepts

Promptness principal
Proof of Misbehavior (POM)
Cost balancing

18
Promptness principal

If a rational node gains no benefit from delaying
a message, it will send it as soon as possible

19
Proof of Misbehavior (POM)

Self-contained, cryptographic proof of wrongdoing
Provides accountability to nodes for their actions

20
Example of POM

Node A requests that Node B store a chunk
Node B replies that it has stored the chunk
Later Node A requests that chunk back
Node B sends back random garbage (it hadnt
stored the chunk) and a signature
Because Node A stored a hash of the chunk, it can
demonstrate misbehavior on part of Node B

21
but its a bit more complicated than that!

This corresponds to a rather simple behavior to
combat. Aggressively Byzantine behavior.

22
Passive-aggressive behaviors

Harder cases than aggressively Byzantine
A malicious Node A could merely lie about
misbehavior on the part of Node B
A node could exploit non-determinism in order to
shirk work

23
Cost Balancing

If two behaviors have the same cost, there is no
reason to choose the wrong one

24
Three-Level Architecture
25
Level 1

Unilaterally deny service to nodes that fail to
deliver messages
Tit-for-Tat
Balance costs
No incentive to make the wrong choice
Penance
Unilaterally impose extra work on nodes with
untimely responses

26
Level 2

Failure to respond to a request by a state
machine will generate a POM from a quorum of
nodes in the state machine

27
Level 3

Makes use of reliable work assignment
Needs only to provide sufficient information to
identify valid request/response pairs

28
Nuts and Bolts

Level 1
Level 2

29
Level 1

Ensure long-term benefit to participants
The RSM rotates the leadership role to
participants.
Participants want to stay in the system in order
to control the RSM and complete their protocols
Limit non-determinism
Self interested nodes could hide behind
non-determinism to shirk work
Use Terminating Reliable Broadcast, rather than
consensus.
In TRB, only the sender can propose a value
Other nodes can only adopt this value, or choose
a default value

30
Level 1

Mitigate the effects of residual non-determinism
Cost balancing
The protocol preferred choice is no more
expensive than any other
Encouraging timeliness
Nodes can inflict sanctions on untimely messages
Enforce predictable communication patterns
Nodes have to have participated at every step in
order to have the opportunity to issue a command

31
Terminating Reliable Broadcast
32
3f2 nodes, rather than 3f1

Suppose a sender s is slow
The same group of nodes now want to determine
that s is slow
A new leader is elected
Every node but s wants a timely conclusion to
this, in order to get their turn to propose a
value to the state machine
s is not allowed to participate in this quorum

33
TRB provides a few guarantees

They differ during periods of synchrony and
periods of asynchrony

34
In synchrony

Termination
Every non-Byzantine process delivers exactly one
message
Agreement
If on non-Byzantine process delivers a message m,
then all non-Byzantine processes eventually
deliver m

35
In asynchrony

Integrity
If a non-Byzantine process delivers m, then the
sender sent m
Non-Triviality
If the sender is non-Byzantine and sends m, then
the sender eventually delivers m

36
Message Queue

Enforces predictable communication patterns
Bubbles
A simple retaliation policy
Node As message queue is filled with messages
that it intends to send to Node B
This message queue is interleaved with bubbles.
Bubbles contain predicates indicating messages
expected from B
No message except the expected predicate from B
can fill the bubble
No messages in As queue will go to B until B
fills the bubble

37
Balanced Messages

Weve already discussed this quite a bit
We assure this at this level of the protocol
This is where we get our gigantic timeout message

38
Penance

Untimely vector
Tracks a nodes perception of the responsiveness
of other nodes
When a node becomes a sender, it includes its
untimely vector with the message

39
Penance

All nodes but the sender receive penance messages
from each node.
Because of bubbles, each untimely node must sent
a penance message back in order to continue using
the system
This provides a penalty to those nodes
The sender is excluded from this process, because
it may be motivated to lie in its penance vector,
in order to avoid the work of transmitting
penance messages

40
Timeouts and Garbage Collection

Set-turn timeout
Timeout to take leadership away from the sender
Initially 10 seconds in this implementation, in
order to overcome all expected network delays
Can only be changed by the sender
Max_response_time
Time at which a node is removed from the system,
its messages discarded and its resources garbage
collected
Set to 1 week or 1 month in the prototypes

41
Global Punishment

Badlists
Transform local suspicion into POMs
Suspicion is recorded in a local nodes badlist
Sender includes its badlist with its message
If, over time, recipients see a node in f 1
different senders badlists, then they too,
consider that node to be faulty

42
Proof

Real proofs do not appear in this paper, they
appear in the technical report

43
but heres a bit

Theorem 1 The TRB protocol satisfies
Termination, Agreement, Integrity and
Non-Triviality

44
and a bit more

Theorem 2 No node has a unilateral incentive to
deviate from the protocol
Lemma 1 No rational node r benefits from
delaying sending the set-turn message
Follows from penance
Lemma 2 No rational node r benefits from sending
the set-turn message early
Sending early could result in senderTO to be sent
(this protocol uses synchronized clocks, and all
messages are cryptographically signed)

45
and the rest thats mentioned in the paper

Lemma 3 No rational node r benefits from sending
a malformed set-turn message.
The set-turn message only contains the turn
number. Because of this, doing so reduces to
either sending early (dealt with in Lemma 1) or
sending late (dealt with in Lemma 2)

46
Level 2

State machine replication is sufficient to
support a backup service, but the overhead is
unacceptable
100 participants 100 MB backed up 10 GB of
drive space
Assign work to individual nodes, using arithmetic
codes to provide low-overhead fault-tolerant
storage

47
Guaranteed Response

Direct communication is insufficient when nodes
can behave rationally
We introduce a witness that overhears the
conversation
This eliminates ambiguity
Messages are routed through this intermediary

48
Guaranteed Response
49
Guaranteed Response

Node A sends a request to Node B through the
witness
The witness stores the request, and enters
RequestReceived state
Node B sends a response to Node A through the
witness
The witness stores the response, and enters
ResponseReceived

50
Guaranteed Response

Deviation from this protocol will cause the
witness to either notice the timeout from Node B
or lying on the part of Node A

51
Implementation

The system must remain incentive-compatible
Communication with the witness node is not in the
form of actual message sending, it is in the form
of a command to the RSM
Theorem 3 If the witness node enters the
request received state, for some work w to
rational node b, then b will execute w
Holds if sufficient sanctions exist to cause it
to be motivated to do this

52
State limiting

State is limited by limiting the number of slots
(nodes with which a node can communicate)
available to a node
Applies a limit to the memory overhead
Limits the rate at which requests are inserted
into the system
Forces nodes to acknowledge responses to requests
Nodes want their slots back

53
Optimization through Credible Threats
54
Optimization through Credible Threats

Returns to game theory
Protocol is optimized so nodes can communicate
directly. Add a fast path
Nodes register vows with the witness
If recipient does not respond, nodes proceed to
the unoptimized case
Analogous to a driver in chicken throwing their
steering wheel out the window

55
Periodic Work Protocol

Witness checks that periodic tasks, such as
system maintenance are performed
It is expected that, with a certain frequency,
each node in the system will perform such a task
Failure to perform one will generate a POM from
the witness

56
Authoritative Time Service

Maintains authoritative time
Binds messages sent to that time
Guaranteed response protocol relies on this for
generating NoResponses

57
Authoritative Time Service

Each submission to the state machine contains the
timestamp of the proposer
Timestamp is taken to be the maximum of the
median of timestamps of the previous f1
decisions
If no decision is decided, then the timestamp
is the previous authoritative time

58
Level 3 BAR-B

BAR-B is a cooperative backup system
Three operations
Store
Retrieve
Audit

59
Storage

Nodes break files up into chunks
Chunks are encrypted
Chunks are stored on remote nodes
Remote nodes send signed receipts and store
StoreInfos

60
Retrieval

A node storing a chunk can respond to a request
for a chunk with
The chunk
A demonstration that the chunks lease has
expired
A more recent StoreInfo

61
Auditing

Receipts constitute audit records
Nodes will exchange receipts in order to verify
compliance with storage quotas

62
Arithmetic Coding

Arithmetic coding is used to keep storage size
reasonable
1 GB of storage requires 1.3 GB of overhead
Keeping this ratio reasonable is crucial to
motivate self-interested nodes to participate

63
Request-Response pattern

Store
Retrieve
Audit

64
Retrieve

Originator sends a Receipt for the StoreInfo to
be retrieved
Storage node can send
A RetrieveConfirm
Containing the data and the receipt
A RetrieveDeny
Containing a receipt and a proof regarding why
Anything else
Generates a POM

65
Store

Originator sends a StoreInfo to be stored
Storage node can send
A receipt
A StoreReject
Demonstrates that the node has reached its
storage commitment
Anything else
Generates a POM

66
Audit

Three phases
Auditor requests both OwnList and StoreList from
auditee
Does this for random nodes in the system
Lists are checked for inconsistencies
Inconsistencies result in a POM

67
Time constraints

Data is stored for 30 days
After this, it is garbage collected
Nodes must renew their leases on stored chunks,
in order to keep them in the system, prior to
this expiration

68
Sanctions

Periodic work protocol forces generation of POMs
or special NoPOMs
POMs and NoPOMs are balanced
POMs evict nodes from the system

69
Recovery

Nodes must be able to recover after failures
Chained membership certificates are used in order
to allow them to retrieve their old chunks
Use of certificate later in the chain is regarded
as a new node entering the system
The old node is regarded as dead
The new node is allowed to view the old nodes
chunks

70
Recovery

This forces nodes to redistribute their chunks
that were on that node
Length of chains is limited, in order to prevent
nodes from shirking work by using a certificate
later in the chain

71
Guarantees

Data on BAR-B can be retrieved within the lease
period
No POM can be gathered against a node that does
not deviate from the protocol
No node can store more than its quota
A time window is available to nodes with
catastrophic failures for recovery

72
Evaluation

Performance is inferior to protocols that do note
make these guarantees, but acceptable

73
Impact of additional nodes
74
Impact of rotating leadership
75
Impact of fast path optimization
76
Fault-Scalable Byzantine Fault-Tolerant Services

Query/Update (Q/U) protocol
Optimistic quorum based protocol
Better throughput and fault-scalability than
Replicated State Machines
Introduces preferred quorum as an optimization on
quorum protocols

77
Motivation

Compelling need for services and distributed data
structures to be efficient and fault-tolerant
In Byzantine fault-tolerant systems, performance
drops off sharply as more faults are tolerated

78
Fault Scalability

A fault-scalable service is one in which
performance degrades gracefully as more server
faults are tolerated

79
Operations-based interface

Provides an interface similar to RSMs
Exports interfaces comprised of deterministic
methods
Queries
Do not modify data
Updates
Modify data
Multi-object updates
Allow a set of objects to be updated together

80
Properties

Operates correctly under an asynchronous model
Queries and updates are strictly serializable
In benign execution, they are obstruction-free
Cost is an increase in the number of required
servers 5b 1 servers, rather than 3b 1 servers

81
Optimism

Servers store a version history of objects
Updates are non-destructive to the objects
Use of logical timestamps based on contents of
update and object state upon which the update is
conditioned

82
Speedups

Preferred quorum, rather than random quorum
Addressed later
Efficient cryptographic techniques
Addressed later

83
Efficiency and Scalability
84
Efficiency

Most failure atomic protocols require at least a
2 phase commit
Prepare
Commit
The optimistic approach does not need a prepare
phase
This introduces the need for clients to repair
inconsistent objects
The optimistic approach also obviates the need
for locking!

85
Versioning Servers

In order to allow for this, versioning servers
are employed
Each update creates a new version on the server
Updates contain information about the version to
be updated.
If no update has been committed since that
version, the update goes through unimpeded.

86
Throughput-scalability

Additional servers, beyond those necessary to
provide the desired fault tolerance, can provide
additional throughput

87
Scaleup pitfall?

Encourage the use of fine-grained objects, which
reduce per-object contention
If majority of accesses access individual
objects, or few objects, then scaleup pitfall can
be avoided
In the example applications, this holds.

88
No need to partition

Other systems achieve throughput-scalability by
partitioning services
This is unnecessary in this system

89
The Query/Update Protocol
90
System model

Asynchronous timing
Clients and servers may be Byzantine faulty
Clients and servers assumed to be computationally
bounded, assuring effectiveness of cryptography
Failure model is a hybrid failure model
Benign
Malevolent
Faulty

91
System model

Extends definition of fail prone system given
by Malkhi and Reiter

92
System model

Point-to-point authenticated channels exist
between all clients and servers
Infrastructure deploying symmetric keys on all
channels
Channels are assumed unreliable
but, of course, they can be made reliable

93
Overview

Clients update objects by issuing requests
stamped with object versions to version servers.
Version servers evaluate these requests.
If the request is over an out of date version,
the clients version is corrected and the request
reissued
If an out of date server is required to reach a
quorum, it retrieves an object history from a
group of other servers
If the version matches the server version, of
course, it is executed
Everything else is a variation upon this theme

94
Overview

Queries are read only methods
Updates modify an object
Methods exported take arguments and return
answers
Clients perform operations by issuing requests to
a quorum
A server receives a request. If it accepts it it
invokes a method
Each update creates a new object version

95
Overview

The object version is kept with its logical
timestamp in a version history called the replica
history
Servers return replica histories in response to
requests
Clients store replica histories in their object
history set, an array of replicas indexed by
server

96
Overview

Timestamps in these histories are candidates for
future operations
Candidates are classified in order to determine
which object version a method should be executed
upon

97
Overview

In non-optimistic operation, a client may need to
perform a repair
Addressed later
To perform an operation, a client first retrieves
an object history set. The clients operation is
conditioned on this set, which is transmitted
with the operation.

98
Overview

The client sends this operation to a quorum of
servers.
To promote efficiency, the client sends the
request to a preferred quorum
Addressed later
Single phase operation hinges on the availability
of a preferred quorum, and on concurrency-free
access.

99
Overview

Before executing a request, servers first
validate its integrity.
This is important, servers do not communicate
object histories directly to each other, so the
clients data must be validated.
Servers use authenticators to do this, lists of
HMACs that prevent malevolent nodes from
fabricating replica histories.
Servers cull replica histories from the
conditioned on OHS that they cannot validate

100
Overview the last bit

Servers validate that they do not have a higher
timestamp in their local replica histories
Failing this, the client repairs
Passing this, the method is executed, and the new
timestamp created
Timestamps are crafted such that they always
increase in value

101
Preferred Quorums

Traditional quorum systems use random quorums,
but this means that servers frequently need to be
synced
This is to distribute the load
Preferred quorums choose to access servers with
the most up to date data, assuring that syncs
happen less often

102
Preferred Quorums

If a preferred quorum cannot be met, clients
probe for additional servers to add to the quorum
Authenticators make it impossible to forge object
histories for benign servers
The new host syncs with b1 host servers, in
order to validate that the data is correct
In the prototype, probing selects servers such
that the load is distributed using a method
parameterized on object ID and server ID

103
Concurrency and Repair

Concurrent access to an object may fail
Two operations
Barrier
Barrier candidates have no data associated with
them, and so are safe to select during periods of
contention
Barrier advances the logical clock so as to
prevent earlier timestamps from completing
Copy
Copies the latest object data past the barrier,
so it can be acted upon

104
Concurrency and Repair

Clients may repeatedly barrier each other, to
combat this, an exponential backoff strategy is
enforced

105
Classification and Constraints

Based on partial observations of the global
system state, an operation may be
Complete
Repairable
Can be repaired using the copy and barrier
strategy
Incomplete

106
Multi-Object Updates

In this case, servers lock their local copies, if
they approve the OHS, the update goes through
If not, a multi-object repair protocol goes
through
In this case, repair depends on the ability to
establish all objects in the set
Objects in the set are only repairable if all are
repairable. If objects in the set that would be
repairable are reclassified as incomplete.

107
An example of all of this
108
Implementation details
109
Cached object history set

Clients cache object history sets during
execution, and execute updates without first
querying.
Failing the request based on an out of date OHS,
the server returns an up-to-date OHS with the
failure

110
Optimistic query execution

If a client has not accessed an object recently,
it is still possible to complete in a single
phase.
Servers execute the update on the latest object
that they store. Clients then evaluate the
result normally.

111
Inline repair

Does not require a barrier and copy
Repairs the candidate in-place, obviating the
need for a round trip
Only possible in cases where there is no
contention

112
Handling repeated requests

Mechanisms may cause requests to be repeated
In order to shortcut other checks, the timestamp
is checked first

113
Retry and backoff policies

Update-update requires retry, and backoff to
avoid livelock
Update-query does not, the query can be updated
in place

114
Object syncing

Only 1 server needs to send the entire object
version state
Others send hashes
Syncing server then calculates hash and comparers
against all others

115
Other speedups

Authenticators
Authenticators use HMACs rather than digital
signatures
Compact timestamps
Hashes are used rather than object histories in
timestamps using a collision resistant hash
Compact replica histories
Replica histories are prune based on the
conditioned-on timestamp after updates

116
Malevolent components

The astute among you must have noticed the
possibility of DOS attacking by refusing
exponential backoff
Servers could rate-limit clients
Clients could also issue updates to a subset of a
quorum, forcing incomplete updates
Lazy verification can be used to verify
correctness of client operations in the
background
The amount of unverified work by a client can
then be limited

117
Correctness

Operations are strictly serializable
To understand, consider the conditioned-on chain.
All operations chain back to the initial
candidate, and a total order is imposed through
on all established operations
Operations occur atomically, including those
spanning multiple objects
If no operations span multiple objects, then
correct operations that complete are also
linearizable

118
Tests