RANDOM GRAPHS IN CRYPTOGRAPHY

About This Presentation

Title:

RANDOM GRAPHS IN CRYPTOGRAPHY

Description:

Pollard's rho algorithm: The cycle length l can be used to find small factors of ... Pollard. Brent. Yao. Quisquater. And yet there are new surprising ideas! 24 ... – PowerPoint PPT presentation

Number of Views:123

Avg rating:3.0/5.0

Slides: 142

Provided by: eladb

Category:

more less

Transcript and Presenter's Notes

Title: RANDOM GRAPHS IN CRYPTOGRAPHY

1
RANDOM GRAPHS IN CRYPTOGRAPHY

Adi Shamir
The Weizmann Institute
Israel

May 15, 2007 7th Haifa Workshop on
Interdisciplinary Applications of Graph Theory,
Combinatorics and Algorithms
2
Random Graphs in Cryptography
In this talk I will concentrate on some
particular algorithmic issues related to random
graphs which are motivated by cryptanalytic
applications
Many of the results I will describe are either
unpublished or little known in our community
Note that in cryptanalysis, constants are
important!
3
Cryptography and Randomness

Cryptography deals with many types of randomness
- random strings
random variables
random functions
random permutations
random walks

4
Cryptography and Randomness

The notion of random functions (oracles)
- truly random when applied to fresh inputs
consistent when applied to previously used
inputs
f(0)37
f(1)92
f(2)78
f(3)51

5
Cryptography and Randomness
This tabular description gives us a local view,
which is not very informative
To see the big picture, we define the random
graph G associated with the random function f
x
f(x)
6
Cryptography and Randomness
When the function f is a permutation, its
associated graph G is very simple
7
Cryptography and Randomness
However, when the function f is a random function
rather than a random permutation, we get a very
rich and interesting structure
8
Random Graph 1
9
Random Graph 2
10
Cryptography and Randomness
There is a huge literature on the structure and
combinatorial properties of such random graphs
The distribution of component sizes, tree sizes,
cycle sizes, vertex in-degrees, number of
predecessors, etc.
11
Cryptography and Randomness
In many applications we are interesting in the
behavior of the random function f under
iteration.

Examples
pseudo random generators
stream ciphers
iterated block ciphers and hash functions
time/memory tradeoff attacks
randomized iterates

In this case, we are interested in a single path
starting at a random vertex within the random
graph.
12
A random path in a random graph
13
A random path in a random graph
14
A random path in a random graph
15
Cryptography and Randomness
Such a path always starts with a tail, and ends
with a cycle.
The expected length of both the tail and the
cycle is about the square root of the number of
vertices.
16
Assuming that we can only move forwards along
edges
Interesting algorithmic problems on paths
17
Assuming that we can only move forwards along
edges- Find some point on the cycle
Interesting algorithmic problems on paths
18
Assuming that we can only move forwards along
edges- Find some point on the cycle- Find the
same point a second time
Interesting algorithmic problems on paths
19
Assuming that we can only move forwards along
edges- Find some point on the cycle- Find the
same point a second time- Find the length of the
cycle
Interesting algorithmic problems on paths
l
20
Assuming that we can only move forwards along
edges- Find some point on the cycle- Find the
same point a second time- Find the length of the
cycle- Find the cycle entry point
Interesting algorithmic problems on paths
21
Interesting algorithmic problems on paths

Why are we interested in these algorithms?
Pollards rho algorithm The cycle length l can
be used to find small factors of large numbers,
requires only negligible memory.
Finding collisions in hash functions The cycle
entry point can represent a hash function
collision.

22
How to find a collision in a given hash function
H?

Exhaustive search Requires 2n time, no space
Birthday paradox Construct a large table of
2n/2 random hash values, sort it, and look for
consecutive equal values. Requires both time and
space of 2n/2
Random path algorithm Iterate the hash function
until you find the entry point into a cycle.
Requires 2n/2 time and very little space

23
Cycle detection is a very well studied problem

Floyd
Pollard
Brent
Yao
Quisquater
And yet there are new surprising ideas!

24
The best known techniqueFloyds two finger
algorithm- Keep two pointers- Run one of them
at normal speed, and the other at double speed,
until they collide
25
Floyds two finger algorithm- Keep two
pointers- Run one of them at normal speed, and
the other at double speed, until they collide
26
Floyds two finger algorithm- Keep two
pointers- Run one of them at normal speed, and
the other at double speed, until they collide
27
Floyds two finger algorithm- Keep two
pointers- Run one of them at normal speed, and
the other at double speed, until they collide
28
Floyds two finger algorithm- Keep two
pointers- Run one of them at normal speed, and
the other at double speed, until they collide
29
Floyds two finger algorithm- Keep two
pointers- Run one of them at normal speed, and
the other at double speed, until they collide
30
Floyds two finger algorithm- Keep two
pointers- Run one of them at normal speed, and
the other at double speed, until they collide
31
Floyds two finger algorithm- Keep two
pointers- Run one of them at normal speed, and
the other at double speed, until they collide
32
Floyds two finger algorithm- Keep two
pointers- Run one of them at normal speed, and
the other at double speed, until they collide
33
Floyds two finger algorithm- Keep two
pointers- Run one of them at normal speed, and
the other at double speed, until they collide
34
Can we use Floyds algorithm to find the entry
point into the cycle?
35
Can we use Floyds algorithm to find the entry
point into the cycle?-First find the meeting
point
36
Can we use Floyds algorithm to find the entry
point into the cycle?- first find the meeting
point- move one of the fingers back to the
beginning
37
Can we use Floyds algorithm to find the entry
point into the cycle?- first find the meeting
point- move one of the fingers back to the
beginning- move the two fingers at equal speed
38
Can we use Floyds algorithm to find the entry
point into the cycle?- first find the meeting
point- move one of the fingers back to the
beginning- move the two fingers at equal speed
39
Can we use Floyds algorithm to find the entry
point into the cycle?- first find the meeting
point- move one of the fingers back to the
beginning- move the two fingers at equal speed
40
Why does it work?
41
Why does it work?- denote by d the distance from
the beginningto the meeting point
d
42
Why does it work?- denote by d the distance from
the beginningto the meeting point- the fast
finger ran another d, reaching the same point, so
d is some (unknown) multiple of the cycle length
d
43
Why does it work?- running the two marked
fingers another d steps reaches the same point
again
d
44
Why does it work?- running the two marked
fingers another d steps reaches the same point
again- so the two fingers meet for the first
time at the entrance to the cycle, and then
travel together
d
45
Why does it work?- running the two marked
fingers another d steps reaches the same point
again- so the two fingers meet for the first
time at the entrance to the cycle, and then
travel together
d
46
Is this the most efficient cycle detection
algorithm?
47
Is this the most efficient cycle detection
algorithm? - When the path has n vertices and
the tail is short, Floyds algorithm requires
about 3n steps, and its extension requires up to
5n steps
48
Is this the most efficient cycle detection
algorithm? - When the cycle is short, the fast
finger can traverse it many times without noticing
49
A better idea- Place checkpoints at fixed
intervals- Update the checkpoints periodically
50
Problem - Too few checkpoints can miss small
cycles
51
Problem- Too many checkpoints are wasteful
52
Problem- You do not usually know in which case
you are!
53
Examples of unusually short cycles - cellular
automata (e.g., when simulating the game of
life)- stream ciphers (e.g., when one of the
LFSRs is stuck at 0)
54
A very elegant solutionPublished by Nivasch in
2004
55
Properties of the Nivasch algorithm- Uses a
single finger- Uses negligible amount of
memory- Stops almost immediately after recycling
- Efficient for all possible lengths of cycle
and tail- Ideal for fast hardware implementations
56
The basic idea of the algorithm- Maintain a
stack of values, which is initially empty -
Insert each new value into the top of the stack-
Force the values in the stack to be monotonically
increasing
4
3
6
7
9
5
0
8
2
1
57
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
58
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
59
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
60
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
61
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
62
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
63
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
64
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
65
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
66
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
67
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
68
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
69
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
70
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
71
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
72
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
73
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
74
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
75
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
76
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
77
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
78
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
79
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
80
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
81
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
82
The Stack Algorithm
4
3
6
7
9
5
0
8
2
1
83
Stop when two identical values appear at the top
of the stack
4
3
6
7
9
5
0
8
2
1
84
Claim The maximal size of the stack is expected
to be only logarithmic in the path length,
requiring negligible memory
4
3
6
7
9
5
0
8
2
1
85
Claim The stack algorithm always stops during
the second cycle, regardless of the length of the
cycle or its tail
4
3
6
7
9
5
0
8
2
1
86
Proof The smallest value on the cycle cannot be
eliminated by any later value. Its second
occurrence will eliminate all the higher values
separating them on the stack.
4
3
6
7
9
5
0
8
2
1
87
The smallest value in the cycle is located at a
random position, so we expect to go through the
cycle at least once and at most twice (1.5 times
on average)
4
3
6
7
9
5
0
8
2
1
88
Improvement Partition the values into k types,
and use a different stack for each type. Stop the
algorithm when repetition is found in some stack.
4
3
6
7
9
5
0
8
2
1
89
The new expected running time (11/k)n. Note
that n is the minimum possible running time of
any cycle detecting algorithm, and for k100 we
exceed it by only 1
4
3
6
7
9
5
0
8
2
1
90
Unlike Floyds algorithm, the Nivasch algorithm
provides excellent approximations for the length
of the tail and cycle as soon as we find a
repeated value, with no extra work
4
3
6
7
9
5
0
8
2
1
91
Note that when we stop, the bottom value in each
stack contains the smallest value of that type,
and that these k values are uniformly distributed
along the tail and cycle
4
3
6
7
9
5
0
8
2
1
92
Adding two special points to the k stack bottoms,
at least one must be in the tail and at least one
must be in the cycle, regardless of their sizes
4
3
6
7
9
5
0
8
2
1
93
We can now find the two closest points (e.g., 0
and 2) which are just behind the collision point.
We can thus find the collision after a short
synchronized walk
4
3
6
7
9
5
0
8
2
1
94
The Fundamental Problem of Cryptanalysis

Given a ciphertext, find the corresponding key
Given a hash value, find a first or second
preimage

Invert the easily computed random function f
where f(x)Ex(0) or f(x)H(x)
95
The Random Graph Defined by f
Goal Go backwards Means Going forwards
96
Possible solutions
Method 1 Exhaustive search Time complexity T
N. Memory complexity M 1.
Method 2 Exhaustive tableTime complexity T
1. Memory complexity M N.
Time/Memory tradeoffs find a compromise
betweenthe two extremes, i.e., M ltlt N and T
ltlt N, by using a free preprocessing stage.
97
Hellmans T/M Tradeoff (1979)
Preprocessing phaseChoose m random starting
points, evaluate chains of length t.Store only
pairs of (startpoint,endpoint) sorted by
endpoints.
Online phase from the given yf(x) complete the
chain.Find x by re-calculating the chain from
its startpoint.
98
How can we cover this graph by chains?
The main problem Long chains converge
99
How can we cover this graph by chains?
Problem Hard to cover more than N/t images Why?
A new path of length t is likely to collide with
the N/t images already covered by the table, due
to the birthday paradox Hellmans solution Use t
independent tables from t related
functions fi(x)f(x i mod N) note that
inversion of fi ? inversion of f.
100
Are these graphs really independent?

Local properties are preserved, while global
properties are modified
A point which had k predecessors in f will also
have k predecessors in fi, but their identities
will change.
In particular, all the graphs will have exactly
the same set of leaves, and values which are not
in the range of f will not be covered by any path
in any table
On the other hand, the number of components, the
size of the cycles, and the structure of the
trees hanging around the cycle can be very
different

101
Are these graphs really independent?
Hellmans trick is theoretically unfounded, but
works very well in practice. To invert a given
image, try separately each one of the t
functions, so both time and space increase by t
Tt2, Mmt By the birthday paradox, the
maximum possible values of t and m in a single
table satisfy ttmN ? T/M tradeoff TM2N2.
102
A typical choice of parameters
Let cN1/3 Use c tables, each table with mc
paths, each path of length about tc (stopping
each path at the first distinguished point
rather than at a fixed length) Together they
cover most of the c3N vertices. Total time
Tc2N2/3 , total space Mc2N2/3
103
Are such tradeoffs practically interesting?
Can be the best approach for cryptosystems with
about 80 bit keys Can be the best approach for
cryptosystems whose keys are derived from
passwords with up to 16 characters
104
A new optimization of Hellmans scheme
If each value has b bits, straightforward
implementations require 2b bits per path I will
show now that only b/3 bits are needed. This is
a 6 fold saving in memory, which is equivalent to
36 fold saving in time due to the T/M tradeoff
curve TM2N2
105
The new optimization
According to an old Chinese philosopher, Paths
in random graphs are like people They are born
at uniformly distributed startpoints, but die at
very unequally distributed endpoints!
106
The unequal distribution of endpoints
distinguished points are much more likely to be
near the leaves than deep in this graph, so very
few of them will be chosen as endpoints
107
The new optimization Forget the endpoints!
Note that the startpoints are arbitrary, so for
each one of the c tables we can choose a
different interval of c consecutive values as
start points Since cN1/3 , only b/3 bits are
needed per startpoint. Since we do not store
endpoints, this is all we need!
108
Divide all the c2 possible distinguished points
into about c large regions
9
8
c
7
6
During preprocessing, make sure that each region
has at most one path ending at one of its
distinguished points
5
4
3
c
For each region, memorize the startpoint of this
path (if it exists), but not the value of the
corresponding endpoint
2
1
0
109
Problem This can lead to too many false alarms
9
8
7
6
A false alarm happens when a stored endpoint is
found, but its corresponding startpoint does not
lead back to the initial value. In Hellmans
scheme, false alarms happen in about half the
tables we try. This wastes time, but is hard to
avoid.
5
4
3
2
1
0
110
There are two types of false alarms here
9
8
7
An old false alarm happens when the path from the
initial value joins one of the precomputed paths

6
5
4
3
2
1
0
111
There are two types of false alarms here
9
8
7
An old false alarm happens when the path from the
initial value joins one of the precomputed paths

6
5
4
3
A new false alarm happens when the path from the
initial value enters a new endpoint
2
1
0
112
A surprisingly small number of new false alarms
are created by forgetting the endpoints
9
8
7
Endpoints which are likely to be chosen by the
online phase were also likely to be chosen by the
preprocessing phase Since the Hellman
parameters were chosen maximally, with high
probability each new path is likely to end in one
of the marked endpoints (otherwise we could add
more paths to increase our cover!)
6
5
4
3
2
1
0
113
The bottom line
9
8
7
Simulations show that the total running time is
increased only by a few percent due to new false
alarms, and thus it is a complete waste to
memorize the endpoints!
6
5
4
3
2
1
0
114
Oechslins Rainbow Tables (2003)
115
There are many other possible tradeoff schemes
Use a different sequence of functions along each
path, such as111222333 or 123123123 or
pseudorandom e.g. 1221211
Make the choice of the next function dependent on
previous values
116
What kind of random graph are we working with in
such schemes?
There was already a slight problem with the
multiple graphs of Hellmans scheme
Oechslins scheme is even wierder
Its time to define a new notion of a random graph!
117
Barkan, Biham, and Shamir (Crypto 2006)
We introduced a new type of graph called Stateful
Random Graph
We proved rigorous bounds on the achievable
time/memory tradeoffs of any scheme which is
based on such graphs, including Hellman,
Oechslin, and all their many variants and
possible extensions
118
The Random Stateful Graph Model
y1
x1
y2
x2
y2
x2
y2
x2
y0
U
U
U
U
f
f
f
f
s0
s1
s2
s2
s2

The nodes in the graph are pairs (yi , si), with
N possible images yi and S possible states si.
The scheme designer can choose any U, then random
f is given.
The increased number of nodes (NS) can reduce the
probability of collisions and a good U can create
more structured graphs.
Examples of states Table in Hellman, column in
Oechslin. We call it a hidden state, since its
value is unknown to the attacker when he tries to
invert an image y .

119
The Stateful-Random-Graph Model cont
y1
x1
y2
x2
y2
x2
y2
x2
y0
U
U
U
U
f
f
f
f
s0
s1
s2
s2
s2
U in Hellman xiyi-1 si-1 mod Nsisi-1
120
The Stateful-Random-Graph Model cont
y1
x1
y2
x2
y2
x2
y2
x2
y0
U
U
U
U
f
f
f
f
s0
s1
s2
s2
s2
U in Rainbow xiyi-1 si-1 mod Nsisi-1 1
mod S.
121
The Stateful-Random-Graph Model cont
y1
x1
y2
x2
y2
x2
y2
x2
y0
U
U
U
U
f
f
f
f
s0
s1
s2
s2
s2
U in exhaustive searchxisi-1sisi-1 1 mod
N, which goes over all the preimagesof f in
a single cycle
122
Coverage and Collision of Paths
y1
x1
y2
x2
y2
x2
y2
x2
y0
U
U
U
U
f
f
f
f
s0
s1
s2
s2
s2
net coverage the set of images yi covered by
the M paths.gross coverage set of nodes (yi ,
si ) covered by the M paths.
Definition Two paths collide if both yi yj
and sisj
(yi-1 , si-1 )
(yj-1 , sj-1 )
123
The rigorously proven Coverage Theorem
Let where MN a, for any 0ltalt1. For any U with
S hidden states, with overwhelming probability
over random fs, the net coverage of any
collection of M paths of any length in the
stateful random graph is bounded from above by
2A.
124
Reduction of Best Case to Average Case
For a given U, consider a huge table W of all the
possible functions and all the subsets of M
startpoints
Wi,j1 if the net coverage of fi and Mj is larger
than 2A (0 otherwise).
We want to prove that almost all the rows contain
only zeroes by proving that there are fewer 1s
than rows in W.
125
Upper Bounding Prob(Wi,j1)

Method
Construct an algorithm that counts the net
coverage for fi and Mj.
Analyze prob. that the counted coverage gt 2A,
i.e., Prob(Wi,j1), over a random and uniform
choice of start points Mj and function fi
The Combinatorial heart of the proof
Define the notion of a coin toss with fixed
success prob. q.Define the notion of miracle
many coin tosses with few successes.Prove that
the probability of a miracle is negligible.

126
Bounding Prob(Wi,j1) Basic Idea
Algorithm traverses chains, stop chain on
collision, counts net coverage.
We want to consider each output of f as a truly
random number. However, this view is justified
only the first time f is applied to an input
(fresh value). Otherwise, the output of f is
already known. Recall Collision only if (yi ,
si ) (yj , sj ).
127
Bounding Prob(Wi,j1) Basic Idea
Real execution of chain creation
4
U
3

FreshBucket1
FreshBucket2
FreshBucketS
FreshBucket3
FreshBucket4
128
Bounding Prob(Wi,j1) Basic Idea
2
4
U
3
1

FreshBucket1
FreshBucket2
FreshBucketS
FreshBucket3
FreshBucket4
129
Bounding Prob(Wi,j1) Basic Idea
2
7
4
U
f
3
1

7
FreshBucket1
FreshBucket2
FreshBucketS
FreshBucket3
FreshBucket4
130
Bounding Prob(Wi,j1) Basic Idea
2
7
6
4
U
U
f
3
1
2

7
FreshBucket1
FreshBucket2
FreshBucketS
FreshBucket3
FreshBucket4
131
Bounding Prob(Wi,j1) Basic Idea
2
7
6
9
4
U
U
f
f
3
1
2

7
9
FreshBucket1
FreshBucket2
FreshBucketS
FreshBucket3
FreshBucket4
132
Bounding Prob(Wi,j1) Basic Idea
2
7
6
9
2
4
U
U
U
f
f
3
1
2
4

7
9
FreshBucket1
FreshBucket2
FreshBucketS
FreshBucket3
FreshBucket4
Not fresh! (although in another bucket)We
already know f(2)7.7 already covered by bucket1
no need to add to fresh bucket4.
133
Bounding Prob(Wi,j1) Basic Idea
2
7
6
9
2
7
3
4
U
U
U
U
f
f
f
3
1
2
4
2

7
9
FreshBucket1
FreshBucket2
FreshBucketS
FreshBucket3
FreshBucket4
134
Bounding Prob(Wi,j1) Basic Idea
2
7
6
9
2
7
3
9
4
U
U
U
U
f
f
f
f
3
1
2
4
2

7
9
FreshBucket1
FreshBucket2
FreshBucketS
FreshBucket3
FreshBucket4
Collision of a freshly created value f(3) with
a value already in the fresh bucket2. Chain must
end.
Clearly NetCoverage ? S FreshBucket
135
Bounding Prob(Wi,j1) Analysis
2
7
6
9
2
7
3
9
4
U
U
U
U
f
f
f
f
3
1
2
4
2

7
9
FreshBucket1
FreshBucket2
FreshBucketS
FreshBucket3
FreshBucket4
Analysis what is the probability of a collision
between a fresh image yif(xi) and the values in
the fresh bucket? Exactly FreshBucket/N as
yi f(xi) is truly random and independent of
previous events. Problem FreshBucket depends
on previous probabilistic events difficult to
analyze.
136
Bounding Prob(Wi,j1) Coin Toss
y1
x1
y2
x2
y2
x2
y2
x2
y0
U
U
U
U
f
f
f
f
s0
s1
s2
s2
s2

Set a threshold A/S in each bucketdivides bucket
to lower and upper buckets.
A/S
FreshBucket1
FreshBucket2
FreshBucketS
Coin Toss is when xi is fresh and lower bucket
is full. ? UpperBuckets ? coin
tossNetCoverage ? SFreshBuckets ?
AUpperBuckets ? ACoin Tosses
Successful Coin Toss Coin toss and yi collides
with lower bucket.
Prob. that a coin toss is successful exactly
qA/(SN) (independent of previous events!)A
successful coin toss ? Collision ? at most M
successful coin tosses
137
Miracles happen with very low probability
Miracle NetCoverage gt 2A Miracle ? After A coin
tosses there are fewer than M successes i.e.,
Prob(Miracle) ? Prob(B(A,q)ltM), where B(A,q) is
a binomial random variable, with qA/(SN).
Concluding the proofProb(Wi,j1) is so small
that 1 in table ltlt rows so for any tradeoff
scheme U with S hidden states, almost all
functions f cannot be covered well even by the
best subset of M startpoints. QED.
138
Rigorous Lower Bound on the Number of Hidden
States S
Net coverage of at least N/2 ? and therefore,
the number of hidden states must satisfy
139
Corollaries
To cover most of the vertices of any stateful
random graph, you have to use a sufficiently
large number of hidden states, which determines
the minimal possible running time of the online
phase of the attack. This rigorously proven
lower bound is applicable to Hellmans scheme,
the Rainbow scheme, and to any other scheme which
can be described by stateful random graphs.
140
Conclusion
Random graphs are wonderful objects to
study Understanding their structure can lead to
many cryptographic and cryptanalytic
optimizations In this talk I gave only a small
sample of the published and folklore results at
the interface between cryptography and random
graph theory
141
(No Transcript)

Write a Comment

User Comments (0)