Title: Maths behind the Diffie Hellman and RSA protocols
1Maths behind the Diffie Hellman and RSA protocols
- What this is going to cover and why
- Primes, products of primes and factorisation
- How to win a million dollars
- How many primes are there and how many very large
numbers are prime ? - Generating and testing very large primes
- Modular exponentiation
- Purposes of Diffie Hellman and RSA protocols
- How strong is RSA and how might it be broken ?
2What we're going to cover and why
- Understanding how asymmetric cryptography works
will involve some mathematics. This concerns how
large prime numbers can be generated and whether
products of 2 large prime numbers can be
factorised. Some exponentiation and remaindering
operations are used on numbers too large using
direct methods, so we'll be covering short cuts
to make these calculations fit our computing
resources. - This will give us the tools needed so we can read
more about and begin to understand the
descriptions and the security of asymmetric
cryptography protocols including Diffie Hellman
and RSA.
3Primes, products of primes and factorisation
- A prime number has no factors other than itself
and 1. A factor F of N divides N exactly such
that the remainder you get on dividing N by F is
zero, or N mod F 0. We will use the word mod to
mean the remaindering or modulus operation. - RSA considered secure because it is thought very
difficult or impossible to factor products of 2
large prime numbers. Diffie Hellman is considered
secure because modular exponentiation involving
large numbers can't be reversed. Large here means
binary numbers with between 1024 and 4096 digits.
A number with 1000 binary digits would have just
over 300 decimal or 250 hexadecimal digits.
4Multiplying large integers
- When we discuss primes and factorisation, we are
dealing with whole numbers which don't have
fractions and the kind of division which leaves
remainders. Not having fractions simplifies the
storage of these numbers in computer memory. - To multiply 2 very large primes we will use the
hardware binary integer multiplication
instructions in our CPU combined with long
multiplication, to multiply two 1024 bit binary
prime numbers together. This needs a programming
library which can handle large integers, because
the 32 or 64 bit integer types directly supported
by hardware are too small. Long multiplication
isn't as fast as multiplying two 32 bit numbers
together directly in hardware to make a 64 bit
one, but is fast enough.
5Remaindering large integers
- Obtaining a result R such as Ra mod b where a
and b are both large, can involve multiplying b
by 2 as many times as possible to obtain p b .
2n ( the dot means multiply) while keeping p
smaller than a and then obtaining R1a-p through
subtraction. This process is repeated to obtain
successively smaller values for R2 , R3 etc.
until the final result R is obtained such that 0
lt R lt a . - Multiplying by 2 and subtraction are fast
operations on large numbers.
6Factorisation
- How do we factorise a number N made by
multiplying large primes P and Q together ? One
approach involves trying every number I smaller
or equal to the square root of N, starting with
2, and seeing if N mod I 0. We add one to I
each time we try this, printing out any factors
we find . We could speed this up if we try
dividing N by prime numbers starting with 2, and
stopping after the square root of N. This is
because e.g. if 2 and 3 are not factors of N,
then multiples of 2 or 3 can't be either. - If we have a list of primes lt sqrt(N) to start
with, this procedure can be made faster. But this
doesn't help us factorise products of large
primes, because we don't have enough computer
memory to store all primes greater than a few
hundred billion.
7How to make 1,000,000
- All you have to do is work out how to factor the
product of 2 randomly chosen primes each about
1024 bits long. The security of Internet
commerce depends upon you not succeeding. Your
fame would enable you to earn this amount through
book sales and public speaking engagements.
Anyone who knew how to do this would have to be
offered more than this to keep the method secret. - There is a prize on offer for this sum for
whoever proves the Reimann hypothesis from the
Clay Mathematical institute, which might be a
method of solving this problem. More on this
below.
8An infinity of prime numbers
- Euclid proved the existence of an infinite number
of primes in 300 BC as follows Imagine there is
a highest prime P. We could then generate a
product of all the primes 2.3.5.7 ... P M - But then, either M1 would be prime ( because M
mod 2 M mod 3 M mod 5 ... M mod P 1 ) or
M1 would have to be the product of 1 or more
primes Q, R gt P. Either way primes higher than P
must logically exist once we imagine a highest
prime P to exist. Therefore there is no highest
prime so an infinite number of prime numbers must
exist. - Obviously we can't generate primes of infinite
size. So the next question concerns how easily
can we find a prime of an arbitrary size by
generating random numbers. This turns out to be
surprisingly easy.
9How many primes exist below a given number
?Sourcehttp//en.wikipedia.org/wiki/Prime_number
Counting_the_number_of_prime_numbers_below_a_give
n_number
- Even though the total number of primes is
infinite, one could still ask "Approximately how
many primes are there below 100,000?", or "How
likely is a random 20-digit number to be prime?". - The prime-counting function p(x) is defined as
the number of primes up to x. There are known
algorithms to compute exact values of p(x) faster
than it would be possible to compute each prime
up to x. Values as large as p(1020) can be
calculated quickly and accurately with modern
computers. Thus, e.g., p(100,000) 9592, and
p(1020) 2,220,819,602,560,918,840. - For larger values of x, beyond the reach of
modern equipment, the prime number theorem
provides a good estimate p(x) is approximately
x/ln(x). Even better estimates are known.
10Prime frequency table source
http//en.wikipedia.org/wiki/Prime_number_theorem
- The column on the right gives average gaps
between primes of size x.
11How to generate a large prime
- To obtain a prime of about 300 decimal digits,
generate 1024 random bits and append a 1 to the
right hand end to ensure it is odd. Test whether
it is prime using a primality test e.g. Fermat or
Rabin Miller. If it isn't add 2 to it and try
again. Repeat until the number tests prime. - Extrapolating from the prime frequency table we
expect one odd number in 330 to be prime in this
range, (because average prime gaps are about
660). Or we could generate a new random number
each time and try on average 330 times before we
find one, or if we are short of trusted entropy
use a secure pseudorandom generator seeded by the
first random number.
12The Rabin Miller primality test
- This and the Fermat test are probabilistic and
are run a number of times to reduce the
probability of a candidate prime being composite
to a very small level. Each test reduces the
probability of a candidate prime being composite
to 1/4 of the previous value. A small witness
prime is used for each cycle. Passing 64 tests
reduces the probability of a random number being
composite to 1/2128 . - Some composites (e.g. Carmichael numbers) are
good Fermat liars, in that they will pass many of
these tests. But these numbers occur so
infrequently that generating the random number
locally reduces this risk to a tiny and known
level.
13Rabin Miller primality test pseudocodesource
http//en.wikipedia.org/wiki/Miller-Rabin_test
- Input n gt 2, an odd integer to be tested for
primality - k, a parameter that determines the
accuracy of the test - Output composite if n is composite, otherwise
probably prime - Write n - 1 as 2sd with d odd by factoring
powers of 2 from n - 1 - LOOP repeat k times
- pick a randomly in the range 2, n - 2
- x ? ad mod n
- if x 1 or x n - 1 then do next LOOP
- for r 1 .. s - 1
- x ? x2 mod n
- if x 1 then return composite
- if x n - 1 then do next LOOP
- return composite
- return probably prime
14Using OpenSSL to check if a large number is prime
- The openssl command line tool enables us to try
various cryptographic primitive operations from
the command line. It has a prime option, to test
if a large number is prime or not, using a set of
Rabin Miller tests. Example - rich_at_saturn openssl prime
- 92938982342342198347213987237213423914192384241397
283498234172139942317423921392173428317519 - 2D9FE6C6431358699F8490E1B3F2DF525DF375B5FC788690DF
09B96E219DC46ED86CA74611D4F is not prime - rich_at_saturn openssl prime
- 92938982342342198347213987237213423914192384241397
283498234172139942317423921392173428317521 - 2D9FE6C6431358699F8490E1B3F2DF525DF375B5FC788690DF
09B96E219DC46ED86CA74611D51 is prime
15Modular exponentiation 1
- A too slow version of this operation needing too
much memory would multiply one number by itself
by the number of times given by the exponent
minus 1, and then taking a remainder or modulus. - E.G. ag mod p
- multiplies a by itself g - 1 times, and takes the
remainder when dividing the result by p. A simple
example of this is if a is 5, g is 3 and p is 11
then - E.G. 53 mod 11 4
- ( because 53 is 125, and 112 is 121, so the
remainder is 125 - 121 4
16Modular exponentiation 2
- A much faster version giving the same result
involves reducing the number of multiplications
by using repeated squaring, is based on - if b is even
- ab (a(b/2))2,
- where b/2 is the integer (no fractions) division,
- Example 28 (24)2 256
- if b is odd
- ab (a(b/2))2 . a
- Example 29 (24)2.2 512
17Modular exponentiation 3
- This is done recursively e.g.
- 237 (218)2.2
- 218 (29)2
- 29 (24)2.2
- 24 (22)2
- 22 2.2
- Exponentiating this way involves 7
multiplications instead of 36. This number of
multiplications is about log2g if g is the
exponent. So instead of needing to multiply g - 1
times, we need to multiply about 1024 times if
exponent g is a 1024 bit binary number.
18Modular exponentiation 4
- However, we could still end up with a number with
so many digits (before taking the remainder on
dividing by p) that we wouldn't have enough
memory to store it. - A second optimisation we can combine with the one
above is by repeatedly taking the remainder on
division by p, which will prevent the
intermediate results from growing too large. We
can combine this second optimisation with the
first because if - f g . h and f mod i j
- then (( g mod i ) . (h mod i)) mod i j
19Modular exponentiation 5
- The above identity follows because
- if g k q where k g - (g mod i) and q g
mod i - and h m n where m h - (h mod i) and n h
mod i, because - f g . h (k q).(m n) k.m q.m k.n
q.n . - In this expression, k and m are exact multiples
of i, so - k.m mod i q.m mod i k.n mod i 0 .
- Therefore g.h mod i q.n mod i j
- So we don't have to multiply g and h, we can
multiply the smaller numbers q and n after
obtaining these remainders.
20Modular exponentiation 6
- Example start with f 221, g 13, h 17 and i
7. - 221 13 . 17 and 221 mod 7 4
- 13 mod 7 6 and 17 mod 7 3 and (6 . 3)
mod 7 4 - So we didn't need to get the remainder on
dividing 221 by 7 and we didn't need to multiply
13 and 17 together. We got the same result
keeping our numbers to multiply (6 and 3)
smaller, because we used the remainder on
division by 7 three times instead of once. - Combining these optimisations into an algorithm
that reduces both the number of multiplications
and the size of these, makes modular
exponentiation fast.
21Diffie Hellman Protocol 1
- This protocol enables Alice and Bob to establish
a shared secret suitable for one-off use to
support a communication session without having
any prior knowledge. It doesn't allow their
shared secret key to be used for authentication.
But it can be combined with other public-key
protocols, e.g. by Bob and Alice signing the
shared secret by using their semi-permanent GPG
secret keys. Then if Alice and Bob both can trust
each others' keys as genuine, their signatures on
the Diffie Hellman (DH) session key also
authenticate them to each other. - The DH one-off shared secret or session key
allows for perfect forward secrecy, which means
that compromise of a long-duration key does not
compromise previous session ciphertext derived
using this.
22Diffie Hellman Protocol 2
23The RSA protocol 1source http//en.wikipedia.org
/wiki/RSA
- Here is an example of RSA encryption and
decryption. The parameters used here are
artificially small, but one can also use OpenSSL
to generate and examine a real keypair.
24The RSA protocol 2source http//en.wikipedia.org
/wiki/RSA
- In real life situations the primes selected would
be much larger, however in our example it would
be relatively trivial to factor n, 3233, obtained
from the freely available public key back to the
primes p and q.
25Quantum computers and Shor's algorithm
- Conventional computers perform computations on
bits. A bit has a value of one or zero. Quantum
computers depend upon the validity of the branch
of physics known as quantum theory. These use
qubits, which can be a 0 or a 1, or a quantum
superposition of both of these states. Shor's
algorithm is theoretically capable of cracking
RSA given a sufficiently large quantum computer. - D-Wave have announced a quantum computer with as
many as 16 qubits.
26Quantum Cryptography
- The same area of physics which might one day be
used to break RSA might also one day replace it
for some purposes. Quantum cryptography derives
from quantum states e.g. of single polarised
photons transmitted over fibre optics, being
dependant upon whether they are observed or not. - Unfortunately this technique requires dedicated
optic-fibre or free-space optical point to point
communication links and expensive equipment. It
is thought that this technique will not work over
a routed network.
27The Riemann Zeta function and factoringsource
http//secamlocal.ex.ac.uk/people/staff/mrwatkin/z
eta/ns111100.htm
- The zeta function (above) holds inside it the
secrets of the primes. How come? Like any
function, all it does is turn one number into
another number. If n is 3, for example, you add
up the infinity of terms in the formula to find
out that (3) is roughly 1.2. As the formula
shows, the zeta function can also be written as a
product of infinitely many terms, each based on
one prime number. - The true significance of this function emerges if
you feed it complex numbers such as 24 13i,
combinations of ordinary, real numbers and
so-called imaginary numbers (where i is the
square root of -1). Although they may sound
abstruse, complex numbers are used to simplify
practical calculations in everything from
engineering to quantum mechanics.
28The Riemann Zeta function and factoring 2source
http//secamlocal.ex.ac.uk/people/staff/mrwatkin/z
eta/ns111100.htm
- For certain complex numbers, the zeta function is
zero. All the known "zeros" lie along a line in
the complex plane, with real parts equalling 1/2.
Riemann's hypothesis is that every zero lies on
this line. If they do, Riemann proved, the prime
numbers must show up as if they are picked at
random, but still following an underlying
distribution. - The TV thriller "prime suspect" was based on the
idea that proving the Reimann hypothesis would
result in breaking all Internet security -
presumably by making it easier to factor products
of large primes. Interestingly enough, there are
deep connections between the Reiman Zeta
function, prime numbers and also quantum physics.
29Further Reading
- These lecture notes present an overview and a
limited toolkit, but don't cover the Diffie
Hellman or RSA protocols extensively. Exam
questions will assume that students have read
about these subjects in more depth. The
recommended text (Ferguson, N. Schneier, B.
(2003) Practical Cryptography, John Wiley
Sons.) is suitable for more thorough study. - For more general study purposes, the HTML version
of these notes contains various relevant links.