Title: Public Key Cryptosystem
1Public Key Cryptosystem
2Recall Definition
- A public key cryptosystem is (M, C, K, G, E, D)
- M cleartext message space
- C ciphertext space
- K key space
- G generate encryption/decryption key pair from
key length - E encrypt cleartext given encryption key
- D decrypt ciphertext given decryption key
3RSA Cryptosystem
- Most well known and widely used public-key
cryptosystem. - Named after inventors Rivest, Shamir, and
Adleman. - Got Turing award for RSA
- Based on factoring of large number
- In fact, more than that.
4RSA Key Generation (1)
- Generate two large random primes p, q
- p, q are usually of the same length.
- Npq
- Choose appropriate exponent e (of the same
length). - Compute d such that for all m,
5RSA Key Generation (2)
- Public Key (n,e)
- Private Key (n,d)
- Discard p,q
- Very important for security of RSA.
6RSA Encryption/Decryption
- Cleartext space (an appropriate subset of )
1, , n-1 - Ciphertext space (an appropriate subset of )
1, , n-1 - E((n,e), m)me mod n.
- D((n,d), c)cd mod n.
7Why does the decryption work?
8Exponentiation Algorithm
- Both encryption and decryption need to compute
modular exponentiation. - What algorithm do we use to do this?
- By definition, ae is multiplication of a for e
times. - Then even computing something like 754238 takes a
lot of time. - How about computing ae for a24973479741111124324
3243224324324234324234234320043543570 and e?
9Fast Exponentiation(1)
- Lets write e in binary, for example
- e1010000000110
- Then
- Enough if we can fast compute
10Fast Exponentiation (2)
- But how can we fast compute
- Starting from a
- Keep squaring, we get
- Until we have all items we need
11Fast Exponentiation Algorithm
- Input a, e
- Output yae
- int ba, y1, eee
- while(ee!0) //invariant (bee) y ae
- if(ee1) //is odd
- y b //multiply result by
power - bb eegtgt1 //compute next power
12Computing d?
- Now we see how encryption/decryption algorithms
work. - But how does key generation algorithm work?
- In particular, how to find d?
- Recall we need for all m, med-11(mod n).
- This requires a little number theory.
13Residue Class (1)
- For any modulus n, any integer a, we can define
- Aa aa mod n
- This is called a residue class.
- For any modulus n, the residue classes mod n
constitute a partition of integers. - Any two residue classes mod n are disjoint.
- Every integer is in a residue class mod n.
14Residue Class (2)
- Being in the same residue class is called modular
equivalent mod n. This is an equivalence
relation. - Any integer is modular equivalent to itself.
- If a and b are modular equivalent, then b and a
are modular equivalent. - If a and b are modular equivalent, and if b and c
are modular equivalent, then a and c are modular
equivalent.
15Residue Class (3)
- Residue classes preserves basic arithmetics.
- If aa (mod n), bb (mod n), then abab
(mod n) - Similar properties hold for subtraction,
multiplication, division, etc. - So we usually use one element of a residue class
to represent it. - For example, we use 1 to represent 1, n1, -n1,
2n1, -2n1,
16Complete Residue System
- There are n residue classes in total with respect
to modulus n. - We use 0, 1, 2, , n-1 to represent each of
them, respectively. - Then we have a complete system of residue
classes. - Formally, we define Zn0,1,, n-1.
17Reduced Residue System
- We are particularly interested in the residue
classes that are coprime to n. - Note that if a number in a residue class is
coprime to n, then so are all other numbers in
this class. - This is called a reduced system of residue
classes. - Formally, we define Zna gcd(a,n)1 and a is
in Zn
18Multiplication in Zn
- For multiplication (mod n---we often omit mod n
in the follows), Zn is a group. This means the
following properties hold - Its closed if a and b are in Zn, then ab are
also in Zn. - If a,b,c are in Zn, then (ab)ca(bc).
- 1 is in Zn. Note that for all a, 1aa1a.
- For all a in Zn, 1/a is also in Zn.
- Note 1/a (mod n) is defined as b such that ab1
(mod n)
19Euler Totient Function
- We define F(n)Zn
- This is called Euler Totient Function.
- Intuitively, this is the number of integers in
0,1,, n-1 that are coprime to n. - Suppose that nab.
- If gcd(a,b)1, then F(n)F(a) F(b).
20Computing F(n) (1)
21Computing F(n) (2)
- What is F(pn) then?
- F(pn) is the number of integers in 0,1, , pn-1
that are coprime to pn - This is equal to the number of integers in 0,1,
, pn-1 that are coprime to p - There are pn-1 multiples of p.
- So F(pn)pn-pn-1pn-1(p-1)
- Therefore,
22(Fermat-)Euler Theorem
- Another property of Euler Totient Function For
all a in Zn, - Why is this true?
- Because Zn is a group and F(n) is its size
23Proof of Euler Theorem (1)
- In general, we show that, for any group G, any a
in G, - aG1
- Suppose the elements of G are b1, , bG.
Consider ab1, , abG. - Clearly, any two elements abi and abk in this
sequence are not equal. (Otherwise bi and bk are
also equal.) - So this is a permuation of b1, , bG.
24Proof of Euler Theorem (2)
- Since ab1, , abG is a permuation of b1, ,
bG, we have - ab1 abG b1 bG
- The above is equivalent to
- aG b1 bG b1 bG
- Which means
- aG1
25Returning to Key Generation
- Recall we need to compute d such that for all m
(in Zn), - med-11 (mod n)
- Due to Euler theorem, this can be implied by
- F(n) ed-1
- which is equivalent to
- ed1 (mod F(n) )
26Returning to Computing d
- In fact, we can show that (for all m) med-11
(mod n) is not only implied by ed1 (mod F(n) ) ,
but also equivalent to it. - So to compute d for key generation, we only need
to find d such that ed1 (mod F(n) ). - Let u F(n). Now the question is to find an
algorithm that computes d such that ed1 (mod u)
when given e and u.
27Compute Modular Inverse
- Recall the definition of 1/e mod something.
- Here we exactly want to find d1/e mod u.
- This is called modular inverse.
- To compute d such that ed1(mod u), we note it is
equivalent to that (there exists v such that) - eduv1.
- Now the problem becomes given e, u, find d such
that eduv1.
28Euclidean Algorithm
- Suppose e is coprime to u.
- Then gcd(e,u)1.
- We have an Euclidean algorithm that finds
gcd(e,u) given e,u. - This algorithm happens to give d,v such that
eduvgcd(e,u) (if it is extended). - So we can apply it here to find the d we want!
29How Euclidean Algorithm Works?
- Question Given e, u, we need to compute
gcd(e,u). - We note that gcd(e,u) divides e mod u and u mod
e, since it divides both e and u. - So we start from e and u.
- Repeat dividing
- If the smaller number divides the greater number,
stop and output the smaller. - Otherwise, replace the greater number with
(greater mod smaller).
30Why Euclidean Algorithm Works?
- It must stop.
- Otherwise, the sum of the two numbers is always
decreasing. - But the sum is always a positive number.
- This is impossible.
- When it stops, it gives gcd(e,u).
- Clearly, gcd(e,u) always divides the two
variables in the algorithm so it also divides
the output. - Then we only need to show the output divides
gcd(e,u).
31Output Divides gcd(e,u)
- Since all common divisors of e and u divide
gcd(e,u), we only need to show the output is a
common divisor. - In the last iteration, the output is a common
divisor of the two variables. - When we go back one step, we have the same
property. - Keep going until we reach e, u.
32Extended Euclidean Algorithm
- We still need to extend Euclidean algorithm to
find d, v such that eduvgcd(e,u). - Think of the first iteration of Euclidean
Algorithm - We start from e1e0u and u0e1u.
- Suppose ugte then we compute u mod e and use it
to replace u. - Here u mod e 1 u - (u div e) e.
- Note we always know the coefficiencies of u and
e. - Keep going this way finally we have the
coefficiencies of u and e for the output---d and
v. - Done with computing d! And done with key
generation.
33RSA Assumption
- The security of RSA cryptosystem is based on RSA
assumption - For all efficient algorithm A, for uniformly
random k-bit primes p, q, for npq, for e
uniformly picked from Z F(n) , for m uniformly
picked from Zn, - PrA(n,e,me mod n)m
- is negligible.
34What does RSA Assumption imply? (1)
- Clearly, it implies that for a uniformly random
message m, the adversary cant get any advantage
for finding m when given the ciphertext. - Does it imply finding d is hard?
- Also clearly, since if given d it is easy to
compute m.
35What does RSA Assumption imply? (2)
- It implies computing F(n) from n is hard.
- Otherwise, we can compute d1/e (mod F(n) ) from
F(n) and e using Extended Euclidean Algorithm. - It implies factoring n is hard.
- Otherwise, we can factor n to get p and q then
we get F(n) (p-1)(q-1).
36What does RSA Assumption imply? (3)
- It implies finding (a, b) (a ? b or -b(mod n)
a, b ? 0 (mod n) ) such that a2b2 (mod n) is
hard. - Otherwise, we have a2-b2 0(mod n), which means n
divides (a-b)(ab). - Without Loss Of Generality, suppose 0ltbltaltn. Then
0lt a-b ltn, 0ltablt2n. - Since a ?-b (mod n), ab ?n, which means ab and
a-b each contains one of the two prime factors. - Using Euclidean Algorithm, we can find gcd(ab,n)
and gcd(a-b,n), which are the two primes.
37Factoring vs. RSA
- We have seen RSA assumption implies factoring is
hard. But does the hardness of factoring imply
the RSA assumption? - This is an open question.
- New results show probably not.
38Security of RSA
- Under RSA assumption, RSA is not only secure
against a passive adversary (as we have
explained). - It is also secure against a Chosen Plaintext
Attack. - Here secure means hard to find cleartext (which
is weak). - Chosen Plaintext Attack (CPA) Stronger model
than passive adversary. - Active adversary can obtain the ciphertexts
corresponding to the cleartexts he chooses. - Any public key cryptosystem MUST be secure
against CPA due to Kerchoff principle.
39More on Active Adversary
- Chosen Ciphertext Attack (CCA)
- Adversary can obtain the cleartexts corresponding
to the ciphertexts he chooses. - However, the above help is only available before
the ciphertext is given to adversary. - Adaptive Chosen Ciphertext Attack (CCA2)
- Unlike CCA, the help with decryption is always
available. - However, adversary cant use the above help to
decrypt the ciphertext he is given. (Otherwise,
no encyption algorithm can be secure.)
40Attack on RSA Short Message
- RSA is not randomized each cleartext has only
one ciphertext. - If we know the message is only a few bits (say, 3
bits), we can test all possible cleartexts. - We encrypt each of them and compare it with the
ciphertext. - The above attack is also valid in case the
message is known to belong to any small set. - Imagine if you encrypt your vote in president
election using RSA!
41Attack on RSA Meet in the Middle
- The attack on short message can be enhanced by
meet-in-the-middle attack. - With a good probability, a short message can be
factored into two even shorter messages mm1m2 - With RSA, E(m)me mod n m1e m2e mod
nE(m1)E(m2). - Then we can encrypt the even shorter messages and
compare their product with the ciphertext.
42Attack in Real Life
- Such attack is especially effective when RSA is
used to encrypt DES key. - DES key is only 56-bit.
- Thus finding DES key is easy with
meet-in-the-middle attack. - We can also use it to attack RSA-encrypted
passwords. - So do NOT simply use RSA to encrypt passwords and
DES keys.
43Attack on Short Message and RSA Security
- Why the above attack is possible while RSA is
secure? - Because RSA is secure only in the sense that it
is hard to find cleartext. - Not in the sense that it is hard to find any
partial information about the cleartext (i.e.,
not semantically secure). - With a short message, we essentially have known
all except a small part about the message. - This remaining small part cant be protected by
RSA.
44Attack on RSA CCA2
- Consider an active adversary in CCA2 model.
- The adversary is given ciphertext c, but he cant
use the help of decryption to get the cleartext
of c. - However, he can choose m and compute cc(m)e.
- Then he asks decryption for c.
- He gets (c)d (c(m)e)d cd (m)ed cd m
- So he can compute cd from (c)d.
- Thus RSA is not secure against CCA2 attack.
45Common Misuse of RSA
- In network security, to accelerate computing,
people often use small exponent e. - Usually e3 sometimes e11.
- Many protocols/systems include such examples.
- Is this secure?
- Unfortunately, it is easy to find d with small e.
- Efficient algorithms have been published for such
e (e.g., the Coppersmith algorithm). - Similarly, using a small d is also subject to
attack.
46Rabin Cryptosystem
- Another popular public key cryptosystem.
- Invented by Mike Rabin (Turing award winner).
- Based on the hardness of computing square root
mod n.
47Rabin Key Generation
- Choose two primes p, q of the same length.
- Compute npq.
- Pick integer b in Zn.
- Public Key (n,b)
- Private Key (p,q)
48Rabin Encryption Decryption
- For cleartext message m, the ciphertext
- cm(mb) mod n
- For ciphertext c, the cleartext is
- m(-bsqrt(b24c))/2
- where sqrt(b24c) is a square root of b24c mod n
that is less than n and greater than 0. -
49Computing Square Root Mod n
- For decryption, we need sqrt(b24c), a square
root of b24c mod n. - First, we compute a square root of b24c mod p.
Suppose this is r1. - Second, we compute a square root of b24c mod q.
Suppose this is r2. - Third, we compute r such that r r1 (mod p) and
r r2 (mod q). - r is what we want.
50Chinese Remainder Theorem
- Why r is what we want? This is based on the
Chinese Remainder Theorem - For nn1nk (where n1, , nk are pairwise
coprime), rx (mod n) if and only if rx (mod
n1), , rx (mod nk).
51Computing Square Root Mod Prime
- But we still need an algorithm to compute a
square root of b24c mod prime (where the prime
is p and q, respectively). - When prime p4k3, it is easy to compute square
roots mod p - r1 ( or -) (b24c)(p1)/4 (mod p)
- For p4k1 we also have algorithms. But we often
have both p and q 3 (mod 4) in this case, n is
called a Blum integer.
52Proof for Square Root Mod Prime (1)
- Why is r1 a square root?
- It is easy to see
- (r1)2(b24c)(p1)/2 (mod p)
- Since b24c does have square roots, (b24c)1/2 is
meaningful. So we can rewrite the above as - (r1)2((b24c)1/2 ) (p1) (mod p)
53Proof for Square Root Mod Prime (2)
- Recall F(p)p-1. So we have
- (r1)2((b24c)1/2 ) (p1) (mod p)
- ((b24c)1/2 ) (F(p)2) (mod p)
- By Euler Theorem,
- ((b24c)1/2 ) 2 (mod p)
- b24c (mod p)
54Number of Roots
- Note that we have two square roots mod each
prime. - So there are four square roots mod n.
- Which of these four is the original cleartext?
- Should add redundancy into the cleartext, so that
it can be recognized.
55Security of Rabin
- It is hard to find a cleartext of Rabin
Cryptosystem under CPA attack. - This is true under the assumption that factoring
n is hard. - Why?
- Suppose there is an efficient algorithm A that
computes m from n, b, c. - Then we can construct an efficient algorithm A
that factors n.
56Reduction of Factorization to Rabin Decryption
- Choose m at random.
- Compute cm(mb) mod n.
- Use A to decrypt c---suppose we get m.
- Claim gcd(m-m, n) is a prime factor of n with
probability 1/2. - But why? We need a closer look at the cleartext.
57Closer Look at Cleartext
- What is m? Is it necessarily equal to m?
- Not really.
- Recall the decryption is (-bsqrt(b24c))/2.
- There are four possible roots of b24c.
- So there are four possible cleartexts, one of
which is equal to m, and one of which is equal to
m. - No algorithm can distinguish m from other three
cleartexts, given n, b, c. - Since you get the same c when you start from n,
b, and any of other three cleartexts.
58Four Possible Cleartexts
- What are these four possible cleartexts?
- Let r1 be a square root of b24c mod p.
- Let r2 be a square root of b24c mod q.
- Due to Chinese remainder theorem
- m1 corresponds to a root R1 such that R1 r1 (mod
p), R1 r2 (mod q). - m2 R2 -r1 (mod p), R2 r2 (mod q)
- m3 R3 r1 (mod p) R3 -r2 (mod q)
- m4 R4 -r1 (mod p) R4 -r2 (mod q)
59Grouping Cleartexts
- We put the four cleartexts (and the corresponding
roots) on the corners of a rectangle - If we are lucky enough to get two cleartexts that
are neighbors, we can factor n.
m3 (R3) -
m1 (R1)
m2 (R2) -
m4 (R4) - -
60Factoring n based on Two Neighbor Cleartexts
- Note that any two neighbor roots are
- Modular equivalent mod one prime factor
- Modular negative mod the other prime factor.
- So their difference is
- 0 mod the first prime factor
- Non-zero mod the second prime factor.
- Thus gcd(difference, n) is the first prime factor.
61Back to m and m
- The difference between two roots are 2 times the
difference between the cleartexts. - gcd (R1-R2, n) gcd (2(m1-m2), n)
- gcd(m1-m2, n)
- So we succeed in factoring if m and m are
neighbors on the rectangle.
62Probability of Success
- What is the probability of success?
- m is uniform and independent from m
- So the probability is ½.
- Done with security of Rabin cryptosystem.
63Semantic Insecurity.
- Although Rabin cryptosystem makes it hard to find
cleartext, it does not provide stronger security
guarantee. - Just like RSA, Rabin cryptosystem is
deterministic. - Each cleartext has a unique encryption.
- Thus it cant be semantically secure.
64Insecurity against CCA
- Recall the security is w.r.t CPA.
- No longer true w.r.t CCA.
- With CCA, the adversary can ask for decryptions.
- So he picks a message at random, encrypts it, and
asks for decryption. - Then he factors n using his original picked
cleartext and the decryption he gets. - The Rabin cryptosystem is totally broken.
65Optional Topic Formal Treatment of Public Key
Cryptosystem
- For public key cryptosystems like RSA and Rabin,
we can model them as trapdoor one-way functions. - A trapdoor one-way function is actually a family
of functions fi with efficient algorithms I, D,
F and two security properties. - For convenience, in the following we assume all
fi are bijections.
66Trapdoor One-way Function (1)
- The three efficient algorithms
- F computes fi(x) from i and x.
- D samples x from the domain of fi().
- I takes security parameter k as input and
outputs an index i and the corresponding trapdoor
t.
67Trapdoor One-way Function (2)
- Security property Hardness to invert
- For index i (distributed as output of I) and x
(distributed as output of D), for all efficient
algorithm A, for all polynomial p(), for all
sufficiently large k, - ProbA(fi(x))x lt1/p(k)
- (This definition only works for fi() with large
domains)
68Trapdoor One-way Function (3)
- Security Property Easiness to Invert with
Trapdoor. - There exists an efficient algorithm A such
that, for all (i, t) in the range of I, for all
x, - A(fi(x), t)x
69Trapdoor OWF and PKC
- Index Encryption key.
- Trapdoor Decryption key.
- I Key generation algorithm.
- D Cleartext sampling algorithm.
- F Encryption algorithm.
- Hardness to invert Security guarantee of PKC.
- Easiness to invert with trapdoor Existence of
decryption algorithm.