Title: PART FIVE: Security issues in distributed systems
1PART FIVESecurity issues in distributed systems
2Suggested reading
- Crittografia, P. Ferragina e F. Luccio, Ed.
Bollati Boringhieri, 16.
3Roadmap
- Introduction
- Computer security vs network security
- Attacks to security
- Criptography
- Private key (symmetric) cryptography
- Public key (asymmetric) cryptography
4Computer security vs network (i.e., distributed
systems) security
- Computer security aims at protecting information
within a computer - Network security aims at protecting the exchange
of information among computers
5Basic problems in network security
- Networks are insecure because most of the
communications occur in clear (e.g., e-mail,
http, ftp,...) - No physical point-to-point connections but
- through shared lines
- through third-party routers
- Often there is no router authentication, but only
(and not always) of users - New security issues e-commerce, e-banking,
e-government, etc.
6Attacks to information security
7Information security main issues
- Confidentiality prevent the data sent from one
person A to person B to be understood by a third
party C. - Authentication verify the identity of who sends
or receives data. - Integrity be sure that the data received are
identical to those sent. - Non-repudiation prevent users who send data may
in future negate to have sent them (digital
signature).
8Ensuring security cryptography
- To address the above security issues, one has to
make use of cryptographic techniques, i.e., of
communication protocols that overcome the
influence of adversaries - Cryptography (from the greek kryptos, hidden, and
graphein, writing) is the discipline that in the
old days dealt with the study of the secret
scriptures today is a set of techniques that
permit the construction of an encrypted text and
the decryption of a cryptogram. (Garzanti, 1972)
9Cryptography a brief history
- Cryptography was used in the ancient antiquity to
hide the content of text messages (in modern
terms, to ensure confidentiality) - Cryptography experienced a tremendous development
during the Second World War, when British
mathematician Alan Turing formalized the theory
needed to decrypt the Nazi German cryptosystem
known as Enigma - In 1949, Claude Shannon published a paper which
gave start to what is now called Theory of
Information, that along with Probability,
Computational Complexity, and Number Theory forms
the basis of Modern Cryptography - Modern cryptography is concerned with all the
aspects (theoretical, computational,
implementative) related to information security
10Cryptosystem
- Definition A cryptosystem is a quintuple
(P,C,K,Cod,Dec), where - P finite set of plaintexts
- C finite set of ciphertexts
- K set of possible encryption keys
- Cod P x K ? C encryption function (injective
and invertible) - Dec C x K ? P decryption function
- Kerckhoffs principle a cryptosystem should be
secure even if everything about the system,
except the key, is public knowledge thus, all
its strength is based on the inviolability of the
key - It can be rephrased in the Shannon maxim the
enemy knows the system!
11 Symmetric vs asymmetric cryptosystems
- If Cod and Dec use the same key to encrypt and to
decrypt a given text, we talk about symmetric
cryptosystems, otherwise of asymmetric
cryptosystems - Symmetric cryptosystem since the same key is
used to encrypt and to decrypt data, sender A and
receiver B must share such a key - ? since this key must be kept secret, the main
problem is the key exchange!
12Symmetric key scenario
13 The problem of transmitting the key
- Q If you want to use a symmetric cipher to
protect the dataflow between two parties, how to
exchange the secret key? - A You must use a secure channel of
communication!
14A first example of a symmetric key cipher The
Caesar cipher
- Let us consider the Italian alphabet, and let us
construct a cipher that replaces each letter of
the alphabet by the letter which is 3 (this is
the key!) positions forward - For example, the clear text distributed
algorithm" is encrypted in the cryptogram
gnvzuneazhg dolrunzmpv. - However, as most of the simple ciphers based on
transpositions and/or shifting, it can be easily
attacked by means of statistical analysis
15 Statistical cryptanalysis
- The plaintext is obtained by means of the use of
statistical techniques on the frequency of
characters or substrings of the ciphertext, as
compared to the corresponding frequencies of the
associated language
16Perfect cryptosystems
- A cryptosystem is called perfect if plaintext and
ciphertext are statistically independent - More formally, if we denote by P(m) the
probability that a message from A to B contains
m, and by P(mc) the probability that a message
is equal to m after observing an encrypted
message c, then a cipher is perfect iff for any
m?M and for any c?C it holds P(mc)P(m).
17A necessary condition to be perfect
- Theorem (Shannon) A cryptosystem is perfect only
if KM. - Proof For the sake of contradiction, assume the
existence of a perfect cryptosystem with KltM.
By the injectivity of the Cod function, MC,
i.e., KltC. Let m be a message s.t. P(m)p?0.
Then, m can generate at most KltC ciphered
messages (one for each key). It follows that
there exists at least a ciphered message c which
is not image of m, namely - P(mc)0?pP(m)
- against the assumption of perfectness. ?
18Two very unperfect ciphers
- Assume that P(m)p, 0ltplt1, and that P(mc)0?p
then, a cryptoanalyst that sees c, is able to
infer that the encrypted message is not m! - Assume now that P(m)p, 0ltplt1, and that
P(mc)1?p then, a cryptoanalyst that sees c, is
able to infer that the encrypted message is
exactly m! - In all the intermediate cases in which P(mc)?p,
a cryptoanalyst is able to infer something by
observing the ciphered messages!
19A perfect cryptosystem
- One-time pad (Vernam G., ATT, 1917)
- Builds a large random (and not pseudorandom) key,
for example using a detector of cosmic rays. - The ciphertext is constructed by a bitwise XOR
between the plaintext m and the key k (recall
1?00?10 1?10?01) - A sends cm?k
- B decrypts mc?k (indeed x?y?yx)
- The key should never be reused (one-time pad).
20One-time pad is perfect!
- We have to show that P(mc)P(m). Let m and c be
of n bits by definition of conditional
probability, we have - P(mc)P(mnc)/P(c)
- where P(mnc) is the probability that A generates
m and ciphers it as c then - P(mnc)P(mncm?k)
- and since k is truly random, and the XOR
transforms a 0 of m in a 0 of c with probability
1/2, and 0 in 1 with probability 1/2, and a 1 in
0 with probability 1/2, and finally a 1 in 1 with
probability 1/2, it follows that any bit of c is
statistically independent of the corresponding
bit of m. Thus, m and c are independent, and so - P(mncm?k) P(m)P(c)
- from which it follows that P(mc)P(c).
?
21 From perfection to reality ...
- One-time pad is only theoretically perfect how A
and B do actually exchange key k?!? If they
exchange it a priori (by not using a traditional
communication channel), then its length will
bound the length of the message to be encrypted!
(notice however that one-time pad was used for
the MoscowWashington red line) - Instead of being perfect (i.e., provably secure
but practically unusable), all used cryptosystems
are computationally secure the cryptanalytic
problem (namely, the decryption of a ciphertext
without knowing the key) is computationally
intractable
22The state-of-the-art in symmetric key encryption
Rijndael
- Developed by Joan Daemen and Vincent Rijmen.
- This algorithm has won the selection for Advanced
Encryption Standard (AES) in 2000. Officially,
the Rijndael method has become the standard for
symmetric key encryption of the XXI century. - The cipher uses a set of keys of variable length
(128, 192, 256 bits), and a network of shuffling
of the message," in which multiple operations of
transposition, substitution, and xoring of blocks
of fixed length are performed. Keys are exchanged
by encrypting them with RSA cryptography (to be
seen next).
23Limits of symmetric key ciphers
- Does a secure channel of communication to
exchange the secret key actually exist in
reality? And if it does exist, why using
encryption?? - In addition, for secure communication between n
users, one must exchange a quadratic number of
keys! - Finally, the symmetric method does not
distinguish between sender and receiver, and so
it cannot address security issues like the
authentication and the non-repudiability of a
message
24Asymmetric key (a.k.a. public key) algorithms
- Each subject S has a pair of keys
- A public key Kpu(S), known to all
- A private key Kpr(S), known only by himself.
- The requirements that a public key algorithm must
enjoy are - Data encoded with one key can be decrypted only
with the other one - The private key should never be transmitted in
the network - It must be very difficult to derive a key from
the other one (in particular, the private key
from the public key).
25The various public key scenarios
First scenario A encodes a message with the
public key associated with B, which then decodes
the message by using its own private key in this
way, confidentiality and integrity are guaranteed
(B only can read the message)
26The various public key scenarios
Second scenario A signs a message by encoding
it with its own private key, and then sends it to
B, which then authenticates the message by using
the public key associated with A in this way,
authenticity and non-repudability are guaranteed
(all can read the message, but A only can have
signed it)
27The various public key scenarios
Third scenario A signs a message by encoding
it with its own private key, then re-encodes it
with the public key associated with B hence, it
sends it to B, which decodes it by using its own
private key, and then authenticates it by using
the public key associated with A in this way,
confidentiality, integrity, authenticity, and
non-repudability are guaranteed.
28The birth of PKI systems
- Where do I find the public keys of my recipients?
- Creation of archives of public keys, the public
key servers. - But who guarantees the correspondence of public
keys with the respective owners? - ? Birth of the Certification Authority (CA).
- At this point, who guarantees the validity of a
certificate authority? - Act of faith!
29The mathematics of public key systems
- It was introduced by Diffie and Hellman in 1976
- Definition A function f is called one-way if for
every x the computation of yf(x) is simple
(i.e., it is in P), while the calculation of
xf-1(y) is computationally hard (i.e., it is
NP-hard). - Definition A one-way function is called trapdoor
if the calculation xf-1(y) can be made easy once
that additional information (private) are known. -
- ... But unfortunately for them, they were not
able to build a one-way trapdoor function!
30The RSA algorithm
- Designed in 1977 by Ron Rivest, Adi Shamir and
Leonard Adlemann, the cipher is patented, and has
become public knowledge until 2000. - Basic idea given two prime numbers p and q (very
large), it is easy to calculate the product n
pq, while it is very difficult to compute the
factorization of n (although this problem is not
known to be NP-hard). - The best factorization algorithms currently
available (Quadratic Sieve, Elliptic Curve
Method, Pollards Heuristic, etc.). all have an
exponential complexity, in the order of
31The RSA algorithm
- To ensure security, it is necessary that p and q
are at least 200 decimal digits. Indeed, in this
way npq is 400 digits long, namely is in the
order of 10400, and so - e79 1034
- which is computationally intractable.
- ? keys are typically 1024 bits long (21024
10300) - RSA is much slower than symmetric key algorithms,
and it is often applied for the transmission of
small amount of data, like the private key in a
symmetric key system (as noticed before).
32RSA at work key generation
Recall x?y mod z ? the remainder of the integer
division between x and z, and between y and z is
the same, namely x mod z y mod z (or,
equivalently, there exists an integer k s.t.
xykz)
- 1. Choose two large primes p and q and computes
npq. - 2. Compute the Euler totient function w.r.t. n,
i.e., the cardinality of all numbers less than n
and prime with it - ?(n)?(pq)pq-(q-1)(p-1)-1pq-(pq)1
- (p-1)(q-1)?(p)?(q)
- Â Â Â Â (since there are q-1 multiples of p less than
n, and p-1 multiples of q less than n)
             - 3. Choose a number 0ltelt?(n) s.t. GCD(e,?(n))1
(i.e., e,?(n) are coprime) - 4. Define the public key as (e,n).
- 5. Compute d such that ed?1 mod ?(n).
- 6. Define the private key as (d,n).
33RSA at work
A sends a crypted message x to B
- The encryption function of A is Cod(x)xe mod n
(with xltn), where (e,n) is the public key of B. - The decryption function of B is
- Dec(Cod(x))Cod(x)d mod n (xe mod n)d mod n
- where (d,n) is the private key of B.
A sends a signed (i.e., non-repudable) message x
to B
- The encryption function of A is Cod(x)xd mod n
(with xltn), where (d,n) is the private key of A. - The decryption function of B is
- Dec(Cod(x))Cod(x)e mod n (xd mod n)e mod n
- where (e,n) is the public key of A.
? Notice that public and private keys can be used
interchangably, since Dec(Cod(x))Cod(Dec(x)).
34Correctness of RSA some theorems of modular
algebra
- Theorem (modular equations) Equation ax?b mod n
has a solution iff GCD(a,n) divides b. In such a
case, there are exactly GCD(a,n) distinct
solutions. - Corollary (existence of the inverse) If a and n
are coprime, then ax?1 mod n has eactly one
positive solution less than n, known as the
inverse of a modulo n. - Eulers Theorem For any ngt1, and for any a prime
with n, we have that a?(n)?1 mod n.
35Correctness of RSA
- Notice that e and ?(n) are coprime, and so from
the corollary on the existence of the inverse,
there exists a unique d less than ?(n) s.t. ed?1
mod ?(n). - Here is the strength of RSA to compute d from e
one must know ?(n), i.e., p and q, and so one
must be able to factorize efficiently! - Secretation we must prove that ? xltn,
Dec(Cod(x))x. But - Dec(Cod(x))(xe mod n)d mod nxed mod n,
- and so we have to show that xxed mod n.
Prove it! - We distinguish two cases
- p and q do not divide x (and so
GCD(p,x)GCD(q,x)1, since they are prime) - p (or q) divides x, but q (or p) does not divide
x. - (notice that p and q cannot both divide x, since
otherwise we should have xn, against the
assumptions) -
36Correctness of RSA (2)
- Case 1 We have GCD(x,n)1, and so from Euleros
theorem, we have x?(n)?1 mod n since ed?1 mod
?(n), we have ed1k?(n), for some positive
integer k. So, since xltn, we have - xed mod n x1k?(n) mod n x(x?(n))k mod n
x1k mod n x. - Case 2 Since p divides x, for any positive
integer k we have x?xk?0 mod p, namely (xk-x)?0
mod p. Since instead q does not divide x,
similarly to Case 1, we have also xed?x mod q,
and so (xed-x)?0 mod q. It follows that (xed-x)
is divided by both p and q, and then by their
product n, from which it follows - (xed-x)?0 mod n ? xed?x mod n ? xed mod n x mod
n x. - ?
- Authentication it follows from the RSA property
- Dec(Cod(x))Cod(Dec(x)).
37RSA at work an example (1 of 2)
- Assume that A wishes to send a secret message to
B then, by the RSA protocol, B needs to provide
its publik key to A. - B needs to generate its keys then, it selects
two large primes, for instance p3 e q11 (ehmm,
not very large, actually!) - Then, n33 e ?(n)21020.
- Then, B takes e3, since 3 is coprime with 20 ?
(3,33) is the public key of B - Then, B searches d s.t. 3d?1 mod 20. Hence, from
3d1k20, by setting k1, we have d7 ? (7,33)
is the private key of B - Now, to encrypt a message, A divides it in blocks
of bits whose maximum value is less than n33
then, a block P becomes - CCod(P)P3 mod 33
- A sends C to B to decode it, B computes PC7 mod
33 - In our example, since n33, a block contains at
most 5 bits (25lt33) however, in the practice, n
is in the order of 21024, and so blocks have a
size of 1024 bits, i.e., 128 ASCII characters (8
bits each).
38RSA at work an example (2 of 2)
To visualize the example, let us suppose that the
26 letters of the English alphabet are
represented by using 5 bits, and so, since n33,
each block is made up by a single character
39 Computational Complexity of RSA
- It can be shown that the keys (and thus p,q,e,d)
can be generated in polynomial time w.r.t. to
their binary representation (namely, logarithmic
in their value). - In particular, e is usually chosen by taking a
quite small prime number (e.g., e3). - Instead, d is obtained by an extension
(polynomial) of the Euclidean algorithm for
computing the GCD (based on the fact that
GCD(a,b)GCD(b,a mod b)). - However, to find large prime numbers (i.e., p and
q), probabilistic primality testing algorithms
are used, since deterministic algorithms are too
slow (although polynomial, but in the order of a
degree of 10). - Finally, note that the processes of encryption
and decryption can be performed efficiently by
successive exponentiation (so-called modular
exponentiation).
40Searching for p and q
- Recall remember the randomized algorithm for
computing a MIS its answer was deterministically
correct, while its computational complexity was
given in probabilistic form. This is known as a
randomized Las Vegas algorithm. - There exists another fundamental type of
randomized algorithm, known as randomized Monte
Carlo algorithm., in which the answer is probably
correct, while the time complexity is
deterministically bounded. - Definition (Monte Carlo algorithm) A Monte Carlo
"no-biased" algorithm is a randomized algorithm
for solving a given decision problem, such that
the answer "no" is always correct, while the
answer "yes" may be incorrect with a fixed
probability e. Monte Carlo "yes-biased"
algorithms are similarly defined. - The MillerRabin algorithm is a Monte Carlo
"no-biased" algorithm to test the primality of a
number n. Its time complexity is O(log3 n), and
its probability of inaccuracy is e1/4 (i.e., YES
answer is correct with probability 3/4).
41MillerRabin algorithm
- It is based on the following property given an
odd integer n (for which we want to test
primality), we write it as n2sr1, with r odd
(thus s is the multiplicity of factor 2 in the
decomposition of the even number n-1). Now, given
2 t n-2, we define the following 2
predicates - (P1) GCD(n,t)1
- (P2) (tr mod n1) OR (it exists 0 j s-1 s.t.
t2jr mod n-1). - Theorem If n is prime it satisfies both
predicates, while if n is composite, then the
number of integers between 1 and n-1 that satisfy
both predicates is less than n/4. - ? We run MR(n) a number of k times, testing each
time the two predicates on a random integer 2 t
n-2. If the algorithm answers "no, even only
once, the number is definitely composite, but if
it always answers "yes", then the probability
that the number is composite is 4-k, and
therefore the probability that the number is
prime is - P(prime)1-P(composite)1-4-k
- (e.g., if k100, then P(prime)1-10-60 1)
42MillerRabin algorithm
- Miller-Rabin(n)
- Set n-12sr with r odd
- For i1 to k do
- 2.1 choose randomly an integer t s.t. 2tn-2
- 2.2 if GCD(n,t)gt1 return composite condition
(P1) is false - 2.3 compute ytr mod n
- 2.4 if y?1 do first condition of (P2) is false
- 2.4.1 j0
- 2.4.2 while ((js-1) and (y?n-1))
- yt2jr mod n
- j
- 2.4.3 if y?n-1 return composite second
condition of (P2) is false as well - Return prime (w.h.p. 1-4-k)
43Is it easy to find prime numbers?
- Despite the efficiency of the primality test, it
is still unknown if the primes are well
distributed and therefore easy to find at random
(Riemann hypothesis!). However, we know that
their density is quite high, as stated by the
following result - Gauss Theorem (prime numbers) Let p(n) be the
distribution function of prime numbers, i.e., the
number of primes less than n. Then the following
is satisfied - So, if you search for a prime number of 100
digits, and they would be uniformly distributed,
you should check "only" ln (10100) 230
consecutive numbers.