Fast Dictionary Attack - PowerPoint PPT Presentation

1 / 51

About This Presentation

Title:

Fast Dictionary Attack

Description:

Rainbow Table. The problem. Given a fixed plaintext P0 ... Rainbow Table. Use successive reduction function for each point in the chain. ... 5 Rainbow tables ... – PowerPoint PPT presentation

Number of Views:221

Avg rating:3.0/5.0

Slides: 52

Provided by: csUi

Category:

more less

Transcript and Presenter's Notes

Title: Fast Dictionary Attack

1
Fast Dictionary Attack

22C196 Security in Distributed System
Dat Tien Nguyen
Mar 29 2007

2
Quick notes

First, Im very sorry for my English
If you have any questions or comments, feel free
to ask.
Hopefully I wont make you fall asleep

3
Dictionary Attack
4
Definition

(1) A method used to break systems by testing all
possible passwords beginning with words that have
a higher possibility of being used
(from webopedia)

5
Definition (cont.)

(2) An e-mail spamming technique in which the
spammer sends out millions of e-mails with
randomly generated addresses using combinations
of letters added to known domain names
(from webopedia)

6
Dic attack vs. BF attack
7
Example

Break Udays password
What dictionary should we try?
English dictionary
India dictionary
His girl friends list
Do you want to try Vietnamese dictionary?

8
Some naïve improvement

Use a large dictionary or more dictionaries
Use string manipulation in the dictionary
password
Its backward drowssap
Number-letter replacement p4ssw0rd
Capitalization PasSwoRd

9
Advanced improvement

Rainbow Table
Markovian/Deterministic Finite Automata Dictionary

10
My presentation

1 Philippe Oechslin, Making a faster
Cryptanalytic Time-Memory Trade-Off
NS05 or 2 A. Narayanan and V. Shmatikov, Fast
Dictionary Attacks on Passwords Using Time-Space
Tradeoff
Fun facts about RaibowCrack

11
Rainbow Table
12
The problem

Given a fixed plaintext P0 and the corresponding
ciphertext C0. The encrypt function is S.
Find the key k ? N such that
C0 Sk(P0)

13
The original method

Try to generate all possible ciphertexts by
encrypting P0 with all N possible keys.
The ciphertexts are organized in a chains so that
only the first and the last elements of a chain
will be stored

14
The original method (cont.)

The chains are created by using a reduction
function R which creates a key from a cipher
text.
R(Sk(P0)) is written as f(k)

15
The original method (cont.)

m chains of length t are created. Their first and
last elements are stored.
Given a cipher text C. Find the key
Generate a chain of keys starting with R(C) and
up to length t
If there is a key in the chain that matches a
last key from the table, C should be in this key
chain
Using the first key of the chain to generate the
whole chain and the expected key is the one comes
just before R(C)

16
Disadvantages

Chains starting at different keys collide and
merge ? reduce the number of distinct keys
covered by a table
Finding a matching endpoint does not imply that
the key in the table (false alarm)
The key maybe in a chain that have the same
endpoint but is not in the table
The chain containing key is in the table, but
merge with other chains ? search through all
chains that have the same endpoint until the key
is found.

17
Some improvement

Using more tables to obtain a higher probability
of success. Each table has a different reduction
function.
Given the probability of success of one table is
Ptable
Using n table will increase the probability of
success to
1 (1-Ptable)n

18
Some improvement (cont.)

Using distinguished points (DP) as endpoints 3
Distinguished points are points for which a
simple criteria holds true
The first ten bits are zero
Chains are generated until distinguished point
appears
Given a cipher text, generate a chain until we
find a distinguished point and only look it up in
the memory

19
Some improvement (cont.)

Advantages of using distinguished endpoints
Loop detection
Merge detection
Disadvantages
The length of chains are not fixed
False alarm cases
What happens if the chain starting with R(C)
contains loop itself?
What happens if the chains starting with R(C)
contains the chain in the table?

20
Rainbow Table

Use successive reduction function for each point
in the chain.

21
Rainbow Table (cont.)

Classical table vs. Rainbow table

22
Rainbow Table (cont.)

To look up a key
Apply Rn-1 to the ciphertext
Look up the result in the endpoints of table
If we dont find, apply Rn-2 and fn-1 to see if
the key was in the second last column
So on and so forth

23
Rainbow Table (cont.)

Advantages
Reduce the number of table look-up
Classical table t2
Rainbow table t(t-1)/2
Merge detection
No loops
Dont have to spend time to detect and reject
loop chains
Fixed length chains
False alarm cases

24
Rainbow Table (cont.)

Improvement
To have higher success rate
Increase the size of the table
Using more tables
People choose using more tables because
Easy to calculate the success rate
One table 80
Five tables 1 (1-0.8)5 0.9996
Easy to calculate the of tables needed

25
Hybrid Markovian/DFA dictionary
26
Why use Markov models?

The fact that the passwords are generated by
people, which are unlikely to be uniformly
distributed in the space of alphabet sequences

27
Why use Markov models?

Markov models are commonly used in Natural
Language Processing
It defines a probability distribution over
sequences of symbols
Used to generate random, yet pronounceable pwds
in LANL in the late 1980s
Very effective at guessing passwords generated by
users

28
Markovian filtering

Zero-order Markov model
Each character is generated according to the
underlying probability distribution and
independently of the previous generated
characters
P(a) ?x?av(x)

29
Markovian filtering (cont.)

First-order model
Each diagram (ordered pair) of characters is
assigned a probability and each character is
generated by looking at the previous character.
P(x1x2xn) v(x1)?iv(xi1xi) (i 1 .. n-1)

30
Markovian dictionary

Zero-order dictionary
First-order dictionary

31
Markovian dictionary (cont.)

The models can drastically reduce the size of the
plausible pwd space
Ex consider 8-character sequence
If ? 85 (means 85 percents of sequences are
ignored)
The size if only 1/7 of the key space
The zero-order dictionary contains 90 of the
dictionary
The first-order dictionary can even do better
(more than 95)

32
Markovian dictionary (cont.)

The distribution of letter frequencies used in
Markovian filtering is language-specific
Two ways to deal with unknown languages
Combine the keyspaces for two or more languages
Come up with a distribution that works reasonably
well for multiple languages

33
Filtering using Finite Automaton

A search space that contains only alphabetic
sequences is unlikely to have a good coverage of
the plausible pwd space
Human often mix upper-lower characters and
numbers
System requires users to put special characters
to their pwds
Even that the distribution of resulting pwds is
not random
Numbers are likely in the end
The first character is more likely to be upper
Pwd contains mostly lower chars

34
Filtering using Finite Automaton

Deterministic finite automata are ideal for
expressing such properties
Specify a set of common regular expression
All lower case
One upper case followed by all lower case
Dictionary the set of sequences matching the
Markovian filtering and accepted by at least one
DFA corresponding to the regular expression

35
Filtering using Finite Automaton

The complete alphabet
Lower case chars 26
Upper case chars 26
Numerals 10
Special chars space, hyphen, underscore, period
and comma (5)
Total 67 chars. The associated keyspace of
8-char sequences is 678 1015 ? impossible for
BF attack
Divide this set into 4 categories a lower
case, A upper case, n - numerals, s special.
The input alphabet of the automata consists of
just these 4 symbols.

36
Indexing Algorithms (IA)

With Markovian filtering and/or DFA filtering we
have a compressed dictionary
We need to index the compressed dictionary
An efficient enumeration algorithm which takes
index i as input and outputs the ith element of
the dictionary.

37
Indexing Algorithms (cont.)

Some changes
Modified dictionary
Discretize the probability distribution
µ(x) log v(x) and ? log?
Discretize the value of µ to the nearest multiple
of µ0
If µ0 is large, the memory is low, but the
accuracy is low as well

38
IA for Zero-order Markovian dic

Partial dictionaries

For any string prefix a
Partial Dic all sequences ß aß satisfies
the Markovian property
Note that Dv,?,l,?,l is well-defined because
?x?aßv(x) ?x?av(x)?x?ßv(x) ??x?ßv(x)

39
IA for Zero-order Markovian dic

The recursive algorithm to compute the size of a
partial dictionary

partial_size1(current_length, level) if level
gt threshold return 0 if total_length
current_length return 1 sum 0 for each char
in alphabet sum sum partial_size1(current_le
ngth1, levelmu(char)) return sum
40
IA for Zero-order Markovian dic

Function takes an index as input and returns the
corresponding key

get_key1(current_length, index, level) if
total_length current_length return "" sum
0 for each char in alphabet new_level level
mu(char) // looked up from precomputed
array size partial_size1current_length1new
_level if sum size gt index // refers
to string concatenation return char
get_key1(current_length1,index-sum,
new_level) sum sum size // control
cannot reach here print "index larger than
keyspace size" exit
41
IAs for other dictionaries

First-order Markovian dictionary
Partial_size2(cur_len,prev_char,level)
Get_key2(cur_len,index,prev_char,level)
DFA dictionary
Paritial_size3(cur_len,state)
Get_key1(cur_len,index,state)
Any keyspace
Compute_bins(t)
Get_key(index)
Hybrid Markovian/DFA dictionary
Get_key5(index)
Multiple keyspaces
Get_key6(K1,K2,,Kn,index)

42
Possible optimizations

The hybrid algorithm should use 50-100 table
lookups and the table must fit into the cache
Get_key6() can be accelerated by pre-computation
Get_key1() can be speed up by reordering the
characters so that the more frequent ones come
first. This strategy works for Get_key2() as well.

43
Experiment

Rainbow vs. Markovian/DFA (paper 2)
142 real pwds from Passware
Search space 6-character alphanumeric sequences

44
Conclusion
45
Fun facts about Rainbow Crack

All the configurations are done for a 666MHz CPU,
256MB RAM
Nowadays with a 3.0GHz CPU, 4GB RAM, the time
should be 5 times less than their results
5 Rainbow tables
Crack Windows hash function (LM), but it can be
configured for other hash algorithms such as MD5
or SHA1

46
Fun facts about Rainbow Crack
47
Fun facts about Rainbow Crack
48
Fun facts about Rainbow Crack
49

Tables 5.2 TB
1TB is about the same amount of information as
all of the books in a large library
(http//kb.iu.edu/data/ackw.html)
Supported Algorithms 13 (CiscoPIX, LanManager,
MD4, MD5, NTLM, MySQL123, MySQLSHA1, SHA1,
HalfLMChallenge, LMChallenge, NTLMChallenge,
MSCache, Oracle)
It takes them 3 years to generate all tables (97
years for one computer)
Successful rate 100
http//www.rainbowcrack-online.com/

50
References

1 Philippe Oechslin, Making a faster
Cryptanalytic Time-Memory Trade-Off
2 A. Narayanan and V. Shmatikov, Fast
Dictionary Attacks on Passwords Using Time-Space
Tradeoff
3 D.E. Denning. Cryptography and Data Security,
page 100. Addison-Wesley, 1982.

51
Thank you for your attention!