Title: ECE 645
1ECE 645 Spring 2007 PROJECT 2 Specification
2Topic Options
3Public Key (Asymmetric) Cryptosystems
Private key of Bob - kB
Public key of Bob - KB
Network
Decryption
Encryption
Bob
Alice
4RSA as a trap-door one-way function
PUBLIC KEY
C f(M) Me mod N
M
C
M f-1(C) Cd mod N
PRIVATE KEY
N P ? Q
P, Q - large prime numbers
e ? d ? 1 mod ((P-1)(Q-1))
5RSA keys
PUBLIC KEY
PRIVATE KEY
e, N
d, P, Q
N P ? Q
P, Q - large prime numbers
e ? d ? 1 mod ((P-1)(Q-1))
6Early Factoring Device Lehmer Sieve Bicycle
chain sieve D. H. Lehmer, 1928
Computer Museum, Mountain View, CA
7Supercomputer Cray-1 from 1980s
Computer Museum, Mountain View, CA
8FPGA based supercomputers
Machine
Released
SRC 6 fromSRC Computers Cray XD1 fromfrom
Cray SGI Altix from SGI SRC 7 from SRC
Computers, Inc,
2002 2005 2005 2006
9COPACOBANA
Ruhr University, Bochum, University of Kiel,
Germany, 2006
Cost 8980
120 Spartan 3 FPGAs Clock frequency 100 MHz
10Factoring 1024-bit RSA keysusing Number Field
Sieve (NFS)
Polynomial Selection
Relation
Collection
Cofactoring
200 bit
350 bit
Trial division ECM, p-1 method, rho method
Sieving
numbers
Linear Algebra
Square Root
11- Topic 1
- Trial Division Sieve
12Topic 1 Trial Division Sieve (1)
- Given
- Inputs
- Variables
- Integers N1, N2, N3, .... each of the size of
k-bits - Constants
- 2. Factor base
- set of all primes smaller smaller than a
certain bound B - p12, p23, p35, ... , pt B
- Parameters of interest
- 4 k 512
- 3 B 105
13Topic 1 Trial Division Sieve (2)
- Required
- Outputs
- For each integer Ni
- A list of primes from the factor base that
divides Ni, and - the number of times each prime divides Ni.
- For example if
- Ni p1e1 p2e2
p3e3 Mi, - where Mi is not divisible by any prime belonging
to - a factor base, then
- the output is
- p1, e1, p2, e2,
p3, e3
14Topic 1 Trial Division Sieve (3)
- Example
- Constants
- k10, B5
- Factor base 2, 3, 5
- Variables
- N1 408 23 3 17
- N2 630 2 32 5
7 - Outputs
- 2, 3, 3, 1
- 2, 1, 3, 2, 5,
1
15Topic 1 Trial Division Sieve (4)
- Optimization Criteria
-
- Maximum number of integers Ni fully processed per
unit - of time for a given k and B.
16- Topic 2
- Greatest Common Divisor
-
- Multiplicative Inverse
17Topic 2 Greatest Common Divisor and
Multiplicative Inverse(2)
- Given
- Inputs
- a, N k-bit integers a lt N
- Outputs
- y gcd(a, N)
- x a-1 mod N
- i.e., integer 1 x lt N, such
that - a ? x (mod
N) 1 - Parameters of interest
- 4 k 1024
18Greatest common divisor
Greatest common divisor of a and b, denoted by
gcd(a, b), is the largest positive integer that
divides both a and b.
d gcd (a, b) iff 1) d a and d b
2) if c a
and c b then c ? d
19gcd (8, 44) gcd (-15, 65) gcd (45,
30) gcd (31, 15) gcd (121, 169)
20Quotient and remainder
Given integers a and n, ngt0
?! q, r ? Z such that a
q? n r and 0 ? r lt n
a
q
q quotient r remainder (of a divided by
n)
a div n
n
a
r a - q? n a
? n
n
a mod n
21Euclids Algorithm for computing gcd(a,b)
qi q-1 q0 q1 qt-1
ri r-2 max(a, b) r-1 min(a,
b) r0 r1 rt-1 gcd(a, b) rt0
i -2 -1 0 1 t-1 t
ri1 ri-1 mod ri
ri-1
qi
ri
ri1 ri-1 - qi ? ri
22Euclids Algorithm Example gcd(36, 126)
qi q-1 3 q0 2 q1
ri r-2 max(a, b) 126 r-1 min(a, b) 36 r0
18 gcd(36, 126) r1 0
i -2 -1 0 1
ri1 ri-1 mod ri
ri-1
qi
ri
ri1 ri-1 - qi ? ri
23Multiplicative inverse modulo n
The multiplicative inverse of a modulo n is an
integer !!! x such that
a ? x ? 1 (mod n)
The multiplicative inverse of a modulo n is
denoted by a-1 mod n (in some books a or
a). According to this notation
a ? a-1 ? 1 (mod n)
24Extended Euclids Algorithm (1)
ri xi ? a yi ? n
qi q-1 ? n/a ? q0 q1 qt-1
xi x-20 x-11 x0 x1 xt-1 xt
yi y-21 y-10 y0 y1 yt-1 yt
ri r-2 n r-1 a r0 r1 rt-1 rt0
i -2 -1 0 1 t-1 t
ri-1
qi
ri
ri1 ri-1 - qi ? ri
xi1 xi-1 - qi ? xi
yi1 yi-1 - qi ? yi
rt-1 xt-1 ? a yt-1 ? n
25Extended Euclids Algorithm (2)
rt-1 xt-1 ? a yt-1 ? n
rt-1 xt-1 ? a yt-1 ? n ? xt-1 ? a (mod n)
If rt-1 gcd (a, n) 1 then
xt-1 ? a ? 1 (mod n) and as a result
xt-1 a-1 mod n
26Extended Euclids Algorithm for computing z
a-1 mod n
qi q-1 ? n/a ? q0 q1 qt-1
ri r-2 n r-1 a r0 r1 rt-1 1 rt0
xi x-20 x-11 x0 x1 xt-1 a-1 mod n xt ?n
i -2 -1 0 1 t-1 t
ri-1
qi
ri
ri1 ri-1 - qi ? ri
xi1 xi-1 - qi ? xi
If rt-1 ? 1 the inverse does not exist
Note
27Extended Euclids Algorithm Example z 20-1 mod
117
ri-1
qi q-1 5 q0 1 q1 5 q2 1 q3 2
ri r-2 117 r-1 20 r0 17 r1 3 r2
2 r3 1 r4 0
xi x-2 0 x-1 1 x0 -5 x1 6 x2 -35 x3 41
20-1 mod 117 x4 -117
i -2 -1 0 1 2 3 4
qi
ri
ri1 ri-1 - qi ? ri
xi1 xi-1 - qi ? xi
Check
20 ? 41 mod 117 1
28- Topic 3
- RSA Encryption Decryption
- with
- Montgomery Multipliers
- based on Carry Save Adders
29RSA as a trap-door one-way function
PUBLIC KEY
C f(M) Me mod N
M
C
M f-1(C) Cd mod N
PRIVATE KEY
N P ? Q
P, Q - large prime numbers
e ? d ? 1 mod ((P-1)(Q-1))
30Exponentiation Y XE mod N
Right-to-left binary exponentiation
Left-to-right binary exponentiation
E (eL-1, eL-2, , e1, e0)2
Y 1 S X for i0 to L-1 if (ei
1) Y Y ? S mod N S S2 mod N
Y 1 for iL-1 downto 0 Y Y2 mod N
if (ei 1) Y Y ? X mod N
31Montgomery Modular Multiplication (1)
C A ? B mod M
A, B, M k-bit numbers
Montgomery domain
Integer domain
A
A A ? 2k mod M
B
B B ? 2k mod M
C MP(A, B, M) A ? B ? 2-k mod M
(A ? 2k) ? (B ? 2k) ? 2-k mod M
A ? B ? 2k mod M
C A ? B
C C ? 2k mod M
32Montgomery Modular Multiplication (2)
A
A
A MP(A, 22k mod M, M)
C
C
C MP(C, 1, M)
33Montgomery Modular Multiplication (3)
2k bits
X AB
x2n-1
x2n-2
x2n-3
xn
. . .
x0
. . .
x1
q0M
x2n-1
x2n-2
x2n-3
0
xn
. . .
. . .
x1
q1Mb
x2n-1
x2n-2
x2n-3
0
0
x2
. . .
. . . . . .
C 2k X zM C 2k ? X AB C ? AB 2-k
0
0
. . .
0
C
k bits
34(No Transcript)
35(No Transcript)
36(No Transcript)
37Fast modular exponentiation using Chinese
Remainder Theorem
d
N
C
mod
M
CP C mod P dP d mod (P-1)
CQ C mod Q dQ d mod (Q-1)
dQ
dP
CQ
Q
MQ
CP
mod
P
MP
mod
M MP RQ MQ RP mod N
where
RP (P-1 mod Q) P PQ-1 mod N
RQ (Q-1 mod P) Q QP-1 mod N
38Time of exponentiation without and with Chinese
Remainder Theorem
SOFTWARE
Without CRT
tEXP(k) cs ?? k3
With CRT
1
k
tEXP-CRT(k) ? 2 ?? cs ?? ( )3 tEXP(k)
4
2
HARDWARE
Without CRT
tEXP(k) ch ?? k2
With CRT
1
k
tEXP-CRT(k) ? ch ? ( )2 tEXP(k)
4
2
39- Topic 4
- RSA Encryption Decryption
- with
- Word-Based
- Montgomery Multipliers
40(No Transcript)
41Data dependency graph of a classical architecture
by Tenca Koc
42Data dependency graph of a new design from GWU
GMU
43Block diagram of the new architecture
44Block diagram of the main Processing Element
45(No Transcript)
46(No Transcript)
47(No Transcript)
48- Topic 5
- p-1 Method of Factoring
49p-1 algorithm
- Inputs
- N number to be factored
- a arbitrary integer such that gcd(a, N)1
- B1 smoothness bound for Phase1
- Outputs
- q - factor of N, 1 lt q N
- or FAIL
50p-1 algorithm Phase 1
precomputations
main computations
postcomputations
out of scope for this project
51p-1 Phase 1 Numerical example
- N 1 740 719 12791361
- a 2
- B1 20
- k 24325711131719 232 792 560
- q0ak mod N 2232 792 560 mod 1 740 719 1 003
058 - q gcd (1 003 058 ? 1 1 740 719) 1361
- Why did the method work?
- q-1 1360 2517 k
- ak mod q a(q-1)m mod q 1
- q ak-1
52Design Methodology Options
53by Mike Babst DSPlogic
54- Methodology 1
- RTL VHDL
- Classical VHDL-based
- Design Methdology
55Structure of a Typical Digital System
Data Inputs
Control Inputs
Control Signals
Execution Unit (Datapath)
Control Unit (Control)
Data Outputs
Control Outputs
56Hardware Design with RTL VHDL
Interface
Pseudocode
Control Unit
Execution Unit
Block diagram
Block diagram
ASM
VHDL code
VHDL code
VHDL code
57Steps of the Design Process
- Text description
- Interface
- Pseudocode
- Block diagram of the Execution Unit
- Interface with the division into Execution Unit
- and Control Unit
- ASM chart and/or block diagram of the Control
Unit - RTL VHDL code
- Testbench
- Debugging
- Synthesis and implementation
- Experimental testing (not required in this course)
58Project 2 - Platform tools
- Target devices Xilinx FPGAs
- Tools
- VHDL Simulation Aldec Active HDL or
- Xilinx ModelSim
- VHDL Synthesis Synplify Pro or Xilinx XST
- Implementation Xilinx ISE or Xilinx WebPack
All tools available in ST 2, rooms 203
265. Xilinx tools available for free for home
use. Aldec Active HDL student edition available
for home use.
59- Methodology 2
- Graphical Data Flow Language
- DSPlogic RCToolbox
60- See the presentation by
- Mike Babst, PhD
- DSPlogic
- available through WebCT
61Project 2 - Platform tools
- Target devices Xilinx FPGAs
- Tools
- Design Entry Debugging
- DSPlogic RC Toolbox
- MathWorks Simulink
- MathWorks Matlab
- Synthesis and Implementation
- Xilinx System Generator
- Xilinx ISE
All tools available in ST 2, room 220.
62- Two hands-on sessions
- given by Dr. Babst
- during the first two weeks after
- the selection of the project
63Reconfigurable computers supported by DSPlogic
toolset
Machine
Released
Cray XD1 fromfrom Cray SGI Altix from SGI
2005 2005
64What is a Reconfigurable Computer?
65- Methodology 3
- HLL Compilers
- Celoxica Handel C
66Design Flow
Executable Specification
Handel-C
VHDL
Synthesis
EDIF
EDIF
Place Route
67Handel-C / ANSI-C Comparisons
Handel-C Standard Library
Preprocessors ie. define
ANSI-C Standard Library
Parallelism
Structures
Pointers
Arbitrary width variables
Side Effects ie. X Y
ANSI-C Constructs for, while, if, switch
Arrays
RAM, ROM
Bitwise logical operators
Signals
Channels
Recursion
Logical operators
Interfaces
Arithmetic operators
Enhanced bit manipulation
Functions
Floating Point
ANSI-C
HANDEL-C
68Handel-C Language (1)
- A subset of ANSI-C
- Sequential software style with a par construct
to implement parallelism - A channel chan statement allows for
communication and synchronization between
parallel branches - Level of design abstraction is above RTL but
below behavioral
69Handel-C Language (2)
- Each assignment and delay statement take one
clock cycle - Automatic generation of the state machine from an
algorithmic description of the circuit in terms
of parallel and sequential blocks - Automatic scheduling of parallel and sequential
blocks, that is the code following a group is
scheduled only after that whole group has
completed
70Handel-C Language (3)
- Automatic generation of clocks, clock enables and
resets - Combinational logic may be implemented using for
example bus, port and signal types - It is possible to design at a level where some
Handel-C statements look similar to Verilog, but
the overal program structure is different
71Platform tools HLL Compilers
- Target devices Xilinx FPGAs
- Tools
- Design Entry Debugging
- Celoxica DK4 Design Suite
- (integrated environment
providing - Handel C compiler,
debugging, - simulation, and
synthesis to EDIF - and VHDL)
- Synthesis and Implementation
- Xilinx ISE
All tools available in ST 2, rooms 203 265.
72VHDL macro declaration in Handel-C
- ENTITY parmult IS
- port (
- clk IN std_logic
- a IN std_logic_VECTOR(7 downto 0)
- b IN std_logic_VECTOR(7 downto 0)
- q OUT std_logic_VECTOR(15 downto 0))
- END parmult
- interface parmult (unsigned 16 q)
parmult_instance (unsigned 1 clk, unsigned 8 a,
unsigned 2 b) with busformat "B(I)"
73VHDL macro instantiation in Handel-C
- unsigned 8 x1, x2
- unsigned resultX
- interface parmult
- (unsigned 16 q)
- parmult_instance1
- (unsigned 1 clk __clock,
- unsigned 8 a x1,
- unsigned 8 b x2 )
- with busformat "B(I)"
74Celoxica RC10 board supporting Handel C
libraries used in the GMU ECE 448 FPGA and ASIC
Design with VHDL
75Literature
- Additional literature with the detailed
- description of all algorithms available
- for each project.
76Project Organization
- 1-3 person teams allowed
- 2 person teams preferred
- by Friday midnight the latest
- Please submit your
- - ranking of 4 topics
- - ranking of 3 design methodologies