Fast Modular Reduction - PowerPoint PPT Presentation

About This Presentation
Title:

Fast Modular Reduction

Description:

Compute AB mod M where A,B and M are typically 100's to 1000's of bits ... Just one extra word on an 8-bit machine is sufficient to handle multiplication ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 15
Provided by: willhase
Category:
Tags: ab | extra | fast | modular | reduction

less

Transcript and Presenter's Notes

Title: Fast Modular Reduction


1
Fast Modular Reduction
  • Will Hasenplaugh
  • Gunnar Gaubatz
  • Vinodh Gopal
  • June 27, 2007

2
Modular Multiplication
  • Modular Multiplication is used in Public Key
    Cryptography
  • Diffie-Hellman and RSA
  • Prime-field Elliptic Curve Cryptography
  • Compute AB mod M where A,B and M are typically
    100s to 1000s of bits
  • We present a variant of Barretts Modular
    Reduction Algorithm which exploits Karatsuba
    Multiplication and Modular Folding
  • Analysis is software focused
  • We use an abstract processor to compare
    algorithms fairly
  • The native word size is w-bits (a power of 2)
  • 1-cycle add and an m-cycle multiply
  • We present example data on an 8-bit processor
    with a 2-cycle multiplier
  • Atmel AVR series - representative of embedded
    handheld devices
  • Our algorithm is also applicable to hardware
    acceleration

3
Montgomery vs. Barrett
  • Word-Serial Montgomery
  • Pro
  • Regularity
  • Interleaved Multiply and Reduce
  • Low-Complexity Quotient Estimation
  • Right-to-Left computation leads to convenient
    hardware pipelines
  • Con
  • Transformation Overhead
  • n2 complexity
  • Barrett
  • Pro
  • No Transformation Overhead
  • Large Digit Based Computation
  • Allows sub-n2 multiplication techniques
  • Flexible Off the Shelf hardware
  • Con
  • Quotient Estimation requires a large digit
    multiplication
  • Left-to-Right computation is less convenient for
    hardware

4
Barrett vs. Montgomery
  • Performance of n2 Barrett approaches 2/3 of
    Montgomery
  • Quotient Estimation for Montgomery is amortized
    as operands grow

5
Karatsuba Multiplication
  • Recursive multiplication algorithm with O(
    n1.585 ) complexity.
  • Schoolbook multiplication complexity scales as
    O( n2 ), but requires fewer additions per
    recursion.
  • NAB
  • Aa12na0
  • Bb12nb0
  • Schoolbook Multiplication -
  • Na1b122n(a1b0a0b1)2na0b0
  • Karatsuba Multiplication -
  • Na1b122n
  • (a1a0)(b1b0)-a1b1-a0b02na0b0

B
A
a1
a0
b1
b0
x
a1a0
b1b0
a1b1
a0b0
(a1a0)(b1b0)

-
a0b0
-
a1b1
NAB
6
Recursive Karatsuba Decomposition
A
a1
a0
lt 1
lt 2
For k recursions extra word is lt log2k
bits
a1a0
lt 3
There are fewer particles in the universe than
that.
Just one extra word on an 8-bit machine is
sufficient to handle multiplication of numbers up
to 2258 bits.
So, we probably wont need to rewrite this code.
7
Carry Handling
  • There is considerable overhead in the naïve
    implementation of Karatsuba.
  • At a recursion depth of 4, 20 of the multiplies
    are with sparsely populated extra words.
  • We turn sparsely populated multiplies into
    branches and adds.
  • NAB
  • Aah2nal
  • Bbh2nbl
  • ah and bh are booleans
  • Nahbh22nahblbhal2nalbl

ah
al
bh
bl
x
albl
if
1
al
bh
if
1
ah
bl
if
1
1
bh
ah
N
Each recursion is a conveniently-sized multiply
-gt No extra words.
8
Karatsuba vs. Schoolbook Multiplication
9
Barretts Algorithm
  • A, B and M are n-bit numbers. We seek to find R
    AB mod M using Barretts Algorithm.
  • A total of 3 n-bit multiplies.

B
A
x
N
N mod 2n
N / 2n
µ
x
µ N / 2n
µ N / 22n
M
x
µ NM / 22n
-
R
10
Barrett vs. Montgomery
11
Folding
  • We accelerate the reduction process by partially
    reducing N ( AB ) with an inexpensive method
    called Folding

B
A
x
N
N mod 23s
N / 23s
M23s mod M
x
NM / 23s

N
12
Iterative Folding
  • We can play the same trick again.
  • F times, in fact.

N
N mod 21.5n
N / 21.5n
M(1)
x

N(1)
N(1) mod 21.25n
M(2)
x

N(2)
N(2) mod 21.125n
13
Iterative Folding ( F 2 )
14
Summary
  • This Fast Modular Reduction technique is 2x
    faster than Montgomery on RSA Encryption on 512
    1024 bit keys.
  • As security requirements heighten, key sizes will
    grow to meet them and the asymptotic advantage of
    Karatsuba will continue to shine. We see a 3x
    and 4x advantage, respectively, for 2048 and
    4096 bit keys.
  • The speedup of a multiplier-bound, w-bit
    architecture is
  • Strong encryption on low-power handheld devices
    is challenging
  • Ex A 16MHz 8-bit Atmel AVR computes a 4096-bit
    RSA in almost 4 minutes with Montgomery, but we
    can do it in 1.
Write a Comment
User Comments (0)
About PowerShow.com