Title: Efficient Architectures for Elliptic Curve Cryptography Processors for RFID
1Efficient Architectures forElliptic Curve
Cryptography Processorsfor RFID
- Lawrence Leinweber,
- Christos Papachristou,
- and Francis G. Wolff
Dept. of Electrical Engineering and Computer
Science
2Motivation
- Promiscuous RFID Tags Communicate Information to
Any Reader - There is a Need for Security and Privacy in RFID
Tags - Smaller, Faster Asymmetric Key Cryptographic
Processors are Needed for RFID Tags and Other
Systems
3Galois Field Math
- Extension Fields Polynomial Math N0 Carries
- GF(2m) Even Coefficients 0 Odd Coefficients
1 - Reduction x20 x5 (mod x11 x2 1) x9 x5
x2 1 - Sparse Polynomials Trinomials and Pentanomials
- Addition
- Multiply
- Easy Squaring
- Inversion aq-2 ? a-1 (mod f)
- Fermats Little Theorem aq ? a (mod f), q 2m
4Elliptic Curves in GF(2m)
- Simplified Equation y² xy x3 a2 x² a6
- Group Law, Point Addition 3 collinear points,
P1, P2, ?P3 - Additive Inverse ?P ?(x, y) (x, x y)
- Additive Identity P P 8
- Point Addition, P3 P1 P2, slope ? (y1 y2)
/ (x1 x2) - x3 ?² ? a2 x1 x2, y3 ?(x1 x3)
y1 x3 - Point Doubling, if P1 P2, slope ? x1 y1 /
x1 - Projective Coordinates P (x, y, z) (x/z,
y/z) - Point Multiplication n Repeated Additions P
nP1
5Lopez-Dahab Add Double
- 6 Multiplies
- 4 Square
- 6 Registers R6
6Modified Lopez-Dahab
- 7 Multiplies
- 4 Square
- 5 Registers R5
7Registers and Datapaths
8High-Level Organization, ALU
9Galois Field ALU
- Degree m-1 Polynomial, m-bit Operands
- Addition Bit-Wise XOR
- Reduction Wiring and XOR Gates
- Squaring
10Galois Field Multiplier
- AND-XOR
- Bit-Serial vs.Digit-Serial,w bits of 2nd Op
- Area Time O(mw) O(m/w) O(m2)
- Simultaneous Multiplication and Reduction
- Most-Significant-Digit First Better Than
Least-First
11Key Control Logic
- Montgomery Ladder
- Maintain Pk, Pk1
- For each key bit
- If key bit 0,
- Double P2k 2Pk
- Add P2k1 Pk Pk1
- If key bit 1,
- Add P2k1 Pk Pk1
- Double P2k2 2Pk1
12Inversion Control Logic
- Itoh-Tsujii Algorithm Fermats Little Theorem
- Define , so
-
-
-
13Synthesis Results
- 26 Test Vectors 1 w 16 NIST B-163
- R6 R5 Designs
- 113 m 251, m prime
- w ? 1, 2, 4, 8, 16
- Synopsys Design Compiler
- 250 nm TSMC, LEDA, low leakage cells
- 130 nm low-power IBM, ARM
- 90 nm high-vt IBM, ARM
- Area, Delay, Power (16 activity tests)
14250 nm Area Energy vs. Degree
- Energy per Encryption (µJ)
15130 nm Area Energy vs. Degree
- Energy per Encryption (µJ)
1690 nm Area Energy vs. Degree
- Energy per Encryption (µJ)
17Area (Gate Equiv.) vs. Degree
18Time (Cycles) vs. Degree
19Area Time vs. Other Works
- Area comparison does not include memory devices
20Other Elliptic Curve Processors
- 20, 02, Affine, EEA, LSB w1, no Square, no
Montgomery - 21, 03, Affine, EEA, MSB w1, no Square, 13
Regs - 15, 06, 15 Muls, 10 Regs, MSB w1, no Square
- Projective Operations, Affine Result
- Lopez-Dahab Formulas, no Inversion
- 16 06, 6 Muls, 8 Regs, MSD w1, Lopez-Dahab,
no Inv. - 17 07, 8 Muls, 10 Regs, MSD w1, Lopez-Dahab,
no Inv. - 18, 08, 7 Muls, 6 Regs, MSD w1, Lopez-Dahab,
no Inv. - Shared z, Min. Datapaths, Mux Selected Key Bit,
no Square
21Registers, Multiplies vs. Others
22Conclusions
- R6 Design 6 Regs, 6 Muls By Data Flow Analysis
- R5 Design 5 Regs, 7 Muls By Modifying Formulas
- Dedicated Squarer Energy Savings by Resource
Allocation - Muxes Hard-Coded Constants Microcoding Affine
Result - Synthesis Results in Area, Delay and Power
- Improvements Compared with Other Processors
- Difficulty with Gate Equivalent Area across
Technology Scales