Title: Dynamic High-Performance Multi-Mode Architectures for AES Encryption
1Dynamic High-Performance Multi-Mode Architectures
for AES Encryption
- Eric Swankoski
- Naval Research Lab
- Vijay Narayanan
- Penn State University
2Background Motivation
- Bandwidth and throughput capabilities of modern
optical networks is skyrocketing - Protecting transmitted data becoming more and
more critical - Current encryption architectures generally arent
capable of keeping up with high-speed
environments - SEU effects rarely, if ever, considered
3Plan of Attack FPGA Encryption
- Algorithm Advanced Encryption Standard (AES)
- Supports multiple key lengths
- Supports multiple encryption modes
- Supports multiple levels of pipelining
- Target Architecture Xilinx FPGAs
- Can be adapted to ASIC devices
- Virtex-II, Virtex-4
- Target Performance 60 gigabits per second
- Requires both inner-round and outer-round
pipelining
4The AES Algorithm
- 10 Rounds of Encryption for 128-bit operands
- Four basic operations
- SubBytes
- 8-bit substitution (16 parallel operations per
round) - ShiftRows
- Byte reordering and rotation (4 parallel
operations per round) - MixColumns
- Polynomial multiplication (4 parallel operations
per round) - AddRoundKey
- Simple 128-bit XOR
5Optimizing for Performance
- Exploit all possible parallelism
- Alternative byte substitution methods
- 1 cycle for a lookup-based substitution
- 5 cycles for a mathematical transformation
- Utilize pipelining
- Outer-Round 1 cycle per round
- Inner-Round
- 4 cycles per round (lookup-based byte
substitution) - 8 cycles per round (pipelined byte substitution)
6Combinatorial Byte Substitution
- Actual mathematical transformation
- Conventional implementation cannot be pipelined
- Simple (atomic) 8x8 lookup table
- Smaller than lookup table
- Faster than lookup table
- Utilizes five-stage pipeline
- All internal operands are four bits wide
7Encryption Round Diagram
- Atomic S-Box
- 40 Pipeline Stages
- Combinatorial S-Box
- 76 Pipeline Stages
- Needs a constant stream to be effective
- Parallel Key Scheduling
- No performance penalty
- Offline Key Scheduling
- Precomputed keys can be stored in registers
8Counter (CTR) Mode
- Effectively converts AES into a stream cipher
- High security similar to CBC
- Supports inner-round and outer-round pipelining
- No error propagation errors are completely
isolated
9Cipher Block Chaining (CBC) Mode
- Most secure no patterns are observed
- Cannot be pipelined
- 100 downstream corruption resulting from data
loss or single-event upsets (SEUs) during
encryption - Errors are isolated during decryption
10Electronic Codebook (ECB) Mode
- Supports full pipelining
- No error propagation errors are completely
isolated - Least secure identical input gives identical
output - Patterns observable in video and image data
11Staggered CBC Mode
- Pipelined with Output Feedback
- Each encrypted block n depends on itself and the
block (n x) where x is the latency of the
pipeline - Maintains security while mitigating some error
propagation problems
12More Challenges
- Error-Tolerant Encryption
- Maintaining High Security
- Maintaining High Performance
13Error-Tolerant Encryption
- Are errors acceptable?
- Possibly, but better to assume not
- How do the multiple modes of encryption deal with
upsets? - Is there a benefit to triple modular redundancy
(TMR)? - Is it what we expect?
14Error-Tolerant Encryption
- CTR and ECB encryption isolate errors
- Transmission integrity largely preserved even
without SEU mitigation - TMR can ensure 100 transmission integrity
- TMR REQUIRED for CBC encryption
15Error-Tolerant Encryption
- Image 1 Error-Free Plaintext Image
- Before Encryption / After Decryption
- CTR, ECB, or CBC with mitigation
- Image 2 Decrypted Plaintext Image
- One corrupted block
- CTR or ECB without mitigation
- Image 3 Decrypted Plaintext Image
- One block corrupted during encryption
- CBC without mitigation
16Maintaining High Security
- How do the multiple modes of encryption affect
security? - Is physical protection of the key necessary?
- Depends on the environment
- How is throughput affected by increased security?
- Hopefully, not at all
17Maintaining High Security
- ECB-encrypted image has observable patterns
- CTR/CBC/SCBC encryption looks like random noise
18Maintaining High Security
- Physical Key Protection
- Not required in aerospace applications
- Power Analysis / Soft Attacks
- Countermeasures not mode specific
- Throughput Effects
- ECB CTR far outperform CBC
- Why is CBC an official mode?
19System-Level Diagram
- Supports ECB, CTR, CBC, and SCBC modes
- Supports two types of TMR
- System triplicates all control, key hardware,
and mode logic - Encryption triplicates only encryption and key
scheduling hardware
20Performance Results Virtex-4
Byte Substitution Key Scheduling Area Frequency Throughput (CTR, ECB, SCBC) Throughput (CBC)
ROM Online 3588 339.5 MHz 43.5 Gbps 1.088 Gbps
ROM Offline 2827 446.8 MHz 57.2 Gbps 1.430 Gbps
Combinatorial Online 13651 519.2 MHz 66.5 Gbps 700.0 Mbps
Combinatorial Offline 10912 519.2 MHz 66.5 Gbps 700.0 Mbps
- Key Scheduling
- Offline uses precomputed and stored keys (compile
or design time) - Online uses dynamically computed keys (run time)
- Significant performance improvement for
combinatorial byte substitution in pipelined mode - Virtex-II Pro performs better with ROM
implementation (56.42 60.35 Gbps) - Better CBC performance achieved through other
architectures
21Lessons Learned
- Dont try to over-optimize FPGA code
- Returns diminish quickly
- Sometimes less is more
- Know your synthesis tool
- Now why did it do THAT?
- Check your systems memory
- RAM does fail at inopportune times
- ESPECIALLY if it has a lifetime warranty
22Lessons Learned
- Over-optimization
- In a highly pipelined FPGA design, routing plays
a MAJOR role in the clock frequency - 70-80 of the total delay
- What would work in an ASIC (or in theory, or on
paper) might actually make things worse - Manual floorplanning and PR might help, but
usually provides minimal (if any) improvement - Moral? Try reducing the pipeline depth as well
as increasing it, it just might help!