Design Flow for HWSW Acceleration Transparency in the ThumbPod Embedded System

1 / 20
About This Presentation
Title:

Design Flow for HWSW Acceleration Transparency in the ThumbPod Embedded System

Description:

BUILD I/F & RECODE. VHDL. C FCN( ) mPROC. ACCELERATION. SYNTHESIZE ... Recode only necessary functions. Design over multiple abstraction levels (Java, C, VHDL) ... –

Number of Views:43
Avg rating:3.0/5.0
Slides: 21
Provided by: david399
Category:

less

Transcript and Presenter's Notes

Title: Design Flow for HWSW Acceleration Transparency in the ThumbPod Embedded System


1
Design Flow for HW/SW Acceleration Transparency
in the ThumbPod Embedded System
  • David Hwang, Patrick Schaumont,Yi Fan, Alireza
    Hodjat,Bo-Cheng Lai, Kazuo Sakiyama,Shenglin
    Yang, Ingrid Verbauwhede
  • 2003 DAC/ISSCC Student Design ContestUCLA
    Electrical Engineering Department

2
Outline Case Study
  • Introduction to ThumbPod
  • Design flow HW/SW Acceleration Transparency
  • Interface Design
  • Interface Overhead
  • Implementation
  • Software and Simulation Flow
  • FPGA Prototype
  • Conclusions and Future work

3
Introduction to ThumbPod
  • Intelligent secure keychain device that
    recognizes owner biometrically
  • Components
  • Microcontroller
  • Fingerprint sensor
  • Biometric signal processing
  • Crypto-processor
  • LCD display
  • Memory
  • Communication IR and USB
  • Applications
  • Secure credit cards, secure memory, access
    control, etc.

4
Complete secure system design!
  • Design a complete system at all levels of
    abstraction

Confidentiality
Protocol Wireless authentication protocol
design
Integrity
Integrity
Identification
Identification
SIM
SIM
Cipher Design,
Algorithm Embedded fingerprint matching
algorithms
Biometrics
Java
Java
JCA
JCA
Architecture Embedded software stack of Java,
JVM / KVM, C
JVM
KVM
CPU
CPU
Crypto
Micro-Architecture Crypto-processor and DFT
co-processor design
MEM
MEM
Vcc
Vcc
D
D
Circuit Circuit techniques to combat power
analysis attacks (not on prototype)
Q
Q
CLK
CLK
5
Fingerprint Verification Protocol Design and
Challenges
  • Verify identity of user of ThumbPod remotely
  • Strong user-ThumbPod identity tie prevents fraud
  • Complex computations
  • Cryptography
  • Fingerprint extraction signal processing
  • on a constrained device
  • Energy efficiency
  • Performance considerations
  • Form factor
  • Example Extraction of 2800-b fingerprint
    template and key generation 200 million cycles
  • Java used for portability and security

6
Design Flow
designflow
FPGA PROTOTYPE FORCONCEPT DEMONSTRATION(DAC
UNIVERSITY BOOTH)
IDEA
7
Conventional Design Flows
ASIC-based
Microprocessor-based
HW / SWWALL
C / Matlab
C
GOOD high performance BAD design time power
(FPGA)
GOOD design time BAD low performance
8
Microprocessor-based Design
Hardware acceleration
Improved compiler
C
C
COMPILE-os2
COMPILE
mPROC
mPROC
  • Bridge the HW/SW wall
  • Add hardware modules to processor
  • Modify compiler for new instructions

9
Bridging the Wall
ASIC-based
Microprocessor-based
HW / SWWALL
C / Matlab
C
VHDL
GOOD high performance BAD design time
GOOD design time BAD low performance
10
Our Flow Functional HW / SW Acceleration
Transparency
Java
COMPILE
Bytecode
KVM
mPROC
11
From Another Angle
Java hash( )
JavaCYCLES
Java hash( )
JavaCYCLES
CCYCLES
Co-Proc.CYCLES
Java CYCLES
12
Advantages and Disadvantages
  • Advantages
  • Performance
  • Recode only necessary functions
  • Design over multiple abstraction levels (Java, C,
    VHDL)
  • Original Java code changes minimally
  • Java function signature remains the same
  • Smooth migration from workstation to embedded
    platform
  • Incremental refinement
  • Disadvantages
  • Interface design
  • Interface overhead (cycle count)

13
Interface Design and OverheadAES Example
  • Interface Design
  • Java to C interface Java/K Native Interface (JNI
    / KNI)
  • C to HW interface GEZEL design environment
  • Interface Overhead
  • Cycles required for moving to lower abstraction
    layers
  • Overhead should not negate performance gains
  • AES Co-Processor Example
  • Advanced Encryption Standard (Rijndael)
  • 128-b data, 128-b key symmetric key cipher
  • Various cryptographic functions over GF(28)


14
AES ExampleInterface Design
public final class RijndaelAlgorithm static
native int rijndael(int din, int key)
public static void main(String args)
dout rijndael(key, din) . . .
JAVA
Java CInterface
void Java_RijndaelAlgorithm_rijndael (void) .
. . i1 popStackAsType(ARRAY) i2
popStackAsType(ARRAY) rijndael(i1-gtdata,
i2-gtdata, result-gtdata) pushStackAsType(ARRAY,
result) . . .
C
C Co-proc.Interface
  • void rijndael(int din4, int key4, int
    dout4)
  • asm(" mov 0, l0" "r" (key))
  • asm(" ldd l0, c0 ! upper word
    key
  • ldd l08, c2 ! lower word
    key
  • cpop1 load_key c0, c2 ! load the
    key)
  • asm(" cpop1 encrypt_ecb c0, c2 ! encrypt
    AES-ECB
  • cpop1 read_output c4, c6 ! getoutput
    data)
  • . . .

ASM
Co-proc.Instructions
15
AES ExampleInterface Overhead
Javacycles
AES301,034
Interface367
Interface892
Ccycles
AES44,063
acceleration
AES11
Co-processorcycles
301, 034
44,430
903
Total Cycles
6.8X
333X
Improvement
  • Interface overhead for co-processor consumes
    cycles but still 333X improvement
  • Better improvement if separate data and control
    flow
  • Currently, data flow and control flow are merged
  • Co-processors with direct memory access would
    reduce interface overhead

16
System Improvements Cycle Counts
CYCLES 241,948,800
CYCLES 159,259,730 (34 FEWER)
  • Other applications have different cycle profiles
    (i.e. streaming)
  • Functional acceleration must be taken in context
    of ENTIRE system HW/SW model
  • Profiling is important

CYCLES 127,846,828 (47 FEWER)
17
Software and Simulation Flow
KVM
  • GEZEL is an interface mechanism to combine C and
    hardware
  • Three simulation platforms of the same system
  • Each platform corresponds to the addition of
    anabstraction level

18
FPGA Hardware Prototype
  • Xilinx Virtex-II FPGA
  • Embedded LEON 32-b Sparc processor
  • Memory-mapped co-processors on the AMBA APB bus
  • Two UARTs
  • Communication with server
  • Authentec CMOS fingerprint sensor

32 MB SRAM
Xilinx Virtex-II FPGA
Mem. Controller
Boot PROM
AMBA AHB
Server
APB Bridge
LEON32- SparcProc.
UART
APB
DFTCo-Proc.
AESCo-Proc.
AuthentecAF-2
19
Prototype Setup
  • Working demo on an FPGA board (two ThumbPods
    shown)and PC connected over RS-232
  • Fingerprint algorithm has 0.5 false reject rate
    and0.01 false acceptance rate

20
Conclusions and Future Work
  • ThumbPod is a secure embedded system witha user
    identity tie
  • HW/SW acceleration transparency used asembedded
    design flow
  • Interface-based design
  • AES co-processor performance gain of 333X
  • System cycle reduction of 47
  • Seven graduate students in eight months
    fromconcept to demo
  • Future work
  • Build an actual ThumbPod
  • Low-power issues
  • Generalizing design flow
Write a Comment
User Comments (0)
About PowerShow.com