Title: Embedded Software Architecture for Low Power
1(No Transcript)
2NATURE Non-Volatile Nanotube RAM based
Field-Programmable Gate Arrays
- Wei Zhang, Niraj K. Jha and Li Shang Dept.
of Electrical EngineeringPrinceton University
Dept. of Electrical and Computer
EngineeringQueens University
3A Hybrid CMOS/NAnoTUbe REconfigurable Architecture
- Motivation
- Background on CNT and NRAM
- Architecture of NATURE
- Logic Folding
- Experimental Results
- Conclusions
4Motivation
- Moores Law Whats Next?
- Carbon nanotubes (CNTs)
- Nanowires
- Single electron devices
- ...
- Challenges in nano-circuits/architectures
- Lack of a mature fabrication process
- Defects and run-time failures
- Reconfigurable architectures, such as an FPGA,
favored - Regular structures ease fabrication
- Fault tolerance through reconfiguration
5Motivation (Contd.)
- Problems of existing reconfigurable architectures
- High reconfiguration time overhead
- Low area efficiency
- Some recent works on programmable nanofabrics
- Molecular logic array (Goldstein et al.
ICCAD 2002) - Nanowire PLA (Dehon et al. FPGA 2004)
- CMOS/nanowire hybrid architecture CMOL
(Strukov et al. Nanotechnology 2005) - Fabrication problem not yet solved
6Advantages of NATURE
- Hybrid design leverages beneficial aspects of
both CMOS and CNT technologies - NRAMs are distributed in NATURE to store
multi-context reconfiguration bits - Fine-grain reconfiguration (even cycle-by-cycle)
- Enables temporal logic folding
- Flexibility to perform area-performance
trade-offs - One-to-two orders of magnitude increase in logic
density
CMOS fabrication compatible
NRAM-based
NATURE
Run-time reconfiguration
Temporal logic folding
Logic density
Design flexibility
7Background
- Carbon nanotube (CNT)
- Metallic or semiconducting
- Single-wall or multi-wall
- Diameter 1-100nm
- Length up to millimeters
- Ballistic transport
- Excellent thermal conductivity
- Very high current density
- High chemical stability
- Robust to environment
Source Euronanotrade
8Background (Contd.)
Source Nantero
- Non-volatile nanotube random-access memory (NRAM)
- Mechanically bent or not determines bistable
on/off states - Fully CMOS-compatible manufacturing process
- Prototype chip 10 Gbit NRAM
- Will be ready for the market in the near future
9NRAMs
- Properties of NRAMs
- Non-volatile
- Similar speed to SRAM
- Similar density to DRAM
- Chemically and mechanically stable
- NATURE not tied to NRAMs
- Phase change RAM
- Magnetoresistive RAM
- Ferroelectric RAM
10Architecture of NATURE
- Island-style logic blocks (LBs) connected by
various levels of interconnects - An LB contains a super macroblock (SMB) and a
local switch matrix
11Architecture of a Super Macroblock (SMB)
- n1 macroblocks (MBs) comprise an SMB, here n1 4
12Architecture of a Macroblock (MB)
- n2 logic elements (LEs) comprise an MB, here n2
4
13Logic Element and Interconnect
- An LE implements a computation and contains
- An m-input look-up table (LUT)
- A flip-flop
- A pass transistor
- Interconnect
- Mixed wire segment scheme
- 25, 50 and 25 distribution for length-1,
length-4 and long wires - Direct links from one LB to its 4 neighbors
14Support for Reconfiguration
- Reconfiguration time short 160ps
- Area overhead of NRAMs
- k no. of reconfiguration sets per NRAM, assume k
16 - Area overhead 20.5 per LB, assuming 100nm
technology for CMOS logic and nanotube length - Logic density k (conf. copies) x area per
configuration 16(1-0.205)12.75 - Appropriate value for k obtained through design
space exploration
15Temporal Logic Folding
- Basic idea one can use NRAM-enabled run-time
reconfiguration to realize different Boolean
functions in the same logic element (LE) every
few cycles
16Example
Without logic folding
With logic folding
Num of LEs 2
Num of LEs 6
Delay 4clock_period
Delay 4 LE delays Interconnect delay
Clock period LE delay Reconfiguration Intercon
nect delay
17Folding Levels
- Logic folding can be performed at different
levels of granularity, providing flexibility to
perform area-performance trade-offs - A level-p folding implies reconfiguration of the
LE after the execution of p LUT computations
(a) level-1 folding
(b) level-2 folding
18Choosing the Folding Level
Clock period increases Routing delay
increases Number of clock cycles
decreases Reconfiguration time decreases
Total delay typically decreases
Folding level
Number of LEs increases
Area increases
- Advantages of logic folding
- Significant flexibility for performing
area-performance trade-offs - Ability to map much larger circuits using the
same number of LEs - Significant improvement in the area/circuit delay
product - Reduction in the need for global routing
19Experimental Setup
- Instance of architecture 4 MBs in an SMB, 4 LEs
in an MB, and LEs contain a 4-input LUT - Number of reconfiguration copies k varied in
order to compare implementations corresponding to
selected folding levels level-1, level-2,
level-4 and no logic folding - Results based on 100nm CMOS technology parameters
20Experimental Results
Average area-time product advantage 2X
Maximum area-time product advantage 3X
21Experimental Results (Contd.)
16-RCA 16-bit ripple carry adder
16-CLA 16-bit carry lookahead adder 16-CSA
16-bit carry select adder 8-MUL
8-bit multiplier
Average area-time product advantage 13X
Maximum area-time product advantage 35X
22Experimental Results (Contd.)
- Flexibility in performing area-performance
trade-off - For area-time (AT) product, larger the circuit
depth, more the advantages of level-1 folding
relative to no folding - For the 64-bit ripple-carry adder, this advantage
is about 35X - LE utilization and logic density very high, with
a reduced need for a deep interconnect hierarchy
23Conclusions
- NATURE A novel high-performance run-time
reconfigurable architecture - Introduction of NRAMs into the architecture
enables cycle-by-cycle reconfiguration and logic
folding - Choice of different folding levels allows the
flexibility of performing area-performance
trade-offs - Logic density and area-time product improved
significantly - Can be very useful for cost-conscious embedded
systems and future FPGA improvement