Title: ICCAD03 Review
1ICCAD03 Review
2Outline
- Overview
- Archive download URL
- Best paper award
- Paper from our group
- Interesting tutorial
- Paper in related areas
- Power and energy optimization
- Interconnect-centric SoC design
- Reliable issue
- Performance optimization
- Simulation at the nanometer scale
- Other areas in ICCAD
3Archive Download URL
- Papers and presentation slides can be downloaded
from - http//www.iccad.com/archive.html
4Best Paper Award
- 6C.1 - Noise Analysis for Optical Fiber
Communication Systems - Alper Demir
- KOC University, Sariyer-Istanbul, Turkey
- 8B.1 - Block-Based Static Timing Analysis with
Uncertainty - Anirudh Devgan, Chandramouli Kashyap
- IBM Research at Austin, IBM Microelectronics
5Paper from Our Group
- 1A.1 - Adaptive Error Protection for Energy
Efficiency - Lin Li, N. Vijaykrishnan, Mahmut Kandemir, Mary
Jane Irwin - 3C.1 - Array Composition and Decomposition for
Optimizing Embedded Applications - Guilin Chen, Mahmut Kandemir, Ugur Sezer, Avanti
Nadgir
6Interesting Tutorial
- 2C.1 - Design and CAD Challenges in sub-90nm CMOS
Technology - Kerry Bernstein, Ching-Te Chuang, Rajiv V. Joshi,
Ruchir Puri - IBM T.J. Watson
- 11B.1 - Formal Methods for Dynamic Power
Mangement - Rajesh K. Gupta, Sandeep Shukla, Sandy Irani
- UCSD, UCI, and VT
72C.1 - Design and CAD Challenges in sub-90nm CMOS
Technology
- Introduction
- CMOS device scaling
- New devices for high-performance logic
- Planar device structures
- Partially-depleted (PD) SOI
- Fully-depleted (FD) SOI
- Strained-Si high-k gate
- Emerging technologies
- Double-gate MOSFETs
- 3D integration and interconnects
- Carbon Nanotube Transistor (CNT)
- Molecular computing
- CAD challenges
- Challenges of Advanced device technologies
- Major issues
- Power crisis
- Coping with Variability
82C.1 - Design and CAD Challenges in sub-90nm CMOS
Technology (Contd)
911B.1 - Formal Methods for Dynamic Power Mangement
- Overview the formal methods that have been
explored in solving the system-level Dynamic
Power Management (DPM) problem. - Show how formal reasoning frameworks can unify
apparently disparate DPM techniques. - Approaches that treat the DPM problem as one of
stochastic optimization with probabilistic
guarantees on performance.
10Power and Energy Optimization
- Using dynamic voltage scaling in embedded systems
(Section 1B) - Using software techniques in embedded systems
(Section 3C) - Energy issues in systems design (Section 7B)
- Power-aware design (Section 8C)
111B.1 - Generalized Network Flow Techniquesfor
Dynamic Voltage Scaling in Hard Real-Time Systems
- Vishnu Swaminathan, Krishnendu Chakrabarty
ECE_at_Duke - Energy consumption must be carefully balanced
with real-time responsiveness in hard real-time
systems. - Present an optimal offline dynamic voltage
scaling (DVS) scheme for dynamic power management
in such systems.
12Generalized Network Flow Models for the DVS
problem
131B.2 - Approaching the Maximum Energy Saving on
Embedded Systems with Multiple Voltages
- Shaoxiong Hua, Gang Qu ECE_at_UMCP
- For a multiple-voltage DVS system to serve a set
of applications (ei, di, pi) i1, 2, , n
without missing their deadlines, - if the system has m voltages v1, v2, ,vm,
determine the value of each vi to minimize the
energy consumption. - determine m and the value of each vi .
141B.2 - Approaching the Maximum Energy Saving on
Embedded Systems with Multiple Voltages (Contd)
- Voltage set-up is the fundamental problem for
multiple-voltage DVS system. - application-specific
- 2-voltage DVS system analytic solutions and a
linear search algorithm - m-voltage DVS system analytic solution does not
exist, an approximation method - Multiple-voltage can be very close to the maximal
energy saving by DVS.
151B.3 - Combined Dynamic Voltage Scaling and
Adaptive Body Biasing for Heterogeneous
Distributed Real-Time Embedded Systems
- Le Yan, Jiong Luo, Niraj K. JhaEE_at_Princeton
- New scheduling algorithm that combines DVS and
adaptive body biasing (ABB) to simultaneously
optimize both dynamic power consumption and
leakage power consumption for real-time
distributed embedded systems.
161B.3 - Combined Dynamic Voltage Scaling and
Adaptive Body Biasing for Heterogeneous
Distributed Real-Time Embedded Systems
- A novel two-phase approach
Phase I Optimal tradeoff between supply and
threshold voltages
Phase II Trade off energy consumption and clock
period
171B.3 - Combined Dynamic Voltage Scaling and
Adaptive Body Biasing for Heterogeneous
Distributed Real-Time Embedded Systems
Initializations
Phase I
No
Extensible tasks exist?
Return
Yes
Allocate slack to reference task
Phase II Reference task highest
energy_derivative
Allocate slack to each other task
energy_derivative higher than reference level
No
ESTWCETgtLFT?
Yes
Invalidate this slack allocation
183C.3 - Energy Optimazation of Distributed
Embedded Processors by Combined Data Compress ion
and Functional Partitioning
- Jinfeng Liu, Pai H. Chou ECE_at_UCI
- Goal
- Energy minimization for distributed embedded
processors - Combined optimization
- Selection of optimal compression algorithm
- Functional partitioning
193C.3 - Energy Optimazation of Distributed
Embedded Processors by Combined Data Compress ion
and Functional Partitioning
203C.4 - Energy-Aware Fault Tolerance in
Fixed-Priority Real-Time Embedded Systems
- Ying Zhang, Krishnendu Chakrabarty, Vishnu
Swaminathan ECE_at_Duke - Goal low power, fault-tolerant real-time systems
- Fault tolerance is achieved via checkpointing
- Power management is carried out using dynamic
voltage scaling (DVS).
217B.1 - A Game Theoretic Approach to Dynamic
Energy Minimization in Wireless Transceivers
- Ali Iranli, Hanif E. Fatemi, Massoud
PedramEE_at_USC - A hierarchical formulation for energy
optimization of wireless transceivers is proposed - A game theoretic approach to solve this energy
minimization is proposed by which the energy
consumption is reduced by 15 for BER 10-5 - The proposed hierarchical frame work can be used
in general for energy optimization of
server-client systems
227B.1 - A Game Theoretic Approach to Dynamic
Energy Minimization in Wireless Transceivers
Transceiver Energy Optimization
Stackelberg Game
237B.2 - Communication-Aware Task Scheduling and
Voltage Selection for Total Systems Energy
Minimization
- Girish V. Varatkar, Radu MarculescuECE_at_CMU
- Recent work in ES community performance and
energy are crucial! - Voltage selection
- Task scheduling algorithm should use the
foresight that voltage selection is going to
follow the scheduling step - Schedule should provide the maximum slowing down
potential - This work brings the communication aspect into
the picture - A communication-centric approach
- A voltage selection approach
247B.3 - LRU-SEQ A Novel Replacement Policy for
Transition Energy Reduction in Instruction Caches
- Praveen G. Kalla, Xiaobo Sharon Hu, Joerg Henkel
CSE_at_Notre Dame - LRU to LRU-SEQ (Sequential LRU)
- Constraining sequential fetches to the same bank
(same way) avoids bank transitions. - It also increases the sleep time for the banks
over-coming break-even time requirements. - LRU nature has to be maintained, else
associativity is lost !! (hit-ratio is affected) - Distance between the last fetched line and the
present line is a parameter that will affect the
performance of this policy.
257B.3 - LRU-SEQ A Novel Replacement Policy for
Transition Energy Reduction in Instruction Caches
State Holder 1 P_way (entire cache)
State Holder 2 P_line (each cache way)
267B.4 - Compiler-Based Register Name Adjustment
for Low-Power Embedded Processors
- Peter Petrov, Alex Orailoglu CSE_at_UCSD
- Compiler-driven register name adjustment for
low-power was proposed - Register names reassigned without incurring any
performance or power overhead - No hardware support required whatsoever
- Efficient algorithm for Register Name Adjustment
proposed with additional frequency skew enhancing
phase
278C.1 - Leakage Power Optimization Techniques for
Ultra Deep Sub-Micron Multi-Level Caches
- Nam S. Kim, David Blaauw, Trevor N.
MudgeEECS_at_UMICH - Cost- effective of VTH for cache leakage
reduction - depending on the target access time, but 1 or 2
high VTHs is enough for leakage reduction - Cache leakage
- another design constraint in processor design
- trade-off among delay / area / leakage
- Incorporating w/ realistic cache miss statistics
for the leakage optimization
288C.1 - Leakage Power Optimization Techniques for
Ultra Deep Sub-Micron Multi-Level Caches
Using high-k dielectric reduces gate-oxide leakage
ITRS 2002 projections with doubling of of
transistors every two years
298C.1 - Leakage Power Optimization Techniques for
Ultra Deep Sub-Micron Multi-Level Caches
cache sub-bank organization
bit-line pair
VTH2
Circuit model based on CACTI
word-line
VTH1
70nm Berkeley predictive technology model
VTH3
decoder
memory cell
Abus buffer w/ repeater
Interconnect R/C annotated
repeaters used to minimize interconnect delay
sense-amp w/ I/O circuits
VTH4
Dbus buffer w/ repeater
308C.3 - Dynamic Platform Management for
Configurable Platform-Based System-on-Chips
- Krishna Sekar, Kanishka Lahiri, Sujit Dey
ECE_at_UCSD - Described design techniques for dynamically
customizing a general-purpose configurable
platform - Dynamic platform management helps combine
benefits of general-purpose application-specific
approaches - Benefits
- Improved application performance
- More efficient platform resource usage
- Improved energy efficiency
318C.3 - Dynamic Platform Management for
Configurable Platform-Based System-on-Chips
General-Purpose Processors
General Purpose Configurable Platforms
Improving flexibility, time-to-market, engg.
cost, time-in market,
Domain Specific Platforms
ASIC, Custom SoC
Improving performance, power, size
328C.3 - Dynamic Platform Management for
Configurable Platform-Based System-on-Chips
33Interconnect-Centric SoC Design
341A.2 - SAMBA-Bus A High Performance Bus
Architecture for System-on-Chips
- Ruibing Lu, Cheng-Kok Koh ECE_at_Purdue
- Single Arbitration, Multiple Bus Accesses
- Automatically delivers multiple bus transactions
- High bandwidth
- Bus transactions can be performed even without
explicit bus access grant from the arbiter - Communication latency increases only slightly
even with high arbitration latency
351A.2 - SAMBA-Bus A High Performance Bus
Architecture for System-on-Chips
Two sub-buses
361A.3 - The Y-Architecture for On-Chip
Interconnect Analysis and Methodology
- Hongyu Chen, Chung-Kuan Cheng, Andrew B. Kahng
et.al.
CSE_at_UCSD - The Y-architecture for on-chip interconnect is
based on pervasive use of 0-, 120-, and
240-degree oriented semi-global and global
wiring. - Communication capability (throughput of meshes)
better than Manhattan architecture and
X-architecture. - Better total wire length compared to both H and X
clock tree structures and better path length
compared to the H tree. - Achieve 8.5 less IR drop than an
equally-resourced power network in Manhattan
architecture.
371A.3 - The Y-Architecture for On-Chip
Interconnect Analysis and Methodology
7 x 7 meshes with different interconnect
architectures.
38Reliable Issue
393B.4 - Vectorless Analysis of Supply Noise
Induced Delay Variation
- Sanjay Pant, David Blaauw, Savithri
SundareswaranUMICH, Motorola - Power Supply Integrity Issues
- Functional Failure
- Voltage fluctuations inject noise in the circuit
- Performance Failure
- Gate delay becoming increasing sensitive to
supply voltage - 10 variation in supply can result in 30 delay
increase - Proposed Approach
- Vectorless
- Conservative in estimating worst-case drop/delay
increase - Takes into account both IR and LdI/dt drops
403B.4 - Vectorless Analysis of Supply Noise
Induced Delay Variation
- Voltage Drop Estimation
- Worst Drop highly dependent on input vectors
- Slow simulation times allow only a few vectors to
be tried - Worst-Case Voltage Budget Analysis
- Highly conservative
- Worst-case drop is localized
- Ignores voltage shifts between distant
driver-receiver pairs
413B.4 - Vectorless Analysis of Supply Noise
Induced Delay Variation
Divide Chip Into Blocks
Compute Unit Pulse Response
Express Delay/Voltage Using Spatial/Temporal
Superposition
Formulate Delay/Voltage Max. As Linear
Optimization
425B.2 - Fault-Tolerant Techniques for Ambient
Intelligent Distributed Systems
- Diana Marculescu ECE_at_CMU
- Novel techniques for harnessing redundancy as a
way for increasing fault-tolerance - Assume a large number of networked devices
- Idle devices can act as surrogates for failing
ones via application migration or remapping - Scheduling techniques for optimizing system
lifetime - Determine optimal migration schedule, under
realistic battery models
438C.2 - Dynamic Fault-Tolerance and Metrics for
Battery Powered, Failure-Prone Systems
- Phillip Stanley-Marbell, Diana MarculescuECE_at_CMU
- Introduce the concept of adaptive fault-tolerance
management for failure-prone systems, and a
classification of local algorithms for achieving
system-wide reliability.
44Performance Optimization
455B.1 - Cache Optimization For Embedded Processor
Cores An Analytical Approach
- Arijit Ghosh, Tony Givargis CS_at_UCI
- An efficient algorithm to directly compute cache
parameters satisfying desired performance
criteria.
465B.3 - Performance Efficiency of Context-Flow
System-On-Chip Platform
- Rami Beidas, Jianwen Zhu ECE_at_Toronto
- A new programming model, called context-flow,
that is simple, safe, highly parallelizable yet
transparent to the underlying architectural
details.
47Simulation at the Nanometer Scale
487A.1 - A Probabilistic-Based Design Methodology
for Nano-Scale Computation
- Iris Bahar, Joseph Mundy, Jie Chen
Brown - Based on Markov random fields
- Propose a new architectural framework designed to
handle faulty processes prevalent with nanoscale
devices - Dynamically defect tolerant
- Adapts to errors as a natural consequence of
probability maximization - Removes need to actually detect faults
- Can handle both structure- and signal-based faults
497A.1 - A Probabilistic-Based Design Methodology
for Nano-Scale Computation
- Carbon Nanotubes (CNTs)
- Excellent conductors
- Diodes, FETs, and memory arrays using CNTs have
been demonstrated - Physical placement of CNTs is an issue
- Alumina substrates have been proposed to
fabricate arrays of CNTs
507A.1 - A Probabilistic-Based Design Methodology
for Nano-Scale Computation
- Molecular devices
- Direct use of molecules and their electronic
states - Conduction achieved by changes in physical
configuration or electronic state - Diodes and memory have been demonstrated
additional electron
switch on
517A.1 - A Probabilistic-Based Design Methodology
for Nano-Scale Computation
- Quantum Cellular Automata (QCA)
- Based on local interaction of quantum dots
arranged in cells - Logic function is encoded into spatial patterns
of the cells. - Information is propagates through chains of QCA
devices
527A.2 - Modeling of Ballistic Carbon Nanotube
Field Effect Transistors for Efficient Circuit
Simulation
- Arijit Raychowdhury, Saibal Mukhopadhyay, Kaushik
Roy ECE_at_Purdue - Circuit/SPICE level model for Ballistic CNFETs
- Removes self-consistent solutions of Poissons
and Schrödinger's Equations - Proposed model closely replicates the self
consistent numerical simulations - The model has been used to simulate simple
adders/multipliers
537A.2 - Modeling of Ballistic Carbon Nanotube
Field Effect Transistors for Efficient Circuit
Simulation
Carbon nanotubes are graphite sheets rolled in
the form of tubes. They act as channel material
for FETs.
Source IBM
547A.2 - Modeling of Ballistic Carbon Nanotube
Field Effect Transistors for Efficient Circuit
Simulation
Intrinsic CNT
557A.2 - Modeling of Ballistic Carbon Nanotube
Field Effect Transistors for Efficient Circuit
Simulation
- Performance of CNFETs can be evaluated only
through circuit simulations - SPICE compatible compact modeling is essential
for circuit simulations
567A.3 - Circuit Simulation of Nanotechnology
Devices with Non-Monotonic I-V Characteristics
- Jiayong Le, Larry Pileggi, Anirudh DevganECE_at_CMU
- Describes a circuit level simulator that can
accommodate an important class of nanotechnology
devices that are characterized by nonmonotonic
I-V characteristics.
57Other Areas in ICCAD
- Placement, Routing, and Floorplanning
- Analog design and Methodology
- Verification
- Formal Verification
- Dynamic Verification
- Timing Analysis
- Delay and Signal Modeling
- Statistical Static Timing
- Retiming for Global Interconnects
58Other Areas in ICCAD (Contd)
- CAD Algorithms for Emerging Technologies
- Reversible Logic Synthesis
- DNA Probe Array Layout
- MEMS
- Design for Customized Processors
- Synthesis
- Testing