Digital and Other ICs - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Digital and Other ICs

Description:

Space Telemetry. Parallel conc. of 16-state conv. codes. 384kbps (rate ... Dead-zone estimation circuit with two voltage controlled delay lines. ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 36
Provided by: bori99
Category:

less

Transcript and Presenter's Notes

Title: Digital and Other ICs


1
Digital (and Other) ICs
  • Borivoje Nikolic
  • bora_at_eecs.berkeley.edu

2
Research Areas
  • Low-Density Parity-Check Decoders
  • PLLs
  • Low-Power Digital ICs
  • Power-performance optimization
  • Compensating the impact of variations
  • Working with advanced devices
  • Analog-to-Digital Converters

3
Outline
  • Low-Density Parity-Check Decoders
  • PLLs
  • Low-Power Digital ICs
  • Power-performance optimization
  • Compensating the impact of variations
  • Working with advanced devices
  • Analog-to-Digital Converters

4
Iterative Coding
  • Iterative decoders are a part of many new
    standards
  • Also 10G Ethernet, magnetic disk drive and tape
    storage,

5
Low-Density Parity-Check Codes
  • Low density parity check codes Gallager63
  • Sparse binary parity check matrix, H
  • Nullspace of H forms set of codewords
  • Decoded using message-passing algorithms
  • Message-passing decoders
  • Low-density-parity check (LDPC) codes
  • Turbo-product codes are decoded similarly
    interleaver

6
Parallel LDPC Decoder Architecture
A. Blanksby and C. J. Howland, JSSC 2002
PEv4
PEv3
PEv1
PEv2
PEvN

1
2
Interconnect Fabric
. . .
PEc1
PEc2
PEcM

7
Staggered Serial LDPC Decoder
E. Yeo, et. al. Globecom2001
8
LDPC codes based on Galois Fields
  • Codes based on GF projections are low rate.
  • No cycles of length 4 (short loop)
  • Cyclic rows
  • e.g. (1023 x 1023) code has rate of 0.68
  • Column splitting
  • Each column in original matrix is split into four
  • Non-zero entries in original column are cycled
    through the 4 new columns
  • eg. (1023 x 4092) code has rate of 0.75
  • Partial loss of regularity (cyclic structure)
  • Complex O(N2) encoding
  • Puncturing
  • Truncate height of PC matrix
  • Columns in the maximum zero runlength region
    correspond to parity bit locations
  • Cyclic encoding using direct application of PC
    matrix now possible

Y. Kou, et. al. ISIT 2000
9
Shift register-based implementation
  • Staggered decoding.
  • Regularity of codes based on Finite Field
    geometries.

E. Yeo, et. al. Globecom2001
10
4092-bit LDPC Decoder
1.8 million transistors 2.7mm x 3.1mm (10x
smaller than a 1024-bit LDPC decoder) 1GHz Chi
p back in December E.Yeo
11
Structured LDPCs
Bit node groups
Check node groups
Ed Liao
  • Construction based on Ramanujan graphs allows for
    hierarchical decomposition and good performance

12
LDPC Codes - Status
  • Two students graduated
  • Engling Yeo (Ph.D). ST Microelectronics,
    Berkeley Lab
  • Ed Liao (M.S.) Qualcomm RD, San Diego
  • Continuing investigation of variable rate,
    variable block size LDPC codes, based on
    structured constructions
  • BEE and ASIC implementations

13
Outline
  • Low-Density Parity-Check Decoders
  • PLLs
  • Low-Power Digital ICs
  • Power-performance optimization
  • Compensating the impact of variations
  • Working with advanced devices
  • Analog-to-Digital Converters

14
PLL Jitter Analysis
15
PLL Jitter Analysis
Adjusting the loop characteristics (wN, z)
modulates the output jitter. There exists a
minimum that depends on the noise source
characteristics.
Problem Noise characteristics are NOT known a
priori!
Therefore, adaptive jitter optimization is
desirable!
16
PLL Circuit
17
Jitter Estimation
Signals track jitter boundaries
18
Implementation
Jitter Estimation
  • Circuit designed in 0.13 mm CMOS and taped out
  • Chip back in December, in the evaluation
  • Socrates Vamvakos

PLL
DL
Driver
19
Outline
  • Low-Density Parity-Check Decoders
  • PLLs
  • Low-Power Digital ICs
  • Power-performance optimization
  • Compensating the impact of variations
  • Working with advanced devices
  • Analog-to-Digital Converters

20
Power is a Problem
  • If we continue doing business as usual, both
    dynamic and leakage power will be a problem

chips are getting hot
and phones leaky!
  • Need to delivermaximum performance under power
    constraints

From S. Borkar, Intel
21
Optimizing Combinational Circuits
Initial W, Vdd,Vth
netlist
Static timer (C)
  • OPTIMIZER (Matlab)
  • Minimize DELAY subject to
  • Maximum ENERGY

Delay, Energy
W, Vdd,Vth
Output
  • Generate Energy Delay (E-D) tradeoffs for
    combinational blocks
  • Investigate the optimality of any given design
  • Optimize critical single-cycle blocks
  • Use inside microarchitecture optimizer

22
Example 64-bit CLA Adders
  • Wide adders are common in the critical paths of
    high performance microprocessors
  • Static adders are low power but slow
  • Domino logic is the choice for short cycle times
  • Setup
  • 0.13?m, 6M, 1.2V
  • Cout 450fF
  • Cin ? 150fF

Zlatanovici et al., ESSCIRC03
23
Adder Architecture in 90nm CMOS
psel
pc2
pc3
pc4
pc1
carry-in
c1
c2
c3
c4
sum select
H4, I4
H16, I16
H64
t, g gen
a630
t630
group16 gen
group64 gen
group4 gen
b630
g630
group64 gen
group16 prop
group4 prop
XOR (transmission gate)
sum gen
sum0630
j, k (static)
sum1630
Critical path of five gate delays 6.3 FO4 _at_
8.5 pJ/cycle
S. Kao, R. Zlatanovici
24
90nm Design
  • Finalize test strategy
  • Implement clock generator and assemble all
    circuits
  • Extract layout for timing and power verification
  • Tapeout Feb03

25
Optimizing Pipelined Circuits
  • Cycle boundaries transparent latches

COMBINATIONAL LOGIC
  • Grand goal find the configuration (transistor
    sizes, cutset) achieving shortest cycle time for
    given power budget and pipeline depth

26
Using Posynomial Models Results
  • Widening profile of NANDs and NORs, 3 cycles
  • Latches migrate towards the output as the power
    constraint tightens start with the input
  • Reverse direction of migration for narrowing
    profile
  • No migration for flat profile
  • Demonstrate on a floating-point unit

R. Zlatanovici
27
Micro-Architecture Optimization
A, B adders Input data rate f
Optimal ELk/ESw about 0.5 (All designs operate
at the throughput of the nominal design sized
for minimum delay under Vddmax and Vthref)
D. Markovic
28
Time-Mux SVD Example
s1,w1
s2,w2
s3,w3
s4,w4
PE U?
PE U?
PE U?
PE U?
rk4
rk3
rk2
rk1
y
  • PE too Fast
  • Large Area

PE-U?1
PE-U?2
wasted time
wasted time
PE-U?3
0
Tsymbol
PE-U?4
PE-U?1
PE-U?2
PE-U?3
PE-U?4
s1,w1
s2,w2
Time-Mux Architecutre
PE U?
s3,w3
y
  • Some Mux overhead
  • Large Area reduction

s4,w4
29
Energy-Area Tradeoff
  • Top1 can be achieved with M5 (E lt Eop1) or M3
    (E lt Eop2 )
  • Area (M5) 3/5 Area (M3)

Energy-Area is a measure of the overall chip cost
30
Working With Advanced Devices
Gate
Gate
Gate
Source
Drain
Source
Drain
Source
Drain
Buried Oxide
Gate
Tbody
Substrate
Bulk MOSFET
Double-Gate (DG)
Ultra-Thin Body (UTB)
12
12
7.2
10.2
9
8.7
7.9
Match Delay
Energy fJ
4.4
FO4 ps
FO4 ps
Match Leakage
Match Power
3.3
Bulk
Bulk
Bulk
DG
UTB
DG
UTB
DG
UTB
by changing VDD
L. Chang, T.-J. King
31
FinFET SRAM Array
  • FinFet devices, Ldrawn 50nm Leff 20nm
  • 1 metal layer, 0.35µm technology
  • SRAM Cell size 5.75x4 µm
  • WL poly
  • BL M1 fin
  • 15x15 SRAM array
  • Static NAND decoder
  • Cross-coupled latch-based sense amp
  • Array size approx.140µm x 70µm
  • Sematech run Jan04

R. Zlatanovici, S. Balasubramanian, with Prof.
T.-J. King
32
Advanced Devices
HP FinFET
LP FinFET
  • Back-Gated MOSFET

Enhancement mode
Accumulation mode
DSP
embedded
100.0
uP
BG-ENH
20
10.0
HP Fin
BG-ACC
HP Fin
BG ACC
10
1.0
BG ENH
LP Fin
LP Fin
0.1
0
0
2
4
1.E02
1.E03
1.E04
J. GarrettS. Balasubramanianwith T.J. King
delay (ps)
Frequency (GHz)
Logic depth
Adaptive VDD, Vth
33
Outline
  • Low-Density Parity-Check Decoders
  • PLLs
  • Low-Power Digital ICs
  • Power-performance optimization
  • Compensating the impact of variations
  • Working with advanced devices
  • Analog-to-Digital Converters

34
ADCs
  • Measured 1.8-V, 14-b, 12-MS/s pipelined ADC in
    0.18-mm CMOS with 102-dB SFDR (Yun Chiu)
  • In design 500MS/s, 12-b, 1.2V digitally
    background calibrated ADC in 0.13mm CMOS
  • After the lunch

35
Summary
  • LDPC decoder in testing (E. Yeo)
  • PLL in testing (S. Vamvakos)
  • Optimal power-performance tradeoffs, SVD (D.
    Markovic)
  • Power-performance optimal FPU (R. Zlatanovici)
  • Optimal 64-bit adder close to tapeout (S.Kao, R.
    Zlatanovici)
  • Adaptive VDD, VTh for low power (J. Garrett)
  • Power-performance optimization in synthesis flows
    (F. Sheikh)
  • Layout techniques to control variations (L.-T.
    Pang)
Write a Comment
User Comments (0)
About PowerShow.com