Title: Communication Latency Aware Low Power NoC Synthesis
1Communication Latency Aware Low Power NoC
Synthesis
- Yuanfang Hu, Yi Zhu, Hongyu Chen, Ronald Graham,
Chung-Kuan Cheng - CSE, University of California, San Diego
- Synopsys, Inc.
2Outline
- Introduction
- Problem Statement
- Methodology
- Experiment Results
- Conclusions and Future Works
3Power and Latency in NoC
- NoC A solution to efficient on-chip
communication - Power efficiency and communication latency are
two important NoC design objectives
MIT Raw (0.18um, 300MHz) Four 4x4
networks Average 7.2w, peak 14.8w 36 of
average chip power
Austin TRIPS (0.1um, 1GHz) Four 5x5
networks Average 14.1w, peak 23.1w 25 of
average chip power
Data sources are from http//www.princeton.edu/pe
h/
4Topology Optimization
- NoC vs. traditional network topology design
- Physical implementation aware. On-chip area
becomes one of the critical constraints - NoC topology has large impact on power and
latency - Different topologies lead to different of hops,
total wire length, flow distributions, etc
Courtesy of Sergio Tota and Mario R. Casu at
Politecnico Di Torino
5Wire Style Optimization
- Different wire styles have different properties,
in terms of power, latency, and routing area
6Motivation An Example
Wire Style Library
Topology Library
Physical Synthesis
7Outline
- Introduction
- Problem Statement
- Methodology
- Experiment Results
- Conclusions and Future Works
8NoC Design Constraints
- Bandwidth
- Communication demands among tiles are satisfied
- Area
- Physical implementation constraint, network
routing area usage cannot exceed the available
on-chip area resources - Latency
- Global average latency is less than a given
budget - Power
- Total power consumption is less than a given
budget
A
A
9Latency Constrained Low Power NoC Synthesis
Problem
- Input
- n x n tiles, a topology library, a library of
interconnect wire styles - Communication demand matrix src x dest
- Objective
- The power efficient NoC topology and its physical
implementation (wire styles and capacities) - Constraints
- Bandwidth requirements need to be satisfied
- The wiring area can not exceed the chip dimension
- Global average latency cannot exceed the given
budget
10Outline
- Introduction
- Problem Statement
- Methodology
- Experiment Results
- Conclusions and Future Works
11Design Flow
12Multi-Commodity Flow (MCF) Approach
- MCF model in CAD
- Global routing (Carden, TCAD96, Albrecht,
TCAD01 ) - NoC mapping (Murali, DATE04 )
- MCF is a static flow model
- Do not consider queuing and contention impacts
- MCF vs. detail simulation based approach
- A very fast evaluation method
- Results need to be further verified by simulation
13Wire Style Representation in MCF
- Integrate multiple wire styles in MCF formulation
- Notations
- Wire style parameter (Pe, Ae, De)
- Bandwidth di Communication demand for commodity
i - Area A Routing area on vertical and horizontal
dimension - Latency LT global average latency budget
- Flow f(p) flow amount on path p
14MCF Formulation
Obj Min Power
Latency constr.
Bandwidth constr.
Area constr.
15Approximation MCF Algorithm
- Obtain (1e) optimal solutions in polynomial time
- Based on LP primal-dual theory
- Core idea
- Iteratively update primal and dual values till
convergence - Our algorithm is based on SODA02 by G.
Karakostas - Use edge length to reflect the constraint
violation - The worse constraint violation, the larger edge
length - Route flows on shortest paths to best satisfying
constraints
16Algorithm Flowchart
A
17Design Flow
18Topology Library
- We study topologies that have identical row and
column connections - Cover a lot of popular topologies such as mesh,
torus, hypercube, octagon, twisted cube, etc - Users can add arbitrary topologies to topology
library
of topologies in topology library, generated
based on Hu, ICCD05
19Power and Delay Library
- Wires
- 0.18um tech node, min global pitch 1.44um
- Table shows power and delay for unit wire length
(2mm) - Power and delay of RC wires are proportional to
wire length - Power and delay of T-line have setup
costP(setup) 4.4pJ/bit, D(setup) 50ps
- Routers
- 1GHz frequency, 4-flit buffer size, 128-bit flit
size - Power model Orion power simulator
- Delay model Peh, HPCA01
20Outline
- Introduction
- Problem Statement
- Methodology
- Experiment results
- Conclusions and Future Works
21Experiment Settings
- 0.18um technology node, 8x8 NoC with tile size
2x2mm - Bandwidth requirements between any pair of tiles
are 1Gb/s - MCF approximation algorithm, error tolerance
- e 1
22(1) Power and Latency Tradeoffs
- Power and latency tradeoffs under different area
constraints - When area resource is abundant, sacrificing 2 of
latency can bring up to 19.4 of power savings
23(2) Topology Selection
- Comparison of optimal topology with mesh, torus
and hypercube in terms of power latency product
24(2) Topology Selection (Cont.)
- Optimal 8x8 topologies under various on-chip area
resources
25(3) MCF Algorithm Performance
- CPU time Comparison of CPLEX (a commercial LP
solver) and our approximation MCF solver
26Conclusions and Future Works
- We study the latency constrained low power NoC
synthesis problem, through a simultaneous
optimization of topology and interconnect wire
styles - We implement an efficient approximation MCF
algorithm - Experiments show that for 8x8 NoC, using our
power and delay parameters - Compared with mesh, torus and hypercube
topologies, our optimized design can improve
power latency product by upto 52.1, 29.4, and
35.6, respectively - Future directions
- Latency constraint global average latency
constraint vs. pair-wise average latency
constraint - FPGA routing architecture design
27Questions?
Thank You!
28(2) Topology Selection
- Comparison of optimal topology with mesh, torus
and hypercube in terms of power and latency
29Summary of Our Methodology
- MCF model is able to quickly evaluate minimum
power consumption of a given NoC topology under
given power delay parameters - We build topology lib and power delay libs, and
use our methodology to find out solution for NoC
Synthesis problem - We emphasize on the capability of our
methodology. - Users can set up their own power delay libs and
topology lib, and use our methodology to
facilitate their design
30Capacity Constrained vs. Area Constrained
Our solution The constraint is channel width
instead of wire capacities
31Typical MCF Problem vs. Our MCF Problem
Our MCF formulation in NoC design
Typical MCF formulation
32Area Constrained MCF Algorithm
Length function with channel width constraint
Length function with edge capacity constraint
33Our NoC Model
- Tile based regular structure
- Every tile has its router for communicating with
other tiles