Title: NoC
1(No Transcript)
2NoC
- General concepts
- Andreas Ehliar - Per Karlström
3Outline
- Background
- Some Implementations
- Design Issues / Tools
- Example Application
- Conclusions
4Current situation
- Many transistors
- Hard to design
- IP cores
- Solve design issue
- Comunication problem
- BUS TDM
- Can't handle many nodes
5Current Situation
6Current Situation
IP
IP
IP
IP
IP
7NOC implementations
- SoCBUS
- xPipes
- Pleiades
- Eclipse
- (FPGA)
8SoCBUS
- Arbitrary topology
- Packet Connected Circuit (PCC)
- No need for buffers
- Payload transfer latency 1cc
- Mesochronous
- No asynchronuous bridging
- Only retiming
- Timing per link
9SoCBUS
10?Pipes
- Multi-GHz
- Heterogeneous
- Packet-switched
- Parameterizable components
- Compileable
- Wormhole switching
- Street sign routing
- Error detection
- Pipelined links
11?Pipes
12Pleiades
- Platform
- Instantiation Maia processor
- Hetrogenous FUs
- ALU
- Memories
- FPGAs
- MACs
- etc
- GALS
- Two level network
13Pleiades
14Eclipse
- Embedded chip level integrated parallel
supercomputer - 2D sparse mesh
- High bandwidth
- MTAC processors
- Multi Threaded Architecture with Chaining
- Thread Level Parallelism
15Eclipse
16FPGA
- Field Programmable Gate Array
- Archetype of future NoC
- Fine grained NoC
- Homogenous blocks
- Heterogeneous links
17FPGA
18Different NoC
- Homogenous
- In function
- Simplify design and floorplan
- Heterogeneous
- In function
- Better functionality and silicon usage
19Homogenous NoC
FU
FU
FU
FU
FU
20Heterogeneous NoC
FU
FU
FU
MUL
DSP
FU
FU
ALU
21Heterogeneous NoC
DSP
FU
FU
MUL
FU
FU
ALU
22Quality of Service
- Guaranteed latency
- Guaranteed bandwidth
- Correctness
23Design Issues
- Physical layer
- low swing
- Differential signaling
- Pseudo differential signaling
- Clocks (GALS, Mesochronous)
24Design Issues - Signaling
V
t
25Design Issues - Clocking
MEM
ALU
FPGA
DSP
26Design Issues - Architecture
27Design Issues - Architecture
28Design Issues
- Data-link
- Error detection/correction
- Media access control
29Design Issues- Errors
Cost
Error detection
Error correction
Ne/Np
30Design Issues
- Network layer
- Architecture
- Hierarchy
- Switching scheme
- Packed switched
- Circuit switched
31Design Issues
- Transport layer
- Connection-oriented/-less
- Flow control
- Packet segmentation / reassembly
- Reordering
32Design Issues - Flow Control
33Design Issues - Power Control
- Application Layer
- Power management
- Node/Network centric
- Power aware API
34Design Issues - Effect of Design
Silicon
Transistor
Gate
RTL
Architecture
Algorithm
35Design Issues - Power Control
36(No Transcript)
37Design Issues - Long Wires
- Solving the global interconnect mess
- Delay
- Bit errors
- Repeaters
- Clock domains
- Create one optimized solution that can be reused
38Design Issues - Long Wires
- Add flip flops to increase clock frequency
- What about ACKs?
NoC Router
NoC Router
39Design Issues - Long Wires
- Add flip flops to increase clock frequency
- What about ACKs?
NoC Router
NoC Router
What about bit errors?
40Design Issues - Long Wires
- Bit errors on long wires will not be avoidable in
the future - Use error correcting codes
- Disadvantage More wires
- Use parity bits to discover errors
- Resend damaged packets
- No longer possible to guarantee real-time
performance
41Design Issues - Long Wires
- Possibility to create heavily optimized solution
- Low voltage signaling
- Advanced symbol encoding/decoding
- Wave pipelining
42Design Issues - Long Wires
- High performance interconnect through wave
pipelining - Need very careful analysis
NoC Router
NoC Router
NoC Router
NoC Router
43Design Issues - Long Wires
- Wave pipelining performance
- 3.45 Ghz signaling on one bit line in 0.25 um
- More energy efficient than regular pipeline
- Faster than regular pipeline
- Disadvantage
- Much harder to test/verify
44System design
- Typical tools
- Simulator
- Network generator
45System design
- What I would want
- Graphical frontend to design NoC
- C and RTL models of the finished NoC
- C API to create C level models of the NoC
- Mix C and RTL models in RTL simulator
- And of course...
46System design
IP cores
47Example Core Router
- SoCBUS Simulation
- Study of 16 port core router on a chip
- 16 x 10 Gigabit Ethernet Ports
- Prove feasibility of using SoCBUS
48Example Core Router
IPP
FT
PB
CPU
MU
OPP
49Example Core Router
- IPP (Input Packet Processor)
- Receive packet from network
- Validate Packet/Filter packet
- Send lookup request to forwarding table
- Send packet to Packet Buffer
- FT (Forwarding Table)
- Get IP address from IPP
- Perform Lookup and send the output port to the
packet buffer - OPP (Output Packet Processor)
- Send packet to Network
50Example Core Router
- PB (Packet Buffer)
- Responsible for packet buffering
- Buffers packets until output port information is
received from the forwarding table - MU (Multicast Unit)
- Handle multicast packets
- CPU
51Example Core Router
- Data flow for a single packet
Forwarding Table
Output Packet Processor
Packet Buffer
Input Packet Processor
52Example Core Router
- Assumptions
- Each link can transfer 64 bits each clock cycle
- SoCBUS can be clocked at 1.2 Ghz
- Packet buffers are large enough
53Example Core Router
- Results for Internet Mix packet sizes
54Example Core Router
- Results for minimum size packets
55Example Core Router
56Example Core Router
- Bottleneck in forwarding table access
- Current version of SoCBUS creates a virtual
circuit for each request - Proposal Extend SoCBUS
- Reliable delivery of small (64 bit or less)
packets without setting up a virtual circuit
57Example Core Router
- Conclusion on this application example
- Initial concept seems to work in simulation
- Current work
- Master thesis to test concept in an FPGA
58Our Reflections
- Many papers use routers for each connection core
- Not every IP core has to have a NoC Uplink
- Probably better to use local shared buses with a
common NoC Uplink - On the Internet, terminals are not connected
directly to routers - Hard to design a network if the traffic is unknown
59Our Reflections
- Research on how to improve NoCs can often be used
to improve non-NoC based designs - Communication over long distances
- Improved crossbars
- It will be hard to guarantee real-time
performance on NoCs
60Conclusions
- NoC seems to be a reasonable tradeoff
- Similar to how standard cells make it easier to
design chips - No industry usage (yet?)
- As yet, no killer application has been
demonstrated - Next level of abstraction
- IP centric design
61Questions/Discussion
- Will future chips have communication patterns
favoring NoCs?
62References
- Networks on chips a new SoC paradigm Benini, L.
De Micheli, G. Computer , Volume 35 , Issue 1
, Jan. 2002 Pages70 - 78 - Powering networks on chips Benini, L. De
Micheli, G. System Synthesis, 2001. Proceedings.
The 14th International Symposium on, 30 Sept.-3
Oct. 2001 Pages33 38 - Addressing the system-on-a-chip interconnect woes
through communication-based design Sgroi, M.
Sheets, M. Mihal, A. Keutzer, K. Malik, S.
Rabaey, J. Sangiovanni-Vincentelli, A. Design
Automation Conference, 2001. Proceedings , 18-22
June 2001 Pages667 - 672 - On-chip networks a scalable, communication-centri
c embedded system design paradigm Henkel, J.
Wolf, W. Chakradhar, S. VLSI Design, 2004.
Proceedings. 17th International Conference on ,
2004 Pages845 - 851 - Design of a Core Router using the SoCBUS On-chip
Network Jimmy Svensson LiTH-ISY-EX-04/3562-SE
LiTH
63References
- A scalable high-performance computing solution
for networks on chips Forsell, M. Micro, IEEE ,
Volume 22 , Issue 5 , Sept.-Oct. 2002 Pages46
- 55 - Xpipes a network-on-chip architecture for
gigascale systems-on-chip Bertozzi, D. Benini,
L. Circuits and Systems Magazine, IEEE , Volume
4 , Issue 2 , 2004 Pages18 - 31 - xpipesCompiler a tool for instantiating
application specific networks on chip Jalabert,
A. Murali, S. Benini, L. De Micheli, G.
Design, Automation and Test in Europe Conference
and Exhibition, 2004. Proceedings , Volume 2 ,
16-20 Feb. 2004 Pages884 - 889 Vol.2 - A wave-pipelined on-chip interconnect structure
for networks-on-chips Jiang Xu Wayne, W. High
Performance Interconnects, 2003. Proceedings.
11th Symposium on, Vol., Iss., 20-22 Aug. 2003
Pages 10- 14 - An on-chip network architecture for hard real
time system Daniel Wiklund LiU-TEK-LIC-200269
LIU