Title: Intel
1Intel IXP2XXX Network Processor Architecture and
Programming
Prof. Laxmi Bhuyan Computer Science UC
Riverside
272
IXP2400
MEv2 2
MEv2 1
DDRAM
Rbuf 64 _at_ 128B
S P I 3 or C S I X
32b
MEv2 3
MEv2 4
Intel XScale Core 32K IC 32K DC
G A S K E T
Tbuf 64 _at_ 128B
PCI (64b) 66 MHz
32b
64b
MEv2 6
MEv2 5
Hash 64/48/128
Scratch 16KB
MEv2 7
MEv2 8
QDR SRAM 1
QDR SRAM 2
CSRs -Fast_wr -UART -Timers -GPIO -BootROM/Slow
Port
E/D Q
E/D Q
18
18
18
18
3IXP2400 Bandwidths
- 600 MHz Operation 4.8 GOPs
- 2.5 Gb/s Full Duplex Media Interface
- POS-PHY
- Utopia
- CSIX-L1
- 2.4 GBs DDR Memory Bandwidth at 300 MTs
- 1.6 GBs QDR Memory Bandwidth with 200 MHz QDRII
devices
4IXP2400 Resources Summary
- Half Duplex OC-48 / 2.5 Gb/sec Network Processor
- (8) Multi-Threaded Microengines
- Intel XScale Core
- Media / Switch Fabric Interface
- PCI interface
- 2 QDR SRAM interface controllers
- 1 DDR SDRAM interface controller
- 8 bit asynchronous port
- Flash and CPU bus
- Additional integrated feature
- Hardware Hash Unit
- 16 KByte Scratchpad Memory,Serial UART port
- 8 general purpose I/O pins
- Four 32-bit timers
- JTAG Support
5IXP2400 Full-Duplex OC-48 System Implementation
S D R A M
6IXP2400 Chaining
Glueless Interface between IXP2400 Devices using
CSIX-L1
Control Plane Processor
PCI 64/66
IXP2400 Processor
IXP2400 Processor
IXP2400 Processor
2.5Gbs CSIX-L1
2.5 Gbs CSIX-L1
2.5Gbs CSIX-L1
2.5Gbs SPI3
D R A M
Q DR
Q DR
D R A M
Q DR
Q DR
D R A M
Q DR
Q DR
QDR SRAM Queues Tables
QDR SRAM Queues Tables
QDR SRAM Queues Tables
DDRPacket Memory
DDRPacket Memory
DDRPacket Memory
718
18
18
IXP2800
Stripe
RDRAM 1
RDRAM 3
RDRAM 2
MEv2 2
MEv2 3
MEv2 4
MEv2 1
Rbuf 64 _at_ 128B
S P I 4 or C S I X
16b
MEv2 7
MEv2 6
MEv2 5
MEv2 8
Intel XScale Core 32K IC 32K DC
G A S K E T
PCI (64b) 66 MHz
Tbuf 64 _at_ 128B
64b
16b
MEv2 10
MEv2 11
MEv2 12
MEv2 9
Hash 48/64/128
Scratch 16KB
MEv2 15
MEv2 14
MEv2 13
QDR SRAM 2
QDR SRAM 1
QDR SRAM 3
MEv2 16
QDR SRAM 4
CSRs -Fast_wr -UART -Timers -GPIO -BootROM/SlowPo
rt
E/D Q
E/D Q
E/D Q
E/D Q
18
18
18
18
18
18
18
18
8IXP2800 Bandwidths
- 1.4 GHz Operation 20 GOPs
- 10Gbs Full Duplex Media Interface
- SPI-4.2
- CSIX-L1
- 1.9 GB/s QDR SRAM Memory Bandwidth/Channel
- 2.1 GB/s RDRAM Memory Bandwidth/Channel
9IXP2800 Resources Summary
- Half Duplex OC-192 / 10 Gb/sec Network Processor
- (16) Multi-Threaded Microengines
- Intel XScale Core
- Media / Switch Fabric Interface
- PCI interface
- 4 QDR SRAM Interface Controllers
- 3 Rambus DRAM Interface Controllers
- 8 bit asynchronous port
- Flash and CPU bus
- Additional integrated features
- Hardware Hash Unit for generating of 48-, 64-, or
128-bit adaptive polynomial hash keys - 16 KByte Scratchpad Memory
- Serial UART port for debug
- 8 general purpose I/O pins
- Four 32-bit timers
- JTAG Support
10IXP2800 and IXP2400 Comparison
IXP2400
IXP2800
600/400MHz
1.4/1.0 GHz/ 650 MHz
Frequency
1 channel DDR DRAM - 150MHz Up to 2GB
3 channels RDRAM 800/1066MHz Up to 2GB
DRAM Memory
2 channels QDR (or co-processor)
4 channels QDR (or co-processor)
SRAM Memory
Separate 32 bit Tx Rx configurable to SPI-3,
UTOPIA 3 or CSIX_L1
Separate 16 bit Tx Rx configurable to SPI-4 P2
or CSIX_L1
Media Interface
8 (MEv2)
16 (MEv2)
Number of MicroEngines
Dual chip full duplex OC48
Dual chip full duplex OC192
Performance
11MicroEngine v2
D-Push Bus
S-Push Bus
From Next Neighbor
Control Store 4K/8K Instructions
Local Memory 640 words
128 GPR
128 GPR
128 Next Neighbor
128 S Xfer In
128 D Xfer In
LM Addr 1
2 per CTX
B_op
A_op
LM Addr 0
Prev B
Prev A
P-Random
B_Operand
A_Operand
CRC Unit
Multiply
Lock 0-15
Status and LRU Logic (6-bit)
TAGs 0-15
32-bit ExecutionData Path
Find first bit
CAM
CRC remain
Add, shift, logical
Status
Entry
OtherLocal CSRs
ALU_Out
To Next Neighbor
Timers
128 S Xfer Out
128 D Xfer Out
Timestamp
D-Pull Bus
S-Pull Bus
12Microengine v2 Features Part 1
- Clock Rates
- IXP2400 600/400 MHz
- IXP2800 - 1.4/1.0 GHz/ 650 MHz
- Control Store
- IXP2400 4K Instruction store
- IXP2800 8K Instruction store
- Configurable to 4 or 8 threads
- Each thread has its own program counter,
registers, signal and wakeup events - Generalized Thread Signaling (15 signals per
thread) - Local Storage Options
- 256 GPRs
- 256 Transfer Registers
- 128 Next Neighbor Registers
- 640 - 32bit words of local memory
13Microengine v2 Features Part 2
- CAM (Content Addressable Memory)
- Performs parallel lookup on 16 - 32bit entries
- Reports a 9-bit lookup result
- 4 State bits (software controlled, no impact to
hardware) - Hit entry number that hit Miss LRU entry
- 4-bit index of Cam entry (Hit) or LRU (Miss)
- Improves usage of multiple threads on same data
- CRC hardware
- IXP2400 - Provides CRC_16, CRC_32
- IXP2800 - Provides CRC_16, CRC_32, iSCSI, CRC_10
and CRC_5 - Accelerates CRC computation for ATM AAL/SAR, ATM
OAM and Storage applications - Multiply hardware
- Supports 8x24, 16x16 and 32x32
- Accelerates metering in QoS algorithms
- DiffServ, MPLS
- Pseudo Random Number generation
- Accelerates RED, WRED algorithms
- 64-bit Time-stamp and 16-bit Profile count
14Intel XScale Core Overview
- High-performance, Low-power, 32-bit Embedded RISC
processor - Clock rate
- IXP2400 600 MHz
- IXP2800 700/500/325 MHz
- 32 Kbyte instruction cache
- 32 Kbyte data cache
- 2 Kbyte mini-data cache
- Write buffer
- Memory management unit
15IXA Software Framework
ExternalProcessors
Control Plane Protocol Stacks
Control Plane PDK
OSSL
XScaleCore Programming Model
Core Components
Core Component Infrastructure Library
Resource Manager Library
Microblock Infrastructure Library
Microengine Programming Model
Micro block
Micro block
Micro block
Utility Library
Protocol Library
Hardware Abstraction Library
16IXA Software Framework - Goals
- Accelerate software development for the IXP
family of network processors - Provide a simple and consistent infrastructure to
write networking applications - Enable reuse of components across applications
- Improve portability of code across the IXP family
17Microengine Programming Model
18Microblock Programming Model
- Data Plane Libraries
- Libraries for commonly used functions
- Microblock Infrastructure Library
- Used by the Microblocks and the DL to manage
packet meta data and DL variables - Microblocks
- Enable development of modular code building
blocks - Define the data flow model, common data
structures, state sharing between code blocks
etc. - Ensures consistency and improves reuse across the
different reference applications - Dispatch Loop (DL)
- The Glue code that binds Microblocks together to
form Microblock Group
19Microblocks
- A combined set of macros/functions that perform a
data plane network processing function - Each Microblock performs a major function on a
packet - 5-Tuple Classification, IPv4 Forwarding, NAT
- Written independent of each other
- Reusable across applications
- Use the infrastructure library
- Access and modify packet meta data and DL
variables - Use data plane libraries
- Hardware abstraction and code reusability
20Microblock Architecture
Driver Microblocks
Packet Processing Microblocks