Title: THE DEVELOPMENT OF NETWORK PROCESSOR TECHNOLOGY
1THE DEVELOPMENT OF NETWORK PROCESSOR TECHNOLOGY
- Adviser Dr.Gaj
- Co-Adviser Dr.Mark
2Scope of Presentation
- 1. Introduction to NP
- 2. Evolution of NP development
- 3. IXP 1200 network processor
- 4. Adding security functionality of network
processor - 5. Conclusion
3Introduction
- Why Network Processor?
- widespread of internet technology
- data explosion
- need to send huge data over networks at high
speed - CPU based gt ASIC based gtNP based
4CPU Based Router
- A computer with multiple NIC installed
- Running software dedicated to routing
- PC Linux gt small router
- Packet flow
- NIC1,buffer-gtmemory-gtCPU, register,
processing-gtmemory-gtNIC2,buffer
5CPU Based Router
- Good flexibility to program the
instructionsupdate upgrade thru software
6CPU Based Router - Drawback
- Cannot keep up with line speedIn 19942000,
network bandwidth growing622Mbps -gt 10Gbps - CPU speed 100MHz -gt 2GHz
- approx 1GHz CPU handle 1Gbps data rate
7CPU Based Router Drawback
- Demand of Quality of Service
- Internet banking, E-commerce requires instant
interactions v.s. E-mail, WWW - Rich user v.s. poor user
- Vedio on demand v.s. Vedio conference
- Each traffic type has each priority level
- Traffic management task needs more complicated
algorithm higher processing speed
8CPU Based Router Drawback
- PC is not optimized for network traffic
- PC is suitable for big chunk of data DMA,
loading file, do I/O once in a while.Network
traffic is I/O bound traffic - Voice stream use small 64-byte packets
- Not good for large number of small packets
- Spend much time idling while memory access
(single thread) - Not good for bit-level operations (using shift)
9ASIC Based Router (Faster)
- Burdens are distributed to each NIC
- Embed instructions to perform forwarding
operations
10ASIC Based Router Drawback
- No Re-Programmability, not flexible
- Instructions are hardwired difficult costly
to change for DiffServe, IPSec, IPv6 - Long Development Time
- 1218 months
- faster complicated application need longer
- more complex designgt fabrication verification
- longer time to market, less competitive
11ASIC Based Router Drawback
- Layer 2 protocol is in flux
- Ethernet (LAN) standard is OK
- LANgt VLAN (802.1Q)
- WAN vendor need to integrate them to single
Multi-service productgt HDLC, Frame Relay,
ATM, etc.
12Network Processor
- What is NP?
- A software programmable device that is designed
to process data packets at wire-speed - As flexible as CPU
- As fast as ASIC at wire speed
- provide all packet processing function as
previous technology can provide
13NP Architecture
- PPEs, Network Interface, Dedicated Hardware Unit,
Control Processor, Memory Interface
14NP Architecture (PPEs)
- Multiple Programmable Processing Engines (PPEs)
- More flexible over ASIC(hard-wired)
- Parallel processing, better than CPU
- Most adopted technology in vendorsex
micro-engines in Intel IXP 1200channel processor
in Motorola C-5pico-engines in IBM NP4GS3
15NP Architecture (PPEs)
- Usually use RISC, pipeline architecture
- Simplified instruction sets to reduce chip area
- Adding bit manipulation functionality
- Topologies can be parallel, pipeline, and pool
(see next page)
16fig.topologies
17NP Architecture (PPEs)
- Multiple hardware thread to hide memory access
delay, achieve more tasks - Instant context-swap (program counter, separated
register sets for each context) - Small internal program memory reduce
instruction fetch timegood fastbad program
cannot be too long, too complicated
18NP Architecture Network Interface
- Network Interface
- Connecting external framer or MAC
- Framer converting bit stream to packet data
- MAC Ethernet Framer
- Framer or MAC can be built internal in NPgood
save overall chip areabad limit product
flexibility - Standard UTOPIA level 2 and 3, SPI-3, SPI 4.2
19NP Architecture Dedicated Unit
- Dedicated Hardware Unit
- Offload the burden of computational intensive
operations from PPEs - Lookup Engines
- Queue Management
- CRC Calculation
- Security function
- Trade-off More dedicated units, more chip area,
more costy
20NP Architecture- Control Processor
- Control Processor
- Time-insensitive task, ex, routing table update,
control and traffic management packet - (How about time-sensitive task?)
- Exceptional packet processing, ex, unknown type
packet - System bring-up, reboot, system management
21NP ArchitectureMemory Interface
- SRAM
- small and frequently accessed data ex. routing
table, queuing information, packet pointer - DRAM
- large and rare accessed data ex. packet data
22NP Advantages
- Less Time to Market (TTM)
- Software programmability Easy to implement,
model, sample product - Flexibility Easy to adapt to newer protocol,
easy to add new functionality to exist design
23NP Advantages
- Longer Time in Market (TIM)
- new and critical functions can be added by
re-programming the network within the device - can be upgraded via a software download to add
new features and protocol support
24NP Advantages
- Leverage 3rd-party development of applications
- 3rd-party vendor provide software packet and
module for common used application - software reuse without the need to re-invent the
wheel - create a new industry
25Intel IXP1200
26Bit-level Operations
- How to set 5th bit to one?
- CPU Y (1ltlt5)two instructions
- NPU with the help of extra shifter, can be done
in one instructions
27Microengine
28Microengine
- 6 ME 4 Thread 24 Thread
- ALU addition, subtraction, logical operations.
No multiplication.save chip area. - 32-bit register
- The power of extra shifter. ex. TTL field,
FF-1FE(FFFFFFFF) gtgt 24 FFFF-1FE
29Microengine
- Multiple Threading
- 4 Program counters
- register sets can divided to 4 parts for 4
threads, swap contexts in a single cycle - context-swap can hide memory access latency
- each thread share same instruction store, each
thread can perform same of different program, but
instructions store is limited (2048 instructions) - program in the instruction store is loaded by
StrongARM
30Microengine
31Microengine
- Separated Register Sets
- 3 types of 32 bit registers128 general-purpose
registers64 SRAM transfer registers64 SDRAM
transfer registers - read bus write bus are separated, so does the
register sets 32 for read, 32 for write, no
addressing mode - can be addressed by globe mode or thread-local
mode - globe modegtshared variable, non-preemptive,
32Memory Interface
33Security functionality of network processor
- why? more and more business/corporate, personal
e-commerce transactions over Internet - need data confidentiality and data integrity for
information transmitted and received over
internet and networks
34Security processor architecture
- for existing NP and network systems
- a. security co-processor architecture
- b. security accelerator architecture
- c. security in-line processor architecture
- for new design and development of NP networks
- a. network processor architecture with security
functionality - (on-chip core)
35Security co-processor architecture
- co-processor performs all security function and
protocol processing and encryption function
36(No Transcript)
37Security accelerator architecture
- use in conjunction with host NPU
- host NPU performs protocol processor, eg,
handshake, protocol header processing - security accelerator performs encryption of
payload
38(No Transcript)
39In-line security processor architecture
- architecture is referred to an bump in the wire
- place before the NP to receive data packets on
one side - encrypt data packets and send on the other side
to NP - Hence, duplicate most of the same functions of
the NP
40(No Transcript)
41on-chip security core architecture
- include security functionality in the NP
architecture - eg. encryption of payload, protocol processing
- implementation of ixp2850 has 2 cryptography core
- more efficient due to reduce transfer of data
packets back and forth different memories - secure traffic on the fly with cryptographic
engine embedded on the NP - reduce real estate power and memory requirements
- on-chip core architecture is the future design
for NP
42(No Transcript)
43Survey
44Considerations for security functionality in
network system
- 1. for existing NP and networksco-pro, acc,
in-line more suitable and cost effective - 2. for relative low volume of data and moderate
speed, co-processor is cost-effective - 3. for high traffic and data volume, in-line
architecture would be the most efficient and ease
of integration. however, cost will be high due to
duplication of NP functionality
45Considerations for security functionality in
network system
- for new NP and network design
- 1.on chip security core architecture gives best
efficiency, power consumption and small footprint - 2.depending on the demand for such performance,
the cost of on-chip security core NP should be
affordable.
46Conclusion
- the need for flexibility and speed resulted in
design shift to NP architecture - the implementation of ixp1200 demonstrated that
NP architecture is efficient and can meet the
demand for networking at wired speed. - the design of NP architecture is still evolving.
With the need for security functionality, on-chip
security core seems to be the way to future NP
architecture, given the advantage of data
efficiency, footprint, and minimum power
consumption.
47Thank You
48(No Transcript)