Fundamental Architectural Considerations for Network Processors - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Fundamental Architectural Considerations for Network Processors

Description:

Data-Plane Processors are primarily responsible for forwarding packets from a ... Parallel Mode, the Task Scheduler is responsible for assigning packets to PEs as ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 14
Provided by: Win6228
Category:

less

Transcript and Presenter's Notes

Title: Fundamental Architectural Considerations for Network Processors


1
Fundamental Architectural Considerations for
Network Processors
  • Review of a paper by
  • M Peyravian and J Calvignac
  • IBM Corporation, 2003

2
Introduction to Network Processors (NP)
  • Network Processors are modified or optimized in
    ways to increase networking functionalities such
    as packet manipulation and deep packet analysis.
  • Tremendous advancements in networking technology
    and the increased speed and bandwidth of
    networked devices have posed an ever increasing
    need for machines which can move packets around
    faster and more efficiently.
  • From a functionality point of view, networked
    processors can be divided into two simple
    categories, namely
  • control-plane
  • data-plane

3
Control-Plane Vs Data-Plane
  • Control plane processors have modest performance
    requirements as they are helper processors used
    mostly to control the flow of traffic and enforce
    quality requirements without actually delving
    into the packet or data flow.
  • Good examples of this procedure would be the RSVP
    or Resource Reservation Protocol or the OSPF
    (Open Shortest Path First) Protocol which have
    been implemented through the use of the PowerPC
    Processor.
  • Data-Plane Processors are primarily responsible
    for forwarding packets from a source to a
    destination.
  • Data-plane algorithms are best implemented by
    parallel processors as network data exhibits a
    high level of parallelism and parallel processors
    have a short code path.
  • Data-plane processors need to be performance
    optimized as they need to decode and move around
    large amounts of data to satisfy network Quality
    of Service requirements.
  • Most optimization research in network processors
    are aimed towards data-plane processors.

4
Architecture Considerations
  • Most networked processors use parallel processing
    through multiple Processing Engines (PE) and
    their architecture can be divided into the
    following types
  • Parallel Architectures
  • Pipeline Stage Architectures

5
Parallel Model
  • Parallel model NPs are typically just modified
    RISC based processors which are scaled down and
    contain bit manipulation circuits to increase
    packet processing power.
  • They have small instruction and data caches and
    many of these PEs are fitted onto one NP chip to
    increase physical space efficiency.
  • In Parallel Mode, the Task Scheduler is
    responsible for assigning packets to PEs as they
    arrive at the network interfaces. It keeps track
    of which PEs are available and then assigns
    accordingly.

6
Pipeline Model
  • Pipeline mode NPs have multiple stages which are
    controlled by PEs and each of these PEs are
    responsible for a certain kind of task.
  • This is referred to as a task oriented processing
    engine.
  • Both these types of NP architectures
  • provide just as much processing time per packet
    but
  • they differ in the processing budget and the
    throughput requirement.
  • Parallel based NPs are much less stringent in
    their processing budget and they can produce
    higher bandwidth networking appliances because of
    their design.

7
Memory Organization
  • The network processor memory holds
  • instruction code
  • control data and
  • the packets that need to be processed.
  • The instruction code represents the application
    programs which run on the processors and are
    stored in high speed SRAMs with low cycle access
    windows.
  • The amount of instruction memory depends on the
    kind of application that is slated to run on the
    NP. Lower layer protocol processes require just a
    few kilobytes of memory while higher protocol
    deep packet processing techniques require several
    hundred kilobytes of instruction memory.
  • Control information such as a routing table is
    stored in the control memory which is, in turn,
    stored in high speed SRAMs
  • Packet information or packet data is stored in
    packet memory which is in turn stored in large
    low-cost DRAMs. This turns out to be the
    bottleneck in NP based network appliances. The
    greater cost of SRAMs versus DRAMs is the
    limiting factor.

8
3 Memory Models
  • NP memory can be structured in three different
    models, each with its own drawbacks and
    advantages. They are
  • Shared
  • Distributed
  • Hybrid
  • Shared memory is limited by scalability and is
    relatively slower in performance but offers a
    much simpler programming model.
  • The distributed model is exactly the opposite and
    offers much better scalability and performance,
    but is much more difficult to implement.
  • The Hybrid model is best suited for most
    applications and is much simpler to program.
  • In the hybrid model the PEs are partitioned into
    multiple clusters and within each cluster, the
    PEs have shared memory.
  • The instruction code is replicated throughout the
    model to avoid instruction code contamination.
    PEs within a cluster also share control memory
    since session information is cluster specific.
  • Other control information such as route tables
    and other global session control information.

9
Multithreading
  • Two main sources for latency in Network
    Processors are
  • Memory Accesses (which can include reads and
    writes)
  • Co-processor Accesses (which include response
    time for requests from PEs)
  • Multithreading is a method of reducing latencies
    in NPs by allowing PEs to continue processing
    other packets while the processor is stalled in
    handling the current packet for any reason. This
    reason could be memory access or any other time
    intensive task. There are a couple of different
    approaches used to implement thread switching for
    NPs.
  • The set of registers which are active at a thread
    switch point are saved in memory and later
    restored when execution returns to the thread
    again. This can be counterproductive in some
    instances because the memory save and restore
    functions can take many processor cycles to
    complete.
  • A PE can contain one set of registers per thread
    and a single-stage PE can switch threads in one
    cycle since thread switching only requires
    pointing to one set of registers. In multi-stage
    pipeline based multithreading scenarios the
    pipeline might need to be cleared before thread
    switching can occur.
  • Multithreading increases the utilization of PEs
    but runs the risk of increasing processor and in
    turn, packet latency into systems. In real-time
    systems such as voice or video routing
    environments, latency can be a major problem and
    might be a more important factor to consider.

10
Support for Traffic Management and Interface
Requirements
  • To help the PE interface with other parts of the
    network system, various external interfaces have
    been developed and structural and architectural
    considerations have to be kept in mind. Common
    interfaces to NPs include serial interfaces which
    can be of a plug-in form. These include Fast
    Ethernet, Gigabit Ethernet and the most popular
    ATM serial mode interfaces.

11
Support for Traffic Management and Interface
Requirements
  • Various different traffic management techniques
    used in network appliances require different
    processor enhancement approaches. Network
    appliances can implement these techniques in
    their software, may use external devices or may
    depend on specialized hardware to perform these
    tasks. QOS can support a variety of flexible
    queuing schemes such as
  • Priority Queuing
  • Round Robin Queuing
  • Weighted Fair Queuing.

12
Network Processor Programming
  • The Parallel Architecture Programming Model
  • Important point of note in NP programming is that
    the parallel architecture with high data
    parallelisms can be exploited very effectively
    using the Run To Completion programming model.
  • According to this model, the programmer writes
    software to be run one only one thread and this
    can simply be propagated to the other threads as
    well as with parallel thread propagation
    algorithms.
  • This programmers model is based on the Symmetric
    Multi Processing architecture in which multiple
    PEs share the same memory. The PEs in this model
    are used as a pool of shared resources which are
    used as and when needed if found to be in idle
    mode.
  • The pipelined architecture Programming Model
  • Distributed programming models have to be used to
    optimize categories of tasks and instructions in
    the pipeline models. This is a weak programming
    model and leads to lesser efficient systems in
    most cases. The distributed architecture is also
    very fragile and the code is difficult to
    maintain and upgrade.

13
Conclusions
  • Network Processors are fast becoming a very
    important part of network appliances
  • They need to keep up with the increasing demands
    on networks and network management appliances.
  • This paper looks at the various different NP
    architectures and gives a broad overview of
    existing technology, using examples from existing
    designs and future considerations.
  • This paper is meant more as a review of
    technology than a direction to be explored. It
    has simple explanations of a very complex subject
    matter and is very easy to read.
Write a Comment
User Comments (0)
About PowerShow.com