Title: Design of a Diversified Router: Common Router Framework
1Design of aDiversified RouterCommon Router
Framework
Jon Turner, John DeHart, Fred Kuhnsjon.turner_at_wu
stl.edu, jdd_at_arl.wustl.edu, fredk_at_arl.wustl.edu ht
tp//www.arl.wustl.edu/arl
2Outline
- JSTs original slides
- Schedule
- Model
- Traffic Types
- IP Addressing
- Components
- Switch
- MetaLink Loopback Block
- LC
- Substrate Link Types
- Packet Formats
- Focus is on wired Ethernet
- LC Rx/Tx Design Implementation
- Common Router Framework (CRF)
- Functional Blocks for implementing a Router
- Framework is meant to define how we would
support - Multiple MetaRouters running on a shared NP Blade
- A single MR running on a dedicated NP Blade could
also use it.
3Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
- Lets look at
- What data passes from block to block
- What blocks touch the Buffer Descriptor
- For MR-specific code, an MR has memory access
restricted to - Packet Buffer
- Control Block
- Output Area
- We need a better definition of how an MR will use
the Control Block and Output Area. Are they
separate and distinct? What are they each
intended for?
4Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
RBUF
Buf Handle(32b)
- Rx
- Function
- Coordinate transfer of packets from RBUF to DRAM
- Notes
- Well pass the Buffer Handle which contains the
SRAM address of the buffer descriptor. - From the SRAM address of the descriptor we can
calculate the DRAM address of the buffer data.
5Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
Buf Handle(32b)
- DeMux
- Function
- Read Pkt Header from DRAM
- Use VLAN from Ethernet header to determine
destination MR in order to locate - MR Parse code
- MR specific memory pointers
- Write MR Id to Buffer Descriptor
- Write VLAN to Buffer Descriptor
6Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
Buf Handle(32b)
DRAM Buf Ptr(32b)
Buffer Offset(16b)
MR Id(16b)
Input MI(16b)
MR Mem Ptr(32b)
MR Lookup Key(16B)
- Parse
- Function
- MR-specific header processing
- Generate MR-specific lookup key (16 Bytes) from
packet - Need CRF functionality to managed multiple MRs in
shared PE. - Notes
- Can Parse adjust the buffer/packet size and
offset? - Can Parse do something like, terminate a tunnel
and strip off an outer header?
7CRF Wrapper Around Parse
MR Selector
8Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
- Lookup
- Function
- Perform lookup in TCAM based on MR Id and lookup
key - Result
- Output MI
- QID
- Stats index
- MR-specific Lookup Result (flags, etc. ?)
- How wide can/should this be?
9Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
- Header Format
- Function
- MR specific packet header formatting
- Need CRF functionality to managed multiple MRs in
shared PE.
10CRF Wrapper Around Header Format
MR Selector
Buffer Handle
QID
Buffer Offset
Gets written to Buffer Descriptor May also cause
size(s) in Descriptor to be updated. (what about
trimming data, What if it is a buffers
worth Which would change the chaining, Can they
add/trim at either end?
11Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
Buffer Handle(32b)
Buf Handle(32b)
- QM
- Function
- CRF queue management for Meta Interface queues
- For performance reasons, QM may actually be
implemented as multiple instances - Each instance on a separate ME would support a
separate set of Meta Interfaces. - See next slide for more details
QID(16b)
12QM/Scheduler on Multiple MEs
Output Hlpr (1 ME)
QM/Schd (1 ME)
Input Hlpr (1 ME)
HeaderFormat
Tx
. . .
QM/Schd (1 ME)
Buf Handle(32b)
Scratch Rings
NN Ring
NN Ring
- QID(32b)
- Reserved (8b)
- QM ID (4b)
- QID(20b) 1M queues per QM
- Input Hlpr would use QM ID to select Scratch ring
on which to put request. - Output Hlpr would process all Scratch rings
coming from QM/Schd MEs and multiplex onto one NN
ring to TX - With 64 entries in Q-Array and 16 entries in CAM,
max number of QM/Schds is probably 4 (2 bits). - Well set aside 4 bits to give us flexibility in
the future.
13Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
TBUF
Buffer Handle(32b)
- Tx
- Function
- Coordinate transfer of packets from DRAM to TBUF
14Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
- Whats missing?
- Multicast
- RFC 1812 An IP router SHOULD support forwarding
of IP multicast packets, - Default IPv4 MR SHOULD support Multicast
- It would be nice if our IPv4 MR was implemented
using the CRF. - Thus, CRF SHOULD provide support for MRs that
support Multicast. - Does PlanetLab make any use of Multicast?
- Statistics/Monitoring
- Exact definition of format for
- Data between blocks
- Lookup result
- Mapping of MRMI to QID
- Is queueing done strictly on a per MI basis?
15Multicast Alternatives
- At least Three Options
- Force MRs that need Multicast to be Dedicated
Blade MRs and do their own Multicast - For our short term goals this is probably
sufficient and the best course. - Perhaps longer term we can look at adding it to
the CRF - Treat as exception and send to Xscale
- Provide support in CRF for Multicast
- Use Multi-Hit Lookup capability of the TCAM
- MI Bit mask defined in Lookup Result
- Will put a bound on the number of MIs that can be
supported on an MR because of the size of the
lookup result. - Has issues of mapping bits in the bit mask to
actual MIs. - Lookup Result contains an index into a table
containing MI bit masks - Allow but do not force MRs to provide code to
interpret Lookup Result. - This would also allow other possible extensions
on an MR-specific basis - This carries with it the problem of bounding the
execution time of the MR-specific code in the
Lookup block. For general multicast, this could
be a serious issue. - There are also issues with generating a QID based
on an MI when the QID is not included in the
Lookup Result. - Other options?
16Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
- Whats missing?
- Multicast
- Statistics/Monitoring
- Exact definition of format for
- Data between blocks
- Lookup result
- Mapping of MRMI to QID
- Is queueing done strictly on a per MI basis?
17Statistics and Monitoring
- MetaRouter Counters
- What support is needed to allow MetaNets access
to counters at individual MRs? - Substrate Counters
- Per Frame Counters
- As part of design we have to enumerate these
- They can probably be managed by the corresponding
Micro Block, for example - Number of Received Frames
- RX Block Counts each received frame, stores
counter in SRAM - XScale can retrieve value from SRAM
- Number of Transmitted Frames
- TX Block Counts each received frame, stores
counter in SRAM - XScale can retrieve value from SRAM
- Dropped Frame Counters
- Drops will probably occur at multiple places
- RX Badly formed Frame
- QM Queue Overflow
- Lookup Result indicates frame is to be dropped
- Etc.
- Etc.
18Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
- Whats missing?
- Multicast
- Statistics/Monitoring
- Exact definition of format for
- Data between blocks
- Lookup result
- Mapping of MRMI to QID
- Is queueing done strictly on a per MI basis?
19Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
- Whats missing?
- Multicast
- Statistics/Monitoring
- Exact definition of format for
- Data between blocks
- Lookup result
- Mapping of MRMI to QID
- Is queueing done strictly on a per MI basis?
20Common Router Framework (CRF) Functional Blocks
Parse
HeaderFormat
Lookup
Tx
DeMux
Rx
MR-1
. . .
MR-n
- Whats missing?
- Multicast
- Statistics/Monitoring
- Exact definition of format for
- Data between blocks
- Lookup result
- Mapping of MRMI to QID
- Is queueing done strictly on a per MI basis?
- What is the limit on number queues and/or MIs?
21Packet Buffer Descriptor Tradeoffs
- Why use a Buffer Descriptor at all?
- QM needs something to link packets/buffers in
queues - ME-to-ME communications costs vs. SRAM access
costs
22Packet Buffer Descriptor def
- Meta Data structure of Packet Buffers (LSB to
MSB) - buffer_next 32 bits Next Buffer Pointer (in a
chain of buffers) - offset 16 bits Offset to start of data in
bytes - BufferSize 16 bits Length of data in the
current buffer in bytes - header_type 8 bits type of header at offset
bytes in to the buffer - rx_stat 4 bits Receive status flags
- free_list 4 bits Freelist ID
- packet_size 16 bits (Total packet size across
multiple buffers) - output_port 16 bits Output Port on the egress
processor - input_port 16 bits Input Port on the ingress
processor - nhid_type 4 bits Nexthop ID type.
- reserved 4 bits Reserved
- fabric_port 8 bits Output port for fabric
indicating blade ID. - nexthop_id 16 bits NextHop IP ID
- color 8 bits Qos Color
- flow_id 24 bits QOS flow ID or MPLS label/flow
id - reserved 16 bits Reserved
- class_id 16 bits Class ID
- packet_next 32 bits pointer to next packet
(unused in cell mode)
23Packet Buffer Descriptor Gets
- buffer_next tx
- Offset rx, tx, fwd
- BufferSize tx, fwd
- header_type tx, fwd
- rx_stat NONE
- free_listpacket_size NONE
- output_port qm(?), tx
- input_port rx, fwd
- nhid_type NONE
- fabric_port qm(?), tx
- nexthop_id
- color
- flow_id
- class_id
- packet_next
24Meta Data Caching
- Meta Data can be cached in one of three places
- SRAM Xfer Registers
- DRAM Xfer Registers
- GPR Registers
- Size of Meta Data Cache is controlled by define
META_CACHE_SIZE - Macro dl_meta_load_cache loads meta data cache
- buffer_handle buffer handle for which meta data
is to be fetched - dl_meta read transfer register prefix
- Xbuf_alloc should be used to allocate the
needed registers - signal_number
- START_LW starting long word for fetch
- NUM_LW number of long words to fetch
- Each microengine (microblock?) can use Meta Data
Caching differently.
25Meta Data Caching
- In the ipv4_v6_forwarder sample app,
- dl_meta_load_cache() used in
- Egress
- ethernet_arp.uc
- pkt_tx_16p.uc
- statistics_util.uc
- tx_helper.uc
- Ingress
- ethernet_arp.uc
- pkt_tx_16p.uc
- statistics_util.uc
- tx_helper.uc
- dl_meta_get_ used in
- Egress
- ethernet_arp.uc
- pkt_tx_16p.uc
- tx_helper.uc
- Ingress
- Ether.uc
26Buffer Handle
27Buffer Descriptor Usage
- Is there a different Buffer Descriptor defn for
LC and PE? - Will we support Multi-Buffer Packets?
- If not, we do not need buffer_next(32b) or
buffer_size(16b) - QM uses packet_next for its packet chaining in
qarray. - Output Port and Input Port probably translate to
TxMI and RxMI - Next Hop fields (nhid_type(4b) and
nexthop_id(16b)) probably can go away. - QOS fields (color(8b) and flow_id(24b)) probably
can go away. - Two reserved fields 4b and 16b can go away.
- class_id(16b) (virtual queue id?) can probably go
away. - fabric_port can probably go away.
28Buffer Descriptor Usage
- PE Buffer Descriptor
- MR_ID (16b)
- TxMI (16b)
- VLAN (16b)
- buffer_next 32 bits Next Buffer Pointer (in a
chain of buffers) - offset 16 bits Offset to start of data in
bytes - BufferSize 16 bits Length of data in the
current buffer in bytes - header_type 8 bits type of header at offset
bytes in to the buffer - rx_stat 4 bits Receive status flags
- free_list 4 bits Freelist ID
- packet_size 16 bits (Total packet size across
multiple buffers) - output_port 16 bits Output Port on the egress
processor - input_port 16 bits Input Port on the ingress
processor - nhid_type 4 bits Nexthop ID type.
- reserved 4 bits Reserved
- fabric_port 8 bits Output port for fabric
indicating blade ID. - nexthop_id 16 bits NextHop IP ID
- color 8 bits Qos Color
29Buffer Descriptor Usage
- PE Buffer Descriptor
- LW0 buffer_next 32 bits Next Buffer Pointer
(in a chain of buffers) - LW1 offset 16 bits Offset to start of data
in bytes - LW1 BufferSize 16 bits Length of data in the
current buffer in bytes - LW2 reserved 8 bits reserved/unused
- LW2 reserved 4 bits reserved/unused
- LW2 free_list 4 bits Freelist ID
- LW2 packet_size 16 bits (Total packet size
across multiple buffers) - LW3 MR_ID 16 bits Meta Router ID
- LW3 TxMI 16 bits Transmit Meta Interface
- LW4 VLAN 16 bits VLAN
- LW4 reserved 16 bits reserved/unused
- LW5 reserved 32 bits reserved/unused
- LW6 reserved 32 bits reserved/unused
- LW7 packet_next 32 bits pointer to next packet
(unused in cell mode) - Leave multi-buffer fields there as a template for
the dedicated blade implementation of a
jumbo-frame MR. - Also reduces changes to Rx, Tx, and QM and
reduces potential problems.
30Extra
- The next set of slides are for templates or extra
information if needed
31Text Slide Template
32Image Slide Template
33CRF Support for Multicast
Default/Unicast path
MR Interp
HeaderFormat
Parse
MR-Specific Path
Post Process
Lookup
MR-1
. . .
MR-n
34CRF Support for Multicast
Default path
MR Interp
MR-Specific Path
Post Process
Lookup
DRAM Buf Ptr
MR Id
MR Lookup Key
MR Ctrl Blk Ptr
MR Mem Ptr
- We will need some kind of copy count or multicast
bit and last copy bit to let TX know when it can
release the DRAM buffer that holds the packet.
35CRF Support for Multicast
Default path
MR Interp
MR-Specific Path
Post Process
Lookup
DRAM Buf Ptr
DRAM Buf Ptr
MR Id
MR Lookup Key
MR Lookup Key
MR Specific Lookup Result
MR Ctrl Blk Ptr
MR Ctrl Blk Ptr
MR Mem Ptr
MR Mem Ptr
- We will need some kind of copy count or multicast
bit and last copy bit to let TX know when it can
release the DRAM buffer that holds the packet.
36OLD
- The rest of these are old slides that should be
deleted at some point.