Utilizing NICs enhancements - PowerPoint PPT Presentation

About This Presentation
Title:

Utilizing NICs enhancements

Description:

The engineering designs one encounters in computer hardware components can be ... simplicity, purity, and symmetry at the outset, based upon what designers ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 24
Provided by: CRU7
Learn more at: https://www.cs.usfca.edu
Category:

less

Transcript and Presenter's Notes

Title: Utilizing NICs enhancements


1
Utilizing NICs enhancements
  • A look at how driver software needs to change
    when using newer features of our hardware

2
theory versus practice
  • The engineering designs one encounters in
    computer hardware components can be observed to
    undergo an evolution during successive
    iterations, from a scheme that embodies
    simplicity, purity, and symmetry at the outset,
    based upon what designers think will be the
    devices likely uses, to a conglomeration of
    disparate add-ons as actual practices dictate
    accommodations

3
backward compatibility
  • An historically important consideration in the
    marketing of computer hardware has been the need
    to maintain past functions in a transparent
    manner i.e., no change is needed to run older
    software on newer equipment, while offering
    enhancements as options that can be selectively
    enabled

4
Example Intels x86
  • The current generation of Intel CPUs will still
    execute all of the software written for PCs a
    quarter-century ago based on a small set of
    16-bit registers, a restricted set of
    instructions, and a one-megabyte memory-space
    but is able, as an option, to use more and larger
    registers (64-bits), richer instruction-sets, and
    more memory

5
Gigabit NICs
  • Intels network controller designs exhibit this
    same kind of evolution over time
  • The Legacy descriptor-formats are just one
    example of keeping prior-generation
    functionality its simple, its pure (i.e.,
    not tied to any specific network-protocols, but
    emphasizing mechanism, not policy)
  • But now alternatives exist -- as options!

6
Legacy RX-Descriptors
The device-driver initializes this base-address
field with the physical address of a
packet-buffer and network hardware does not
ever modify it
Base-address (64-bits)
status
Packet- length
Packet- checksum
VLAN tag
errors
The network controller later will
write-back values into all these fields
when it has finished transferring a
received packets data into that packet-buffer
7
RxDesc Status-field
7 6 5 4
3 2 1 0

PIF
IPCS
TCPCS
VP
IXSM
EOP
DD
UDPCS
DD Descriptor Done (1yes, 0no) shows if nic
is finished with descriptor EOP End Of
Packet (1yes, 0no) shows if this packet is
logically last IXSM Ignore Checksum
Indications (1yes, 0no) VP VLAN Packet match
(1yes, 0no) USPCS UDP Checksum calculated in
packet (1yes, 0no) TCPCS TCP Checksum
calculated in packet (1yes, 0no) IPCS IPv4
Checksum calculated on packet (1yes, 0no)
PIF Passed In-exact Filter (1yes, 0no) shows
if software must check
8
RxDesc Errors-field
7 6 5 4
3 2 1 0

RXE
IPE
TCPE
reserved (0)
SEQ
SE
CE
reserved (0)
CE CRC Error or Alignment Error (check
statistics registers to differentiate)
TCPE TCP/UDP Checksum Error IPE IPv4
Checksum Error These bits are relevant
only while NIC is operating in SerDes mode SE
Symbol Error SEQ Sequence Error RXE
Rx Data Error
9
Extended RX-Descriptors
CPU writes this, NIC reads it
NIC writes this, CPU reads it
Base-address (64-bits)
MRQ (multiple receive queues)
Packet- checksum
IP identification
reserved (0)
Extended status
Packet- length
VLAN tag
Extended errors
The device-driver initializes the
base-address field with the physical address
of a packet-buffer, and it initializes the
reserved field with a zero-value the
network hardware will later modify both fields
The network controller will
write-back the values for these fields
when it has transferred a received packets
data into the packet-buffer
10
An alternative option
CPU writes this, NIC reads it
NIC writes this, CPU reads it
Base-address (64-bits)
MRQ (multiple receive queues)
RSS Hash (Receive Side Scaling)
reserved (0)
Extended status
Packet- length
VLAN tag
Extended errors
Receive Side Scaling refers to an optional
capability in the network controller to assist
with routing of network packets to various
CPUs within a modern multiprocessor system (See
Section 3.2.13 in Intels Software Developers
Manual)
11
Extended Rx-Status (20-bits)
19 18 17 16 15 14 13 12
11 10 9 8 7 6 5 4
3 2 1 0
0
0
0
0
A C K
0
0
0
0
U D P V
I P I V
0
P I F
I P C S
T C P C S
U D P C S
V P
I X S M
E O P
D D
These extra status-bits provide
additional hardware support to driver
software for processing ethernet packets
that conform to standard TCP/IP network
protocols (with possibilities for future
expansion)
These eight bits have the same meanings as in
a Legacy Rx-Status byte
DD Descriptor Done EOP End Of Packet
IXSM Ignore Checksum Indications VP VLAN
Packet match USPCS UDP Checksum calculated
TCPCS TCP Checksum calculated IPCS IPv4
Checksum calculated PIF Passed In-exact
Filter
ACK TCP ACK-Packet identification UDPV Valid
UDP checksum IPIV Valid IP Identification
12
Extended Rx-Errors (12 bits)
11 10 9 8 7
6 5 4 3 2
1 0
RXE
IPE
TCPE
0
0
SEQ
SE
CE
0
0
0
0
These eight bits have the same meanings, and
the occupy the same arrangement, as in
the Legacy Rx-Errors byte
13
Main device-driver changes
  • If we want to utilize the NICs Extended
    Receive Descriptor format, we will need several
    significant changes in our driver source-code and
    data-types
  • Our modules initialization of base_address
    fields
  • Our new need for programming register RFCTL
  • Our typedef for the RX_DESCRIPTOR structs
  • Our get_info_rx() function for /proc/nicrx
    display
  • Our interrupt-handlers treatment of rxring
    entries

14
Use of C language union
  • Each Receive-Descriptor now has a dual
    identity, as far as the NIC is concerned
  • one layout during its fetch from memory
  • another layout during write-back to memory
  • The C language provides a special type
    construction for accommodating this kind of
    programming situation, its known as a union
    and it requires a special syntax

15
Bitfields in C
  • Some of the fields in the Extended RX
    Descriptor do not align with the CPUs natural
    8-bit,16-bit and 32-bit data-sizes
  • The C language provides bitfields for a
    situation like this (not yet standardized)

Extended errors
Extended status
12-bits
20-bits
16
Syntax for Rx-Descriptors
typedef struct unsigned long
long base_address unsigned long
long reserved RX_DESC_FETCH typedef
struct unsigned int mrq unsigned
short ip_identification unsigned
short packet_chksum unsigned
int desc_status20 unsigned
int desc_errors12 unsigned
short packet_length unsigned
short vlan_tag RX_DESC_STORE typedef
union RX_DESC_FETCH rxf RX_DESC_STORE rxs
RX_DESCRIPTOR
17
RFCTL (0x5008)
The Receive Filter Control register
31

16
reserved (0)
15 14 13 12 11 10 9
8 7 6 5 4 3
2 1 0
E X T E N
IP FRSP _DIS
ACKD _DIS
ACK DIS
IPv6 XSUM _DIS
IPv6 _DIS
NFS_VER
NFSR _DIS
NFSW _DIS
iSCSI_DWC
iSCSI _DIS
EXTEN (bit 15) Extended Status Enable (1yes,
0no) This enables the NIC to write-back the
Extended Status
18
Modifying my_read()
  • To implement use of Extended Receive
    Descriptors in our most recent character-mode
    device-driver (i.e., zerocopy.c), we need some
    changes in the read() method
  • Most obvious example a packet-buffers memory
    address can no longer be gotten from an
    Rx-Descriptors base_address (which now gets
    overwritten by the NIC)

19
For our pseudo-files sake
  • Also our drivers read() function shouldnt
    prepare a current rx-descriptor for reuse, as it
    did in earlier drivers, since that would destroy
    all of the useful information which the NIC has
    just written into that descriptor
  • Instead, the preparation of a descriptor for
    reuse in a future packet-receive operation should
    be deferred, at least temporarily

20
OK, but then when?
  • We can reassign the duty to refresh some
    Rx-Descriptors for reuse to our drivers
    Interrupt Service Routine specifically, at the
    point in time when an RXDMT0 event is signaled
    (Rx-Descriptor Min-Threshold)
  • It might be best to create a bottom half to
    take care of those re-initializations, but we
    havent yet done that in our new prototype

21
Handling RXDMT0 interrupts
irqreturn_t my_isr( int irq, void dev_id )
int intr_cause ioread32( io E1000_ICR
) if ( intr_cause (1ltlt4) ) // Rx-Descriptors
Low unsigned int rx_buf virt_to_phys(
rxring ) 16 N_RX_DESC unsigned int rxtail
ioread32( io E1000_RDT ), i, ba //
prepare the next eight Rx-Descriptors for reuse
by the NIC for (i 0 i lt 8 i) ba
rx_buf rxtail RX_BUFSIZ rxring rxtail
.base_address ba rxring rxtail .reserved
0LL rxtail (1 rxtail)
N_RX_DESC // now give the NIC ownership
of these reinitialized descriptors iowrite32(
rxtail, io E1000_RDT )
22
extended.c
  • Heres our revision of zerocopy.c, aimed at
    showing how we can incorporate use of the NICs
    Extended Receive Descriptors
  • It appears to function exactly as before, until a
    user attempts to view the drivers
    Receive-Descriptor queue
  • cat /proc/nicrx
  • Then we are shown descriptors having two distinct
    formats (i.e., FETCH and STORE)

23
Demo bitfield.c
  • Because the manner in which bitfields are
    handled in the C language varies with the
    particular C-compiler being used, we have created
    a short demo-program that shows us how our GNU
    C-compiler gcc handles the layout of bitfields
    within a C data-item

typedef struct unsigned int desc_status20 /
/ bits 0..19 unsigned int desc_errors12 //
bits 20..31 RXD_ELT
Write a Comment
User Comments (0)
About PowerShow.com