Title: Nw Device Drivers
1N/w Device Drivers
- Any n/w transaction is made through an
interface/device - They are not accessed as files but by names eg
eth0 - Interface can be hardware eg eth0 or pure
software eg l0 - N/w interface is in charge of sending and
receiving packets
2Configuring interfaces
The ifconfig program configures interface
devices for use gtifconfig (DEVICE) (IPADDR)
netmask (NMASK) broadcast (BCAST) gtifconfig
(DEVICE) up
gtifconfig
(DEVICE) down
3Configuring Routes
The route program is used to configure routes
for interfaces to the forwarding
information base ( FIB) gtroute add -net
(NETWORK) netmask (NMASK) dev (DEVICE)
gtroute add -host (IPADDR) (DEVICE)
4Module Loading/Unloading
A n/w interface registers itself with the kernel
through a data structure ie struct
net_device register_netdev(struct net_device
netdev) Invokes the init fn of
the device and adds the structure to a global
list of network devices ( pointed by dev_base
) unregister_netdev(struct net_device netdev)
does the necessary cleanup and
restores the global list back.
5net_device structure
- Visible and non visible parts.
- Visible parts -
- name of the device nameIFNAMSIZ
- I/O specific fields mem_end,mem_start,base_ad
dr,irq - if_port,dma
- next Pointer to the next
device in global list - int (init) Initialization
routine for the driver. - Non visible parts -
- Mtu- Maximum Transfer Unit
- tx_queue_len Max no frames that can be
queued on the device - Hard_header_len hardware header length
- dev_addr Hardware (MAC) address
of the deivce - addr_len Len of h/w address
- void priv Private data
6net_device structure
Kernel Space (having the global device list)
dev_base
Struct Net_Device Name Lo Qdisc Open
Close Init hard_start_xmit
Next ..
Struct Net_Device Name Eth0 Qdisc Open
Close Init hard_start_xmit
Next ..
7Device methods
Fundamental methods int (open)() - opens the
interface(ifconfig dev up) int (stop)() -
stops the interface (ifconfig dev down) int
(hard_start_xmit)(struct sk_buff ,) -initiates
transmission int (hard_header)() - builds the
hardware header struct net_device_stats
(get_stats)() - gets statistics of
interface int (set_config)() - to change
interface coniguration
8Device methods ...
Optional methods int (do_ioctl) () - perform
interface specific ioctl commands int
(change_mtu)() - change mtu for the
interface int (set_mac_address)() - to change
interface h/w address
9Socket Buffers
N/w drivers deal with data buffers which is of
struct sk_buff struct net_device dev,rx_dev -
device sending and receiving the buffer union
h - Transport layer header union nh - Network
layer header union mac - Mac layer
header unsigned char head,data,tail,end -
pointers to address data unsigned long len -
Length of data buffer struct sock sk - socket
owned by stuct timeval stamp - time Packet
arrived at struct sk_buff next,prev - pointers
to next and previous packets char cb48 -
control buffer
10Socket buffers ...
11Sk_buff methods
struct sk_buff alloc_skb(int len,int
prio),dev_alloc_skb(int len) -
allocate a buffer, prio GFP_ATOMIC or
GFP_KERNEL Void kfree_skb(),dev_kfree_skb() -
free the buffer unsigned char skb_put(struct
sk_buff ,int len) - add data to end of
buffer unsigned char skb_push(struct sk_buff
,int len) - add data to start of packet int
skb_tailroom(struct sk_buff skb) - returns
availabe space int sk_headroom(strct sk_buff
skb) - returns available space in front of
data unsigned char skb_pull(struct sk_buff ,int
len) - returns data from the packet Length of a
single skb ie skb-gtlen skb-gttail -
skb-gtdata Size of an skb skb-gtend -
skb-gthead headroom skb-gthead - skb-gtdata
12ioctls
Are used to copy data from user space to kernel
space and vice versa.Approriate routines can
be invoked from the ioctl fn. int
(do_ioctl)(struct net_device ,struct ifreq
ifr,int cmd) Each interface can define its own
ioctl commands ioctl implementation for sockets
has 16 ioctls as private to interface ,
SIOCDEVPRIVATE to SIOCDEVPRIVATE15 more ioctls
can be virtually implemented by the ifr
structures ifu_data field as roughly
shown struct ifreq char name char
data
13Packet transmission
- int (hard_start_xmit)(struct sk_buff ,struct
net_device ) - Initiates the transmission of packet.
- Gets invoked by the dev_queue_xmit function
which is - invoked by the higher layers
- should detect timeouts
14Packet Reception
- Occurs through an interrupt handler.
- (unless interface is pure s/w)
- Allocate a sk_buff buffer
- assign dev and protocol fields and pass it to
link layer( netif_rx fn) - Link layer puts on backlog queue and marks it
for next - Bottom half run
- The n/w bottom half passes pkt to protocol
receive fn ( IP layer) - The IP layer either forwards it or sends up to
transport layer.
15How to trap a packet ?
- Write a n/w module and register it
- store the eth0 net_device structure in a temp
pointer - Replace the eth0 net_device with your net_device
structure - In your device ,hard_start_xmit fn invoke the
eth0 hard - _start_xmit function.
- Replace the routing table entries eth0 with your
device - !!! Packets now go out through your interface
, u can do wonders - with the packet before calling the actual
hard_start_xmit fn !!!
16Linux QOS support
TCP,UDP
Traffic Control
Packet Out
Input Demultiplexing
Forwarding
Output Queue
Packet In
17QOS Support ...
- Traffic Control consists of queuing
disciplines,classes and - Filters which controls the packets that are sent
out. - Queuing disciplines is at the heart of linux
traffic control - Each device has its own queuing discipline .
- Before transmitting any packet out of the
interface,Queuing - Disciplines come into action.
18Queuing Disciplines
Specifies the way the packets are going to be
Queued at the o/p interface. (enqueue) (struct
sk_buff ,struct net_device ) (dequeue) (struct
sk_buff ,struct net_device ) Before calling the
device hard_start_xmit , packet is first Queued
using enqueue . dequeue determines the next
packet to be transmitted
19Queuing Disciplines
Types of Queuing Discliplines CBQ - Class
Based Queuing TBF - Token Bucket Filter SFQ -
Stochastic Fair Queuing Priority - Priority based
dequeing FIFO- First in First Out RED - Random
Early Detection
20Classes , filters
- Filters are used to classify packets based on
packet properties - When enqueue fn is invoked, the filters are
applied - to identify the (best) class to which the
packets belong. - The enqueue fn of the Qdisc owned by that class
is then invoked - Classes and queues are tied together.
- Each class has a queue which is FIFO by default
-
- Classes not supported for all Qdisc eg TBF
21Types of Filters
Different filter types are supported - Route -
Based on the decission on which route packet will
be routed. Fw - based on the decission on how the
firewall marks the packet u32 - Based on the
decission on fields within the packet rsvp -
Bases the decission on the target
22TC
Linux Traffic Controller -Alexey
Kuznetsov User level program to create qdisc,
classes and filters. It uses netlink sockets to
transfer data to and fro the kernel space tc
qdisc add/del/get/change dev (DEVICE) handle
(ID) QKIND tc class add/del/get/change dev
(DEVICE) classid (ID) QKIND tc filter
add/del/get/change dev (DEVICE) prio PRIO
proto PROTO classid (ID) FTYPE
23CBQ
Root
A
B
Audio (Prio 7) 10
Video Prio 4) 20
Audio (Prio 7) 30
Video(Prio 4) 40
24CBQ
- CBQ is a scheduling mechanism to
- provide link sharing between agencies that share
the same physical link - provide a framework to differentiate traffic
that has different priorities - Main components are
- Classifier extract flow information, and put
packet in corresponding class - General Scheduler aims to share the bandwidth
among classes - Link sharing Scheduler aims to share bandwidth
during congestion, - distributes the excess
bandwidth approriately. - Estimator Estimates whether each class is
underlimit/overlimit
25CBQ
Link Sharing Scheduler
Estimator
Input link
Output link
Classifier
General Scheduler
26Qdisc_ops
qdisc
Per interface
Class Qdisc_class_ops
Dev_base
Qdisc_ops
qdisc
Class Qdisc_class_ops
Net_device Name Eth0 Qdisc Next ..
Filter Chain
qdisc
Packet Storage
27Filter Storage
PG_FilterStore
Front
Priority 1, Noof Filters
Next
Next
Back
Next
Front
Priority 2, Noof Filters
Filter Elements storing DestIp,SrcIP,SrcMask,DestM
ask, SrcPortMin,SrcPortMax,DstPortMin,DstPortMax,P
rotocol,Tos,Dscp etc.
Back
Next
Front
Priority 3, Noof Filters
Back
Filter Priority Queue Node
Filter Element Node
28References
Linux Device Drivers Rubini Corbet Linux IP
networking Glenn Herrin,May 31 2000 Linux
Advanced Networking Overview S.RadhaKrishnan Lin
k Sharing for Packet n/w Sally Floyd IEEE.Aug
1995
29THANK YOU