Title: Network%20Layer
1Network Layer
2Network Layer
- Introduction
- Datagram networks
- IP Internet Protocol
- Datagram format
- IPv4 addressing
- ICMP
- Whats inside a router
- Routing algorithms
- Link state
- Distance Vector
- Hierarchical routing
- Routing in the Internet
- RIP
- OSPF
- BGP
- Broadcast and multicast routing
3Network layer
- transport segment from sending to receiving host
- on sending side encapsulates segments into
datagrams - on receiving side, delivers segments to transport
layer - network layer protocols in every host, router
- Router examines header fields in all IP datagrams
passing through it
4Key Network-Layer Functions
- analogy
- routing process of planning trip from source to
dest - forwarding process of getting through single
interchange
- forwarding move packets from routers input to
appropriate router output - routing determine route taken by packets from
source to dest. - Routing algorithms
5Interplay between routing and forwarding
6Network Layer
- Introduction
- Datagram networks
- IP Internet Protocol
- Datagram format
- IPv4 addressing
- ICMP
- Whats inside a router
- Routing algorithms
- Link state
- Distance Vector
- Hierarchical routing
- Routing in the Internet
- RIP
- OSPF
- BGP
- Broadcast and multicast routing
7Datagram networks
- Connection-less service no call setup at network
layer - routers no state about end-to-end connections
(why?) - no network-level concept of connection
- packets forwarded using destination host address
- packets between same source-dest pair may take
different paths
1. Send data
2. Receive data
8Network Layer
- Introduction
- Datagram networks
- IP Internet Protocol
- Datagram format
- IPv4 addressing
- ICMP
- Whats inside a router
- Routing algorithms
- Link state
- Distance Vector
- Hierarchical routing
- Routing in the Internet
- RIP
- OSPF
- BGP
- Broadcast and multicast routing
9The Internet Network layer
- Host, router network layer functions
Transport layer TCP, UDP
Network layer
Link layer
physical layer
10The Internet Protocol (IP)
Protocol Stack
App
Transport
TCP / UDP
Data
Hdr
TCP Segment
Network
IP
Data
Hdr
IP Datagram
Link
11The Internet Protocol (IP)
- Characteristics of IP
- CONNECTIONLESS mis-sequencing
- UNRELIABLE may drop packets
- BEST EFFORT but only if necessary
- DATAGRAM individually routed
Source
Destination
R2
D
H
R1
R3
- Architecture
- Links
- Topology
R4
Transparent
12IP datagram format
6 for TCP
- how much overhead with TCP?
- 20 bytes of TCP
- 20 bytes of IP
- 40 bytes app layer overhead
13IP Fragmentation Reassembly
- Problem A router may receive a packet larger
than the maximum transmission unit (MTU) of the
outgoing link. - different link types, different MTUs
- Solution large IP datagram divided
(fragmented) within net - one datagram becomes several datagrams
- reassembled only at final destination, why?
- IP header bits used to identify, order related
fragments
fragmentation in one large datagram out 3
smaller datagrams
reassembly
E.g., Ethernet frames carry up to 1500 bytes,
frames for some wide-area links carry no more
than 576 bytes. (MTU the max of data a
link-layer frame can carry)
14IP Fragmentation and Reassembly
- Example
- 4000 byte datagram
- MTU 1500 bytes
1480 bytes in data field
offset 1480/8
Note the offset value is specified in units of
8-byte chunks!!!
15Fragmentation
- Fragments are re-assembled by the destination
host not by intermediate routers. - To avoid fragmentation, hosts commonly use path
MTU discovery to find the smallest MTU along the
path. - Path MTU discovery involves sending various size
datagrams until they do not require fragmentation
along the path. - Most links use MTUgt1500bytes today.
- Try traceroute www.berkeley.edu 500 F
andtraceroute www.berkeley.edu 1501 - -F Set the "don't fragment" bit, return error it
is too long - Bonus Can you find a destination for which the
path MTU lt 1500 bytes?
16Network Layer
- Introduction
- Virtual circuit and datagram networks
- Whats inside a router
- IP Internet Protocol
- Datagram format
- IPv4 addressing
- ICMP
- IPv6
- Routing algorithms
- Link state
- Distance Vector
- Hierarchical routing
- Routing in the Internet
- RIP
- OSPF
- BGP
- Broadcast and multicast routing
17IP Addresses
- IP (Version 4) addresses are 32 bits long
- Every interface has a unique IP address
- A computer might have two or more IP addresses
- A router has many IP addresses
- IP addresses are hierarchical
- They contain a network ID and a host ID
- E.g. SeattleU addresses start with 172.17
- IP addresses are assigned statically or
dynamically (e.g. DHCP) - IP (Version 6) addresses are 128 bits long
18IP Addresses
Originally there were 5 classes
24
1
7
CLASS A
Host-ID
0
Net ID
16
2
14
CLASS B
Host-ID
10
Net ID
8
3
21
CLASS C
110
Net ID
Host-ID
4
28
CLASS D
1110
Multicast Group ID
5
27
CLASS E
11110
Reserved
A
B
C
D
0
232-1
19IP AddressesExamples
Class A address www.mit.edu 18.181.0.31
(18lt128 gt Class
A) Class B address www.seattleu.edu 172.17.
72.14 (128lt171lt12864
gt Class B)
20IP Addressing
- Problem
- Address classes were too rigid. For most
organizations, Class C were too small and Class B
too big. Led to inefficient use of address space,
and a shortage of addresses. - Organizations with internal routers needed to
have a separate (Class C) network ID for each
link. - And then every other router in the Internet had
to know about every network ID in every
organization, which led to large address tables. - Small organizations wanted Class B in case they
grew to more than 255 hosts. But there were only
about 16,000 Class B network IDs.
21IP Addressing
- Two solutions were introduced
- Subnetting within an organization to subdivide
the organizations network ID. - Classless Interdomain Routing (CIDR) in the
Internet backbone was introduced in 1993 to
provide more efficient and flexible use of IP
address space. - CIDR is also known as supernetting because
subnetting and CIDR are basically the same idea.
22Subnetting
16
2
14
CLASS B e.g. Company
Host-ID
10
Net ID
16
16
2
14
2
14
e.g. Site
Host-ID
0000
Host-ID
1111
10
Net ID
10
Net ID
Subnet ID (20)
Subnet Host ID (12)
Subnet ID (20)
Subnet Host ID (12)
16
16
2
14
2
14
e.g. Dept
10
Net ID
Host-ID
Host-ID
1111011011
10
Net ID
000000
Subnet ID (26)
Subnet Host ID (6)
Subnet ID (22)
Subnet Host ID (10)
23Subnetting
- Subnetting is a form of hierarchical routing.
- Subnets are usually represented via an address
plus a subnet mask or netmask. - e.g.
- zhuy_at_cs1 /sbin/ifconfig eth0
- eth0 Link encapEthernet HWaddr
00219B8F646D - inet addr10.124.72.20
Bcast10.124.72.255 Mask255.255.255.0 - Netmask 255.255.255.0 the first 24 bits are the
subnet ID, and the last 8 bits are the host ID. - Can also be represented by a prefix length,
e.g. 171.64.15.0/24, or just 171.64.15/24.
24IP Addressing
223.1.1.1
- IP address 32-bit identifier for host, router
interface - interface connection between host/router and
physical link - routers typically have multiple interfaces
- host typically has one interface
- IP addresses associated with each interface
223.1.2.9
223.1.1.4
223.1.1.3
223.1.1.1 11011111 00000001 00000001 00000001
223
1
1
1
25Subnets
223.1.1.1
- IP address
- subnet part (high order bits)
- host part (low order bits)
- Whats a subnet ?
- device interfaces with same subnet part of IP
address - can physically reach each other without
intervening router
223.1.2.1
223.1.1.2
223.1.2.9
223.1.1.4
223.1.2.2
223.1.1.3
223.1.3.27
subnet
223.1.3.2
223.1.3.1
network consisting of 3 subnets
26Subnets
- Recipe
- To determine the subnets, detach each interface
from its host or router, creating islands of
isolated networks. Each isolated network is
called a subnet.
Subnet mask /24
27Subnets
223.1.1.2
223.1.1.1
223.1.1.4
223.1.1.3
223.1.7.0
223.1.9.2
223.1.9.1
223.1.7.1
223.1.8.0
223.1.8.1
223.1.2.6
223.1.3.27
223.1.2.1
223.1.2.2
223.1.3.2
223.1.3.1
28Classless Interdomain Routing (CIDR)Addressing
- The IP address space is broken into line
segments. - Subnet portion of address of arbitrary length
- Each line segment is described by a prefix.
- A prefix is of the form x/y where x indicates the
prefix of all addresses in the line segment, and
y indicates the length of the segment. - e.g. The prefix 128.9/16 represents the line
segment containing addresses in the range
128.9.0.0 128.9.255.255.
128.9.0.0
65/8
128.9/16
0
232-1
216
128.9.16.14
29Classless Interdomain Routing (CIDR)Addressing
128.9/16
0
232-1
128.9.16.14
30Classless Interdomain Routing (CIDR)Addressing
- Prefix aggregation
- If a service provider serves two organizations
with prefixes, it can (sometimes) aggregate them
to form a shorter prefix. Other routers can refer
to this shorter prefix, and so reduce the size of
their address table. - E.g. ISP serves 128.9.14.0/24 and 128.9.15.0/24,
it can tell other routers to send it all packets
belonging to the prefix 128.9.14.0/23. - ISP Choice
- In principle, an organization can keep its prefix
if it changes service providers.
31Size of the Routing Table at the core of the
Internet
- Source http//www.cidr-report.org/
32IP addresses how to get one?
- Q How does host get IP address?
- hard-coded by system admin in a file
- Wintel control-panel-gtnetwork-gtconfiguration-gttcp
/ip-gtproperties - UNIX /etc/rc.config
- DHCP Dynamic Host Configuration Protocol
dynamically get address from as server - plug-and-play
-
33IP addresses how to get one?
- Q How does network get subnet part of IP addr?
- A gets allocated portion of its provider ISPs
address space
ISP's block 11001000 00010111 00010000
00000000 200.23.16.0/20 Organization 0
11001000 00010111 00010000 00000000
200.23.16.0/23 Organization 1 11001000
00010111 00010010 00000000 200.23.18.0/23
Organization 2 11001000 00010111 00010100
00000000 200.23.20.0/23 ...
..
. . Organization 7
11001000 00010111 00011110 00000000
200.23.30.0/23
34Hierarchical addressing route aggregation
Hierarchical addressing allows efficient
advertisement of routing information
Organization 0
Organization 1
Send me anything with addresses beginning
200.23.16.0/20
Organization 2
Fly-By-Night-ISP
Internet
Organization 7
Send me anything with addresses beginning
199.31.0.0/16
ISPs-R-Us
35Hierarchical addressing more specific routes
ISPs-R-Us has a more specific route to
Organization 1
Organization 0
Send me anything with addresses beginning
200.23.16.0/20
Organization 2
Fly-By-Night-ISP
Internet
Organization 7
Send me anything with addresses beginning
199.31.0.0/16 or 200.23.18.0/23
ISPs-R-Us
Organization 1
Longest prefix match
36IP addressing the last word...
- Q How does an ISP get block of addresses?
- A ICANN Internet Corporation for Assigned
- Names and Numbers
- allocates addresses
- manages DNS
- assigns domain names, resolves disputes
37NAT Network Address Translation
rest of Internet
local network (e.g., home network) 10.0.0/24
10.0.0.1
10.0.0.4
10.0.0.2
138.76.29.7
10.0.0.3
Datagrams with source or destination in this
network have 10.0.0/24 address for source,
destination (as usual)
All datagrams leaving local network have same
single source NAT IP address 138.76.29.7, differe
nt source port numbers
38NAT Network Address Translation
- Motivation local network uses just one IP
address as far as outside world is concerned - no need to be allocated range of addresses from
ISP - just one IP address is used for all
devices - can change addresses of devices in local network
without notifying outside world - can change ISP without changing addresses of
devices in local network - devices inside local net not explicitly
addressable, visible by outside world (a security
plus).
39NAT Network Address Translation
- Implementation NAT router must
- outgoing datagrams replace (source IP address,
port ) of every outgoing datagram to (NAT IP
address, new port ) - . . . remote clients/servers will respond using
(NAT IP address, new port ) as destination
addr. - remember (in NAT translation table) every (source
IP address, port ) to (NAT IP address, new port
) translation pair - incoming datagrams replace (NAT IP address, new
port ) in dest fields of every incoming datagram
with corresponding (source IP address, port )
stored in NAT table
40NAT Network Address Translation
NAT translation table WAN side addr LAN
side addr
138.76.29.7, 5001 10.0.0.1, 3345
10.0.0.1
10.0.0.4
10.0.0.2
138.76.29.7
10.0.0.3
4 NAT router changes datagram dest addr
from 138.76.29.7, 5001 to 10.0.0.1, 3345
3 Reply arrives dest. address 138.76.29.7,
5001
41NAT Network Address Translation
- 16-bit port-number field
- 60,000 simultaneous connections with a single
LAN-side address! - NAT is controversial
- routers should only process up to layer 3
- violates end-to-end argument
- NAT possibility must be taken into account by app
designers, eg, P2P applications - address shortage should instead be solved by IPv6
42Network Layer
- Introduction
- Virtual circuit and datagram networks
- Whats inside a router
- IP Internet Protocol
- Datagram format
- IPv4 addressing
- ICMP
- IPv6
- Routing algorithms
- Link state
- Distance Vector
- Hierarchical routing
- Routing in the Internet
- RIP
- OSPF
- BGP
- Broadcast and multicast routing
43ICMP Internet Control Message Protocol
- used by hosts routers to communicate
network-level information - error reporting unreachable host, network, port,
protocol - echo request/reply (used by ping)
- network-layer above IP
- ICMP msgs carried in IP datagrams
- ICMP message type, code plus first 8 bytes of IP
datagram causing error
Type Code description 0 0 echo
reply (ping) 3 0 dest. network
unreachable 3 1 dest host
unreachable 3 2 dest protocol
unreachable 3 3 dest port
unreachable 3 6 dest network
unknown 3 7 dest host unknown 4
0 source quench (congestion
control - not used) 8 0
echo request (ping) 9 0 route
advertisement 10 0 router
discovery 11 0 TTL expired 12 0
bad IP header
44An asideError Reporting (ICMP) and traceroute
- Internet Control Message Protocol
- Used by a router/end-host to report some types of
error - E.g. Destination Unreachable packet cant be
forwarded to/towards its destination. - E.g. Time Exceeded TTL reached zero, or
fragment didnt arrive in time. Traceroute uses
this error to its advantage. - An ICMP message is an IP datagram, and is sent
back to the source of the packet that caused the
error.
45Traceroute and ICMP
- Source sends series of UDP segments to dest
- First has TTL 1
- Second has TTL2, etc.
- Unlikely port number
- When nth datagram arrives to nth router
- Router discards datagram
- And sends to source an ICMP message (type 11,
code 0) - Message includes name of router IP address
- When ICMP message arrives, source calculates RTT
- Traceroute does this 3 times
- Stopping criterion
- UDP segment eventually arrives at destination
host - Destination returns ICMP host unreachable
packet (type 3, code 3) - When source gets this ICMP, stops.
It would be fun if you design a traceroute tool
on your own!
46Network Layer
- Introduction
- Datagram networks
- IP Internet Protocol
- Datagram format
- IPv4 addressing
- ICMP
- Whats inside a router
- Routing algorithms
- Link state
- Distance Vector
- Hierarchical routing
- Routing in the Internet
- RIP
- OSPF
- BGP
- Broadcast and multicast routing
47Router Architecture Overview
- Two key router functions
- run routing algorithms/protocol (RIP, OSPF, BGP)
- forwarding datagrams from incoming to outgoing
link
48Input Port Functions
Physical layer bit-level reception
- Decentralized switching
- given datagram dest., lookup output port using
forwarding table in input port memory - goal complete input port processing at line
speed - queuing if datagrams arrive faster than
forwarding rate into switch fabric
Data link layer e.g., Ethernet
49Three types of switching fabrics
50Switching Via Memory
- First generation routers
- traditional computers with switching under
direct control of CPU - packet copied to systems memory
- speed limited by memory bandwidth (2 bus
crossings per datagram)
51Switching Via a Bus
- datagram from input port memory
- to output port memory via a shared bus
- bus contention switching speed limited by bus
bandwidth - 1 Gbps bus, Cisco 1900 sufficient speed for
access and enterprise routers (not regional or
backbone)
52Switching Via An Interconnection Network
- overcome bus bandwidth limitations
- Banyan networks, other interconnection nets
initially developed to connect processors in
multiprocessor - Advanced design fragmenting datagram into fixed
length cells, switch cells through the fabric. - Cisco 12000 switches Gbps through the
interconnection network
53Output Ports
- Buffering required when datagrams arrive from
fabric faster than the transmission rate - Scheduling discipline chooses among queued
datagrams for transmission
54Output port queueing
- buffering when arrival rate via switch exceeds
output line speed - queueing (delay) and loss due to output port
buffer overflow!
55Input Port Queuing
- Fabric slower than input ports combined -gt
queueing may occur at input queues - Head-of-the-Line (HOL) blocking queued datagram
at front of queue prevents others in queue from
moving forward - queueing delay and loss due to input buffer
overflow!
56How a Router Forwards Datagrams
128.17.20.1
e.g. 128.9.16.14 gt Port 2
R2
Prefix
Port
Next-hop
3
65/8
128.17.16.1
R1
R3
1
128.9/16
2
128.17.14.1
2
128.9.16/20
2
128.17.14.1
3
128.9.19/24
7
128.17.10.1
128.9.25/24
2
128.17.14.1
R4
128.9.176/20
1
128.17.20.1
142.12/19
3
128.17.16.1
128.17.16.1
Forwarding/routing table
57How a Router Forwards Datagrams
- Every datagram contains a destination address.
- The router determines the prefix to which the
address belongs, and routes it to theNetwork ID
that uniquely identifies a physical network. - All hosts and routers sharing a Network ID share
same physical network.
58Forwarding Datagrams
- Is the datagram for a host on a directly attached
network? - If no, consult forwarding table to find next-hop.
59Inside a router
Link 1, ingress
Link 1, egress
Choose Egress
Link 2, ingress
Link 2, egress
Choose Egress
Link 3, ingress
Link 3, egress
Choose Egress
Link 4, ingress
Link 4, egress
Choose Egress
60Inside a router
Forwarding Table
Link 1, ingress
Link 1, egress
Forwarding Decision
Link 2, ingress
Link 2, egress
Choose Egress
Link 3, ingress
Link 3, egress
Choose Egress
Link 4, ingress
Link 4, egress
Choose Egress
61Forwarding in an IP Router
- Lookup packet DA in forwarding table.
- If known, forward to correct port.
- If unknown, drop packet.
- Decrement TTL, update header Checksum.
- Forward packet to outgoing interface.
- Transmit packet onto link.
Question How is the address looked up in a real
router?
62Making a Forwarding DecisionClass-based
addressing
IP Address Space
Class A
Class B
Class C
D
Class A
Routing Table
Class B
212.17.9.4
Exact
Match
Class C
212.17.9.0
Port 4
212.17.9.0
Exact Match There are many well-known ways to
find an exact match in a table.
63Direct Lookup
Next-hop, Port
IP Address
Memory
Data
Address
Problem With 232 addresses, the memory would
require 4 billion entries.
64Associative LookupsContents addressable memory
(CAM)
Associative Memory or CAM
- Advantages
- Simple
- Disadvantages
- Slow
- High Power
- Small
- Expensive
Port Number
Network Address
Port Number
Search Data
Hit?
32
Search data is compared with every entry in
parallel
65Hashed Lookups
Search Data
Hashing Function
16
32
Data
Address
Memory
66Lookups Using HashingAn example
Memory
1
2
3
4
Search Data
Associated Data
1
2
16
Hashing Function
Hit?
32
1
2
3
Linked list of entrieswith same hash key.
67Lookups Using Hashing
- Advantages
- Simple
- Expected lookup time can be small
- Disadvantages
- Non-deterministic lookup time
- Inefficient use of memory
68Trees and Tries
Binary Search Tree
lt
gt
lt
gt
lt
gt
69Search TriesMultiway tries reduce the number of
memory references
16-ary Search Trie
0000, ptr
1111, ptr
1111, ptr
0000, 0
1111, ptr
0000, 0
000011110000
111111111111
Question Why not just keep increasing the degree
of the trie?
70Classless AddressingCIDR
128.9/16
0
232-1
128.9.16.14
Most specific route longest matching prefix
Question How can we look up addresses if they
are not an exact match?
71Ternary CAMs
Associative Memory
Value
Mask
Port
10.1.1.32
1
255.255.255.255
255.255.255.0
Port
10.1.1.0
2
255.255.255.0
10.1.3.0
3
10.1.0.0
4
255.255.0.0
255.0.0.0
10.0.0.0
4
Priority Encoder
Note Most specific routes appear closest to top
of table
72Longest prefix matches using Binary Tries
Example Prefixes
0
1
a) 00001
b) 00010
c) 00011
g
d) 001
e) 0101
i
f
d
f) 011
g) 10
h
h) 1010
e
i) 111
j
j) 111100
a
b
c
k) 11110001
k
73Lookup Performance Required
Line
Line Rate
Pktsize40B
Pktsize240B
T1
1.5Mbps
4.68
Kpps
0.78
Kpps
OC3
155Mbps
480
Kpps
80
Kpps
OC12
622Mbps
1.94
Mpps
323
Kpps
OC48
2.5Gbps
7.81
Mpps
1.3
Mpps
OC192
10
Gbps
31.25
Mpps
5.21
Mpps
74Discussion
- Why was the Internet Protocol designed this way?
- Why connectionless, datagram, best-effort?
- Why not automatic retransmissions?
- Why fragmentation in the network?
- Must the Internet address be hierarchical?
- What address does a mobile host have?
- Are there other ways to design networks?