Title: Networking Tips
1Networking Tips Best Practices
- Terry Slattery
- Founder CEO
2Tech Tip Origins (Background)
- NetMRI - like an MRI for your network
- Based on real-world needs of network engineers
- Identify network hot spots (performance, errors,
utilization) - Operational correctness (HSRP, root bridge
selection, duplex mismatch) - Configuration management (what changed,
comparison with network policies) - System-level analysis
- Deployed in over 50 enterprise networks
- Data sources
- SNMP
- Command Line Interface (Telnet or SSH)
- Configurations
3But Which Tech Tips?
- Surprising number of common problems
- Affect normal network operations
- Often not visible to element or event systems
- Correcting common problems
- Increases network efficiency and availability
- Especially important in redundant networks
- Reduces firefighting (fire codes vs fire alarms)
- Bounce around multiple tips
- Explain the problem
- Identify the impact
- How to detect the problem
- Fixes
4Switch Port Duplex Mismatch
- The most common problem in switched networks
- Symptoms
- Errors increase with traffic load throughput
degrades - Half duplex side sees late collisions
- Full duplex side sees FCS Alignment errors
- Often not apparent at low traffic volume (ping
works) - Detection
- CLI show interface
- SNMP
- ifTable only includes basic error count
- ifXTable has error breakdown
- Warning total errors ! sum of individual errors
5Duplex Mismatch Performance
6Switch Port Duplex Mismatch
- For duplex mismatch
- Auto/Auto both ends (good for workstations)
- Alternative Configure switch and server NIC
- Bad cable plant may be the problem
- Automated tools to identify errors from
mismatches - Recurring problem
- NICs get replaced
- OS upgrade or reload mayreset NIC settings
- Devices move from port to port, switch to switch
7Duplex Mismatch References
- Microsofts recommendation
- Network switches and server network adapters have
to have the duplex settings matched for
communication to function correctly. Both must be
set to full-duplex or half-duplex. They cannot be
mismatched. - For optimal performance, the network adapters in
a file server should be in full-duplex mode and
connected to a switch. - Technical References
- An Introduction to Auto-Negotiation, Bill Bunch,
Feb 1995 - Fluke Networks Auto Negotiation Poster
- Troubleshooting Cisco Catalyst Switches to NIC
Compatibility Issues
8Unidirectional Router Link
- History
- Early 90s
- Routing ProtocolRIP
- Zero packets inone direction
- Priority Queueing(before routingprioritization)
9Unidirectional Router Link
- Symptom
- Zero traffic in one direction for many hours
- Enough traffic in the other direction to be of
interest - Detection
- Surprisingly frequent problem
- CLI show interface and examine packet counts
- SNMP - ifTable packet counters
- Must still diagnose why
10Unidirectional Router Link
- Static routes
- Use care when overriding the dynamic protocol
- Static routes can be a big hammer
- Over-use creates problems
- Policy routing or Traffic Engineering
- MPLS
- Multi-Topology Routing
- Hardware or cable plant failure
- UDLD helps avoid it at the link level
- Modern routing protocols require two-way
adjacency - ACL configuration mistake
- It happens more often than youd think!
11QoS Queue Drops
G711 Good
- Symptom
- Poor voice quality
- Apps with poor response time
- Packet loss during high network utilization
- High priority traffic starves low priority
traffic - The loss could be good could be bad
- It Depends - which queue?
- One Express Forwarding queuetypically VoIP
- Four Assured Forwardingclasses, three drop
probabilityqueues in each class - Are you monitoring drops?
80ms Jitter
10 packet loss
12QoS Queue Drops
- Design policies
- Know your traffic mix
- Proper allocation of traffic to queues
- Default service queue (Scavenger)
- Implementation
- Classification marking near source
- Consistent queue policies
- Monitor queues
- Is the right thing happening?
- What adjustments are necessary?
- References
- Cisco At-A-Glance guides
- Networkers talks
1395th Percentile Interface Utilization
- The problem
- Averages may hide significant utilization
problems - 95 of the time, the usage is below this amount
(5 of the time its above) - Sort data samples discard top 5 value of next
sample - Average 0.397Mbps 95th Percentile 0.752Mbps
1495th Percentile Interface Utilization
- Need statistically significant number of samples
- 10 minute interval yields 144 samples
- Discard top 5 of the samples (7 samples)
- 7 samples 10 minutes/sample 70 minutes!
- Tradeoff sampling interval vs sampling impact
- Definitely good for WAN (and many LAN) interfaces
- Identify high utilization during daily operations
where the average is low - Also good for identifying very low utilization
links (e.g. unused WAN links)
15Excessive ICMP Packets
- ICMP destination or network unreachable
- Address scan by worms
- Application uses hard-coded addressing and server
moved - ICMP redirect
- Host with wrong subnet mask (sends to router vs
direct) - Old secondary subnet
- ICMP TTL exceeded
- Traceroute - this is ok, but count should be
small (lt850/day/router?) - Routing loop indicator if high counts
(gt2000-4000?)
16Excessive ICMP Packets
- How to troubleshoot?
- SNMP doesnt include source/destination addresses
- Protocol analyzer (site visit tapping the
network) - Access list with logging
- Netflow?
- NAM blade in 6500s
- debug ip icmp works because ICMP is typically
low rate
17Root Bridge Selection
- Symptom
- A spanning tree becomes unstable after adding a
switch - Multiple root bridge changes during a short
period - Blocking/Listening/Learning/Forwarding cycles
- Default bridge priority is 32768
- Switch with the lowest MAC address is the
tie-breaker - War story (from Networkers 2005 discussion)
- 900 server data center
- Needed another port
- Added an old, small switch
- Result unstable L2 infrastructure
- Why Its preemptive
18Root Bridge Selection
- Selecting a root bridge
- Stable switch w/ stable power
- Core, connected to routing infrastructure
- Sufficient memory and CPU
- Set bridge priority (low numbers win)
- Common recommendation 8192
- Using 8190 protects against accidental addition
of switch with 8192 - Backup root?
- Marginal usefulness in non-redundant designs
- Bridge priority 16384 (16380)
19Tracking Configuration Changes
- Symptom
- I only changed one thing!
- What was working before is now failing
- 50-80 of network errors are due to configuration
changes! - Identifying configuration changes is important
- Compliance
- Sarbanes-Oxley (corp financial records)
- HIPAA (health records)
- GLBA (individual financial records)
- PCI (credit card numbers)
- Who changed it and why
- Change review
20Tracking Configuration Changes
- What is a configuration change?
- IOS thinks a config changed if config terminal
- Grab the config check for changes
- Archiving
- Keep all copies (disk space is inexpensive)
- Hourly check is likely sufficient
- Watch for debug sessions creating many revisions
- First step of Configuration Management DataBase
(CMDB)
21Tracking Configuration Changes
22Configuration Best Practices
- Is my networks configuration correct?
- Consistency is important
- Prevent security holes
- Avoid unintentional network outages
- Validate implementation of network policies
- Validate as-built against design
- Manual validation
- Doesnt scale to hundreds of routers switches
- Error prone
- Configuration Management DataBase (CMDB)
- Archive of all configurations
- Periodic validation of network policy
implementation - (Will show NetMRI mechanism - use the tools you
have)
23Router Loopback Interface
- Best practice
- Configure a loopback interface for router
management - Tie BGP, VTY source, tunnels, and SNMP/syslog to
loopback - Use separate CIDR for loopbacks to simplify ACLs
- Configuration Policy Definition
- Policy Cisco Router Loopback0 Policy
- Description Check for a Loopback 0 interface
- Device-Filter Vendor "Cisco" and Type
"Router - Section Loopback Interface
- Description Ensure that the Loopback0 interface
is defined - Required
- interface Loopback0
- ip address 10\.1\.. 255.255.255.252
24SNMP Syslog
- Best practice
- Tie loopback interface to Syslog
- Use ACL to restrict access
- Allow access from adjacent routers/switches
- Configuration Policy Definition
- Section SNMP and Syslog configuration
- Description Policy for SNMP and Syslog
configuration - Required
- access-list 80 permit 10.0.1.101
- access-list 90 permit 10.0.1.101
- snmp-server community . RO 80
- snmp-server community . RW 90
- snmp-server host 10.0.1.101 .
- logging 10.0.1.101
- logging source-interface Loopback0
- logging facility syslog
- snmp-server enable traps tty
25DNS, NTP, and Clock Configuration
- Best practice
- DNS NTP servers on separate subnets
- Clock and logging timestamp configuration
- Configuration Policy Definition
- Section NTP and Clock configuration
- Description Check NTP servers and clock
configuration. - Required
- ip domain -name example.com
- ip name-server 10.0.1.100
- ip name-server 10.0.2.100
- ntp server 10.0.1.100
- ntp server 10.0.2.100
- clock timezone (ESTest) -5
- clock summer-time (EST5EDTedt) recurring
- service timestamps debug datetime msec localtime
- service timestamps log datetime msec localtime
- Invalid
- ntp server .
- ip name-server .
26VLAN Naming
- Symptom
- Multiple VLAN names in one spanning tree
- Not important for the switches (they use VLAN ID)
- Confusing for the network administrators
- Inconsistent naming
- Configuration error
- Other...
- 5 switches with one name
- 4 switches with anothername
- The cause is?
-
Two VLANs accidentallyinterconnected
27Summary
- Common problems still abound
- Old fashioned routing switching problems
- Combinations of problems create weird symptoms
- Reduce firefighting by proactive analysis
correction - Traps syslog are reactive
- System-level analysis is required (its more
than element mgmt) - Automated tools are required
- Manual data collection doesnt scale
- Automate the tedious analysis
- Automation frees time to work on interesting
problems
Better, simpler and more automated tools are
required. Jean-Pierre Garbani Forrester
Research Commenting on the evolution of the Event
Management Market