Interprtation de la dynamique de BGP - PowerPoint PPT Presentation

1 / 77
About This Presentation
Title:

Interprtation de la dynamique de BGP

Description:

Agilent Laboratories. special thanks to: BGP. Defacto standard inter-domain routing protocol ... with Alexander Tudor (Agilent Labs) ease router testing by ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 78
Provided by: ola69
Category:

less

Transcript and Presenter's Notes

Title: Interprtation de la dynamique de BGP


1
Interprétation de la dynamique de BGP
Olaf MaennelTechnische Universität München
2
BGP
  • Defacto standard inter-domain routing protocol
  • Path vector protocol
  • Policy routing protocol
  • Topology changes, etc... ? BGP updates

prefix p
BGP observation point
AS 4
AS 5
AS 1
path 5 3 1
path 1
path 3 1
AS 3
3
Locating routing instabilities
  • LOCATION AS edgeeither internal to an
    AS (e.g., inside AS1AS1) or link between
    two ASes (e.g., external AS1AS2)
  • INSTABILITY any change of BGP advertisment over
    a BGP session

4
Why identify locations of instabilities?
  • Instabilities can lead to
  • Unreachablility / poor performance
  • Route oscillation
  • BGP churn
  • Black holes
  • Identifying the location enables corrective action

5
Causes of instabilities
  • Possible causes for BGP instabilities
  • BGP session availability
  • Session establishment/teardown/reset
  • BGP session filters
  • BGP attribute or filter manipulation
  • misconfiguration
  • IGP cost change
  • IGP metric change, link or node failures

6
Outline
  • High-level approach
  • Limitations
  • Evaluation

7
Outline
  • High-level approach
  • Limitations
  • Evaluation

8
High-level approach (I) Dimensions
  • Locate BGP instability by analyzing BGP updates
    along three dimensions
  • 1. Time
  • 2. Views
  • 3. Prefixes
  • Locate BGP instability by analyzing BGP updates
    along three dimensions
  • 1. Time
  • 2. Views
  • 3. Prefixes
  • Locate BGP instability by analyzing BGP updates
    along three dimensions
  • 1. Time
  • 2. Views
  • 3. Prefixes
  • Locate BGP instability by analyzing BGP updates
    along three dimensions
  • 1. Time
  • 2. Views
  • 3. Prefixes

9
One trigger multipe updates?!!
ISP A (Tier 1)
ISP B (Tier 1)
ISP D (Tier 2)
ISP C
Customer E
10
One trigger multipe updates?!!
11
One trigger multipe updates?!!
12
BGP doing its job
13
BGP doing its job
14
BGP doing its job
15
BGP doing its job
16
Definition of terms
echoes multiple BGP updates for - same
triggering event - on one peering session
- for one prefix
Group updates into update bursts - same
prefix / peer - short time window
updates for prefix A seen on peer P1
BGP updates (echoes)
time
17
Digression update burst duration
convergence can take rather long
18
Time
BGP observation point A
prefix p
same prefix same observation point
19
Views
observation point B
BGP observation point A
prefix p
same prefix across observation points
20
Event
  • Captures update propagation
  • Clusters updates bursts across observation
    points
  • Different timeout heuristics relative, static,
    adaptive

updates for prefix A seen on peer P1
updates for prefix A seen on peer P2
Quiet period.
updates for prefix A seen on peer P3
event duration
time
21
Prefixes
observation point B
BGP observation point A
prefix p
prefix q
correlate across prefixes
22
High-level approach (II) Algorithms
  • UNION heuristic (Time)
  • INTERSECTION heuristic (Views)
  • GREEDY heuristic (Prefixes)

23
UNION heuristic
  • Routing instability ? change from
    previous to new path
  • previous best path no longer available
  • or new best path becomes available
  • ? UNION of AS edges as candidates
  • Input BGP updates
  • Output update bursts with candidates

24
UNION heuristic
BGP observation point A
prefix p
UNION of AS edges on paths as candidates
25
INTERSECTION heuristic
  • Routing instability ? changes at multiple
    observation points
  • ? INTERSECTION of candidate sets
  • Input update bursts with candidates
  • Output events with instability sets

26
INTERSECTION heuristic
observation point B
BGP observation point A
prefix p
Changes observable at multiple observation points
27
INTERSECTION heuristic
observation point B
BGP observation point A
prefix p
Changes observable at multiple observation points
28
GREEDY heuristic
  • Routing instability ? changes multiple
    prefixes
  • ? identify correlated prefixes
  • Input events with instability sets
  • Output correlated events

29
GREEDY heuristic
  • Goal Distinguish between multiple simultaneous
    instabilities.
  • Determine most popular AS edge in instability
    sets
  • For all events and for each edge in instability
    set
  • Counteredge
  • Sort edges by counter values
  • Chose edge with largest counter value as
    candidate AS edge for associated events
  • Remove these events from the input
  • Repeat

30
Prefixes
observation point B
BGP observation point A
prefix p
prefix q
candidates for prefixes p
31
Prefixes
observation point B
BGP observation point A
prefix p
prefix q
candidates for prefixes p q
32
Outline
  • High-level approach
  • Limitations
  • Evaluation

33
Problems with UNION heuristic
  • Location may not be in the UNION at all!? may
    lead to empty INTERSECTION
  • Size of candidate set may be large

34
Caution induced updates
AS8
AS7
p71
p 871
p 1
AS1
AS2
AS3
AS4
p 321
p 21
p 1
p
AS5
p 1
AS6

Policy AS 4 prefers path over AS 3 instead of AS
6!
35
Caution induced updates
AS8
AS7
p71
p 871
p 1
AS1
AS2
AS3
AS4
p5871
p 321
p 1
preferred
p
AS5
p 4321
less preferred
p 1
AS6

Link failure between AS 2 AS 3
36
Caution induced updates
AS8
AS7
p71
p 871
p 1
AS1
AS2
AS3
AS4
p5871
p 321
p 1
preferred
p
p5461
AS5
p 4321
less preferred
p 1
AS6

Old path 5871 new path 4561, but failure is
between 2-3
37
Reducing size of candidate set
  • Idea exclude some ASes
  • e.g., initial or final shared path segment
  • Narrows the candidate set, but may exclude the
    location

prefix p
( exclude )
( exclude )
38
Caution on excluding candidate ASes
AS2
p 1
p 21
AS4
AS5
AS1
p
p 1
p 31
AS3
39
Caution on excluding candidate ASes
AS2
p 1
p 21
AS4
AS5
p 5421
AS1
p 421
change of preference!
p
p 431
p 5421
p 1
p 31
AS3
Path change AS 2 replaces AS 3 yet cause
is AS 5
40
Good news
  • Accurate in simulations
  • Accurate when applied to real data
  • Some formal justification in paper

41
Outline
  • High-level approach
  • Limitations
  • Evaluation

42
Evaluation of methodology
  • Simulation Setup
  • Inferred AS topology from BGP data
  • Routescope Simulator
  • Data analysis Setup
  • BGP routing table dumps and updatesfrom RIPE,
    Routeviews, and Akamai
  • Over 1,100 BGP feeds / 650 ASes (some I-BGP)

43
Simulation
  • Topology
  • Inferred AS topology
  • Single node AS
  • Policies
  • Inferred AS relationships
  • Prefer customer routes over peers over upstreams
  • Predicted routes agree with high accuracy with
    actual routes
  • Link failures
  • Randomly selected
  • Observation points
  • Randomly selected

44
Simulations UNION INTERSECTION
Histogram of instability set size Several UNION
heuristics
Percentages of events
Size of instability set ( of AS-AS edges)
Choice of heuristic matters
45
Simulations summary
  • The methodolgy never excludes the simulated
    failure location
  • Number of observation points matter
  • Average instability sizes after
    intersectionwith only two obs. ?7 edges in
    68with 10 obs. ?7 edges in 88
  • Location of observation points matter (in
    AS-hierachy)

46
Data analysis UNIONINTERSECTION
Histogram of instability set size Several UNION
heuristics
Percentages of events
Size of instability set ( of AS-AS edges)
More aggressive heuristics are dangerous
47
UNION INTERSECTION GREEDY
  • Zipfs law seems to apply to the distribution of
    correlated events across prefixes
  • Single AS edge identified for 93.4 of prefixes
  • Three AS edges identified for 97.2 of prefixes
  • If restricted to at least 100 correlated
    prefixes
  • Single AS edge identified for 96.3 of prefixes

48
Validation
  • Syslog data of tier-1 vs. Greedy results
  • Crosscheck Session reset on router ? event
    within 5 minutes
  • Result
  • Checked 35 events
  • Found 26 events ? 74 of the events

49
Summary
  • Proposed methodologyTime ? Views ? Prefixes
  • Ideal-world study Simulation
  • UNION / INTERSECTION heuristics ? 7 AS edges
    for 88 (10 obs.)
  • Real-world study Data analysis
  • UNION / INTERSECTION heuristics
  • Beacons ? 3 AS edges for 76 (2 obs.)
  • All prefixes ? 5 AS edges for 90 (5 obs.)
  • UNION / INTERSECTION / GREEDY heuristic
  • All prefixes 1 AS edge for 93
  • Successful validation on tier-1 syslog data

50
Ongoing work
  • Generate synthetic BGP traffic
  • with Alexander Tudor (Agilent Labs)
  • ease router testing by
  • identifying a statistical profile of BGP
  • BGP alarm system
  • with Gert Doering (SpaceNet) and RIPE NCC
  • detect unwanted routing conditions and trigger
    alarms
  • integration of private AS monitors withRIPEs
    myASN project

51
Questions? Comments?!
Thanks !
52
What did happen? -)
observation point B
AS7
CE PE
BGP observation point A
CE PE
AS4
prefix p
AS3
CE PE
peering
AS5
PE CE
AS6
AS8
PE CE
CE PE
prefix q
AS1
AS2
PE CE
PE CE
TIER 1 AS 1 / AS7 is doing TE on incoming routes
53
What did happen? -)
observation point B
AS7
CE PE
BGP observation point A
CE PE
AS4
prefix p
AS3
CE PE
peering
AS5
PE CE
AS6
AS8
PE CE
CE PE
prefix q
AS1
AS2
PE CE
PE CE
TIER 1 AS 1 / AS7 is doing TE on incoming routes
54
What did happen? -)
observation point B
AS7
CE PE
BGP observation point A
CE PE
AS4
prefix p
AS3
CE PE
peering
AS5
PE CE
PE CE
AS6
AS8
PE CE
CE PE
prefix q
AS1
AS2
PE CE
PE CE
AS 2 added a new upstream AS 3
55
What did happen? -)
observation point B
AS7
CE PE
BGP observation point A
CE PE
AS4
prefix p
AS3
CE PE
peering
AS5
PE CE
PE CE
AS6
AS8
PE CE
CE PE
prefix q
AS1
AS2
PE CE
PE CE
AS 5 prefers the peering session / AS 7 the
shortest AS path
56
Additional slides
a few thoughts about route flap dampening
57
Interarrival time between echoes
peers without MRAI lots of echoes with MRAI
doesnt prevent echoes
58
Number of echoes in update bursts
damping on peers? without MRAI 8.3 with MRAI
2.4
59
Ciscos default damping parameters
60
Summary
  • Todays eBGP convergence depends on
  • MRAI shorter MRAI leads to - more echoes and
    to more damping and - to faster convergence if
    damping is not aggressive
  • Damping settings - damping occurs for normal
    prefixes! (BGP path exploration may need 6
    echoes, and depends on interconnectivity)-
    damping helps for unstable prefixes

61
Additional slides
Convergence
62
Regarding BGP convergence
  • timeout too small cant capture all effects
  • timeout too large combine several instabilities
    in one burst

updates for prefix A seen on peer P1
updates for prefix A seen on peer P2
updates for prefix A seen on peer P3
time
instability
instability
63
Regarding BGP convergence
  • timeout too small cant capture all effects
  • timeout too large combine several instabilities
    in one burst

updates for prefix A seen on peer P1
updates for prefix A seen on peer P2
updates for prefix A seen on peer P3
time
failure
64
Update burst duration
convergence can take rather long
65
Number of updates in update bursts
most bursts only a few updates - some bursts
huge of updates!
66
Interarrival time of update bursts
time to next update burst unpredictable
67
Convergence points on different peers
Do all peers converge at the same time? - pick
one prefix on one peer - find other peers with
active update bursts - compute time difference
between convergence points
updates for prefix A seen on peer P1
updates for prefix A seen on peer P2
updates for prefix A seen on peer P3
time
68
Time difference between convergence points
5 of prefixes with more/less specific update
burst
69
Bursts observed on different peers
update distribution locally or globally visible
70
Additional slides
BGP wedgies
71
Tim Griffin BGP wedgies
ISP A (Tier 1)
ISP B (Tier 1)
ISP D (Tier 2)
ISP C
Customer E
72
Tim Griffin BGP wedgies
ISP A (Tier 1)
ISP B (Tier 1)
ISP D (Tier 2)
primary link
ISP C
backup link
Customer E
73
Desired Situation !!!
ISP A (Tier 1)
ISP B (Tier 1)
ISP D (Tier 2)
primary link
ISP C
backup link
Customer E
74
AS path prepending ???
ISP A (Tier 1)
ISP B (Tier 1)
ISP D (Tier 2)
2
ISP C
2 2 2 2 2
AS 2
75
Policies with communities ?!!
ISP A (Tier 1)
ISP B (Tier 1)
ISP D (Tier 2)
primary link
ISP C
Community set local-preference
AS 2
76
Primary link fails
ISP A (Tier 1)
ISP B (Tier 1)
ISP D (Tier 2)
ISP C
AS 2
77
Primary link recovers ?!!
ISP A (Tier 1)
ISP B (Tier 1)
ISP D (Tier 2)
ISP C
AS 2
Write a Comment
User Comments (0)
About PowerShow.com