Title: Declarative Routing: Extensible Routing with Declarative Queries
1Declarative RoutingExtensible Routing with
Declarative Queries
- UC Berkeley Boon Thau Loo, Joseph M.
Hellerstein, Ion Stoica. - Intel Research Joseph M. Hellerstein.
- Wisconsin-Madison Raghu Ramakrishnan
- SIGCOMM05
2Outlines
- Abstract
- System Model
- The Basics
- Challenges
- Expressiveness
- Security
- Optimization
- Stability and Robustness
- Evaluation
- Conclusion
3Abstract (1)
- Problem The Internet routing infrastructure is
difficult to supply new applications. - Other researches on the above problem
- New hard-coded routing protocols
- In order to change or upgrade a routing protocol,
must get access to each router. This is more
tedious. - Active Networks allows one to apply new routing
functionality without the direct access to
routers. - Difficulties in both router performance and the
security and reliability of the resulting
infrastructure. - Overlay networks allows third parties to replace
Internet routing with new, from-scratch
implementation of routing functionality that run
at the application layer. - Simply move the problem from the network layer to
the application layer that third parties have
control. - This paper explores a new point to strike a
better balance between the extensibility and
robustness of a routing infrastructure. - Declarative Routing to express routing protocols
using a database query language.
4Abstract (2)
- Ideas based on an observation
- Recursive query languages studied in the
deductive database literature are a natural fit
for expressing routing protocols. Deductive
database query languages focus on identifying
recursive relationships among nodes of a graph,
and are well suited for expressing paths among
nodes in a network. - Basic mechanism
- A routing protocol is implemented by writing a
simple query in a declarative query language like
Datalog, which is then executed in a distributed
fashion at some or all of the nodes. - The future applications
- Individual end-user will explicitly request
routes with particular properties, by submitting
route construction queries to the network. - An administrator at an ISP might re-configure the
ISPs routers by issuing a query to the network. - The simplicity and safety of declarative routing
has benefits over the current relatively fragile
approaches to upgrading routers.
5Outlines
- Abstract
- System Model
- The Basics
- Challenges
- Expressiveness
- Security
- Optimization
- Stability and Robustness
- Evaluation
- Conclusion
6System Model (1)
- We model the routing infrastructure as a directed
graph, where each link is associated with a set
of parameters (e.g., loss rate, available
bandwidth, delay). - The nodes can either be IP routers or overlay
nodes. - Fully distributed implementation
- Like traditional routers, the nodes maintain
links to their neighbors, compute routes, and set
up the forwarding state. - But, instead of running a traditional routing
protocol, each node runs a general-purpose query
processor.
7System Model (2)
8System Model (3)
- The query processor can read the neighbor table,
and install entries into the forwarding table. - This simple interface is the only interaction
between the query processor and the core
forwarding logic. - Declarative query
- Both routing protocols and route requests can be
expressed. - The results of the query are used to establish
router forwarding state. - Base tuples the local information that the node
reads. - E.g. link tuple
- link (source, destination, ), a copy of an entry
in the neighbor table. - Derived tuples the generated intermediate data.
- E.g. path tuple
- path (source, destination. pathVector, cost)
- Lifetime of a query
- Each query is accompanied by a specification of
the lifetime. - During this period, neighbor table updates would
trigger the re-computation of some of the
existing derived and result tuples.
9Outlines
- Abstract
- System Model
- The Basics
- Challenges
- Expressiveness
- Security
- Optimization
- Stability and Robustness
- Evaluation
- Conclusion
10The Basics (1)
- A Datalog program (a query) consist of a set of
declarative rules. - rule form
- ltheadgt - ltbodygt
- Variable names begin with an upper-case letter.
- Function symbols, predicates and constants begin
with an lower-case letter. - Ex.
- NR1 path(S, D, P, C) - link(S, D, C), P
f_concatPath( link(S, D, C), nil ). - NR2 path(S, D, P, C) - link(S, Z, C1), path(Z,
D, P2, C2), C C1 C2, P f_concatPath(
link(S, Z, C1), P2 ). - Query path(S, D, P, C)
- Since both S and D are unbound variables, this
query will compute the full transitive closure
consisting of the paths between all pairs of
reachable nodes. - Pitfall
- The above query will not terminate due to the
generation of path tuples with cycles. - Solution add an extra predicate f_inPath(P2, S)
false to rule NR2.
11The Basics (2)
- Query Plan Generation
- Before executing a query, we need to generate a
query plan. - A query plan a dataflow diagram consisting of
relational operators that are connected by arrows
indicating the flow of tuples. Figure 2.
12Query Plan
13The Basics (3)
- Query Plan Distributed Execution
- Upon receipt of the Datalog query, each node
creates the query plan. Figure 3 (First, ignore
query initial dissemination) - Query Initial Dissemination
- Flood
- Piggy-back
- Embedded the query into the first data tuple sent
to each neighboring node. - Optimization Nodes not involved in the query
computation will not receive the query.
14Query Plan Distributed Execution (1)
Each Iteration represent the traversal of a
cloud in Figure 2.
15Query Plan Distributed Execution (2)
- A node needs to take up to k iterations to
converge to a steady state when receiving a
query. k is the diameter of the network. - The total time taken for a query to converge is
proportional to 2k. - The initial query takes up to k iterations to
reach the farthest node from the query node.
16The Basics (4)
- Distance Vector Protocol Expression
- DV1 path(S, D, D, C) - link(S, D, C)
- DV2 path(S, D, Z, C) - link(S, Z, C1), path(Z,
D, W, C2), C C1 C2 - DV3 shortestCost(S, D, minltCgt) - path(S,D, Z,
C). - DV4 nextHop(S, D, Z, C) - path(S, D, Z, C),
shortestCost(S, D, C). - Query nextHop(S, D, Z, C)
- Count-to-Infinity problem
- Using split-horizon a method of preventing a
routing loop in a network. - The basic principle is simple Information about
the routing for a particular packet is never sent
back in the direction from which it was received.
- Modification
- include(DV1, DV3, DV4)
- DV2 path(S, D, Z, C) - link(S, Z, C1), path(Z,
D, W, C2), C C1 C2, W ? S. - DV5 path(S, D, Z, 8) - link(S, Z, C1), path(Z,
D, S, C2).
17Outlines
- Abstract
- System Model
- The Basics
- Challenges
- Expressiveness
- Security
- Optimization
- Stability and Robustness
- Evaluation
- Conclusion
18Challenges
- Four challenges to justify the feasibility of
declarative routing - Expressiveness
- How to express various routing policies? Any
limitation? - Security
- Is it safe enough to execute queries issued by
untrusted third-parties? - Efficiency
- How to adapt or develop the Query Plan Generation
to perform the queries well in a large network? - How to reduce the redundant work performed by
various routing queries issued concurrently? - Stability and Robustness
- Since the network is dynamic, how to efficiently
maintain the robustness and accuracy of long term
routes?
19Outlines
- Abstract
- System Model
- The Basics
- Challenges
- Expressiveness
- Security
- Optimization
- Stability and Robustness
- Evaluation
- Conclusion
20Expressiveness
- Main goal here is
- To illustrate the natural connection between
recursive queries and network routing. - To highlight the flexibility, ease of programming
and ease of reuse afforded by a query language. - Routing protocols to be expressed
- Best-Path Routing
- Policy-Based Routing
- Dynamic Source Routing
- Link State
21Expressiveness Best-Path Routing
- From base rules in the first Network-Reachability
example - NR1 path(S, D, P, C) - link(S, D, C), P
f_concatPath( link(S, D, C), nil ). - NR2 path(S, D, P, C) - link(S, Z, C1), path(Z,
D, P2, C2), C f_compute(C1, C2), P
f_concatPath( link(S, Z, C1), P2 ). - BPR1 bestPathCost(S, D, AGGltCgt) - path(S,D, P,
C). - BPR2 bestPath(S, D, P, C) - path(S, D, P, C),
bestPathCost(S, D, C). - Query bestPath(S, D, Z, C)
- If best-path is the shortest-path, replace
f_compute with f_sum, and AGG with min. - Then, add an extra condition f_inPath(P2, S)
false to rule NR2 to avoid cycles in paths. - Additionally, QoS requirement can be specified
with certain of constraints. - Add an extra constraint Cltk to the rules NR1 and
NR2 to restirct the set of paths to those with
costs below a loss or latency threshold k.
22Expressiveness Policy-Based Routing
- To restrict the scope of routing by precluding
paths that involve undesirable nodes. - include(NR1, NR2)
- PBR1 permitPath(S, D, P, C) - path(S, D, P, C),
excludeNode(S, W), f_inPath(P, W) false. - Query permitPath(S, D, P, C).
- An additional table excludeNode is introduced.
- excludeNode(S, W) reresents that node S does not
carry any traffic for node W. This table is
stored at each node S. - We can generate bestPath tuples meeting the above
policy by adding BPR1 and BPR2.
23Expressiveness Dynamic Source Routing
- Flip the order of path and link in the body of
rule NR2 to using left recursion, we get DSR. - include(NR1)
- DSR1 path(S, D, P, C) - path(S, Z, P1, C1),
link(Z, D, C2), f_concatPath(P1, link(Z, D, C2)),
C C1 C2. - Query path(S, D, P, C).
- We can also generate bestPath tuples by adding
BPR1 and BPR2.
24Expressiveness Link State
- Flood the links to all nodes in the network.
- LS1 floodLink(S, S, D, C, S) - link(S, D, C).
- LS2 floodLink(M, S, D, C, N) - link(N, M, C1),
floodLink(N, S, D, C, W), M ? W. - Query floodLink(M, S, D, C, N)
- Then, if all the links are available at each
node, a local version of the Best-Path query is
executed locally using the floodLink tuples.
25Outlines
- Abstract
- System Model
- The Basics
- Challenges
- Expressiveness
- Security
- Optimization
- Stability and Robustness
- Evaluation
- Simulation
- Experiments
- Conclusion
26Security
- Security is a key concern with any extensible
system. - Bound on the resource consumption
- Queries written in Datalog language have
polynomial time and space complexity in the size
of the input. - Termination of an augmented Datalog query
- The addition of arbitrary functions, the time
complexity of a Datalog program is no longer
polynomial. - Several powerful static tests for termination
18. - Side-effect-free language
- Taking a set of stored tables as input and
produce a set of derived tables. - The execution is standboxed within the query
engine. - As a result, Datalog eliminates many of the risks
usually associated with extensible systems. - Still many other security issues (but orthogonal
to network extensibility, and wont be addressed) - Denial-of-service attacks
- Compromised routers
27Outlines
- Abstract
- System Model
- The Basics
- Challenges
- Expressiveness
- Security
- Optimization
- Stability and Robustness
- Evaluation
- Simulation
- Experiments
- Conclusion
28Optimization (1)
- Pruning Unnecessary Paths
- The Inefficiency
- Queries with aggregates start by enumerating all
possible paths. - Solution
- Using a query optimization technique, aggregate
selections 25, 22. - E.g. Figure 3
- By maintaining a min-so-far aggregate value for
the current shortest path cost from node a to its
destination nodes, we can selectively avoid
sending path tuples to neighbors if we know they
can not be involved in the shortest path. - In general, aggregate selections are useful when
AGG function can be used to prune communication.
29Optimization (2)
- Subsets of Sources and Destinations (2
techniques) - Magic Sets Rewrite
- To limit query computation to the relevant
portion(nodes) of the network. - E.g. if nodes b and c are the only nodes issuing
the path query (After this, only nodes reachable
from b and c participate in this query)
30Optimization (3)
- Left-Right Recursion Rewrite
- To generate best paths from magicSources to
magicDsts nodes, Best-Path-Pair (Using left
recursion)
31Optimization (4)
- If all nodes are running the same query, then
using right-recursion to directly utilize path
info sent by neighboring nodes. - If only a small of subset of nodes are issuing
the same query, using left-recursion to lower
message overhead. - Drawback of Best-Path-Pair
- No sharing if magicSource(b) is added.
32Optimization (5)
33Outlines
- Abstract
- System Model
- The Basics
- Challenges
- Expressiveness
- Security
- Optimization
- Stability and Robustness
- Evaluation
- Simulation
- Experiments
- Conclusion
34Stability and Robustness (1)
- Each query is accompanied by a lifetime. During
this period, changes in the network might result
in some re-computation. - Using continuous queries to re-compute new
results based on changes in the network - Each router is responsible for detecting changes
to its local information and reporting these
changes to its local query processor. - E.g. consider the Network-Reachability query
35Stability and Robustness (2)
36Outlines
- Abstract
- System Model
- The Basics
- Challenges
- Expressiveness
- Security
- Optimization
- Stability and Robustness
- Evaluation
- Conclusion
37Evaluation (Size of network)
38Evaluation (All-Pairs Shortest Paths 1)
39Evaluation (All-Pairs Shortest Paths 2)
- Two observations
- Convergence latency for the Best-Path query is
proportional to the network diameter, and
converges in the same time compared to the path
vector protocol. - Per-node communication overhead increases
linearly with the number of nodes. - Both observations are consistent with the
scalability properties of the traditional
distance vector and path vector protocols
40Evaluation (Source/Destination Queries 1)
41Evaluation (Source/Destination Queries 2)
42Evaluation (Mixed Query Workload)
43Outlines
- Abstract
- System Model
- The Basics
- Challenges
- Expressiveness
- Security
- Optimization
- Stability and Robustness
- Evaluation
- Conclusion
44Conclusion
- Observation
- Finding a connection between two different domain
might address some difficult problems in one by
using the techniques from the other
(well-studied).