Application of AI and MLTechniques to FaultTolerant Routing - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Application of AI and MLTechniques to FaultTolerant Routing

Description:

Fault-Tolerant Routing on Complete Josephus Cubes' (not AI ... [4] Bradley, Tyrrell., ' Immunotronics: Hardware Fault Tolerance Inspired by the Immune System' ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 54
Provided by: bron6
Category:

less

Transcript and Presenter's Notes

Title: Application of AI and MLTechniques to FaultTolerant Routing


1
Application of AI- and ML-Techniques to
Fault-Tolerant Routing
Arjun Rao CS 717 November 16 and 18, 2004
2
Papers Covered
  • 1 Loh, Peter K.K., Artificial Intelligence
    Search Techniques as Fault-Tolerant Routing
    Strategies
  • 2 Loh, Shaw., A Genetic-Based Fault-Tolerant
    Routing Strategy for Multiprocessor Networks

3
Papers Covered (cont.)
  • 3 Loh, Schröder, Hsu., Fault-Tolerant Routing
    on Complete Josephus Cubes (not AI-related but
    interesting nevertheless)
  • If time permits, also
  • 4 Bradley, Tyrrell., Immunotronics Hardware
    Fault Tolerance Inspired by the Immune System

4
The Problem of Routing
  • Communication between nodes
  • Servers
  • Microprocessors
  • Desire shortest, most efficient paths
  • Multiprocessor network topologies, e.g.
    hypercubes, Josephus cubes, etc.
  • Desire availability of paths
  • What to do when links/nodes fail?
  • How to remain (close to) optimal?

5
Intro to Fault-Tolerant Routing
  • Current algorithms adaptive but non-minimal
  • Misrouting
  • Routing strategies tied to specific topologies
  • k-ary, n-cubes, meshes, etc. Regular structures
    and symmetry
  • Constrained by fault number and types
  • More general strategies vulnerable to deadlock
    and livelock

6
Turn Model Glass, Ni
  • Widest application scope
  • k-ary, n-cubes, nD-meshes, torus geometries, etc.
  • West-First algorithm (on 2D-mesh)
  • Messages prevented from turning west again
  • Prevents cycles?deadlocks
  • Routing along virtual channels in strictly
    decreasing or increasing order

7
Turn Model and Channel Numbering
8
Turn Model (cont.)
  • Three examples of routing
  • F FAILURE
  • Full adaptation w/o deadlock and livelock
    requires more global info?more overhead

9
AI Search Techniques
  • Arbitrary topology ? Search space
  • Search space ? Search tree(s)
  • Adaptive but still non-minimal
  • Characteristic recursion impractical on
    loosely-coupled, distributed network

10
AI Logical Abstraction
  • Abstraction
  • S Problem space
  • O Set of objectives
  • P Search paths
  • S (O, P), where oi ? O and pj ? P, each pj
    connects tuple (ok, ol), k ? l
  • Abstraction used to model

11
Multiprocessor Network w/ Generic Topology
  • Network
  • N Nodes
  • L Links between nodes
  • G (N, L), where ni ? N and lj ? L, each lj
    connects tuple (nk, nl), k ? l
  • Objective ? Node
  • Search path ? Link

12
Abstract Routing Model
  • Search ?
  • ?(os, ot) S x S ? S, where S (O, P) and S
    (O, P)
  • ox,oy ? O and ox,oy ? O ? Successful search
  • ox,oy ? O and ox ? O, oy ? O ? Unsuccessful ?
  • Routing attempt R
  • R(ns, nd) G x G ? G, where G (N, L) and G
    (N, L)
  • ni,nj ? N and ni,nj ? N ? Complete route
  • ni,nj ? N and ni ? N, nj ? N ? Incomplete ?

13
Routing Analogy
  • AI search equivalent to routing attempt
  • Successful search ? Route between source and
    destination nodes
  • Unsuccessful search ? Incomplete route to
    destination

14
Caveats of Analogy
  • No specific search algorithm ? No routing
    strategy
  • No optimality constraints
  • Nothing about deadlocks/livelocks
  • Nothing about fault tolerance!!

15
Fault-Tolerant Routing Model
  • Model considers two aspects
  • Routing system configuration
  • Must be generic enough!
  • Message propagation protocols and policies
  • Following slides introduce what is needed for AI
    searches (w/ physical message backtracking)

16
FT Routing Model (cont.)
17
FT Routing Model (cont.)
  • Eager readership of input messages
  • Single input buffer to avoid polling
  • Multiple output buffers to accommodate different
    delivery rates
  • Router process
  • AI/FT routing strategy implemented here
  • Physical message backtracking ? Increased message
    sizes
  • Increased message sizes/overhead ? Requires
    communications router at each node

18
Communications Router
19
Communications Router (cont.)
  • Communication router constitutes router process
    and connections
  • Main components LCM and CP
  • ROM Stores link management and routing software
  • RAM Stores routing table, link status table,
    associated link lists

20
CR Data Structure Routing Table
21
CR Routing Table
  • For each node, up to n links
  • For each link
  • Connected with status OK and node ID of neighbor
  • Not connected with status NC and node ID 1
  • Link fault represented by timeout
  • Status reset to NC
  • Processor fault represented by timeouts in
    neighbors

22
CR Data Structures Link Status Table, Lists
23
Message Packets
  • Six fields
  • Router Control (4 bits) Type of message,
    including NORMAL and BACKTRACK
  • Destination Node ID (10 bits) Supports network
    of size up to 1024 nodes
  • Pending Nodes (20 bytes) Stack of node IDs that
    may receive packet but have not yet
  • Traversed Nodes (20 bytes) Stack of nodes
    traversed, with most recent on top

24
Message Packets (cont.)
  • Traversed Nodes Index (10 bits) Index to
    previous traversed nodes field. Supports
    simulation of physical message backtracking
  • Data Field (n-bit pointer) Points to
    information content of packet

25
(Finally) AI Search Strategies
  • Brute Force
  • Depth-First Search
  • Random Climbing
  • Heuristic
  • Hill Climbing
  • Best-First Search
  • A

26
AI Search Strategies (cont.)
  • In presence of network faults
  • Prevent cycles ? No deadlocks
  • Prevent more than two traversals of nodes/links ?
    No livelocks and necessary for AI searches
  • Adaptations of search algorithms
  • Problems
  • Recursion? Nope (PMB)
  • Overhead? Fixed (Well, mostly)

27
Common Beginning
  • Extracts header and disassembles it
  • IF Destination Node is reached, pass packet to
    host processor
  • ELSE
  • IF Router Control is BACKTRACK
  • IF Pending Nodes top node is directly linked
  • Route packet to that node
  • Set Router Control to NORMAL
  • ELSE
  • Backtrack packet to previous node in
    traversed
  • Pop current node ID from Pending Nodes
  • Push current node ID onto Traversed Nodes

28
Depth-First Search
  • Travel as far as possible
  • Do not consider alternative paths just yet
  • If fault or dead-end, backtrack to most recent
    possible path

29
DFS (cont.)
  • Following common beginning
  • Look for directly linked successor nodes
  • IF they are already traversed, ignore
  • ELSE IF they are in Pending Nodes, ignore
  • ELSE push them onto Pending Nodes
  • Read top node of Pending Nodes
  • IF directly linked (no fault), route packet to it
  • ELSE Set BACKTRACK and route to last traversed
    node
  • END

30
DFS Example
31
DFS Example (cont.)
32
Random Climbing
  • Following the common beginning
  • ELSE
  • Select a successor node randomly
  • Push unselected successor nodes onto Pending
    Nodes

33
Hill Climbing
  • Heuristic Estimated remaining distance
  • Following common beginning
  • ELSE
  • Sort successor nodes according to est. remaining
    distance
  • Push sorted nodes onto Pending Nodes

34
Best-First Search
  • Resumes partial routes not previously considered
  • Looks at immediate neighbors, neighbors of
    predecessors
  • Sorts by est. remaining distance
  • Leads to non-minimal routes!

35
BFS (cont.)
  • ELSE
  • Push (directly linked successor nodes) onto
    Pending Nodes
  • Sort Pending Nodes according to est. remaining
    distance

36
A
  • Two heuristics
  • Estimated remaining distance h
  • Path length traversed g
  • Partial paths sorted by f g h
  • When no faults, always finds minimal route

37
A (cont.)
  • After current ID processing
  • Record path length traversed, g
  • ELSE
  • Calculate and store f for new successor nodes
  • Push them onto Pending Nodes sorted by f

38
Performance Testing
  • Simulated 125-node multiprocessor network
  • Max 8 links per node (maps to many topologies)
  • Faulty links and processors
  • Pre-specified or dynamically generated
  • Testing
  • Messages between every pair of nodes
  • 20 trials at 0, 5, 10, 15, 20 faulty links
  • 125 x 125 x 20 x 6 1,875,000 tests (??)

39
Test Results
  • As faults increase, heuristic strategies fair
    better (esp. gt 15)
  • A best search technique but slow
  • Hill climbing and BFS do not consider nodes
    traversed
  • Hill climbing considers only immediate neighbors

40
Test Results (cont.)
41
Main Point
  • Using AI search techniques, we abstract from
    routing in networks to searching in trees
    (topology-independent, quantity and type of
    faults irrelevant)

42
Next Paper
  • 1 Loh, Peter K.K., Artificial Intelligence
    Search Techniques as Fault-Tolerant Routing
    Strategies
  • 2 Loh, Shaw., A Genetic-Based Fault-Tolerant
    Routing Strategy for Multiprocessor Networks

43
Our Little Problem
  • AI search techniques topology- and fault-type
    independent
  • but non-minimal routes utilized
  • Follow-up work shows how genetic algorithms
    (combined with heuristics) can find minimal
    routes in presence of network faults

44
Genetic Algorithms Overview
  • Optimization strategy
  • Population of potential solutions evolve over
    series of generations
  • Each element of population is chromosome each
    unit of chromosome is gene
  • Chromosomes undergo crossover and mutation
  • Most fit chromosomes selected for next
    generation, based upon fitness function

45
Abstract Model
  • Same as before (including definitions of S and G)
  • Pure abstraction suffers from same caveats as
    before
  • Basic idea Instead of AI search for adaptive
    route, optimize over population of routes to find
    best

46
Message Packets
  • Simplified version

47
Chromosome
  • Route ? Chromosome
  • Node on route ? Gene in chromosome
  • Length of route ? Size of chromosome
  • Chromosome size directly reflects routing
    performance!
  • Distance traversed basis of fitness

48
Population Creation
49
Mutation and Crossover
  • Mutation Swap and/or shift
  • Normal crossover destroys routes, messes with
    source and destination problem w/ different
    lengths
  • Use one-point random crossover

50
Fitness Function
  • F (Dmax Droute) / Dmax ?
  • Dmax Maximum distance between source and
    destination
  • Droute Distance traveled by specific route
  • ? Predefined value to ensure non-zero fitness
  • Higher value ? More fit

51
Selection Scheme
  • Roulette Wheel
  • Sum of fitness values random value from 0,1
  • Select chromosomes with fitness greater than
    product
  • Tournament Selection
  • Most fit chromosomes selected
  • Stochastic Remainder
  • Probabilities used to select route
  • Which scheme has best performance selecting
    optimal route?

52
Reroute
53
Genetic Hybrid Algorithm
Write a Comment
User Comments (0)
About PowerShow.com