Application Layer Multicast for Earthquake Early Warning Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Application Layer Multicast for Earthquake Early Warning Systems

Description:

Pastry routes the message to the rendez-vous point rebuilding the tree Root failure Root state is replicated in the k closest nodes When children detect failure, ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 18
Provided by: Informatio173
Learn more at: https://ics.uci.edu
Category:

less

Transcript and Presenter's Notes

Title: Application Layer Multicast for Earthquake Early Warning Systems


1
Application Layer Multicast for Earthquake Early
Warning Systems
  • Valentina Bonsi - April 22, 2008

2
Agenda
  • Architecture of an Earthquake Early Warning
    System
  • Crisis Alert Early Warning dissemination
  • Application Layer Multicast implementation
  • Enhancing reliability of ALM

3
Earthquake Early Warning System
Schools
Municipalities
  • Earthquake assessment center
  • determine magnitude and location of the event
  • predict the peak ground motion expected in the
    region around the event

Sensors (Seismometers etc)
Transportation authorities
Companies
Public
Focus of the ElarmS system (www.elarms.com)
Crisis Alert system
4
Crisis Alert current implementation
  • Notification can be triggered automatically when
    message from the Earthquake assessment center is
    received
  • Dedicated engine to process early warning
    policies
  • No human interaction required
  • Dissemination modalities
  • CREW protocol
  • Application Layer Multicast (Scribe on Pastry,
    http//research.microsoft.com/antr/scribe/)
  • Organization based dissemination no
    dissemination to the public

5
Application Layer Multicast Scribe
implementation
Scribe application-level multicast infrastructure
Application level
Pastry object location and routing for P2P
systems
Internet
6
Pastry
  • Pastry node
  • device connected to the Internet and running
    Pastry node SW
  • Assigned a unique 128-bit id
  • Node state

Leaf set ?L? nodes with the?L?/2 numerically
closest larger NodeId and ?L?/2 numerically
closest smaller NodeId
Routing table ?log2bN? cols, 2b-1 rows Entry in
ith col shares i digits with current node Each
entry is the closest existing node with the
appropriate prefix
Neighborhood set ?M? nodes that are physically
closest to the local node
7
Pastry routing
  • Algorithm
  • Check if the message key falls in the leaf set ?
    forward message to destination
  • Find the node in the routing table that shares
    with the key one more digit than the current node
    ? forward message to the next node
  • Find the node in the routing table that shares
    with the key the same number of digit than the
    current node but is numerically closer to the key
    ? forward message to the next node
  • It always converges
  • Number of routing steps
  • Expected ?log2bN?
  • worst case (many simultaneous node failures) N
  • Delivery guaranteed unless ??L?/2? nodes with
    consecutive ids fail simultaneously

8
Pastry self-organization adaptation
  • Node arrival
  • New node X must know a nearby pastry node A
  • X asks A to forward a JOIN message ? forwarded to
    Z, the node with id closer to X
  • All nodes that have seen the JOIN send to X their
    state tables
  • X initiate its state with collected info (L
    Z.L, MA.M)
  • X notify every node in its state table
  • Node departure
  • Leaf node failure detected during routing
  • Contact the live node in its L with the largest
    index on the size of the failed node X (smaller
    or grater than himself)
  • Choose the new leaf among X.L
  • Neighbor node failure detected during periodic
    contact
  • Contact other members in N
  • Choose the closest neighbor to add
  • Routing table node failure detected during
    routing
  • Contact other members in the same routing table
    row and look for the node in the same position of
    the failed one
  • Eventually continue the search among nodes in the
    next routing table row

9
Scribe
  • Event notification infrastructure for topic-based
    publish-subscribe applications
  • Scribe system network of Pastry nodes running
    the Scribe SW
  • Topic creation
  • node send a CREATE message with topic id as
    destination ? Pastry delivers it to the node with
    the node id numerically closest to topic id
    (rendez-vous point)

10
Scribe membership management
  • Node subscription
  • Node sends a SUBSCRIBE message with key topicId
  • Pastry routes message to rendez-vous point but
    each node along the route
  • Checks if he is a forwarder for the topic
  • If not, add the topic to its current list and
    sends a SUBSCRIBE message to the same topic
  • Add the sender to the children list for the topic
    and stops the original message
  • Node unsubscription
  • Node unsubscribes locally and checks if there are
    children for the topic
  • If not, sends an UNSUBSCRIBE message to its
    parent (possible only after having received at
    least one event from parent)
  • Disseminating a message
  • If publisher has rendez-vous IP address send a
    PUBLISH message to it
  • Otherwise send uses Pastry to locate it by
    sending a PUBLISH message with topicId as key
  • Rendez-vous node disseminate message along the
    tree

11
Scribe Reliability
  • Scribe provides reliable ordered delivery if
    the TCP connections between nodes in the tree do
    not fail
  • Repairing the multicast tree
  • Children periodically confirm their interest in
    the topic by sending a message to their parents
  • Parents send heartbeat to children, when child do
    not receive heartbeats sends a SUBSCRIBE message
    with topicId as key. Pastry routes the message to
    the rendez-vous point rebuilding the tree
  • Root failure
  • Root state is replicated in the k closest nodes
  • When children detect failure, new tree is build
    with one of these nodes as root

12
Scribe experimental evaluation
  • Setup
  • Simulator that models propagation delay but not
    queuing, packet losses or cross traffic.
  • 100,000 nodes
  • 1,500 groups with different size (11 to 100,000)
  • Delay penalty (compared with IP multicast)
  • Max delay penalty 1.69 for 50 of groups, 4.26
    max
  • Avg delay penalty 1.68 for 50 of groups, 2 max
  • Node stress
  • Mean number of children tables per node 2.4 (Max
    40)
  • Mean number of entries per node 6.2 (Max 1059)
  • Link stress (compared with IP multicast)
  • Mean number of messages per link Scribe 2.4, IP
    0.7
  • Max number of messages per link Scribe 4031, IP
    950
  • Scalability with many small groups (50,000 nodes,
    30,000 groups 11 nodes each)
  • Mean number of entries per node 21.2, naïve
    multicast 6.6
  • Problem is that many long paths with no branches
    are created ? algorithm for collapsing the tree
    removing nodes that are not members and have only
    one child per table

13
Reliability of ALM
  • In the ALM implementation we consider, nodes that
    subscribe to a topic are organized in a tree
  • If one of the nodes fails, all the sub-tree
    rooted at that node will be unreachable until the
    tree is rebuilt
  • Impossible to guarantee that nodes will receive
    early warning since rebuilding the tree can
    require few seconds ? can be too late
  • Idea build a graph instead of a tree, where the
    subscription of each node is maintained in k
    nodes
  • If we assume single failure model and k2, in
    case of a node failure the sub-tree rooted at
    that node will be notified by the other node that
    maintain those subscription while the graph is
    repaired.

14
Membership management
  • Node subscription
  • Node sends one subscribe messages to the topic Id
    and waits for k subscribe ack
  • Node that receive the subscribe message performs
    the Scribe subscription and disseminates k-1
    subscribe messages to its replicas
  • Subscribe node can receive
  • SUBSCRIBE ACK ? add new parent
  • SUBSCRIBE FAILED ? impossible adding another
    parent in the current situation (retry when a
    node joins or leaves)
  • SUBSCRIBE LOST ? re-send subscribe message
  • Node unsubscription
  • Same as Scribe but UNSUBSCRIBE message is send to
    all parents.
  • Message dissemination
  • Same as Scribe but each node maintains a cache of
    sent messages in order to avoid re-multicasting

15
Multicast graph maintenance
  • Repairing the multicast tree
  • Children periodically check that one of their
    parents is alive and confirm their interest in
    the topic by sending a SUBSCRIPTION CONFIRM
    message
  • When child detects parents failure sends a new
    SUBSCRIBE message.
  • Root failure
  • Root state is replicated in the k closest nodes
  • If root node fails, messages with key topic Id
    are routed to one of the root replicas (because
    it now has the closest Id to the topic)
  • Children of the root will detect the failure and
    send a new SUBSCRIBE message

16
Graph-based ALM issues
  • Dissemination path contains loops ? cache sent
    messages
  • Cache should contain a unique id
    hash(sourceid)
  • How to choose replica nodes
  • Pastry routing delivers a message to the node
    with id closest to the message destination ? each
    subscription is maintained in the node that would
    have been the parent in the tree structure Scribe
    generates and in the k-1 nodes with ids closest
    to it
  • Root is replicated in the same way ? if it fails
    one of the k-1 replicas will become the new root
    and will already have the list of subscribed
    nodes
  • Position of the root is dynamic every time a new
    node is added, can become the new rendez-vous
    point if its Id is the closest to the topic Id
  • Issues can arise in this case because potentially
    only the old rendez-vous point will subscribe to
    the new one ? if the old rendez-vous point fails,
    the dissemination stops
  • 2 possible solutions
  • Impose additional constraints on the fan-out of
    nodes (h children)
  • Fix the rendez-vous point to be a stable node

17
Testing plan
  • Reliability
  • Message dissemination delay in normal conditions
    and with failures
  • Failure models
  • Independent failures
  • Geographical failures
  • Scalability
  • Set of EEW receivers includes schools,
    transportation companies, cities.
  • Test with increasing number of receivers for each
    organization
Write a Comment
User Comments (0)
About PowerShow.com