Geographic Hash Table - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

Geographic Hash Table

Description:

Title: Slide 1 Author: mborkow Last modified by: Department of CSE, BUET Created Date: 3/1/2004 10:35:15 PM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 48
Provided by: mbo125
Category:

less

Transcript and Presenter's Notes

Title: Geographic Hash Table


1
Geographic Hash Table
  • S. Ratnasamy, B. Karp, S. Shenker, D. Estrin, R.
    Govindan, L. Yin and F. Yu

2
Overview
  • Data Centric Storage
  • Data is stored inside the network each name
    corresponds to a location in space
  • All data with the same name will be stored at the
    same sensor network location
  • E.g an elephant sighting
  • Why Data centric Storage?
  • Energy efficiency
  • Robustness against mobility and node failures
  • Scalability

3
Keywords and Terminology
  • Observation
  • ? low-level readings from sensors
  • ? e.g. Detailed temperature readings
  • Events
  • ? Predefined constellations of low-level
    observations
  • ? e.g. temperature greater than 75 F
  • Queries
  • ?Used to elicit information from sensor network

4
Performance MetricTotal Usage /Hotspot Usage
  • Use communication as a cost function for energy
    consumption
  • Total Usage
  • Total number of packets sent in the
    Sensor network
  • Hotspot Usage
  • The maximal number of packets send by a
    particular sensor node
  • Costs used in the evaluation
  • Message flooding cost O(n)
  • Point-to-point routing cost
  • n is the number of nodes

5
Alternative Storage Schemes
  • External Storage (ES)
  • Events propagated and stored at an external
    location
  • Local Storage (LS)
  • Events stored locally at the detecting node
  • Queries are flooded to all nodes and the events
    are sent back
  • Data Centric Storage (DCS)
  • Data for an event stored within the sensor
    network
  • Queries are directed to the node that stores the
    data

6
External Storage (ES)
External storage
event
7
Local Storage (LS)
Queries flooded at all the nodes
event
event
8
Why do we need DCS?
  • Scalability
  • Robustness against Node failures and Node
    mobility
  • To achieve Energy-efficiency

9
Design Criterial Scalability Robustness
  • Node failures
  • Topology changes
  • System scale to large number of nodes
  • Energy Constraints
  • Persistence
  • (k,v) pair must remain available to queries,
    despite sensor node failures and changes in
    sensor network topology
  • Consistency
  • A query k must be routed correctly to a node
    where (k,v) pairs are stored if these node
    change, then they should do this consisently
  • Scaling in Database Size
  • Topological generality system should scale well
    on a large number of topologies

10
Assumptions in DCS
  • Large Scale networks whose approximate
    geographic boundaries are known
  • Nodes have short range communication and are
    within the radio range of several other nodes
  • Nodes know their own locations by GPS or some
    localization scheme
  • Communication to the outside world takes place by
    one or more access points

11
Data Centric Storage
  • Relevant Data are stored by name at nodes
    within the Sensor network
  • All data with the same general name will be
    stored at the same sensor-net node.
  • e.g. (elephant sightings)
  • Queries for data with a particular name are then
    sent directly to the node storing those named
    data

12
Data centric Storage
Elephant Sighting
sourcelass.cs.umass.edu
13
Geographic Hash Table
  • Events are named with keys and both the storage
    and the retrieval are performed using keys
  • GHT provides (key, value) based associative memory

14
Geographic Hash Table Operations
  • GHT supports two operations
  • ? Put(k,v)-stores v (observed data) according
    to the key k
  • ? Get(k)-retrieve whatever value is
    associated with key k
  • Hash function
  • ? Hash the key in to the geographic
    coordinates
  • ? Put() and Get() operations on the same
    key k hash k to the same location

15
Storing Data in GHT
Put (elephant, data)
(12,24)
Hash (elephant)(12,24)
sourcelass.cs.umass.edu
16
Retrieving data in GHT
(12,24)
Hash (elephant)(12,24)
Get (elephant)
17
Geographic Hash Table
Node A
Node B
18
Algorithms Used By GHT
  • Geographic hash Table uses GPSR for
    Routing(Greedy Perimeter Stateless Routing)
  • PEER-TO-PEER look up system
  • (data object is associated with key and each
    node in the system is responsible for storing a
    certain range of keys)

19
Algorithm (Contd)
  • GPSR- Packets are marked with position of
    destinations and each node is aware of its
    position
  • Greedy forwarding algorithm
  • Perimeter forwarding algorithm

B
B
A
A
20
GPSR Right-Hand Rule In Perimeter Forwarding
2
x
z
3
1
y
21
Home Node and Home perimeter
  • Home node Node geographically nearest to the
    destination coordinates of the packet
  • Serves as the rendezvous point for Get() and
    Put() operations on the same key
  • In GHT packet is not addressed to specific node
    but only to a specific location
  • Use GPSR to find the home node
  • only perimeter mode of GPSR to find Home
    Perimeter
  • Home Perimeter perimeter that encloses the
    destination
  • Start from the home node, and use perimeter mode
    to make a cycle and return to the home node

22
Problems
  • Robustness could be affected
  • Nodes could move (i.d. of Home node?)
  • Node failure can Occur
  • Deployment of new Nodes
  • Not Scalable
  • Storage capacity of the home nodes
  • Bottleneck at Home nodes

23
Solutions to the problems
  • Perimeter refresh protocol
  • mostly addresses the robustness issue
  • Structured Replication
  • address the scalability issue
  • how to handle storage of many events

24
Perimeter refresh protocol
  • Replicates stored data for key k at nodes around
    the location to which k hashes
  • Stores a copy of the key value pair at each node
    on the home perimeter
  • Each node on the perimeter is called a replica
    node
  • How do you ensure consistency persistence
  • A node becomes the home node if a packet for a
    particular key arrives at that node
  • The perimeter refresh protocols periodically
    sends out refresh packets
  • After a time period Th generate a refresh packet
    that contains the data for that key
  • Packet forwarded on the home perimeter in the
    same way as Get() and Put()
  • The refresh packet will take a tour of the home
    perimeter regardless the changes in the network
    topology since the keys insertion
  • This property maintains the perimeter

25
Perimeter Refresh Protocol
  • How do you guard against node failures
  • When a replica node receives a packet it did not
    originate, it caches the data in the refresh and
    sets up a takeover timer Tt
  • Timer is reset each time a refresh from another
    node arrives
  • If the timer expires the replica node initiates a
    refresh packet addressed to the keys hashed
    location
  • Note That particular node does not determine a
    new home node. The GHT routing causes the refresh
    to reach a node home node

26
Perimeter Refresh Protocol
E
Replica
Assume key k hashes at location L A is closest
to L so it becomes the home node
Replica
D
L
F
A
home
B
C
27
Perimeter Refresh Protocol
E
Replica
D
Replica
Suppose the node A dies
L
F
home
C
Replica
B
Replica
28
Time Specifications
  • Refresh time (Th)
  • Take over time (Tt)
  • Death time (Td)
  • General rule
  • TdgtTh and TtgtTh
  • In GHT Td3Th and Tt2Th

29
Characteristics Of Refresh Packet
  • Refresh packet is addressed to the hashed
    location of the key
  • Every (Th) secs the home node will generate
    refresh packet
  • Refresh packet contains the data stored for the
    key and routed exactly as get() and put()
    operations
  • Refresh packet always travels along the home
    perimeter

30
Structured Replication
  • Too many events are detected then home node will
    become the hotspot of communication.
  • Structured replication is used to address the
    scaling problem
  • Hierarchical decomposition of the key space
  • Event names have a certain hierarchy depth

31
Structured Replication
32
Structured Replication
  • A node that detects a new event, stores that
    event to its closest mirror
  • this is easily computable
  • This reduces the storage cost, but increases the
    query cost
  • GHT has to route the queries to all mirror nodes
  • Queries are routes recursively
  • First route query to the root, then to the first
    level and then to the second level mirrors
  • Structured replication becomes more useful for
    frequently detected events

33
Evaluation
  • Simulation to test if the protocol is functioning
    correctly
  • Done in the ns-2 network simulator using an IEEE
    802.11 mac
  • This is a well known event driven simulator for
    ad-hoc networks
  • Larger scale simulations for the comparative
    study where done with a custom simulator

34
Comparative Study
  • Simulation compares the following schemes
  • External Storage (ES)
  • Local Storage (LS)
  • Normal DCS A query returns a separate message
    for each detected event
  • Summarized DCS(S-DCS) A query returns a single
    message regardless of the number of detected
    events
  • Structured Replication DCS (SR_DCS) Assuming an
    optimal level of SR
  • Comparison based on Cost
  • Comparison based on Total usage and Hot spot
    usage

35
Assumptions in comparison
  • Asymptotic costs of O(n) for floods and O( n) for
    point to point routing
  • Event locations are distributed randomly
  • Event locations are not known in advance
  • No more than one query for each event type
  • (Q Queries in total)
  • Assume access points to be the most heavily used
    area of the sensor network

36
Comparison based onHot-spot/Total Usage
  • n - Number of nodes
  • T - Number of Event types
  • Q Number Of Event types queried for
  • Dtotal Total number of detected events
  • DQ- Number of detected events for queries

37
DCS TYPES
  • Normal DCS Query returns a separate message for
    each detected event
  • Summarized DCS Query returns a single message
    regardless of the number of detected events
  • (usually summary is preferred)

38
Comparison Study contd..
ES LS DCS
Total
Hot spot
39
Observations from the Comparison
  • DCS is preferable only in cases where
  • Sensor network is Large
  • There are many detected events and not all event
    types queried
  • Dtotalgtgtmax(Dq,Q)

40
Simulations
  • To check the Robustness of GHT
  • To compare the Storage methods in terms of total
    and hot spot usage

41
Simulation Setup
  • ns-2
  • Node Density 1node/256m2
  • Radio Range 40 m
  • Number of Nodes -50,100,150,200
  • Mobility Rate -0,0.1,1m/s
  • Query generation Rate -2qps
  • Event types 20
  • Events detected -10/type
  • Refresh interval -10 s

42
Performance metrics
  • Availability of data stored to Queriers
  • (In terms of success rate)
  • Loads placed on the nodes participating in GHT
    (hotspot usage)

43
Simulation Results for Robustness
  • GHT offers perfect availability of stored events
    in static case
  • It offers high availability when nodes are
    subjected to mobility and failures

44
Simulation Results under varying Q
Number of nodes is
constant 10000
45
Simulation results under varying N
Number of Queries Q 50
46
Simulation Results for comparison of 3-storage
methods
  • S-DCS have low hot-spot usage under varying Q
  • S-DCS is has the lowest hot-spot usage under
    varying n

47
Conclusion
  • Data centric storage entails naming of data and
    storing data at nodes within the sensor network
  • GHT- hashes the key (events) in to geographical
    co-ordinates and stores a key-value pair at the
    sensor node geographically nearest to the hash
  • GHT uses Perimeter Refresh Protocol and
    structured replication to enhance robustness and
    scalability
  • DCS is useful in large sensor networks and there
    are many detected events but not all event types
    are Queried
Write a Comment
User Comments (0)
About PowerShow.com