P2P Discovery of Computational Resources forGrid Applications - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

P2P Discovery of Computational Resources forGrid Applications

Description:

Workstations in CSIL (undergraduate) lab at CS/UIUC ... Yellow curve shows extra resources (wastage) P2P Resource Discovery-Based: Pink curve ... – PowerPoint PPT presentation

Number of Views:30

Avg rating:3.0/5.0

Slides: 25

Provided by: Ade80

Category:

more less

Transcript and Presenter's Notes

Title: P2P Discovery of Computational Resources forGrid Applications

1
P2P Discovery of Computational Resources
for Grid Applications

Adeep S. Cheema (Microsoft)
Moosa Muhammad (Motorola)
Indranil Gupta (UIUC)

Dept. of Computer Science University of Illinois
at Urbana-Champaign (UIUC)
2
Grid vs. P2P Complementary Areas

Grid computing (great applications)
Dedicated H/W Grids
SETI_at_Home
Background Grids
Emphasis and hype so far on deployability, not on
scale and reliability
Peer-to-Peer (P2P) Computing (great technologies)
Decentralized (no server), Scalable (1000s of
nodes), Reliable (network loss, node failure)
Emphasis and hype so far has been on scale and
reliability, not on (legal) deployability
Foster et al, Ledlie et al called for a
convergence of these two areas
Eliminate the hype, increase the user base

3
Setting Background Grids

Background Grid A group of Grid clusters where
jobs run in the background on each machine
Scale Grid job CPU utilization down/up according
to CPU free time percentage, RAM free percentage
Conservative approach
Makes sense in undergraduate labs better use of
resources without disturbing user (CSIL Lab at
CS.UIUC)
Problem? Resource Discovery
Sample Query Find me a machine with gt 1.4 GHz
Intel P4, gt 40 CPU idle, gt 512 MB RAM free,
running Linux
Do this is in a Grid network with (tens of)
1000s of hosts
With each hosts CPU-free , RAM-free changing
all the time

4
Requirements and Idea

Need a solution that is
Scalable support a large pool of diverse
resources
Efficient and Fully Decentralized low background
bandwidth, support quick queries times, quick
updates
Robust support frequent updates, frequent host
failures, accurate replies to queries
Basic Idea Adapt P2P technologies to provide a
substrate that Background Grid applications can
build on top of
But need to modify P2P technology by adding
expressive naming of resources

5
One secondAny Off-The-Shelf Solutions?

Existing P2P Technologies
DHTsDistributed Hash Tables Pastry, Chord,
CAN, Kelips
Support efficient resource discovery, but require
unique names for resources
Also support range queries (find u2.mp3)
Problem No expressive naming of resources
How do you name a resource that is 2.3 GHz P4,
30 CPU-idle, 128 MB RAM-free for efficient
querying, and for frequent and efficient updates
(e.g., change CPU-idle to 40)?
Existing Grid Resource Discovery
GRIP Globus
Matchmaking Condor
Ontology-based, Iamnitchi and Foster
Problem scalability and reliability not (really)
addressed

6
Background Grids Workload Traces
An aside

Workstations in CSIL (undergraduate) lab at
CS/UIUC
6 candidate machines, monitored every minute, for
a 3 week period. Over 3000 machine hours of
traces.
Monitored Parameters CPU Idle, RAM free, disk
space free

7
Short-term Behavior

Bursty CPU utilization
Interspersed with long periods of inactivity
Low average load

8
Long-term Behavior

Temporal and diurnal patterns
Maintenance related patterns

9
Background Grids Behavior Summary

CPU utilization bursty, dynamic and low in
long-term
RAM free varies a reasonable amount over time
Disk space free linear with a small negative
slope and occasional small positive jumps
Because the RAM-free and disk-free vary over
time, Background Grids make sense!

10
Resource Tuples
Back to our problem

ltip,port,cpu-speed,tot-ram,cpu-idle,ram-free,disk-
free,gt
Insert, Update, and Query in a distributed manner
Maintain the resource tuples among the hosts
themselves (cooperative, no server, decentralized)

11
Pastry A Crash Course

Pastry Rowstron and Druschel 03 a P2P resource
discovery system
Each host mapped to a unique identifier
(nodeID) derived by hashing hosts ip address
(SHA-1 or MD-5 ) ? load balancing and scale
All nodeIDs located on a logical ring
Each host knows its next few successors and
predecessors on ring
Each host also knows a few other far-away hosts
on the ring
A message for a destination nodeID is routed
through these neighbors. Routing to any host
takes O(log N) logical hops in a system with N
hosts.
Each resource has a unique name (resourceID) in
the nodeID space derived by hashing resources
name (SHA-1 or MD-5) ? load balancing and scale
the host with nodeID closest to resourceID is
responsible for the resource
resourceIDs can be inserted and queried and
deleted
Maintenance of neighbors at each host is
autonomous through a heartbeating protocol.
Handles arrival, departure and failure of nodes.
Problem for our resource tuples
ltip,port,cpu,ram,diskgt? Pastry assumes each
resource has a unique, static, one-dimensional
name

12
Expressive Naming of Grid Resources

Split a resource tuple into two parts
a static part fixed configuration
CPU type and speed, Total RAM.
a dynamic part continuously changing parameters
CPU idle, RAM free, disk free
Can be extended so the split is different for
different resource tuples

13
Expressive Naming of Grid Resources (2)

For a given resource tupled, derive Pastry
resourceID by
Hashing static part of resource tuple
Appending dynamic part of resource tuple verbatim
(un-hashed)

14
Ringing Resources
Static Part
Dynamic Part

Why?
Retain load-balancing yet optimize for querying
A given hosts resource tuple will lie in a
vicinity within the Pastry ring
As the hosts behavior changes over time, its
resource tuple spans an arc
If dynamic attributes are in same order, can also
search by dynamic attribute, given static
attributes

15
Ringing Resources (2)
Static Part
Dynamic Part

Static part consists of attributes that each have
a discrete range, and usually one that is finite,
e.g., CPU speed is multiples of 100 MHz, and lt 4
GHz
Dynamic part
Contains attributes with continuous ranges
Derived by encoding all dynamic attributes into
the 32 dynamic bits
E.g., truncate, other encodings in paper

16
When Host Joins Background Grid

Two steps
Convert resource tuple into resourceID
Insert resourceID into Pastry system
Thats it!

17
When Hosts Condition changes

E.g., CPU idle changes
Update resourceID in the Pastry system
Delete old resourceID from Pastry
Insert new resourceID into Pastry
Uses single UPDATE message (since old and new
locations close-by)
How often to update? Updates are initiated by
resources either
Periodically at rate URATE, or
When significant change in dynamic art
Parameterized as UCHANGEresource UCHANGECPU
8.7 means update oonly when CPU idle changes
by more than 8.7 since last update

18
How to Discover a Resource

E.g., Find me a machine with gt 1.4 GHz Intel P4,
gt 40 CPU idle, gt 512 MB RAM free, running Linux
Three alternative approaches
Single-shot send out one Pastry lookup for one
resourceID that satisfies query parameters. Works
well if frequency of updates and number of
resources are high.
Recursive send out one Pastry lookup with a TTL
(time to live) field, with recursive tuning of
the lookups resourceID along the way.
Parallel searching Initiate multiple Pastry
lookups, each for a different resourceID that
satisfies the query. Works well if resources are
limited or contention is high.

19
A. Update Frequency
Experimental Results
20000 total hosts 1000 hosts (only) in
Pastry Update threshold on 10 change 1 query /
time unit
All experiments uses CSIL workload traces shown
earlier
Total number of updates (in simulation run) is
low for an update threshold gt 10
20
B. Scalability

25 of the hosts store stored 70 of the data
(skewed distribution)
The most-loaded host 0.55 of the load, compared
to a system-wide average of 0.1 (scalable)
The bandwidth consumed scales with number of host
resources injected

21
C. Short-Term Query Performance
A single-shot search (for different CPU
idle) returns sufficient number of results!

Search bandwidth goes down quickly
with time difference between queries

22
D. Long-Term Query Performance

A sample Grid application that requires 12 hosts
of specific configurations for a 10 hour period
Horizontal blue line
Overprovisioning Brown curve shows total
resources
Yellow curve shows extra resources (wastage)
P2P Resource Discovery-Based Pink curve
Grid application demand always satisfied. No
wastage due to overprovisioning.

23
Summary

Grid applications need substrates that provide
both reliability and scalability (1000s of
nodes)
P2P technologies exist, but need to be adapted to
build such substrates
This paper
Background Grids use idle CPU, free RAM etc.
Resource Discovery answer queries looking for
such resources
Solution Adapt Pastry P2P Overlay by using
expressive naming of resources
Traces collected from CS Undergraduate Lab at
UIUC
Resulting Substrate is robust, scalable,
reliable, and beats overprovisioning for
long-running Grid applications