Slicing with SHARP - PowerPoint PPT Presentation

About This Presentation

Title:

Slicing with SHARP

Description:

Jeff Chase Duke University Federated Resource Sharing Location-Independent Services The last decade has yielded immense progress on building and deploying large-scale ... – PowerPoint PPT presentation

Number of Views:60

Avg rating:3.0/5.0

Slides: 41

Provided by: JeffC68

Learn more at: http://issg.cs.duke.edu

Category:

more less

Transcript and Presenter's Notes

Title: Slicing with SHARP

1
Slicing with SHARP

Jeff Chase
Duke University

2
Federated Resource Sharing

How do we safely share/exchange resources across
domains?
Administrative, policy, security, trust
domains or sites
Sharing arrangements form dynamic Virtual
Organizations
Physical and logical resources (data sets,
services, applications).

3
Location-Independent Services
dynamic server set
Clients
request routing
varying load

The last decade has yielded immense progress on
building and deploying large-scale adaptive
Internet services.
Dynamic replica placement and coordination
Reliable distributed systems and clusters
P2P, indirection, etc.

Example services caching network or CDN,
replicated Web service, curated data, batch
computation, wide-area storage, file sharing.
4
Managing Services
Servers
Service Manager
Clients
sensor/actuator feedback control

A service manager adapts the service to changes
in load and resource status.
e.g., gain or lose a server, instantiate
replicas, etc.
The service manager itself may be decentralized.
The service may have contractual targets for
service quality - Service Level Agreements or
SLAs.

5
An Infrastructure Utility
Resource efficiency Surge protection Robustness P
ay as you grow Economy of scale Geographic
dispersion
Resource pool

In a utility/grid, the service obtains resources
from a shared pool.
Third-party hosting providers - a resource
market.
Instantiate service wherever resources are
available and demand exists.
Consensus we need predictable performance and
SLAs.

6
The Slice Abstraction

Resources are owned by sites.
E.g., node, cell, cluster
Sites are pools of raw resources.
E.g., CPU, memory, I/O, net
A slice is a partition or bundle or subset of
resources.
The system hosts application services as
guests, each running within a slice.

Site A
Service S1
Site B
Service S2
Site C
7
Slicing for SLAs

Performance of an application depends on the
resources devoted to it Muse01.
Slices act as containers with dedicated bundles
of resources bound to the application.
Distributed virtual machine / computer
Service manager determines desired slice
configuration to meet performance goals.
May instantiate multiple instances of a service,
e.g., in slices sized for different user
communities.
Services may support some SLA management
internally, if necessary.

8
Example Cluster Batch Pools in Cluster-on-Demand
(COD)

Partition a cluster into isolated virtual
clusters.
Virtual cluster owner has exclusive control over
servers.
Assign nodes to virtual clusters according to
load, contracts, and resource usage policies.
Example service a simple wrapper for SGE batch
scheduler middleware to assess load and
obtain/release nodes.

request
Service Managers (e.g., SGE)
COD Cluster
grant
Dynamic Virtual Clusters in a Grid Site Manager
HPDC 2003 with Laura Grit, David Irwin, Justin
Moore, Sara Sprenkle
9
A Note on Virtualization

Ideology for the future Grid adapt the grid
environment to the service rather than the
service to the grid.
Enable user control over application/OS
environment.
Instantiate complete environment down to the
metal.
Dont hide the OS its just another replaceable
component.
Requires/leverages new underware for
instantiation.
Virtual machines (Xen, Collective, JVM, etc.)
Net-booted physical machines (Oceano, UDC, COD)
Innovate below the OS and alongside it
(infrastructure services for the control plane).

10
SHARP

Secure Highly Available Resource Peering
Interfaces and services for external control of
federated utility sites (e.g., clusters).
A common substrate for extensible
policies/structures for resource management and
distributed trust management.
Flexible on-demand computing for a site, and
flexible peering for federations of sites
From PlanetLab to the New Grid
Use it to build a resource dictatorship or a
barter economy, or anything in between.
Different policies/structures may coexist in
different partitions of a shared resource pool.

11
Goals

The question addressed by SHARP is how do the
service managers get their slices?
How does the system implement and enforce
policies for allocating slices?
Predictable performance under changing
conditions.
Establish priorities under local or global
constraint.
Preserve resource availability across failures.
Enable and control resource usage across
boundaries of trust or ownership (peering).
Balance global coordination with local control.
Extensible, pluggable, dynamic, decentralized.

12
Non-goals

SHARP does NOT
define a policy or style of resource exchange
E.g., barter, purchase, or central control (e.g.,
PLC)
care how services are named or instantiated
understand the SLA requirements or specific
resource needs of any application service
define an ontology to describe resources
specify mechanisms for resource control/policing.

13
Resource Leases
request
Site A Authority
S1 Service Manager
grant
ltleasegt ltissuergt As public key lt/issuergt
ltsigned_partgt ltholdergt S1s public key
lt/holdergt ltrsetgt resource description
lt/rsetgt ltstart_timegt lt/start_timegt
ltend_timegt lt/end_timegt ltsngt unique ID
at Site A lt/sngt lt/signed_partgt
ltsignaturegt As signature lt/signaturegt lt/leasegt
14
Agents (Brokers)
request
request
S1 Service Manager
Site A Authority
grant
grant

Introduce agent as intermediary/middleman.
Factor policy out of the site authority.
Site delegates control over its resources to the
agent.
Agent implements a provisioning/allocation policy
for the resources under its control.

15
Leases vs. Tickets

The site authority retains ultimate control over
its resources only the authority can issue
leases.
Leases are hard contracts for concrete
resources.
Agents deal in tickets.
Tickets are soft contracts for abstract
resources.
E.g., You have a claim on 42 units of resource
type 7 at 3PM for two hoursmaybe. (also signed
XML)
Tickets may be oversubscribed as a policy to
improve resource availability and/or resource
utilization.
The subscription degree gives configurable
assurance spanning continuum from a hint to a
hard reservation.

16
Service Instantiation
instantiate service in virtual machine
7
Site
redeem ticket
request
6
5
1
grant ticket
grant lease
2
request
3
grant ticket
4
Agent
Service Manager
Like an airline ticket, a SHARP ticket must be
redeemed for a lease (boarding pass) before the
holder can occupy a slice.
17
Ticket Delegation
Transfer of resources, e.g., as a result of a
peering agreement or an economic transaction.
The site has transitive trust in the delegate.
Agents may subdivide their tickets and delegate
them to other entities in a cryptographically
secure way. Secure ticket delegation is the basis
for a resource economy. Delegation is
accountable if an agent promises the same
resources to multiple receivers, it may/will be
caught.
18
Peering
Sites may delegate resource shares to multiple
agents. E.g., Let my friends at UNC use 20 of
my site this week and 80 next weekend. UNC can
allocate their share to their local users
according to their local policies. Allocate the
rest to my local users according to my policies.
Note tickets issued at UNC self-certify their
users.
19
A SHARP Ticket
ltticketgt ltsubticketgt ltissuergt As public
key lt/issuergt ltsigned_partgt
ltprincipalgt Bs public key lt/principalgt
ltagent_addressgt XML RPC redeem_ticket()
lt/agent_addressgt ltrsetgt resource
description lt/rsetgt ltstart_timegt
lt/start_timegt ltend_timegt lt/end_timegt
ltsngt unique ID at Agent A lt/sngt
lt/signed_partgt ltsignaturegt As signature
lt/signaturegt lt/subticketgt ltsubticketgt
ltissuergt Bs public key lt/issuergt
ltsigned_partgt lt/signed_partgt ltsignaturegt
Bs signature lt/signaturegt lt/subticketgt lt/ticket
gt
20
Tickets are Chains of Claims
claimID a holder A a.Rset a.term
claimID b holder B issuer A parent
a b.rset/term
21
A Claim Tree
anchor
The set of active claims for a site forms a claim
tree. The site authority maintains the claim
tree over the redeemed claims.
40
ticket T
25
8
final claim
9
3
3
10
22
Ticket Distribution Example
A 40

Site transfers 40 units to Agent A

B 8
D 3

A transfers to B and C

E 3
C 25
H 7

B and C further subdivide resources

F 9
resource space

C oversubscribes its holdings in granting to H,
creating potential conflict

conflict
G 10
t0
t
time
23
Detecting Conflicts
A 40
100
B 8
D 3

E 3
40
C 25
H 7
resource space
F 9
26
25
8
G 10
10
9
7
3
3
t0
t
time
24
Conflict and Accountability

Oversubscription may be a deliberate strategy, an
accident, or a malicious act.
Site authorities serve oversubscribed tickets
FCFS conflicting tickets are rejected.
Balance resource utilization and conflict rate.
The authority can identify the accountable agent
and issue a cryptographically secure proof of its
guilt.
The proof is independently verifiable by a third
party, e.g., a reputation service or a court of
law.
The customer must obtain a new ticket for
equivalent resource, using its proof of rejection.

25
Agent as Arbiter

Agents implement local policies for apportioning
resources to competing customers. Examples
Authenticated client identity determines
priority.
Sell tickets to the highest bidder.
Meet long-term contractual obligations sell the
excess.

26
Agent as Aggregator
Agents may aggregate resources from multiple
sites.
Example PlanetLab Central

Index by location and resource attributes.
Local policies match requests to resources.
Services may obtain bundles of resources across
a federation.

27
Division of Knowledge and Function
Agent/Broker
Authority
Service Manager
Knows the Application Instantiate app Monitor
behavior SLA/QoS mapping Acquire
contracts Renew/release
Guesses global status Availability of
resources What kind How much Where (site
grain) How much to expose about resource
types? About proximity?
Knows local status Resource status Configuration
Placement Topology Instrumentation Thermals, etc.
28
Issues and Ongoing Work

SHARP combines resource discovery and brokering
in a unified framework.
Configurable overbooking degree allows a
continuum.
Many possibilities exist for SHARP agent/broker
representations, cooperation structures,
allocation policies, and discovery/negotiation
protocols.
Accountability is a fundamental property needed
in many other federated contexts.
Generalize accountable claim/command exchange to
other infrastructure services and applications.
Bidding strategies and feedback control.

29
Conclusion

Think of PlanetLab as an evolving prototype for
planetary-scale on-demand utility computing.
Focusing on app services that are light resource
consumers but inhabit many locations network
testbed.
Its growing organically like the early Internet.
Rough consensus and (almost) working code.
PlanetLab is a compelling incubator and testbed
for utility/grid computing technologies.
SHARP is a flexible framework for utility
resource management.
But its only a framework.

30
Performance Summary

SHARP prototype complete and running across
PlanetLab
Complete performance evaluation in paper
1.2s end-to-end time to
Request resource 3 peering hops away
Obtain and validate tickets, hop-by-hop
Redeem ticket for lease
Instantiate virtual machine at remote site
Oversubscription allows flexible control over
resource utilization versus rate of ticket
rejection

31
Related Work

Resource allocation/scheduling mechanisms
Resource containers, cluster reserves,
Overbooking
Cryptographic capabilities
Taos, CRISIS, PolicyMaker, SDSI/SPKI
Lottery ticket inflation
Issuing more tickets decreases value of existing
tickets
Computational economies
Amoeba, Spawn, Muse, Millenium, eOS
Self-certifying trust delegation
PolicyMaker, PGP, SFS

32
Physical cluster
COD servers backed by configuration database
Network boot Automatic configuration Resource
negotiation
Dynamic virtual clusters
Database-driven network install
33
SHARP

Framework for distributed resource management,
resource control, and resource sharing across
sites and trust domains

Challenge SHARP Approach
Maintain local autonomy Sites are ultimate arbiters of local resources Decentralized protocol
Resource availability in the presence of agent failures Tickets time out (leases) Controlled oversubscription
Malicious actors Signed tickets Audit full chain of transfers
34
40
claimID a holder A issuer A parent
a.rset/term
anchor
40
8
ticket T
25
3
claimID b holder B issuer A parent
a b.rset/term
3
final claim
25
8
future claim
9
10
claimID c holder C issuer B parent
b c.rset/term
10
9
3
3
7
T
claim tree at time t0
claim delegation
t0
t
subclaim(claim c, claim p) ? c.issuer
p.holder ? c.claimID p
? contains(p.rset, c.rset) ?
subinterval(c.term, p.term) contains(rset p,
rset c) ? c.type p.type ? c.count ?
p.count subinterval(term c, term p) ?
p.start ? c.start ? p.end ? p.start ? c.end ?
p.end
ticket(c0,,cn) ? ? ci, i 0..n-1
subclaim(ci1, ci) anchor(c0)
anchor(claim a) ? a.issuer a.holder
a.parent null
agent
35
Mixed-Use Clustering
Virtual clusters
BioGeometry batch pool
SIMICS/Arch batch pool
Internet Web/P2P emulations
Student semester project
Somebodys buggy hacked OS
Physical clusters
Vision issues leases on isolated partitions of
the shared cluster for different uses, with
push-button Web-based control over software
environments, user access, file volumes, DNS
names.
36
Grids are federated utilities

Grids should preserve the control and isolation
benefits of private environments.
Theres a threshold of comfort that we must reach
before grids become truly practical.
Users need service contracts.
Protect users from the grid (security cuts both
ways).
Many dimensions
decouple Grid support from application
environment
decentralized trust and accountability
data privacy
dependability, survivability, etc.

37
COD and Related Systems
Other cluster managers based on database-driven
PXE installs
Oceano
hosts Web services under dynamic load.
Dynamic clusters
OS-Agnostic
COD
NPACI Rocks
Netbed/Emulab
Flexible configuration Open source
configures Linux compute clusters.
configures static clusters for emulation
experiments.
COD addresses hierarchical dynamic resource
management in mixed-use clusters with pluggable
middleware (multigrid).
38
Dynamic Virtual Clusters
Web interface
Reserve pool (off-power)
COD database
COD Manager
Virtual Cluster 2
negotiate configure
Allocate resources in units of raw
servers Database-driven network install
Pluggable service managers Batch schedulers (SGE,
PBS), Grid PoP, Web services, etc.
Virtual Cluster 1
39
Enabling Technologies
DHCP host configuration
Linux driver modules, Red Hat Kudzu, partition
handling
DNS, NIS, etc. user/group/mount configuration
NFS etc. network storage automounter
PXE network boot
IP-enabled power units
Power APM, ACPI, Wake-on-LAN
Ethernet VLANs
40
Recent Papers on Utility/Grid Computing

Managing Energy and Server Resources in Hosting
Centers ACM Symposium on Operating Systems
Principles 2001
Dynamic Virtual Clusters in a Grid Site Manager
IEEE High-Performance Distributed Computing
2003
Model-Based Resource Provisioning for a Web
Service Utility USENIX Symposium on Internet
Technologies and Systems 2003
An Architecture for Secure Resource Peering ACM
Symposium on Operating Systems Principles 2003
Balancing Risk and Reward in a Market-Based Task
Manager IEEE High-Performance Distributed
Computing 2004
Designing for Disasters USENIX Symposium on File
and Storage Technologies 2004
Comparing PlanetLab and Globus Resource
Management Solutions IEEE High-Performance
Distributed Computing 2004
Interposed Proportional Sharing for a Storage
Service Utility ACM Symposium on Measurement and
Modeling of Computer Systems 2004