Title: Virtual Appliances for Scientific Applications
1Virtual Appliances for Scientific Applications
- Kate Keahey
- keahey_at_mcs.anl.gov
- Argonne National Laboratory
- University of Chicago
2The Grid Metaphor
What happens if a power station fails?
How do we store energy?
How do we charge for energy?
What elements make for a safe and efficient
power Grid?
How do we ensure quality of service?
How do we reliably deliver energy?
How do we make sure that supply meets demand?
3Computational Grids
How can we manage different computing
environments?
What is the unit of resource usage?
How can we use Grid resources as easily
and intuitively as we use electrical power today?
How can we negotiate for computation?
How can we ensure that disk, CPUs, network are
all available?
4Provisioning Critical Resources
5Quality of Service
- Issues of control
- Trust management
- Dynamic relationships
- Protocols to negotiate SLA-based relationships
- Enforcement tools
- What worked
- Coarse-grained sharing for relatively tight-knit
communities with strong incentives to collaborate - Non-critical needs
- Informal relationships
- What proved difficult
- Formal sharing for loosely knit communities
6Quality of Life
- Lots of heterogeneous resources, none of them
good for my application - Consistent environment
- Short-term leasing
- Changing configuration quickly, quick turnaround
- Some examples
- Support for legacy physics applications
- Unusual platforms needed by ornitologists
- Climate scientists need very consistent
configurations - What worked
- Access to resources with standardized
configuration - Tightly-knit communities
- Everything else proved difficult
7Workspaces for Grid Computing
- Virtual Workspace
- Environment definition
- Resource allocation
- The GT4 Virtual Workspace Service (VWS)
- allows an authorized client to deploy and manage
workspaces on-demand. - GT4 WSRF-based protocol set, leverages multiple
GT services - Multiple back-ends possible, currently using Xen
- http//workspace.globus.org
Paper Virtual Workspaces Achieving Quality of
Service and Quality of Life in the Grid,
Scientific Programming Journal
8Workspace Service
The VWS manages a set of nodes inside the TCB
(typically a cluster). This is called the node
pool.
Pool node
Pool node
Pool node
The workspace service has a WSRF frontend that
allows users to deploy and manage virtual
workspaces
VWS Service
Pool node
Pool node
Pool node
VWS Node
Each node must have a VMM (Xen) installed, along
with the workspace backend (software that
manages individual nodes)
Pool node
Pool node
Pool node
Image Node
Pool node
Pool node
Pool node
VM images are staged to a designated image
node inside the TCB
Trusted Computing Base (TCB)
9Deploying Workspaces
Pool node
Pool node
Pool node
VWS Service
- Workspace Deployment Request
- Workspace metadata
- Describes the workspace
- Contextualization information (IP,
security,partitions,etc.) - Resource Allocation
- Specifies availability, CPU, disk, memory,
nodes, etc.
Pool node
Pool node
Pool node
Pool node
Pool node
Pool node
Image Node
Pool node
Pool node
Pool node
10Interacting with Workspaces
The workspace service publishes information on
each workspace as standard WSRF
Resource Properties.
Pool node
Pool node
Pool node
VWS Service
Pool node
Pool node
Pool node
Users can query those properties to find
out information about their workspace (e.g. what
IP the workspace was bound to) as well as manage
the resources a workspace was assigned
Pool node
Pool node
Pool node
Image Node
Pool node
Pool node
Pool node
Users can interact directly with their workspaces
the same way the would with a physical machine.
Trusted Computing Base (TCB)
11The Case of OSG Edge Services
12OSG Edge Services
- Requirements
- Edge Services are VO-specific
- Resource usage negotiation and enforcement
- Features
- IP addresses Management
- Host certificates for Edge Services, naming
issues - Resource allocation (re)negotiation
- Integration into the local infrastructure
- Challenges
- Image configuration and maintenance
- Fine-grain resource usage enforcement
- Running out of public IPs
Paper Division of Labor Tools for Growth and
Scalability of Grids, ICSOC 2006
13The Case of the OSG Virtual Cluster
Pool node
Pool node
Pool node
VWS Service
Pool node
Pool node
Pool node
Pool node
Pool node
Pool node
Image Node
Pool node
Pool node
Pool node
14OSG Virtual Cluster
- Requirements
- Leasing/Glide-ins resource allocation for
VO-specific computation - Short execution time, workflows
- Scientific gateways
- Features
- Describing and managing aggregate workspaces
- Application-specific configuration on the fly
- Challenges
- Integration with local scheduling infrastructure
Paper Virtual Clusters for Grid Communities,
CCGrid 2006 (TR2005)
15The Case of the STAR Application
STAR
STAR
GRAM
STAR
no STAR
VWS
STAR
no STAR
no STAR
GRAM
16STAR Application
- Requirements
- Hard-to-install legacy applications
- Consistent environment requirements
- Features
- Image size (6-10 GB), 8 min deployment time
- Image Caching
- Challenges
- Integration with local scheduling infrastructure
Presentation Virtual Workspace Appliances, SC06
17The Case of the Alice Application
- Requirements
- Pull-based computing model
- Features
- Partition management
- Blank partitions
- Partition sharing between workspaces
- Capability maching
- Workspace descriptions
- Factory pre-reqisites
- Ongoing effort
18Moving Forward
- Deployment a chicken and egg problem
- The Chicken overcoming Xenophobia
- Hypervisor installations are invasive
- Security the cure or the disease?
- Infrastructure scheduling, etc.
- Incentives
- The Egg users
- Where do I get an image from?
- VO administrators
- How do we describe, identify, query for images?
- Integrated vision of knitting multiple resources
together
19Overall Approach
Appliance Producer
Appliance Management
Appliance Deployment
20Deployment (1)
- Matching Appliances to Resources
- Appliance meta-data
- VM image?
- What VMM, architecture, etc.
- Resource characteristics
- What kind of appliances am I willing to deploy?
- Workspace Service
- Workspace meta-data
- VWS Factory pre-conditions
21Deployment (2)
- Establishing trust in an appliance
- Assert appliance properties, sign them to the
image - Direct or indirect assertion
- Trust the process, not just the person
- Probe appliances
Presentation Making your workspace secure
establishing trust with VMs in the Grid, SC05
22Deployment (3)
- Adapting appliances for deployment
- IP address delivery
- Generating certificates
- Making an appliance work within a specific
deployment framework (contextualization) - Virtual clusters
- Application-level configuration
23Producing Appliances
- Configuration for the masses
- The profile of an appliance configurer has
changed - Building appliances incrementally
- Appliance attestation
- Functionality testing
- Trust the process, not just the person
24Managing Appliances
- Security updates
- Security RSS Feed
- Bugtraq, US-CERT Security Advisories
- Will the system still work?
- Functionality testing
- Component dependencies
25Appliance Layers
- Layered Appliance
- A set of interdependent layers
- Appliance layers
- Less data needs to travel
- More flexible
- Faster deployment
- Trust management
- Collaborative aspects of configuration
Customization Layer
Application Layer
VO Layer
System Layer
26Virtual Organizations
grid-proxy-init
myVO.org
Sharing resources images, hardware, networks,
storage facilities, security context
27Conclusions
- We need languages and protocols to describe,
discover and name appliances - Growing role of a VO
- Configuration management
- Virtual networks and namespaces
- Beyond a security context
- Sustainable deployment model
- How does producing, deploying and managing
appliances work together?
28Credits
- Workspace team
- Tim Freeman, Borja Sotomayor
- Guest appearances
- Rick Bradshaw, Predrag Buncic, Narayan Desai,
Abhishek Rana, Frank Siebenlist, Doug Olson,
Frank Wuerthwein and others.