Virtualization in MetaSystems - PowerPoint PPT Presentation

About This Presentation
Title:

Virtualization in MetaSystems

Description:

Dawid Kurzyniec, Piotr Wendykier, David DeWolfs, Dirk Gorissen, Maciej Malawski, Vaidy Sunderam ... Oak Ridge Labs (A. Geist, C. Engelmann, J. Kohl) ... – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 30
Provided by: vaid5
Category:

less

Transcript and Presenter's Notes

Title: Virtualization in MetaSystems


1
Virtualization in MetaSystems
  • Vaidy Sunderam
  • Emory University, Atlanta, USA
  • vss_at_emory.edu

2
Credits and Acknowledgements
  • Distributed Computing Laboratory, Emory
    University
  • Dawid Kurzyniec, Piotr Wendykier, David DeWolfs,
    Dirk Gorissen, Maciej Malawski, Vaidy Sunderam
  • Collaborators
  • Oak Ridge Labs (A. Geist, C. Engelmann, J. Kohl)
  • Univ. Tennessee (J. Dongarra, G. Fagg, E.
    Gabriel)
  • Sponsors
  • U. S. Department of Energy
  • National Science Foundation
  • Emory University

3
Virtualization
  • Fundamental and universal concept in CS, but
    receiving renewed, explicit recognition
  • Machine level
  • Single OS image Virtuozo, Vservers, Zones
  • Full virtualization VMware, VirtualPC, QEMU
  • Para-virtualization UML, Xen (Ian Pratt et. al,
    cl.cam.uk)
  • Consolidate under-utilized resources, avoid
    downtime, load-balancing, enforce security
    policy
  • Parallel distributed computing
  • Software systems PVM, MPICH, grid toolkits and
    systems
  • Consolidate under-utilized resources, avoid
    downtime, load-balancing, enforce security policy
    aggregate resources

4
Virtualization in PVM
  • Historical perspective PVM 1.0, 1989

5
Key PVM Abstractions
  • Programming model
  • Timeshared, multiprogrammed virtual machine
  • Two-level process space
  • Functional name ordinal number
  • Flat, open, reliable messaging substrate
  • Heterogeneous messages and data representation
  • Multiprocessor emulation
  • Processor/process decoupling
  • Dynamic addition/deletion of processors
  • Raw nodes projected
  • Transparently
  • Or with exposure of heterogeneous attributes

6
Parallel Distributed Computing
  • Multiprocessor systems
  • Parallel distributed memory computing
  • Stable and mainstream SPMD, MPI
  • Issues relatively clear performance
  • Platforms
  • Applications
  • Correspondingly tightly coupled

7
Parallel Distributed Computing
  • Metacomputing and grids
  • Platforms
  • Parallelism
  • Possibly within components, but mostly loose
    concurrency or pipelining between components
    (PVM 2-level model)
  • Grids resource virtualization across multiple
    admin domain
  • Moved to explicit focus on service orientation
  • Wrap applications as services, compose
    applications into workflows deploy on service
    oriented infrastructure
  • Motivation service/resource coupling
  • Provider provides resource and service
    virtualized access

8
Virtualization in PDC
  • What can/should be virtualized?
  • Raw resource
  • CPU process/task instantiation gt staging,
    security etc
  • Storage e.g. network file system over GMail
  • Data value added or processed
  • Service
  • Define interface and input-output behavior
  • Service provider must operate the service
  • Communication
  • Interaction paradigm with strong/adequate
    semantics
  • Key capability
  • Configurable/reconfigurable resource, service,
    and communication

9
The Harness II Project
  • Theme
  • Virtualized abstractions for critical aspects of
    parallel distributed computing implemented as
    pluggable modules, (including programming
    systems)
  • Major project components
  • Fault-tolerant MPI specification, libraries
  • Container/component infrastructure C-kernel, H2O
  • Communication framework RMIX
  • Programming systems
  • FT-MPI H2O, MOCCA (CCA H2O), PVM

10
Harness II
  • Aggregation for Concurrent High Performance
    Computing
  • Hosting layer
  • Collection of H2O kernels
  • Flexible/lightweight middleware
  • Equivalent to Distributed Virtual Machine
  • But only on client side
  • DVM pluglets responsible for
  • (Co) allocation/brokering
  • Naming/discovery
  • Failures/migration/persistence
  • Programming environments FT- MPI, CCA, paradigm
    frameworks, distributed numerical libraries

11
H2O Middleware Abstraction
  • Providers own resources
  • Independently make them available over the
    network
  • Clients discover, locate, andutilize resources
  • Resource sharing occurs between single provider
    and single client
  • Relationships may betailored as appropriate
  • Including identity formats, resource allocation,
    compensation agreements
  • Clients can themselves be providers
  • Cascading pairwise relationships maybe formed

12
H2O Framework
  • Resources provided as services
  • Service active software component exposing
    functionality of the resource
  • May represent added value
  • Run within a providers container (execution
    context)
  • May be deployed by any authorized party
    provider, client, or third-party reseller
  • Provider specifies policies
  • Authentication/authorization
  • Actors ? kernel/pluglet
  • Decoupling
  • Providers/providers/clients

13
Example usage scenarios
  • Resource computational service
  • Reseller deploys software component into
    providers container
  • Reseller notifies the client about the offered
    computational service
  • Client utilizes the service
  • Resource raw CPU power
  • Client gathers application components
  • Client deploys components into providers
    containers
  • Client executes distributed application utilizing
    providers CPU power
  • Resource legacy application
  • Provider deploys the service
  • Provider stores the information about the service
    in a registry
  • Client discovers the service
  • Client accesses legacy application through the
    service

14
Model and Implementation
Interface StockQuote double
getStockQuote()
  • H2O nomenclature
  • container kernel
  • component pluglet
  • Object-oriented model, Java
  • and C-based implementations
  • Pluglet remotely accessible object
  • Must implement Pluglet interface, may implement
    Suspendible interface
  • Used by kernel to signal/trigger pluglet state
    changes
  • Model
  • Implement (or wrap) service as a pluglet to be
    deployed on kernel(s)

Clients
Functionalinterfaces
(e.g. StockQuote)
Pluglet
Suspendible
Interface Pluglet void init(ExecutionContext
cxt) void start() void stop()
void destroy()
Interface Suspendible void suspend()
void resume()
15
Accessing Virtualized Services
  • Request-response ideally suited, but
  • Stateful service access must be supported
  • Efficiency issues, concurrent access
  • Asynchronous access for compute intensive service
  • Semantics of cancellation and error handling
  • Many approaches focus on performance alone and
    ignore semantic issues
  • Solution
  • Enhanced procedure call/method invocation
  • Well understood paradigm, extend to be more
    appropriate to access metacomputing services

16
The RMIX layer
  • H2O built on top of RMIX communication substrate
  • Provides flexible p2p communication layer for H2O
    applications
  • Enable various message layer protocols within a
    single, provider-based framework library
  • Adopting common RMI semantics
  • Enable high performance and interoperability
  • Easy porting between protocols, dynamic protocol
    negotiation
  • Offer flexible communication model, but retain
    RMI simplicity
  • Extended with asynchronous and one-way calls
  • Issues Consistency, Ordering, Exceptions,
    Cancellation

RPC clients
Web Services
Java
H2O kernel
SOAP clients
...
RMIX
RMIX
Networking
Networking
RPC, IIOP, JRMP, SOAP,
17
RMIX Overview
  • Extensible RMI framework
  • Client and provider APIs
  • uniform access to communication capabilities
  • supplied by pluggable provider implementations
  • Multiple protocols supported
  • JRMPX, ONC-RPC, SOAP
  • Configurable and flexible
  • Protocol switching
  • Asynchronous invocation

18
RMIX Abstractions
  • Uniform interface and API
  • Protocol switching
  • Protocol negotiation
  • Various protocol stacks for different situations
  • SOAP interoperability
  • SSL security
  • ARPC, custom (Myrinet, Quadrics) efficiency
  • Asynchronous access to virtualized remote
    resources

19
Asynchronous RMIX
  • Parameter marshalling
  • Data consistency
  • Also in PVM, MPI etc
  • Exceptions/cancellation
  • Critical for stateful servers
  • Conservative vs. best effort
  • Other issues
  • Execution order
  • Security
  • Virtualizing communications
  • Performance/familiarity vs. semantic issues

20
Programming Models CCA and H2O
  • Common Component Architecture
  • Component standard for HPC
  • Uses and provides ports described in SIDL
  • Support for scientific data types
  • Existing tightly coupled (CCAFFEINE) and loosely
    coupled, distributed (XCAT) frameworks
  • H2O
  • Well matched to CCA model

21
MOCCA implementation in H2O
  • Each component running in separate pluglet
  • Thanks to H2O kernel security mechanisms,
    multiple components may run without interfering
  • Two-level builder hierarchy
  • ComponentID pluglet URI
  • MOCCA_Light pure Java implementation (no SIDL)

22
Performance Small Data Packets
  • Factors
  • SOAP header overhead in XCAT
  • Connection pools in RMIX

23
Large Data Packets
  • Encoding (binary vs. base64)
  • CPU saturation on Gigabit LAN (serialization)
  • Variance caused by Java garbage collection

24
Use Case 2 H2O FT-MPI
  • Overall scheme
  • H2O framework installed on computational nodes,
    or cluster front-ends
  • Pluglet for startup, event notification, node
    discovery
  • FT-MPI native communication (also MPICH)
  • Major value added
  • FT-MPI need not be installed anywhere on
    computing nodes
  • To be staged just-in-time before program
    execution
  • Likewise, application binaries and data need not
    be present on computing nodes
  • The system must be able to stage them in a secure
    manner

25
Staging FT-MPI runtime with H2O
  • FT-MPI runtime library and daemons
  • Staged from a repository (e.g. Web server) to the
    computational node upon users request
  • Automatic platform type detection appropriate
    binary files are downloaded from the repository
    as needed
  • Allows users to run fault tolerant MPI programs
    on machines where FT-MPI is not pre-installed
  • Not needing login account to do so using H2O
    credentials instead

26
Launching FT-MPI applications with H2O
  • Staging applications from a network repository
  • Uses URL code base to refer to a remotely stored
    application
  • Platform-specific binary transparently uploaded
    to a computational node upon client request
  • Separation of roles
  • Application developer bundles the application and
    puts it into a repository
  • The end-user launches the application, unaware of
    heterogeneity

27
Interconnecting heterogeneous clusters
  • Private, non-routable networks
  • Communication proxies on cluster front-ends route
    data streams
  • Local (intra-cluster) channels not affected
  • Nodes use virtual addresses at the IP level
    resolved by the proxy

28
Initial experimental results
  • Proxied connection versus direct connection
  • Standard FT-MPI throughput benchmark was used
  • within a Gig-Ethernet cluster proxies retain 65
    of throughput

29
Summary
  • Virtualization in PDC
  • Devising appropriate abstractions
  • Balance pragmatics and performance vs. model
    cleanness
  • The Harness II Project
  • H2O kernel
  • Reconfigurability, by clients/tprs very valuable
  • RMIX communications framework
  • High level abstractions for control comms (native
    data comms)
  • Multiple programming model overlays
  • CCA, FT-MPI, PVM
  • Concurrent computing environments on demand
Write a Comment
User Comments (0)
About PowerShow.com