Grid-Computing with NetSolve - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Grid-Computing with NetSolve

Description:

single host, cluster, Condor cluster, MPP. ... Large Mcell run with Large Condor Pools (first complete bio-chemical model of a cell) ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 40
Provided by: francin4
Category:

less

Transcript and Presenter's Notes

Title: Grid-Computing with NetSolve


1
Grid-Computing with NetSolve
  • Henri Casanova

2
Grid Computing with NetSolve
  • NetSolve introduction, history and overview
  • NetSolve collaborations
  • NetSolve and the Grid ?
  • NetSolve and Scheduling on the Grid
  • Conclusion

3
NetSolve Introduction
  • Developed at the University of Tennessee and the
    Oak Ridge National Lab.
  • Project leaders Jack Dongarra, Henri Casanova
  • Source freely available

http//www.cs.utk.edu/netsolve
4
NetSolve Genesis
  • Started as an RPC system for Matlab (Cleve
    Moller)
  • Each host must have a Matlab license
  • Limited to Matlabs functions (no LAPACK)
  • Other similar projects (Multi-Matlab, etc)
  • Rapidly distanced itself from Matlab
  • Netlib repository free software
  • Matlab people not too interested !!
  • Easier to develop outside Matlab (v4.2) !!

5
NetSolve Genesis
  • First objectives
  • Run-time system that provides access to freely
    available software running on computational
    servers
  • Easy to use for domain scientists
  • Easy to add new software so the servers
  • Easy to deploy (light-weight)
  • Multiple user interfaces
  • RPC programming model

6
NetSolve Brief History
  • Jan. 1996 v 1.0 (UNIX)
  • Jan. 1997 v 1.1 (UNIX)
  • Sep. 1998 v 1.2 (UNIX/Win32)
  • Java interface
  • Complete rewrite
  • Mathematica interface

7
NetSolve Overview
8
NetSolve The server
  • Daemon running on a computational server
  • single host, cluster, Condor cluster, MPP.
  • Provides access to problems that can be solved
    using pre-installed software
  • Implements basic access control mechanisms
  • Monitor its host workload when possible
  • Reports to agent(s)

9
NetSolve The Agent
  • Daemon running on any host
  • Maintains information on the available servers
  • Gathers workload and network measurements
  • Performs decisions for mapping tasks to resources
  • There can be multiple agents

10
NetSolve agent/server
  • Non-hierarchical (no master agent)
  • makes it easy to deploy
  • Any agent/server can be stopped/restarted safely
  • multiple institutions can
    contribute
  • System can be started on an intranet or Internet
  • System can be open, private or controlled
  • Simple failure detection/restart mechanism

11
NetSolve how it works
NetSolve server daemon
Register
Client stubs
NetSolve problem description files
Computational Modules
Java applet
  • Problem description files
  • Client download stubs at run-time
  • Problem description files are portable
  • Java applet to generate them

12
Available software
  • BLAS
  • LAPACK
  • ScaLAPACK
  • ItPack
  • PETSc
  • Aztec
  • FitPack
  • FFTPack
  • NAG software
  • Minpack
  • QMR
  • ARPACK
  • ImageVision
  • MCell

software added by users
13
NetSolve The client
Multiple interfaces
  • Matlab, Mathematica
  • C, Fortran
  • Perl
  • Java API, Java GUI
  • MS Excel in progress

All interfaces implement same basic mechanisms
14
NetSolve Matlab interface
gtgt netsolve_init gtgt netsolve
/LinearAlgebra/L3/dmatmul /LinearAlgebra/L3/li
nsol /ImageProcessing/Vision/filter gtgt
netsolve(linsol) solves Axb.
2 inputs matrix A, matrix b 1 output
matrix x
15
NetSolve Matlab interface
Synchronous call
gtgt load(a) gtgt load(b) gtgt x netsolve(linsol,a,b
) x 12.326 23.432 . gtgt y
netsolve(linsol,aa,b) y 31.234
-0.323 .
16
NetSolve Matlab interface
Asynchronous call
gtgt load(a) load(A) load(b) gtgt r1
netsolve_nb(linsol,a,b) r1 0 gtgt
r2 netsolve_nb(linsol,A,b) r2
1 gtgt netsolve_nb(status) request
0 done request 1 still pending gtgt
x netsolve_nb(wait,0) x 1.234
-4.534 ... gtgt y netsolve_nb(probe,1)
17
NetSolve Fortran interface
parameter( MAX 100) double precision
A(MAX,MAX), B(MAX) integer IPIV(MAX), N, INFO,
LWORK integer NSINFO call DGESV(N,1,A,MAX,IPIV,B,
MAX,INFO)
call NETSL(DGESV(),NSINFO,
N,1,A,MAX,IPIV,B,MAX,INFO)
18
NetSolve Parallel libraries
  • NetSolve user is unaware of parallel processing
  • NetSolve takes care of the starting the message
    passing system, data distribution, and returning
    the results.

19
NetSolve Condor
Condor Pool (U of Wisconsin)
20
NetSolve Ninf
NetSolve Network
ADAPTOR (Java)
Ninf Network
Ninf MetaServer
ETL (Tsukuba, Japan)
21
NetSolve and the Grid
  • Emergence of the Grid vision
  • Can NetSolve be part of the Grid ?
  • Somewhat a different philosophy
  • Grid Hi-Perf resources, large-scale apps.
  • Global infrastructure
  • NetSolve various resources, small-scale apps.
  • More reduced deployment

22
(No Transcript)
23
NetSolve Grid Middleware
Q What does NetSolve need to become Grid
middleware ?
A NetSolve provides the right level of
abstraction for computational services, but a
few more features
  • More low-level control over jobs
  • - non-location transparent interface
  • - stop jobs, no automatic restart of
    jobs
  • Ways to query the systems topology
  • Better network/CPU/memory load sensors
  • Ways to manage remote storage for data and
    executables
  • Security
  • General job-launching facility (batch systems)
  • Interface to a Global information directory
    service

24
NetSolve on the Grid
Minor software enhancements NWS Globus seem
to do the trick
Minor software enhancements
  • int netsl(128.34.45.43linsol(),)
  • / performs no automatic
    resubmission /
  • int netsl_kill(int request_id)
  • / terminates a job /
  • void netsl_info()
  • / returns static and dynamic
    information /

Use of the Network Weather Service (Rich Wolski,
U. of Tenn.)
25
NetSolve and Globus
Many parts of Globus seem to provide exactly
whats needed
  • GRAM job launching
  • MDS information service (NWS-fed)
  • GSI security
  • GASS remote storage
  • HBM liveliness
  • GEM ? Nexus ??

Risks
Light-weight aspect of NetSolve lost ? What if
Globus fails ? (even though NT ) ) What if some
site just does not want Globus installed
? Developing on top of Globus is no easy task at
the moment
26
NetSolve-Globus Design
Goals
  • Maintain both modes of executions
  • Isolate Globus-specific parts

Globus gatekeeper
NetSolve server daemon
Standard NetSolve protocol
Globus NetSolve protocol
GRAM
Local Disk
GSI
GASS storage (GEM ?)
HBM
Computational Module
Computational Module
NetSolve agent
NetSolve sub-tree
MDS
27
NetSolve-Globus Status
  • Proxy architecture in place
  • First experiments with Globus underway
  • Transparent access Globus just looks
  • like more resources if you have a
    certificate.
  • (Agent talks to MDS ?)
  • Issues
  • Globus shortcomings
  • (GASS, map-files, MDS non-dist., ...)
  • Globus stability and deployment ?

28
NetSolve a research vehicle
Global deployment is an important
issue but NetSolve is also a great research tool
for experimenting in Grid environments right now.
Besides, some applications want the Grid NOW
! (after all the hype)
29
Performance on the Grid ?
  • The power-Grid analogy breaks
  • down for performance.
  • Scheduling seems to be the answer.
  • Globus is a low-level infrastructure and
  • does not provide scheduling facilities
  • Hence projects like AppLeS (Pr. Fran Berman)

Application-level information
AppLeS scheduling agent
Static Grid information
Performance
Dynamic Grid information
30
Scheduling Research with NetSolve ?
As middleware, NetSolve provides the ideal level
of abstraction for doing research on
Grid-scheduling for several classes of
applications. Even the simple RCP-style
applications can prove challenging to schedule on
the Grid. That research could then in turn be
deployed with NetSolve on the Grid. Bottomline
How to do AppLeS within NetSolve ?
31
Scheduling in NetSolve
Issues
  • NetSolves programing model is very general and
    NetSolve
  • interfaces to arbitrary software
  • Very little application-level
    information.
  • Hence, the scheduler in the agent is primitive
    as of now.

Solution
  • Consider classes of applications
  • Build NetSolve-based frameworks for the
    applications
  • With focus on a built-in scheduler

32
First approach Task farming
  • New call in NetSolve for independent tasks
  • Opportunity to experiment with scheduling
  • Bypassing the agent for decision making
  • (agent becomes an information service)
  • Preliminary experimental results satisfactory

netsl_farm(i1,100,linsol, ltarrays of
pointersgt)
- Work queue scheduling - Queue size dynamically
tuned according to available resources -
Implemented with NetSolve internals (before
new non-location transparent interface)
33
Target Application MCell
  • MCell 3-D Monte-Carlo simulation of
  • neuro-transmitter release in between cells.
  • Developed at Salk Institute, Cornell U.
  • Fits the farming semantic and need for NetSolve

List of seeds
Agent
Input files
NetSolve Servers
Output files
script
MCell
Scrip ...
Scrip ...
Scrip ...
Scrip ...
Input scripts
Scrip ...
Scrip ...
Scrip ...
Scrip ...
Scrip ...
Scrip ...
34
Farmings shortcomings
  • NetSolves farming interface is very general
  • Fails to capture applications idiosyncrasies

Input Files
Task 1
Task 1
Task 1
Task 1
Taking advantage of file sharing is paramount
! Need for more evolved scheduling facilities
35
Grid-Scheduling Templates
  • Shortcomings of AppLeS
  • Next generation of schedulers
  • Idea - class of applications
  • - common scheduler
  • - ready-to-use

  • First template Parameter Sweep Applications
  • ( MCell, INS2D, )

- large number of independent tasks - inputs from
files or command line arguments - output to
files - user provides an executable
36
Template structure
Application Specific Interfaces
A basis for a generic Template structure
File describing the entire application
Standard Interface
Data structures describing the application-level
information
NWS
Scheduler
Measurements/ Predictions
Data structures describing the environment
Requests for computation resources
Grid Middleware NetSolve,
Monitoring
The Grid Globus,
37
Initial implementation
  • First prototype completed last week
    (non-NetSolve)
  • NetSolve code being finished
  • Need for a remote storage infrastructure
  • - GASS too crude for now (no name space)
  • - IBP too early
  • - At the moment, simulation with NetSolve
    !
  • Scheduler early prototypes
  • Use of NWS as a plug-in service
  • Great infrastructure to do scheduling research

38
Short- Long-Term Goals
  • Large Mcell run with Large Condor Pools
  • (first complete bio-chemical model of a cell)
  • - scheduling over Condor ?
  • - storage infrastructure ?
  • Deployment of INS2D in production env.
  • - Globus ?

short
  • General template structure
  • New PS applications (Monte-Carlo)
  • New template for new applications

long
39
Conclusion
  • The Grid is an exciting playground
  • - Lot of things are needed
  • - Many research groups involved
  • - Collaborations difficult but getting
    there
  • - A lot of interest from domain scientists
  • - Still need of a sociological model
  • - Middleware is a key component.
  • Scheduling is the key to Grid becoming a
    reality,
  • i.e. usable by more than a selected set of
    people,
  • not in demo mode
Write a Comment
User Comments (0)
About PowerShow.com