The Inferno Grid (and the Reading Campus Grid) - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

The Inferno Grid (and the Reading Campus Grid)

Description:

Inferno Grid software (a Limbo program) (Can also run Inferno native on bare hardware) Could write all applications in Limbo (Inferno's own ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 18
Provided by: JonBl2
Category:

less

Transcript and Presenter's Notes

Title: The Inferno Grid (and the Reading Campus Grid)


1
The Inferno Grid(and the Reading Campus Grid)
  • Jon Blower
  • Reading e-Science Centre
  • Many others,
  • School of Systems Engineering, IT Services

http//www.resc.rdg.ac.uk resc_at_rdg.ac.uk
2
Introduction
  • Reading are in early stages of Campus Grid
    construction
  • Currently consists of two flocked Condor pools
  • More of which later
  • Also experimenting with the Inferno Grid
  • Condor-like system for pooling ordinary desktops
  • Although (like Condor) it could be used for more
    than this
  • The Inferno Grid is commercial software but free
    to UK e-Science community
  • Secure, low maintenance, firewall-friendly
  • Perhaps not (yet) as feature-rich as Condor

3
The Inferno operating system
  • The Inferno Grid is based upon the Inferno OS
  • Inferno OS is built from the ground up for
    distributed computing
  • Mature technology, good pedigree (Bell Labs, Pike
    Ritchie)
  • Extremely lightweight ( 1MB RAM) so can run as
    emulated application identically on multiple
    platforms (Linux, Windows, etc)
  • Hence it is a powerful base for Grid middleware
  • Everything in Inferno is represented as a file or
    set of files
  • cf. /dev/mouse in Unix
  • So to create a distributed system, just have to
    know how to share files uses a protocol
    called Styx for this
  • Inferno OS is released under Liberal Licence
    (free and open source) for non-commercial use
  • Can run applications in the host OS (Linux,
    Windows etc)
  • Secure certificate-based authentication, plus
    strong encryption built-in at OS level

4
The Inferno Grid
  • Built as an application in the Inferno OS
  • Hence uses OSs built-in security and ease of
    distribution
  • Can run under all platforms that Inferno OS runs
    on
  • Essentially high-throughput computing cf. Condor
  • Created by Vita Nuova (http//www.vitanuova.com)
  • Free academic licence, but also used for real
  • Evotec OAI (speeds up drug discovery) 90
    utilisation of machines
  • Major government department modelling disease
    spread in mammals
  • Other major company (cant say more!)
  • University installations at Reading and York
  • (AHM2004 created Inferno Grid from scratch
    easily)

5
(Can also run Inferno native on bare hardware)
Could write all applications in Limbo (Infernos
own language) and run on all platforms,
guaranteed!
Inferno Grid software (a Limbo program)
Inferno OS (Virtual OS)
Host OS (Windows, Linux, MacOSX, Solaris, FreeBSD)
6
Inferno Grid system overview
  • Matches jobs submitted to abilities of worker
    nodes
  • The whole show is run by a scheduler machine
  • Jobs are ordinary Windows/Linux/Mac executables
  • Process is different from that of Condor
  • Unless Condor has changed/is changing
  • In Condor, workers run daemon processes that wait
    for jobs to be sent to them
  • i.e. scheduler-push
  • Requires incoming ports to be open on each worker
    node
  • In the Inferno Grid, workers dial into the
    scheduler and ask have you got any work for me?
  • i.e. worker-pull or labour exchange
  • No incoming ports need to be open
  • Doesnt poll uses persistent connections
  • Studies have shown this to be more efficient (not
    sure which ones -)

7
Architecture
Worker firewalls No incoming ports open. Single
outgoing port open (to fixed, known server)
Workers can be in different admin. domains
Workers can connect and disconnect at will.
Job submission is via supplied GUI. Could create
other apps (command-line, Web interface)
Firewall single Incoming port open
Scheduler listens for job submissions and
workers reporting for duty
8
Job Administration
9
Node Administration
10
Pros and Cons
  • Pros
  • Easy to install and maintain
  • Good security
  • See next slide
  • Industry quality
  • Cons
  • Small user base and not-great documentation
  • Hence learning curve
  • Doesnt have all Condors features
  • E.g. migration, MPI universe, reducing impact on
    primary users
  • No Globus integration yet
  • But probably not hard to do JobManager for
    Inferno?
  • Security mechanism is Infernos own
  • But might see other mechanisms in Inferno in
    future
  • Question over scalability (100s of machines,
    fine 1000s not sure)
  • Inferno Grids dont flock yet

11
Security and impact on primary users
  • Only one incoming port on the scheduler needs to
    be open through the firewall
  • Nothing runs as root
  • All connections in the Inferno Grid can be
    authenticated and encrypted
  • Public-key certificates for auth, variety of
    encryption algs
  • Cert. usage is transparent, user is not aware
    its there
  • Similar to SSL in principle
  • Can setup worker nodes to only run certain jobs
  • So can prevent arbitrary code from being run
  • Doesnt have all of Condors options for pausing
    jobs on keyboard press, etc
  • Runs jobs under low priority
  • But could set up so that workers dont ask for
    work if they are loaded
  • But what happens to a job that has already
    started?

12
Other points
  • Slow-running tasks are reallocated until whole
    job is finished.
  • Could fairly easily write different front-ends
    for Inferno Grid for job submission and
    monitoring
  • Dont have to use supplied GUI
  • ReSCs JStyx library could be used to write Java
    GUI or JSP
  • In fact, code base is small enough to make
    significant customisation realistic
  • Customise worker node behaviour
  • Flocking probably not hard to do
  • Schedulers could exchange jobs
  • Or workers could know about more than one
    scheduler
  • Inferno OS can be used to very easily create a
    distributed data store
  • This data store can link directly with the
    Inferno Grid
  • Caveat We havent really used this in anger yet!

13
Building an Inferno Grid in this room
  • These are conservative estimates (I think)
  • Install scheduler (Linux machine) 10 minutes
  • Install worker node software (Windows) 2
    minutes each
  • Run toy job and monitor it within 15 minutes of
    start
  • Set up Inferno Certificate Authority 1 minute
  • Provide Inferno certificates to all worker nodes
    2 minutes per node
  • Provide Inferno cert to users admins 2
    minutes each
  • Fully-secured (small) Inferno Grid up and ready
    in an hour or two.
  • If you know what youre doing!! (remember that
    docs arent so good ? )

14
Reading Campus Grid so far
  • Collaboration between School of Systems
    Engineering, IT Services and e-Science Centre
  • Havent had as much time as wed like to
    investigate Inferno Grid
  • But have an embryonic Campus Grid of two
    flocked Condor pools
  • Although both at Reading, come under different
    admin domains
  • Getting them to share data space was challenging,
    and firewalls caused initial problem
  • (Incidentally, the Inferno Grid had no problems
    at all crossing the domains)
  • Small number of users running MPI and batch jobs
  • Animal and Microbial Sciences, Environmental
    Systems Science Centre
  • Ran demo project for University
  • An heroic effort at the moment, but we are
    trying to secure funding

15
Novel features of RCG
  • Problem Most machines are Windows but most
    people want nix environment for scientific
    programs
  • Diskless Condor
  • Windows machines reboot into Linux overnight
  • Loads Linux from a network-shared disk image
  • Uses networked resources only (zero impact on
    hard drive)
  • In morning, reboots back into Windows
  • Looking into CoLinux (www.colinux.org)
  • Free VM technology for running Linux under
    Windows
  • Early days, but initial look is promising.

16
Future work
  • Try to get funding!
  • Intention is to make CG key part of campus
    infrastructure
  • IT Services are supportive
  • Installation of SRB for distributed data store
  • Add clusters/HPC resources to Campus Grid
  • Working towards NGS compatibility

17
Conclusions
  • Inferno Grid has lots of good points, especially
    in terms of security, ease of installation and
    maintenance
  • Should be attractive to IT Services
  • We havent used it in anger yet but it is used
    successfully by others (in academia, industry and
    govt)
  • Caveat these people tend to run a single app (or
    small number of apps) rather than general code
  • Doesnt have all of Condors features
  • We dont want to fragment effort or become
    marginalised
  • Would be great to see good features of Inferno
    appear in Condor, esp. worker pull mechanism
Write a Comment
User Comments (0)
About PowerShow.com