Title: The Ibis e-Science Software Framework
1The Ibis e-Science Software Framework
- Henri Bal
- High Performance Distributed Computing group
- Department of Computer Science
- VU University, Amsterdam, The Netherlands
- Frank J. Seinstra, Jason Maassen, Niels Drost
- Netherlands eScience Center
2Introduction
- Distributed systems continue to change
- Clusters, grids, clouds, mobile devices
- Distributed applications continue to change
- e-Science, web, pervasive applications
- Distributed programming continues to be
notoriously difficult
3Distributed Systems 1980sMultiple PCs on a
(local) network
- Networks of Workstations (NOWs)
- Collections of Workstations (COWs)
- Processor pools
- Condor pools
- Clusters
4Distributed Systems 1990sSharing wide-area
resources
- Metacomputing (Smarr Catlett, CACM)
- Flocking Condor (Epema)
- DAS (Distributed ASCI Supercomputer)
- Grid Blueprint (Foster Kesselman)
- Desktop grids, SETI_at_home
5Distributed Systems 2000s
- Cloud computing
- Pay-on-demand
- Virtualization
- Hardware diversity /heterogeneous computing
- Green IT
- The Networked World
- Sensor networks
- Smart phones
6Our approach
- Study fundamental underlying problems
- hand-in-hand with realistic applications
- integrate solutions in one system Ibis
!
Distributed Systems
User
7Ibis History
- Started as NWO project (2002)
- VL-e (2003-2009)
- EU (JavaGAT, XtreemOS, Contrail)
- VU grant Frank Seinstra (2008-2012)
- Currently COMMIT Netherlands eScience Center
8- COMMIT is a public-private research community
solving grand challenges in information and
communication science shaping tomorrows society. - COMMIT has 15 projects and 200 people in 80
organisations such as universities, TNO, Thales,
Logica, Philips, AMC, and SMEs like DevLab,
Hyves, Waag. - COMMIT cooperates closely with EIT ICT-Labs.
- COMMIT delivers science, disseminates its
results, measures its impact, generates synergy.
9(No Transcript)
10Outline
- Problem Solving vs. System Fighting
- Jungle Computing
- Example applications
- Computational Astrophysics
- Multimedia Content Analysis
- The Ibis Software Framework
- The 3 Common Uses of Ibis
- Master Key Glue HPC
- Some current work
11- Ibis Problem Solving vs. System Fighting
12A Random Example Supernova Detection
- DACH 2008, Japan
- Distributed multi-cluster system
- Heterogeneous
- Distributed database (image pairs)
- Large vs small databases/images
- Partial replication
- Image-pair comparison given (in C)
- Find all supernova candidates
- Task 1 As fast as possible
- Task 2 Idem, under system crashes
13Problem Solving vs. System Fighting
- All participating teams struggled (1 month)
- Middleware instabilities
- Connectivity problems
- Load balancing
- But not the Ibis team
- Winner (by far) in both categories
- Note many Japanese teams with years of
experience - Hardware, middleware, network, C-code, image
data - Focus on problem solving, not system fighting
- incl. opening of black-box C-code
14Ibis Results Awards Prizes
1st Prize DACH 2008 - BS
1st Prize DACH 2008 - FT
AAAI-VC 2007 Most Visionary Research Award
WebPie A Web-Scale Parallel Inference Engine J.
Urbani, S. Kotoulas, J. Maassen, N. Drost, F.J.
Seinstra, F. van Harmelen, and H.E. Bal
1st Prize SCALE 2008
3rd Prize ISWC 2008
1st Prize SCALE 2010
- Many domains data/compute intensive,
real-time... - Winner Sustainability Award in the Enlighten Your
Research (EYR) competition, 7 Dec. 2011 (Frank
Seinstra)
15Ibis Users
and many more
16 17Jungle Computing (Frank Seinstra)
- Worst case computing as required by end-users
- Distributed
- Heterogeneous
- Hierarchical (incl. multi-/many-cores)
18Why Jungle Computing?
- Scientists often forced to use a wide variety of
resources simultaneously to solve computational
problems, e.g. due to - Desire for scalability
- Distributed nature of (input) data
- Software heterogeneity (e.g. mix of C/MPI and
CUDA) - Ad hoc hardware availability
- Energy consumption (use most energy-efficient
resource) -
- Note most users do not need worst case jungle
- Ibis aims to apply to any subset
19Example Application Domains
- Computational Astrophysics (Leiden)
- AMUSE multi-model / multi-kernel simulations
- Simulating the Universe on an Intercontinental
Grid - Portegies Zwart et al (IEEE Computer, Aug
2010) - Climate Modeling (Utrecht)
- CPL multi-model / multi-kernel simulations
- Atmosphere, ocean, source rock formation,
- hardware (potentially) very
diverse - high resolution gt speed
scalability -
20- Domain Example 1
- Computational Astrophysics
21Domain Example 1 Computational Astrophysics
Demonstrated live at SC11, Nov 12-18, 2011,
Seattle, USA
22Domain Example 1 Computational Astrophysics
- The AMUSE system (Leiden University)
- Early Star Cluster Evolution, including gas
- Gravitational dynamics (N-body) GPU /
GPU-cluster - Stellar evolution Beowulf cluster /
Cloud - Hydro-dynamics, Radiative transport
Supercomputer
gravitational dynamics
AMUSE
hydro-dynamics
stellar evolution
radiative transport
23Domain Example 1 Computational Astrophysics
Demonstrated live at SC11, Nov 12-18, 2011,
Seattle, USA
24- Domain Example 2
- Multimedia Content Analysis
25Multimedia Content Analysis (MMCA)
- Aim
- Automatic extraction of semantic concepts from
image sets and video streams - Depending on specific problem size of data set
- May take hours, days, weeks, months, years
26Multimedia Content Analysis (MMCA)
- Applications in (a.o)
- Remote Sensing
- Security / Surveillance
- Medical Imaging
- Document Analysis
- Multimedia Systems
- Astronomy
- Application types
- Real-time vs. off-line
- Fine-grained vs. coarse-grained
- Data-intensive / compute-intensive /
information-intensive
27Domain Example 2 Color-based Object Recognition
by a Grid-connected Robot Dog
Seinstra et al (IEEE Multimedia, Oct-Dec
2007) Seinstra et al (AAAI07 Most Visionary
Research Award)
28Successful
- but many fundamental problems unsolved!
- Scaling up to very large systems
- Platform independence
- Middleware independence
- Connectivity (a.o. firewalls, )
- Fault-tolerance
-
- Software support tool(s) urgently needed!
- Jungle-aware transparent efficient
- No progress until discovery of Ibis
29- The Ibis Software Framework
30The Ibis Software Framework
- Offers all functionality to efficiently
transparently implement run Jungle Computing
applications - Designed for dynamic / hostile environments
- Modular and flexible
- Allow replacement of Ibis components by external
ones, including native code - Open source
- Download http//www.cs.vu.nl/ibis/
31Ibis Design
- Applications need functionality for
- Programming (as in programming languages)
- Deployment (as in operating systems)
32Ibis Software Stack
33JavaGAT
- Java Grid Application Toolkit
- High-level API for developing (Grid) applications
independently of the underlying (Grid) middleware - Use (Grid) services file cp, resource discovery,
job submission, - Developed in EU GridLab project
- Thilo Kielmann, Rob van Nieuwpoort
- SAGA API standardized by OGF
- Simple API for Grid Applications (a.o. with LSU)
- SAGA on top of JavaGAT
34Zorilla
- A prototype P2P middleware
- A Zorilla system consists of a collection of
nodes, connected by a P2P network - Each node independent implements all middleware
functionality - No central components
- Supports fault-tolerance and malleability
- Easily combines resources in multiple
administrative domains
35IbisDeploy
36Ibis Portability Layer (IPL)
- Java-centric run-anywhere communication library
- Sent along with your application
- MPI for the Grid
- Supports fault-tolerance and malleability
- Resource tracking (Join-Elect-Leave model)
- Open-world / Closed world
- Efficient
- Highly optimized object serialization
- Can use optimized native libraries (e.g. MPI,
Infiniband)
37SmartSockets
- Robust connection setup
- Always connection in 30 different scenarios
Problems Firewalls Network Address Translation
(NAT) Non-routed networks Multi-homing
38Ibis Programming Models
- IPL-based programming models, a.o.
- Satin
- A divide-and-conquer model
- MPJ
- The MPI binding for Java
- RMI
- Object-Oriented remote Procedure Call
- Jorus
- A user transparent parallel model for
multimedia applications
39- The 3 Common Uses of Ibis
40Ibis as Master Key (or Passepartout)
- Use JavaGAT to access any system
- Develop/run applications independently of
available middlewares - JavaGAT adaptors required for each middleware
- Intelligent dispatching even allows for
transparent use of multiple middlewares - Example file copy
- JavaGAT vs. Globus
- Simple, portable,
- SAGA API standardized
41Ibis as Glue
- Use IPL SmartSockets, generally for wide-area
communication - Linking up separate activities of an
application - Activities often largely independent tasks
implemented in any popular language or model
(e.g. C/MPI, CUDA, Fortran, Java) - Each typically running on a single
GPU/node/Cluster/Cloud/ - Automatically circumvent connectivity problems
- Example
With SmartSockets
No SmartSockets
42Ibis as HPC Solution
- Use Ibis as replacement for e.g. C/MPI code
- Benefits
- (better) portability
- malleability (open world)
- fault-tolerance
- (run-time) task migration
- Downside
- requires recoding
- Comparable speedups
43MMCA Situation in 2004/2005
SSH
Parallel Horus Client
Parallel Horus Server
Sockets SSH Tunneling
Parallel Horus Client
C/MPI
- Code pre-installed at each cluster site
- Instable / faulty communication
- Connectivity problems
- Execution on each cluster by hand
44Phase 1 Ibis as Master Key (2006)
JavaGAT IbisDeploy
Parallel Horus Client
Parallel Horus Server
Sockets SSH Tunneling
Parallel Horus Client
C/MPI
- Code pre-installed at each cluster site
- Instable / faulty communication
- Connectivity problems
- Execution on each cluster by hand
45Phase 2 Ibis as Glue (2006/2007)
JavaGAT IbisDeploy
Parallel Horus Client
Parallel Horus Server
IPL SmartSockets
Parallel Horus Client
C/MPI
- Code pre-installed at each cluster site
- Instable / faulty communication
- Connectivity problems
- Execution on each cluster by hand
46Phase 3 Ibis as HPC Solution (2008)
JavaGAT IbisDeploy
Parallel Jorus Client
Parallel Jorus Server
IPL SmartSockets
Parallel Jorus Client
Ibis/Java
- Code pre-installed at each cluster site
- Instable / faulty communication
- Connectivity problems
- Execution on each cluster by hand
47Master Key Glue HPC
- Step-wise conversion to 100 Ibis / Java
- Phase 1 JavaGAT as Master Key
- Phase 2 IPL SmartSockets as Glue
- Phase 3 Ibis as HPC Solution
- After each phase a fully functional, working
solution was available! - Eventual result
- wall-socket computing from a memory stick
- Remember the Promise of the Grid?
- Awards at AAAI 2007 and CCGrid 2008
48 49Other current PhD projects using Ibis
- Distributed reasoning over semantic web data
- WebPIE Parallel reasoner on Web scale
- Written in Java, uses Hadoop (MapReduce)
- Graph applications (HiPG)
- E.g. for bioinformatics applications
- http//www.graph500.org/
- Games distributed model checking
- Deal with large state space
- GreenClouds
- Distributed smart phone applications
50Green Clouds
- NWO Smart Energy Systems project with Univ. of
Amsterdam (Cees de Laat) SARA - How to map high-performance applications onto
hybrid distributed computing system, taking both
performance energy consumption into account - System-level approach to reduce HPC energy
consumption
51DAS-4 infrastructure for Green IT
UvA/MultimediaN (16/36)
Dual quad-core Xeon E5620 Various accelerators
(GPUs, multicores, .) Scientific Linux Built by
ClusterVision
VU (74)
SURFnet6
ASTRON (23)
10 Gb/s lambdas
TU Delft (32)
Leiden (16)
52Main ideas
- Adapt resources to application needs dynamically,
accounting for computational energy efficiency - Using Ibis malleability support
- Exploit hardware diversity
- Graphics Processing Units (GPUs) have much higher
FLOPS/Watt for many applications - Use optical and photonic networks
- Build a knowledge base semantic infrastructure
description
53Computation Offloading Framework
- Runs on Android, integrates with Eclipse
- Multiple implementations of compute intensive
parts - Remote and local implementation bundled together
- Deals with network connectivity issues (Ibis
SmartSockets)
54Computation Offloading
Remote
Activity
Stub
Proxy
Local
55Conclusions
- Ibis enables problem solving (avoids system
fighting) - Successfully applied in many domains
- Astronomy, multimedia analysis, climate modeling,
remote sensing, semantic web,
medical imaging, - Data intensive, compute intensive, real-time
- Open source, download
- www.cs.vu.nl/ibis/
56Conclusions (2)
- Jungle Computing is hard
- High-Performance Jungle Computing even harder
- While research into efficient green
Jungle-aware programming models has only just
begun - Ibis provides the basic functionality to
efficiently transparently overcome most Jungle
Computing complexities
57Acknowledgements
- Ceriel Jacobs
- Roelof Kemp
- Timo van Kessel
- Thilo Kielmann
- Ela Krepska
- Maarten van Meersbergen
- Rob van Nieuwpoort
- Nick Palmer
- Kees van Reeuwijk
- Jacopo Urbani
- Kees Verstoep
- Ben van Werkhoven
- Gosia Wrzesinska
-