Title: Universit Paris Sud, LRI
1- Université Paris Sud, LRI
- Parallel Architecture Team
- Cluster and Grid group
- Cecile Germain, Gille Fedak, Franck Cappello
2Outline
- Introducing XtremWeb Team
- Several approaches of Metacomputing
- Motivations and goals of XW (Franck)
- Inside XW (Gilles)
- Scientific aspects of XW (Cecile)
- Conclusion
3Introducing XtremWeb Team
Members (still increasing)
F. Cappello CR C. Germain MC O. Richard MC
(collaborator) G. Fedak Ph. D student V.
Neri Engineer
Location
In short France, Paris area
Hardware resources
1 Myrinet platform 8 nodes Multipro (2
procs/node) 1 simulation platform 20 nodes
Multipro (2 procs/node) 1 XtremWeb platform 8
nodes Multipro (to be installed)
Collaborations
RWCP, IBM Watson, Compaq, ENS Lyon (LIP)
Supercomputing, HPCA, PACT, ICPP, Europar, Grid,
PPL FGCS, etc.
Publications
4Tree approaches of Metacomputing
- Metacomputing (GRID) building an infrastructure
that enable the cooperation of (large) devices
(supercomputers, data bases, telescope, virtual
reality centers) across wide area networks for a
single application. - Supercomputing Portals provide a unique
interface for accessing a (large) variety of
geographically distributed supercomputers - Global Computing using a very large number of
PCs connected to Internet.
5Metacomputing
- International Projects
- Globus (infrastructure software)
- Netsolve (scheduling libraries)
- Legion (interface between middleware and OO
applications) - Ninf
- Data Grid (European project)
- Eurogrid (European forum)
- NCG (Nation wide Computational Grid) NASA.
- ...
- International Workshop on Global and Cluster
Computing (WGCC'2000) - http//pdplab.trc.rwcp.or.jp/pdperf/wgcc2000
.html
6Supercomputing Portals
- Projects
- Unicore (infrastructure). European
- Academic Projects
- Industrial Projects (Artabel, CS)
7Global Computing very first experiments
- Well known Applications
- RC5 et DES code cracking
- finding Mersenne prime numbers
- RSA code cracking
- Seti_at_home (exploring signals of the universe)
- First results 1996-1997
- Number of participants 250, 3500, 14000 PCs in
1997 - 35 k persons in the Seti_at_home mailing list in
1997 (2 million participants currently)
8Global Computing Currently building very large
platforms(commercial issues)
- International projects
- SETI_at_home (dedicated to one application)
- Entropia (infrastructure) US
- Andrew Chien, Larry Smarr
- Distributed.net (infrastructure) US
- Nimrod-G (cycle stealing) Australia
- XtremWeb (infrastructure) Europe
- Typical number of participants 100 K, 1M
9Motivations for Global Computing 1)
Project AUGER Understanding very high energy
cosmic rays (1020 ev) Physicists are unable to
reproduce them on earth. 1 rays every century
per Km2. Possible Origin galaxies
collision. -gt building 2 very large detectors in
south and north America -gt simulate a huge
number of rays entering the atmosphere (air
showers) and compare them with Detectors
measurements.
Some applications need an extraordinary computing
power to compute a very large set of independent
calculations
10Motivations for Global Computing 2)
- Today
- A very large number of resources are connected
to Internet - The number of accessible computing resources is
about several millions - Tomorrow
- Post PC area a very large number of mobile
devices (phone, palmtop PC, route-planer, etc.).
About 1 billion of connected computing resources - Mobile objects will use wireless communication
(PalmVII).They should be much more available for
Global Computing thanPC connected via phone-line
and modems.
Use idle resources to build a Very Large
Parallel Computers
11Motivations for Global Computing 3)
- Today
- Cycle stealing is widely used by the industry
for simulations - Most of the cycle stealing systems (Condor,
Mosix, Glunix, etc.)have been designed for local
area network (a single administrationdomain) - Tomorrow
- Cycle stealing will concern stations and servers
of multi-sitecompanies. The geographically
distributed resources will be managed as a
single set of resources
Cycle stealing across the Internet
12Motivations for Global Computing 4)
- A new Parallel Architecture
- A very large number of resources (gtgt parallel
computers) - Very poor communication performances
- Common issues with Distributed systems
- Load balancing
- Fault tolerance
- New issues
- How to program this new architecture?
- Application domains (EP, multi-parameter
simulations, portingwell known high performance
applications) - Performance Evaluation (benchmark, measures,
parameters) - Performance models ( algorithmic BSP like ,
performance LogP like) - New Algorithmic
- Economic model
A very large field for academic research in
Computer Science
13XtremWeb objective 1)A platform to investigate
Global Computing system issues
Using PCs connected to Internet during their idle
time
XtremWebWork Server LRI
X 1000 volunteers PCs
Internet
XtremWeb Result Collector LRI?
14XtremWeb objective 2)Meeting the General
properties of a Global Computing system
- Scalability such a system must scale up to 100 k
or even 1 M machines - Heterogeneity workers may have different
hardware and OS - Dynamicity the number of workers evolves
continuously - Availability the resource proprietary should be
able to define sharing policies for its resource - Fault tolerance system (and may be applications)
must be able towork normally even if some
fundamental elements are defective - Usability the system must be easy to program
and maintain - Security the system must be secure for the
workers, the servers and the. A malicious worker
should not be able to corrupt the application. A
malicious agent should not be view as a regular
XtremWeb approved server
15XtremWeb objective 3) Specific properties
- Multi-applications
- High performances
16XtremWeb objective 4) Growth
1 LRI -gt 100 Machines --gt September 2000 2
Université Paris sud -gt 2 000 Machines --gt 1/2
2001 3 Universities/High Schools -gt 10 000
Machines --gt Fin 2001 ??? 4 Foreign
Universities-gt 20 000 Machines --gt
??? 5 Volunteers PCs -gt 100 000 Machines
--gt ??? We participate to a very large
project gathering 19 research centers around the
world. This project should gather about 5 000
machines.
17Conclusion
- Unique European Project of Global Computing
(according to our knowledge), - Aims design, performance, theoretical issues of
Global Computing - Objective 100 000 Machines (Personal Devices)
- We are seeking for very large applications
- We are seeking for International Partners
(offering their resources during idle time and
looking for an academic platform of Global
Computing)
18IEEE CCGrid 2001
16 - 18 May 2001 in Brisbane, Australia
Papers Due
4 November 2000
Notification of Acceptance
20 December 2000
24 January 2001
Camera Ready Papers Due
Global Computing on personal devices Session
Web/grid computing middleware,
environments and toolkits
Programming models, languages, scripting
tools
Processing algorithms for typical
web/grid configurations (large number of
processors, limited bandwidth, occasional faults)
Resource management, reservation and
scheduling
Performance evaluation and modelling of
grid systems
Web/grid security, management and
monitoring
Compute driven and I/O (storage) driven
web/grid applications (scientific, engineering,
and business)
Integration of legacy systems via the web
Web/grid services and economies of
computational grids
Franck Cappello Spyros Lalis
CNRS Institute for Computer Science
Session Co-Chairs
Universite Paris-Sud Foundation for Research
and Technology
France
Greece
fci_at_lri.lri.fr
lalis_at_ics.forth.gr