Using%20Distributed%20Computing%20in%20Computational%20Fluid%20Dynamics - PowerPoint PPT Presentation

About This Presentation
Title:

Using%20Distributed%20Computing%20in%20Computational%20Fluid%20Dynamics

Description:

Title * QADPZ * An Open System for Distributed Computing Author: PINGY Last modified by: PINGY Created Date: 1/15/2003 11:26:07 PM Document presentation format – PowerPoint PPT presentation

Number of Views:143
Avg rating:3.0/5.0
Slides: 36
Provided by: PING115
Category:

less

Transcript and Presenter's Notes

Title: Using%20Distributed%20Computing%20in%20Computational%20Fluid%20Dynamics


1
UsingDistributed ComputinginComputational
Fluid Dynamics
Norwegian University of Science and Technology
  • Zoran Constantinescu,
  • Jens Holmen, Pavel Petrovic

13-15 May 2003, Moscow
2
outline
simulations, data processing, complex algs.
existing systems
supercomputers
need more processing power
clusters of PCs
distributed computing
QADPZ project
CFD application
Navier-Stokes solver
  • description
  • advantages
  • applications
  • current status
  • existing solver using Diffpack and
    mpich
  • small mpi replacement
  • results and comparison
  • future work

3
problems
  • larger and larger amounts of data are generated
    every day (simulations, measurements, etc.)
  • software applications used for handling this
    data are requiring more and more CPU power
  • simulation, visualization, data processing
  • complex algorithms, e.g. evolutionary algorithms
  • large populations, evaluations very time
    consuming

? need for parallel processing
4
solution1
  • use parallel supercomputers
  • access to tens/hundreds of CPUs
  • e.g. NOTUR/NTNU embla (512) gridur (384)
  • high speed interconnect, shared memory
  • usually for batch mode processing
  • very expensive (price, maintenance, upgrade)
  • these CPUs are not so powerful anymore
  • e.g. 500 MHz RISC vs. today's 2.5 GHz Pentium4

5
solution2
  • use clusters of PCs (Beowulf)
  • network of personal computers
  • usually running Linux operating system
  • powerful CPUs (Pentium3/4, Athlon)
  • high speed networking (100 MBps, 1 GBps, Myrinet)
  • much cheaper than supercomputers
  • still quite expensive (install, upgrade,
    maintenance)
  • trade higher availability and/or greater
    performance for lower cost

6
solution3
  • use distributed computing
  • using existing networks of workstations
  • (PCs from labs, offices connected by LAN )
  • usually running Windows or Linux operating system
  • (also MacOS, Solaris, IRIX, BSD, etc.)
  • powerful CPUs (Pentium3/4, AMD Athlon)
  • high speed networking (100 MBps)
  • already installed computers very cheap
  • already existing maintenance
  • easy to have a network of tens/hundreds of
    computers

7
distributed computing
  • specialized applications run on each individual
    computer from the LAN
  • they talk to one or more central servers
  • download a task, solve it, and send back results
  • more suited (easier) for task-parallel
    applications
  • (where the applic. can be decomposed into
    independent tasks)
  • can also be used for data-parallel applications
  • the number of available CPUs is more dynamic

8
existing systems
  • seti_at_home
  • search for extraterrestrial intelligence
  • analysis of data from radio telescopes
  • client application is very specialized
  • using the Internet to connect clients to the
    server, and to download/upload a task/results
  • no framework for other type of applications
  • no source code available

9
existing systems
  • distributed.net
  • one of the largest "computer" in the world
    (20TFlops)
  • used for solving computational challenges
  • RC5 cryptography, Optimal Golomb ruler
  • client application is very specialized
  • using the Internet to connect clients to the
    server
  • no framework for other applications
  • no source code available

10
existing systems
  • Condor project (Univ.of Wisconsin)
  • more research oriented computational projects
  • more advanced features, user applications
  • very difficult to install, problems with some
    OSes
  • (started from a Unix environment)
  • restrictive license (closed system)
  • other commercial projects Entropia, Parabon
  • expensive for research, only certain OSes

11
a new system
  • QADPZ project (NTNU)
  • initial application domains large scale
    visualization, genetic algorithms, neural
    networks
  • prototype in early 2001, but abandoned (too viz
    oriented)
  • started in July 2001, first release v0.1 in Aug
    2001
  • we are now close a new release v0.8 (May 2003)
  • system independent of any specific application
    domain
  • open source project on SourceForge.net

12
http//qadpz.sourceforge.net
13
QADPZ description
  • QADPZ project (NTNU)
  • similar in many ways to Condor (submit computing
    tasks to idle computers running in a network)
  • easy to install, use, and maintain
  • modular and extensible
  • open source project, implemented in C
  • support for many OSes (Linux, Windows, Unix, )
  • support for multiple users, encryption
  • logging and statistics

14
QADPZ architecture
client ? master ? slave

  • management of slaves
  • scheduling tasks
  • monitoring tasks
  • controlling tasks
  • background process
  • reports status
  • download tasks/data
  • executing tasks
  • user interface
  • submit tasks/data

15
communication
data flow
control flow

repository
  • external server web, ftp,
  • internal lightweight web server

16
the client
  • is the interface to the system
  • describes job project file, prepares task code,
    prepares input files, submit the tasks
  • communicates with the master
  • two modes
  • interactive stay connected to master and wait
    for the results
  • batch detach from the master and get results
    later the master will keep all the messages

17
jobs, tasks, subtasks
  • job
  • consists of groups of tasks executed sequentially
    or in parallel
  • a task can consist of subtasks (the same
    executable is run, but with different input
    data)
  • each task is submitted individually, the user
    specifies which OS and min. resource requirements
    (disk, mem)
  • the master allocates the most suitable slave for
    executing the tasks and notifies the client
  • when task is finished, results are stored as
    specified in the project and the client is
    notified

18
the tasks
  • normal binary executable code
  • no modifications required
  • must be compiled for each of the used platforms
  • normal interpreted program
  • shell script, Perl, Python
  • Java program
  • requires interpreter/VM on each slave
  • dynamically loaded slave library (our API)
  • better performance
  • more flexibility

19
the master
  • keeps account of all existing slaves (status,
    specifications)
  • usually one master is enough
  • more can be used if there are too many slaves
    (communication protocol allows one master to act
    as another client, but this is not fully
    implemented yet)
  • keeps account of all submitted jobs/tasks
  • keeps a list of accepted users (based on
    username/passwd)
  • gathers statistics about slaves, tasks
  • can optionally run the internal web server for
    the data and code repository

20
the slave

21
OO design
22
communication
remove?
  • layered communication protocol
  • exchanged messages are in XML (w/
    compressencrypt)
  • uses UDP with a reliable layer on top

23
configuration
  • slave
  • qadpz_slave (daemon) slave.cfg
  • master
  • qadpz_master (daemon) master.cfg
  • qadpz_admin users.txt privkey
  • client
  • qadpz_run client.cfg pubkey

Linux, Win32 (9x,2K,XP), SunOS, IRIX, FreeBSD,
Darwin MacOSX
24
installation
  • one of our computer labs (Rose lab)
  • 80 PCs Pentium3, 733 MHz, 128 MBytes
  • dual boot Win2000 and FreeBSD
  • running for several month
  • when a student logs in into the computer, the
    slave running on that computer is set into
    disable mode (no new computing tasks are
    accepted, any current tasks in killed and/or
    restarted on another comp.)
  • our Beowulf cluster (ClustIS)
  • 40 nodes AMD Athlon, 1466 MHz, 512/1024 GBytes
  • 100 MBps network, Linux

25
status
26
CFD application
  • existing solver for the incompressible
    Navier-Stokes equations
  • using the projection method, eq. for the velocity
    and pressure are solved sequentially at each time
    step
  • using Galerkin finite element method to
    discretize in space
  • the pressure Poisson eq. is preconditioned with a
    mixed additive/multiplicative domain
    decomposition method one iteration of
    overlapping Schwartz combined with a coarse grid
    correction

27
NS solver
  • developed in C using the Diffpack object
    oriented numerical library for differential eqs.
  • used the parallel version based on standard MPI
    (message passing interface) for communication
  • we used mpich
  • tested on our 40 nodes PC cluster

28
MPI implementation
  • needed to replace the MPI calls used by Diffpack
  • a subset of the MPI standard
  • MPI_Init, MPI_Finalize
  • MPI_Comm_size, MPI_Comm_rank
  • MPI_Send, MPI_Recv
  • MPI_Isend, MPI_Irecv
  • MPI_Bcast, MPI_Wait
  • MPI_Reduce, MPI_Allreduce
  • MPI_Barrier
  • MPI_Wtime

29
test case
  • oscillating flow around a fixed cylinder
  • 3D domain with 8-node isoparametric elements
  • coarse grid 2 184 nodes, 1 728 elements
  • fine grid 81 600 nodes, 76 800 elements
  • 10 time steps
  • running time 30 minutes with 2 CPUs
  • (AMD Athlon 1466 MHz)

30
coarse grid
31
running time
32
monitor
33
future work
  • local caching of binaries on the slaves
  • different scheduling protocols on master
  • dynamic load balancing of tasks
  • connect to the Grid
  • web interface to the client
  • creating jobs easier, with input data
  • starting/stopping jobs
  • monitoring execution of jobs
  • easy access to the output of execution
  • should decrease learning effort for using the
    system

34
the work
  • cooperation between
  • Norwegian Univ. of Science and Technology
  • Division of Intelligent Systems (DIS)
  • http//dis.idi.ntnu.no
  • SINTEF
  • http//www.sintef.no

35
thank you
questions ?
http//qadpz.sourceforge.net
Write a Comment
User Comments (0)
About PowerShow.com