Title: Climate Simulation using NinfG on the ApGrid Testbed
1Climate Simulation using Ninf-Gon the ApGrid
Testbed
- Yoshio Tanaka, Hiroshi Takemiya
- Kazuyuki Shudo, Satoshi Sekiguchi
- Grid Technology Research Center, AIST
2Elements of this DEMO
- Application Climate Simulation
- Originally developed by Dr. Tanaka (U. of
Tsukuba) - Portal Grid PSE Builder
- Any Unix-command application can be integrated to
Web portal - Middleware used for the implementation of
Grid-enabled climate simulation Ninf-G - GridRPC middleware based on the Globus Toolkit
which is used for gridifying the original
(sequential) application - Testbed ApGrid Testbed
- International Grid Testbed over the Asia Pacific
Region
3Application Climate Simulation
- Goal
- Long term, global climate simulation
- Winding of Jet-Stream
- Blocking phenomenon of high atmospheric pressure
- Barotropic S-Model
- Climate simulation model proposed by Prof. Tanaka
- Simple and precise
- Modeling complicated 3D turbulence as a
horizontal one - Keep high precision over long periods
- Taking a statistical ensemble mean
- several 100 simulations
- Introducing perturbation at every time step
- Typical parameter survey
4PSE Grid PSE Builder
- Generates an web interface for running an
Unix-command application. - Write an interface using XML.
ltapplicationgt ltappnamegtlslt/appnamegt
ltargspecgt/bin/ls option widthlt/argspecgt
ltarglistgt ltargs userequiredgt
lttitlegtoptionlt/titlegt ltradio
nameoptiongt ltoption value-agtdo not
hide entries lt/optiongt
5PSE Grid PSE Builder (contd)
client auth.
Grid PSE Core
SignOn/SignOff Job Control submission/query /cance
l
Job Queuing Manager Signing Server
globusrun
Accounting DB (Postgress)
accounting information
6Middleware Ninf-G (GridRPC System)
Utilization of remote supercomputers
? Notify results
Internet
user
? Call remote procedures
Call remote libraries
Large scale computing utilizing multiple
supercomputers on the Grid
7Middleware Ninf-G (contd)
- RPC library on the Grid
- Built on top of Globus Toolkit
- MDS managing stub information
- GRAM invocation of server programs
- GSI secure communication between a client and a
server - Simple and easy-to-use programming interface
- Hiding complicated mechanism of the grid
- Providing RPC semantics
for (i start i lt end i) / sequential
search / SDP_search(argv1, i, valuei)
grpc_function_handle_init(hdl,
, SDP/search) for (i start i lt end i)
/ parallel search using async. call /
grpc_call_async(hdl, argv1, i, valuei)
8Testbed ApGrid Testbed
http//www.apgrid.org/
9Ninfy the original (seq.) climate simulation
- Dividing a program into two parts as a
client-server system - Client
- Pre-processing reading input data
- Post-processing averaging results of ensembles
- Server
- climate simulation, visualize
S-model Program
Reading data
Solving Equations
Solving Equations
Solving Equations
Averaging results
VIsualize
10Testbed
- UME Cluster (AIST)
- jobmanager-grd, (40cpu 20cpu)
- AMATA Cluster (KU)
- jobmanager-sqms, 6cpu
- Galley Cluster (Doshisha U.)
- jobmanager-pbs, 10cpu
- Gideon Cluster (HKU)
- jobmanager-pbs, 15cpu
- PRESTO Cluster (TITECH)
- jobmanager-pbs, 4cpu
- VENUS Cluster (KISTI)
- jobmanager-pbs, 16cpu
- ASE Cluster (NCHC)
- jobmanager-fork, 2cpu
11Climate Simulation
client
server
front node - public IP - Globus - gatekeeper
- jobmanager - pbs, grd, sqms - NAT
backend nodes - private IP or public IP -
Globus SDK - Ninf-G Lib
12Lessons Learned
- Difficulties caused by the bottom-up approach and
the problems on the installation of the Globus
Toolkit. - Most resources are not dedicated to the ApGrid
Testbed. - Sites policy should be respected.
- There were some requirements on modifying
software configuration, environments, etc. - Version up of the Globus Toolkit (GT1.1.4 -gt
GT2.0 -gt GT2.2) - Apply patches, install additional packages
- Build bundles using other flavors
- Different requirements for the Globus Toolkit
between users. - Middleware developers needs the newest one.
- Application developers satisfy with using the
stable (older) one. - It is not easy to catch up frequent version up of
the Globus Toolkit. - ApGrid software package should solve some of
these problems
13Lessons Learned (contd)
- Problems in scalabiliy
- Initialization of function handles
- Initialization of a function handle takes several
to several ten seconds - Overhead caused by hitting gatekeeper (GSI
authentication) and a jobmanager invocation - Overhead caused by MDS lookup
- Current Ninf-G implementation needs to hit
gatekeeper for initialization of function handles
one-by-one - Although Globus GRAM enables to invoke multiple
jobs at one contact to gatekeeper, GRAM API is
not sufficient to control each jobs.
14Lessons Learned (contd)
- We observed that Ninf-G apps did not work
correctly due to un-expected configuration of
clusters - Failed in GSI auth. for establishing connection
for file transfers using GASS. - Backend nodes do not have host certs.
- Due to the configuration of local scheduler
(PBS), Ninf-G executables were not activated. - Example
- PBS jobmanager on a 16 nodes cluster
- Call grpc_call 16 times on the cluster. App.
developer expected to invoke 16 Ninf-G
executables simultaneously. - Configuration of PBS Queue Manager set the max
number of simultaneous job invocation for each
user a 9 - 9 Ninf-G executables were launched, however 7
were not activated
15Special Thanks (for technical support) to
- Kasetsart University (Thailand)
- Sugree Phatanapherom
- Doshisha University (Japan)
- Yusuke Tanimura
- University of Hong Kong (Hong Kong)
- CHEN Lin, Elaine
- KISTI (Korea)
- Gee-Bum Koo, Jae-Hyuck
- Tokyo Institute of Technology (Japan)
- Kenichiro Shirose
- NCHC (Taiwan)
- Julian Yu-Chung Chen
- AIST (Japan)
- Grid Support Team
- APAN
- HK, TW, JP