Title: Science Grid Program NAREGI
1- Science Grid Program NAREGI
- And
- Cyber Science Infrastructure
November 1, 2007 Kenichi Miura, Ph.D. Information
Systems Architecture Research Division Center for
Grid Research and Development National Institute
of Informatics Tokyo, Japan
21.National Research Grid Initiatve
(NAREGI)2. Cyber Science
Infrastructure(CSI)
Outline
3National Research Grid Initiative (NAREGI)
ProjectOverview
- Originally started as an RD project funded by
MEXT - (FY2003-FY2007)
- 2 B Yen(17M) budget in FY2003
- Collaboration of National Labs. Universities and
Industry - in the RD activities (IT and Nano-science
Apps.) - Project redirected as a part of the Next
Generation - Supercomputer Development Project
(FY2006-..)
MEXTMinistry of Education, Culture,
Sports,Science and Technology
4National Research Grid Initiative (NAREGI)
ProjectGoals
- To develop a Grid Software System (RD in Grid
- Middleware and Upper Layer) as the prototype
of future Grid - Infrastructure in scientific research in
Japan - (2)To provide a Testbed to prove that the
High-end Grid Computing Environment (100Tflop/s
expected by 2007) can - be practically utilized by the nano-science
research - community over the Super SINET (now,
SINET3). - (3) To Participate in International
collaboration/Interoperability - (U.S., Europe, Asian Pacific) ? GIN
- (4) To Contribute to Standardization Activities,
e.g., OGF
5Organization of NAREGI
Ministry of Education, Culture, Sports, Science
and industry(MEXT)
Oparation And Collaboration
Collaboration
Center for Grid Research and Development (National
Institute of Informatics)
Cyber Science Infrastructure(CSI)
Industrial Association for Promotion
of Supercomputing Technology
Deployment
Coordination and Operation Committee
Computing and Communication Centers (7 National
Universities) etc.
Grid Technology Research Center (AIST), JAEA
Joint RD
TiTech, Kyushu-U, Osaka-U, Kyushu-Tech., Fujitsu,
Hitachi, NEC
SINET3
Unitization
Collaboration
Computational Nano Center (Inst. Molecular
science)
ITBL
Joint Research
Collaboration Joint Research
RD on Grand Challenge Problems for Grid
Applications (ISSP, Tohoku-U, AIST,
Inst. Chem. Research, KEK etc.)
6NAREGI Software Stack
7VO and Resources in Beta 2
Decoupling VOs and Resource Providers (Centers)
RO3
Client
Client
Client
VO-APL2
ResearchOrg (RO)1
VOMS
VOMS
IS
SS
RO2
VOs Users
VO-RO2
IS
SS
VOMS
VOMS
Resource Providers
IS
IS
- Policy
- VO-R01
- VO-APL1
- VO-APL2
- Policy
- VO-R01
- VO-APL1
- VO-APL2
A.RO1 B.RO1
N.RO1
a.RO2 b.RO2
n.RO2
Grid Center_at_RO1
Grid Center_at_RO2
8WP-2Grid Programming GridRPC/Ninf-G2
(AIST/GTRC)
GridRPC
- Programming Model using RPC on the Grid
- High-level, taylored for Scientific Computing
(c.f. SOAP-RPC) - GridRPC API standardization by GGF GridRPC WG
- Ninf-G Version 2
- A reference implementation of GridRPC API
- Implemented on top of Globus Toolkit 2.0 (3.0
experimental) - Provides C and Java APIs
Numerical Library
IDL FILE
IDL Compiler
Client
4. connect back
3. invoke Executable
generate
2. interface reply
Remote Executable
GRAM
1. interface request
fork
Interface Information LDIF File
MDS
retrieve
Server side
Client side
9WP-2Grid Programming-GridMPI (AIST and U-Tokyo)
GridMPI is a library which enables MPI
communication between parallel systems in the
grid environment. This realizes Huge data
size jobs which cannot be executed in a single
cluster system Multi-Physics jobs in the
heterogeneous CPU architecture environment
- Interoperability
- - IMPI(Interoperable MPI) compliance
communication protocol - - Strict adherence to MPI standard in
implementation - High performance
- - Simple implementation
- - Buit-in wrapper to vendor-provided MPI
library
10WP-3 User-Level Grid Tools PSE
- Grid PSE
- - Deployment of applications on the Grid
- - Support for execution of deployed applications
- Grid Workflow
- - Workflow language independent of specific Grid
middleware - - GUI in task-flow representation
- Grid Visualization
- - Remote visualization of massive data
distributed over the Grid - - General Grid services for visualization
11(No Transcript)
12Workflow based Grid FMO Simulations of Proteins
monomer calculation
fragment data
densityexchange
dimer calculation
total energy calculation
njs_png2005
njs_png2005
njs_png2006
njs_png2006
njs_png2007
njs_png2007
njs_png2008
njs_png2008
visuali-zation
njs_png2009
njs_png2009
njs_png2002
njs_png2010
njs_png2002
njs_png2010
input data
njs_png2003
njs_png2011
njs_png2003
njs_png2011
njs_png2002
njs_png2012
njs_png2004
njs_png2002
njs_png2012
njs_png2004
NII Resources
njs_png2057
njs_png2057
Data component
dpcd054
dpcd054
dpcd055
dpcd055
dpcd052
dpcd056
dpcd052
dpcd056
IMS Resources
dpcd053
dpcd056
dpcd057
dpcd053
dpcd056
dpcd057
By courtesy of Prof. Aoyagi (Kyushu Univ.)
13Scenario for Multi-sites MPI Job Execution
Work-flow
Input files
1 Submission
c Edit
b Deployment
a Registration
IMPI
WFT
PSE
CA
3 Negotiation
Distributed InformationService
SuperScheduler
Resource Query
Output files
3 Negotiation
5 Sub-Job
Agreement
Grid Visualization
10 Accounting
6 Co-Allocation
GridVM
GridVM
GridVM
6 Submission
4 Reservation
LocalScheduler
LocalScheduler
LocalScheduler
MPI
10 Monitoring
RISMJob
IMPIServer
FMOJob
CA
CA
CA
Site C (PC cluster)
Site B (SMP machine)
Site A
14Adaptation of Nano-science Applications to Grid
Environment
GridMPI
NII
IMS
(Sinet3)
Electronic Structurein Solutions
FMO
RISM
Electronic StructureAnalysis
Solvent Distribution Analysis
Data Transformationbetween Different Meshes
MPICH-G2, Globus
FMO
RISM
Reference Interaction Site Model
Fragment Molecular Orbital method
15NAREGI Application Nanoscience
Simulation Scheme
By courtesy of Prof. Aoyagi (Kyushu Univ.)
16Collaboration in Data Grid Area
- High Energy Physics(GIN)
- - KEK
- - EGEE
- Astronomy
- - National Astronomical Observatory
- (Virtual Observatory)
- Bio-informatics
- - BioGrid Project
17NAREGI Data Grid Environment
Grid Workflow
Job 1
Job 2
Job n
Data Grid Components
Data 1
Data 2
Data n
Import data into workflow
Data Access Management
Place register data on the Grid
Job 1
Job 2
Metadata Construction
Assign metadata to data
Grid-wide DB Querying
Meta- data
Meta- data
Job n
Meta- data
Data Resource Management
Data 1
Data 2
Data n
Grid-wide File System
Store data into distributed file nodes
18Roadmap of NAREGI Grid Middleware
OGSA/WSRF-based RD Framework
UNICORE-based RD Framework
Utilization of NAREGI-Wide Area Testbed
Utilization of NAREGI NII-IMS Testbed
Prototyping NAREGI Middleware Components
Apply Component Technologies to Nano Apps
and Evaluation
ß2 Ver. Limited Distr.
ß1 Ver. Release
Version 1.0 Release
Evaluation of aVer. In NII-IMS Testbed
Deployment of ßVersion
Evaluation of ßVersion By IMS and
other Collaborating Institutes
Development and Integration of ßVer.
Middleware Evaluation on NAREGI
Wide-area Testbed
aVer. (Internal)
Development of OGSA-based Middleware
Verification Evaluation Of Ver. 1
Midpoint Evaluation
19Highlights of NAREGI b release (05-06)
- Resource and Execution Management
- GT4/WSRF based OGSA-EMS incarnation
- Job Management, Brokering, Reservation based
co-allocation, Monitoring,
Accounting - Network traffic measurement and control
- Security
- Production-quality CA
- VOMS/MyProxy based identity/security/monitoring/ac
counting - Data Grid
- WSRF based grid-wide data sharing with Gfarm
- Grid Ready Programming Libraries
- Standards compliant GridMPI (MPI-2) and GridRPC
- Bridge tools for different type applications in
a concurrent job - User Tools
- Web based Portal
- Workflow tool w/NAREGI-WFML
- WS based application contents and deployment
service - Large-Scale Interactive Grid Visualization
The first incarnation In the world (_at_a)
NAREGI is operating production level CA in APGrid
PMA
Grid wide seamless data access
High performance communication
Support data form exchange
A reference implementation of OGSA-ACS
20NAREGI Version 1.0
Operability, Robustness, Maintainability
- To be developed in FY2007
- More flexible scheduling methods
- - Reservation-based scheduling
- - Coexistence with locally scheduled jobs
- - Support of Non-reservation-based scheduling
- - Support of Bulk submission for parameter
- sweep type jobs
- Improvement in maintainability
- - More systematic logging using Information
Service (IS) - Easier installation procedure
- - apt-rpm
- - VM
21Science Grid NAREGI- Middleware Version. 1.0
Architecture -
22Network Topology of SINET3
- It has 63 edge nodes and 12 core nodes (75
layer-1 switches and 12 IP routers). - It deploys Japans first 40 Gbps lines between
Tokyo, Nagoya, and Osaka. - The backbone links form three loops to enable
quick service recovery against network failures
and the efficient use of the network bandwidth.
Edge node (edge L1 switch)
Core node (core L1 switch IP router)
1 Gbps to 20 Gbps
10 Gbps to 40 Gbps
Hong Kong
622Mbps
Singapore
622Mbps
2.4Gbps
Los Angeles
10Gbps
New York
Japans first 40Gbps (STM256) lines
23NAREGI Phase 1 Testbed
3000 CPUs 17 Tflops
TiTech Campus Grid
Osaka Univ. BioGrid
AIST SuperCluster
Kyushu Univ. Small Test App Clusters
SINET3
(10Gbps MPLS)
Center for GRID RD (NII) 5 Tflops
Computational Nano-science Center(IMS) 10 Tflops
24Computer System for Grid Software Infrastructure
R D Center for Grid Research and Development (5
Tflop/s,700GB)
File Server (PRIMEPOWER 900
ETERNUS3000 ETERNUS LT160)
High Perf. Distributed-memory type Compute
Server (PRIMERGY RX200)
Intra NW-A
Intra NW
128CPUs(Xeon, 3.06GHz)Control Node
Memory 130GB Storage 9.4TB
InfiniBand 4X(8Gbps)
Memory 16GB Storage 10TB Back-up
Max.36.4TB
1node/8CPU
High Perf. Distributed-memory Type Compute Server
(PRIMERGY RX200)
(SPARC64V1.3GHz)
128 CPUs(Xeon, 3.06GHz)Control Node
Memory 65GB Storage 9.4TB
L3 SW 1Gbps (upgradable To 10Gbps)
SMP type Compute Server (PRIMEPOWER HPC2500)
InfiniBand 4X (8Gbps)
Distributed-memory type Compute Server (Express
5800)
Intra NW-B
1node (UNIX, SPARC64V1.3GHz/64CPU)
Memory 128GB Storage 441GB
128 CPUs (Xeon, 2.8GHz)Control Node
Memory 65GB Storage 4.7TB
GbE (1Gbps)
SMP type Compute Server (SGI Altix3700)
Distributed-memory type Compute Server(Express
5800)
128 CPUs (Xeon, 2.8GHz)Control Node
1node (Itanium2 1.3GHz/32CPU)
Memory 32GB Storage 180GB
Memory 65GB Storage 4.7TB
Ext. NW
GbE (1Gbps)
Distributed-memory type Compute Server(HPC
LinuxNetworx )
SMP type Compute Server (IBM pSeries690)
128 CPUs (Xeon, 2.8GHz)Control Node
Memory 65GB Storage 4.7TB
1node (Power4 1.3GHz/32CPU)
L3 SW 1Gbps (Upgradable to 10Gbps)
Memory 64GB Storage 480GB
GbE (1Gbps)
Distributed-memory type Compute Server(HPC
LinuxNetworx )
128 CPUs(Xeon, 2.8GHz)Control Node
Memory 65GB Storage 4.7TB
SINET3
GbE (1Gbps)
25Computer System for Nano Application R
D Computational Nano science Center (10
Tflop/s,5TB)
SMP type Computer Server
Distributed-memory type Computer Server(4 units)
5.0 TFLOPS
5.4 TFLOPS
818 CPUs(Xeon, 3.06GHz)Control Nodes Myrinet2000
(2Gbps)
16ways50nodes (POWER4 1.7GHz) Multi-stage
Crossbar Network
Memory 3072GB Storage 2.2TB
Memory 1.6TB Storage 1.1TB/unit
Front-end Server
Front-end Server
File Server 16CPUs (SPARC64 GP, 675MHz)
L3 SW 1Gbps (Upgradable to 10Gbps)
CA/RA Server
Memory 8GB Storage 30TB Back-up 25TB
Firewall
VPN
Center for Grid R D
SINET3
26Future Direction of NAREGI Grid Middleware
Center for Grid Research and Development (Nationa
l Institute of Informatics)
Cyber Science Infrastructure(CSI)
Science Grid Environment
Productization of Generalpurpose Grid Middleware
for Scientific Computing
Grid Middleware
Resource Management in the Grid Environment
Grid Middleware for Large Computer Centers
Grid Programming Environment
Personnel Training (IT and Application Engineers)
Grid Application Environment
Evaluation of Grid System with Nano Applications
Data Grid Environment
Contribution to International Scientific
Community and Standardization
High-Performance Secure Grid Networking
Grid-Enabled Nano Applications
Computational Methods for Nanoscience using the
Lastest Grid Technology
Research Areas
Requirement from the Industry with regard to
Science Grid for Industrial Applications
Large-scale Computation
Vitalization of Industry
High Throughput Computation
Solicited Research Proposals from the Industry
to Evaluate Applications
Progress in the Latest Research and
Development (Nano, Biotechnology)
New Methodology for Computational Science
Computational Nano-science Center
Use In Industry (New Intellectual Product
Development)
Industrial Committee for Super Computing
Promotion
(Institute for Molecular Science)
271.National Research Grid Initiatve
(NAREGI)2. Cyber Science
Infrastructure(CSI)
Outline
28Cyber Science Infrastructure background
- A new information infrastructure is needed in
order to boost todays advanced scientific
research. - Integrated information resources and system
- Supercomputer and high-performance computing
- Software
- Databases and digital contents such as e-journals
- Human and research processes themselves
- U.S.A Cyber-Infrastructure (CI)
- Europe EU e-Infrastructure (EGEE,DEISA,.)
- Break-through in research methodology is
required in various fields such as
nano-Science/technology, bioinformatics/life
sciences, - the key to industry/academia cooperation
- from Science
to Intellectual Production
Advanced information infrastructure for research
will be the key in international cooperation and
competitiveness in future science and
engineering areas
A new comprehensive framework of information
infrastructure in Japan Cyber Science
Infrastructure
29Cyber-Science Infrastructure for R D
Cyber-Science Infrastructure (CSI)
NII-REO (Repository of Electronic Journals and
Online Publications
NAREGI Outputs
Virtual Labs Live Collaborations
GeNii (Global Environment for Networked
Intellectual Information)
Deployment of NAREGI Middleware
UPKI National Research PKI Infrastructure
International Infrastructural Collaboration
Industry/Societal Feedback
Restructuring Univ. IT Research
ResourcesExtensive On-Line Publications of
Results
30Structure of CSI and Role of Grid Operation
Center (GOC)
International Collaboration
EGEE TeraGrid DEISA OGF
etc
National Institute of Informatics
Cyber-Science Infrastructure
Center for Grid Research and Development
RD and Operational Collaboration
Academic Contents Service
RD/Support to Operations
Planning/Collaboration
e-Science Community
WG for Grid Middleware
GOC (Grid Operation Center) Deployment
Operations of Middleware Tech.
Support Operations of CA VO Users Admin. Users
Training Feedbacks to RD Group
Peta-scale System VO
WG for Inter-university PKI
NAREGI Middleware
Planning/Operations/Support
Univ./National Supercomputing Center VOs
WG for Networking
Research Community VO
UPKI System
Planning/Operations
Networking Infrastructure(SINET3)
Planning/Operations
31Cyber Science Infrastructure
32Expansion Plan of NAREGI Grid
Petascale Computing Environment
National Supercomputer Grid (Tokyo,Kyoto,Nagoya)
Domain-specific Research Organizations (IMS,KEK,NA
OJ.)
NAREGI Grid Middleware
Interoperability (GIN,EGEE,Teragrid etc.)
Laboratory-level PC Clusters
Departmental Computing Resources
Domain-specific Research Communities
33CyberInfrastructure (NSF)
Track1 Petascale System (NCSA)
Leadership Class Machine
gt 1Pflops
NSF Supercomputer Centers (SDSC,NCSA,PSC) Track2
(TACC,UTK/ORNL,FY2009)
Nationalgt500 Tflops
Four Important Areas (2006-2010) High
Performance Computing Data, Data Analysis
Visualization Virtual Organization for
Distributed Communities Learning Workforce
Development
Local50-500 Tflops
Network InfrastructureTeraGrid
SloganDeep Wide - Open
34EUs e-Infrastructure (HET)
Tier 1
PACE Petascale Project(2009?)
Europtean HPC center(s) gt1Pflops
DEISA
National/Regional centers with Grid
Colaboration 10-100 Tflops
Tier 2
EGI
EGEE
Tier 3
Local centers
HETHPC in Europe Task Force PACE Partnership
for Advanced Computing in
Europe DEISA Distributed European
Infrastructure for Supercomputer
Applications EGEE Enabling Grid for
E-SciencE EGI European Grid Initiative
Network InfrastructureGEANT2
35Summary
- NAREGI Grid middleware will enable seamless
federation of heterogeneous computational
resources. - Computations in Nano-science/technology
applications over Grid is to be promoted,
including participation from industry. - NAREGI Grid Middleware is to be adopted as one of
the important components in the new Japanese
Cyber Science Infrastructure Framework. - NAREGI is planned to provide the access and
computational infrastructure for the Next
Generation Supercomputer System.
36Thank you!
http//www.naregi.org