Title: OBIGrid: A New Infrastructure for Bionetwork Research
1OBIGrid A New Infrastructurefor Bionetwork
Research
- Akihiko Konagaya
- Project Director
- Bioinformatics Group, RIKEN GSC
- Visiting Professor
- Tokyo Institute of Technology
2Outline
- Riken Genomic Sciences Center (GSC)
- Bionetworks and Phenomics
- Open Bioinformatics Grid (OBIGrid)
- Scalable Computing
- Distributed Resource Sharing
- VO for Human Collaboration
Grid as a Next Generation Infrastructure For
Bioinformatics Research
3RIKEN Genomic Sciences Center (GSC)
- Opening 1998
- Moving to Yokohama since Oct. in 2000
- Six Research Groups (gt500, gt60 M )
- Human Genome Sequencing (since 1998)
- Mouse cDNA Sequencing (since 1998)
- Protein Structure Analysis (since 1998)
- Mouse Mutagenesis (since 1999)
- Plant Mutagenesis (since 1999)
- Bioinformatics (since 2000)
4Strategic Technology Domain
Modeling and Simulation From Molecular to Cell
Information Integration from Genome to Phenome
Grid
High Performance Computing (PC-cluster, FPGA, MDM)
5Bioinformatics Group Organization
Biomedical Knowledge Discovery Team
Genomics Knowledge Base Research Team
- Knowledge Extraction from DBs
- Immuno-Informatics
- Genome-Phenome Integration
- Bioinformatics Tools
Computational Proteomics Team
High Performance Biocomputing Team
- 1 Peta Flops Molecular Dynamics Machine
- Open Bioinformatics Grid (OBIGrid)
- Signal Transduction Modeling
- Whole Cell Simulation
Population and Quantitative Genomics Team
6Published Papers
Toyoda, T., Mochizuki, Y., Konagaya, A. GSCOPE A
Clipped Fisheye Viewer for Biomolecular Network
Graphs. Bioinformatics, 19, 437-438,
2003. Toyoda, T., Konagaya, A. KnowledgeEditor
a new tool for interactive modeling and
analyzing biological pathways based on microarray
data. Bioinformatics, 19, 433-434, 2003.
Nagashima, T., Silva, D., Socha, L., Petrovsky,
N., Suzuki, H., Saito, R., Kasukawa, T.,
Kurochkin, I.V., Konagaya, A., Schönbach, C.
Inferring higher functional information for
RIKEN mouse full-length cDNA clones with FACTS.
Genome Res., 13(6b), 1520-1533, 2003.
Suenaga, A., Hatakeyama, M., Ichikawa, M., Yu,
X., Futatsugi, N., Narumi, T., Fukui, K., Terada,
T., Taiji, M., Shirouzu, M., Yokoyama, S.,
Konagaya, A. Molecular Dynamics, Free Energy and
SPR Analyses of the Interactions between SH2
Domain of Growth Factor Receptor Binding Protein
2 and Phosphotyrosyl Peptides. Biochemistry-U.S.,
42, 5195-5200, 2003.
Hatakeyama, M., Kimura, S., Naka, T., Kawasaki,
T., Yumoto, N., Ichikawa, M., Kim, J-H., Saito,
K., Saeki, M., Shirouzu, M., Yokoyama, S.,
Konagaya, A. A computational model on the
modulation of MAPK and Akt pathways in heregulin
induced ErbB signaling. Biochemical Journal, 373,
451-463, 2003.
7http//gscope.gsc.riken.go.jp/
Network Microarray Viewer,Clustering Software
Genetics Mapping Software
Omic Space Viewer
TraitMap Database
Integrated Database
1)Tetsuro Toyoda, Yoshiki Mochizuki and Akihiko
Konagaya. Gscope A Clipped Fisheye Viewer
Effective for Highly Complicated Biomolecular
Network Graphs. Bioinformatics, 19, 437-438,
2003. 2)Tetsuro Toyoda and Akihiko Konagya.
KnowledgeEditor a new tool for interactive
modeling and analyzing biological pathways
based on microarray data. Bioinformatics, 19,
433-434, 2003
8FACTS Knowledge Extraction from Literature DBs
http//facts.gsc.riken.go.jp/
Nagashima,T., Schoenbach,C. etal Inferring
higher functional information for RIKEN mouse
full-length cDNA clones with FACTS., Genome
Research, 6b (2003)
9Network Simulation
https//access.obigrid.org/yagns/
Finding ErbB4 New Signal Cascade with 68
parameters (43 unknown parameters are estimated
by simulation)
HRG (? 10 nM, ?1 nM, ?0.1 nM) Experimental
results HRG (Red, 10 nM green, 1 nM blue, 0.1
nM) Simulation results
M. Hatakeyama, S. Kimura,etal A computational
model on the modulation of MAPK and Akt pathways
in heregulin induced ErbB signaling. Biochem J.,
373, 451-463, 2003. .
10Molecular Simulation
Free Energy Calculation of Grb2Erb B1 and B4
Peptides
pY0992?pY1045?pY1056? pY1068?pY1086?pY1148? pY1173
?pY1188?pY1242
RedPhospho-Tyrosyl peptide (ErbB1 and
ErbB4) BlueGrb2 PDB ID 1ZFP (Grb2 SH2 domain
pY1068 complex)
Correlation between Measured and Calculated Free
Energies
Hydrogen Binding Patterns
A. Suenaga, etalMolecular Dynamics, Free Energy
and SPR Analyses of the Interactions between SH2
Domain of Growth Factor Receptor Binding Protein
2 and Phosphotyrosyl Peptides. Biochemistry
(U.S.), Vol.42, No.18, 5195-5200, 2003.
11Bionetworks and Phenomics
12Bionetworks and Phenomics
RIKEN GSC Symposium
June 20th 2003 1000-1730 Koryuto Hall, Yokohama
Institute
Organized by Bioinformatics Group, RIKEN
GSC and SIGMBI, Japanese Society of Artificial
Intelligence
13Knowledge integration in Omic Space(Wada,2003)
OMIC SPACE
14Precise 3D measurement with X-ray, Laser, and
Color Images (Toyoda,2003)
15Example 3D distribution of trichomes(Col-0)
16Mapping Trait to Network
http//gscope.gsc.riken.go.jp
Pathway View
Trait Map View
17Maze on a Jigsaw Puzzle
Phenome
Genome
Biological Data
18Equipments for New Quest
High Performance Computers
Data, Knowledge and Tools
Collaboration of Human Experts
The illustrations are quoted from the following
sites www.dnr.state.wi.us/org/
aw/air/ed/educatio.htm www.mtnbrook.k12.al.us/acad
emy/2ndgrade/mtn/map.htm www.dnr.state.wi.us/org/
aw/air/ed/educatio.htm
19Needs of High Performance Computing
- Increase of Genome Sequence Information
- Combinatorial Increase of Search Space
- Genome Transcriptome Proteome ...
Phenome - Computer Simulation and Unknown Parameter
Estimation
20Needs of Resource Sharing
- Biological Databases (Unigene, TrEMBL,...)
- Bioinformatics Tools (BLAST, HMMER, ...)
- Programming Language (Bioperl, Biojava, ...)
21Needs of Human Collaboration
22Grid for Bioinformatics
- Effective for Embarrassing Parallel
Computation - Homology Search, Motif Search,
- Unknown Parameter Estimation for
Cellular Models - etc
- Distributed Resource Sharing among
organizations - Web Services, Workflow and
Computational Pipeline, - Autonomous Database Update,
- etc
- Ba or Field for Human Collaboration
- Group Works for Genome Annotation,
Whole Cell Simulation, - Collaboration between Biologists and
Computer Scientists, - etc
23Open Bioinformatics Grid (OBIGrid)
24Overview of OBIGrid
Started in April, 2002
Objectives
Establishment of Grid for Bioinformatics
Technologies
Virtural Private Net (VPN) Globus Tool
Kit Bioinformatics Frameworks
Organized by
Current Status
Initiative for Parallel Bioinformatics (IPAB)
High Performance Biocomputing Committee of
Genome Information Science in MEXT
Aca. 13 Ent. 9 Nat.Res. 5 Nodes 291 CPUs 490
25Bioinformatics Applications
26OBIGrid Web Site
http//www.obigrid.org
27Scalable ComputingOpen Bioinformatics
Environment (OBIEnv)by JAIST, GSC, DDBJ etal
28Typical Computing in Bioinformatics
Job
DB
Software
Task 1-250
Task 1
Task 2
Task 251-500
DB
Software
.
.
.
DB
Software
Task 501-750
Task 999
DB
Software
Task 751-1000
Task 1000
great many and similar tasks independent to
each other ...
29PostgreSQL
OBIEnv Overview
Environment Information Server
P2P Server
Divided Jobs
Node Search
Set of Nodes
Node
Results
Node
Globus Tool Kit
Globus Tool Kit
Reporting Environmental Information
Job Dispatcher (obidispatch)
Temporal Work Area for Job Execution
Local Authentification
Environment Scanner (obiregist)
Job (List of Tasks)
List of OBIEnv Users
DB
HW
SW
OBIEnv User
Unauthorized Local Users
transferred and updated by obiupdate command
30Database Transfer (obiupdate)
mirror
indexing
rsync
SSE
DB
DB
DB
DB
DB
DB
SSE
DB
SSE
DB
DB
DB
DB
Perl
SSE
SSE
GNU
BLAST
...
DB
DB
SSE Standard Software Environment
multiple primary servers are allowed
31Parallel Job Execution
Job (Task List)
Job Dispatcher (obidispatch)
blast Q1 genbank blast Q2 genbank blast Q10
genbank
Nodes with TrEMBL and BLAST?
Q1,Q2
Set of Nodes
Q3,Q4
TrEMBL
Q5,Q6
TrEMBL
Q7,Q8
Environment Information Server
Q9,Q10
TrEMBL
TrEMBL
TrEMBL
Tasks are independent to each other
32OBIEnv Sample Script
Task FileGet files in Fasta format and execute
BLAST in parallel
List of entries (for queries)
Job Dispacher
33OBIEnv Demo
34Remote Resource SharingScalable Genome
Database(OBISgd)by JAIST DDBJ
35Typical Database Access in Bioinformatics
Web Services
Mirroring
App1
App2
App1
App2
Site B
Site A
Site B
Site A
36OBISgd Overview
https//access.obigrid.org/jaist/xml-ddbj/index.ht
ml
JAIST
DDBJ
Parallel Search Engine
Indexing
Genome DBs
Search index
Search index
Search index
Search index
Entry Data
Entry Retrieval Service in XML format by SOAP
Server
Genome Database Navigator
- Scalability
- Quick Response
- Consistency
37Obtained Entry Information
getXML_DDBJ_Entry with AB000100
Entry Click
DDBJ SOAP Server
Java
XML Image
XSLT
38Database Federation and Computational Pipeline
Database Federation Web Services
Phenome
App4
App3
Metabolome
App5
Computational Pipeline
Proteome
App2
Transcriptome
App1
Genome
39VO for Human CollaborationThermus Thermophilus
Cyber Outlet (OBITco)by GSC and Osaka Univ.
40Virtual Organization on Grid
A
B
VO on Grid
Project
D
C
Project
VO provides the boundary of knowledge sharing
over geometrical and organizational limitation.
41Thermus Thermophilus Cyber Outlet (OBITco)
Access Manager
SSL connection
OBIGrid network
Thermus server
Knowledge Management System on OBIGrid, designed
for sharing experimental data, computation
results, genome annotation, etc.
42Thermus thermophilus on Open Bioinformatics Grid
(OBIGrid)
https//access.obigrid.org/thermus/
by RIKEN GSC Bioinformatics Group
431. Annotation Viewer Showing ORFs predicted by
several gene finding programs. Registered
members can access this page and annotate/update
the annotation information. The information
consists of DNA/DNA identity, protein/protein
identity, source organisms, EC number, domains,
predictions for transmembrane regions, GO terms,
etc. Annotation status of ORFs is monitored and
when/who/how is checked automatically.
442. Microarray analysis package Management of
on-going microarray experimental
data. Researchers from different organizations
can access and download the microarray data in
the secure environment.
Microarray TT number vs. Array spot table
ORFs and its location on the chip are checked
instantly (upper Fig.). Easy Data/information
exchange between wet-lab and informatics
researchers (lower Fig.).
Microarray examination result files
2D expression analysis by IBM Japan
453. Pathway Viewer Metabolic pathways in Thermus
thermophilus. ORFs predicted in Thermus are
shown in blue. The EC numbers link with
LinkDB in Genome.ad.jp for the detailed
description of the enzymes.
Pathway viewer by GSCope
GSCope Java applet version by RIKEN GSC.
http//gscope.gsc.riken.go.jp
464. Reference library
References on Thermus thermophilus to date are
registered.
47Open Bioinformatics Grid (OBIGrid)
- Real Practical Grid System designed
- for
- -Scalable Computing
- -Resource Sharing
- -Human Collaboration
Feel free to contact us info_at_obigrid.org
48Contribution and Collaboration (OBIGrid)
titles omitted
Staff
Kyoko Hirukawa(JAIST),Hiroko Furuno,Sonoko Endo
(GSC)
VPN/Globus net
Hiroyuki Umeda(IBM)
OBIEnv/OBISgd
Kenji Sato, Shinichi Tsuji, Yasuhiko
Nakajima(HNES)
OBIMde
Makoto Taiji, Tetsu Narumi, Atsushi Suenaga,
Noriyuki Futatsugi (GSC)
OBITco/ OBIYagns
Fumikazu Konishi, Akinobu Fukuzaki, Mariko
Hatakeyama, Kaori Ide(GSC) Seiki Kuramitsu,
Shigeyuki Yokoyama, Ryoji Masui, Noriko
Nakai(Structurome) Shuhei Kimura(GSC), Takuji
Kawasaki(FUJI RIC)
High-Performance Biocomputing Committee (HPBC)
Takahiro Koita(OSU), Hideo Matsuda(Osaka Univ.),
Tanaka Koji(TIT), Masahiro Okamoto(Kyushu Univ.),
Tsuneo Nakanishi(Kyushu Univ.), Akira
Fukuda(Kyushu Univ.), Satoru Miyazaki(DDBJ), Asao
Fujiyama(NII), Hiroyuki Kurata(KIT), Yutaka
Akiyama(AIST), Tsutomu Maruyama(Tsukuba Univ.),
Toshinori Endo (TMU), Satoshi Matsuoka(TIT),
Masahiro Yamamura(TIT), Hideki Nakada(AIST, TIT),
Shinichi Morishita (Tokyo Univ.), Toshihisa
Takagi (Tokyo Univ.), Hiroshi Someya (ISM),
Tomoyuki Hiroyasu(Doshisha Univ.), Satoshi
Ono(Tokushima Univ.), Kenji Ono(JAIST), Xavier
Defago(JAIST), Sashio Hayashi(JAIST), Tomoyuki
Yamamoto(JAIST), Akihiko Konagaya(GSC, TIT),
Fumikazu Konishi(GSC), Morikazu Nakamura(Univ. of
the Ryukyu)
Site Contribution from Initiative for Parallel
Bioinformatics (IPAB)
NIPPON SHINYAKU Co,. Ltd. DDBJ
NEC Software Hokuriku, Ltd. IBM Japan, Ltd. NEC
Corporation Japan Science and Techonolgy
Corporation MITSUBISHI RESEARCH INSTITUTE, Inc.
RIKEN Yokohama Institute Mitsui Knowledge
Industry Hewlett-Packard Japan, Ltd. FUJI
Research Institute SUMISHO Eelectronics Co.,
Ltd. NIPPON TELEGRAPH AND TELEPHONE WEST
Corporation
Leader Akihiko Konagaya (Riken, TIT)