Title: K' SEKAR, Ph'D'
1K. SEKAR, Ph.D. BIOINFORMATICS CENTRE SUPERCOMPUTE
R EDUCATION AND RESEARCH INDIAN INSTITUTE OF
SCIENCE BANGALORE 560 012 INDIA sekar_at_physics.
iisc.ernet.in Voice (91)-80-2932469 FAX
(91)-80-3600683 (91)-80-3601409
(91)-80-3600551
2APPLICATIONS OF DATA MINING
3Abstract
Bioinformatics is one of the fastest growing
interdisciplinary areas in the biological
sciences and has explored in such a way that we
need powerful tools to organize and analyze the
data. An overview will be presented on the
general features of data mining tools, techniques
and its applications.
4Bioinformatics is the fashionable new name for
the field previously called computational
biology.The name is preferred by many because it
puts the emphasis on the data storage and
analysis, rather than on the biology, and the
field is really data driven.
5The term Bioinformatics is used to encompass
almost all computer applications in biological
sciences, but was originally coined in the mid
1980s for the analysis of biological sequence
data.
The quantity of known sequences data outweighs
protein structural data and by virtue of the
genome projects, sequence database are doubling
in size every year.
A key challenge of bioinformatics is to analyze
the wealth of sequence data in order to
understand the amassed information in term of
protein structure function and evolution.
Wherever possible, a range of different methods
should be used, and the results should be married
with all available biological information.
6Refers to database-like activities involving
persistent sets of data that are maintained in a
consistent state over essentially indefinite
periods of time.
Encompass the use of algorithmic tools to
facilitate biological database analyses.
Comprises the entire collection of information
management systems, analysis tools and
communication networks supporting biology.
7DATA MINING
Datamining is defined as exploration and
analysis by automatic and semi-automatic means,
of large quantities of data in order to discover
meaningful patterns and rules.
8The central challenge is to derive maximum
results from the wealth of data.This can be
achieved by establishing and maintaining
databases and providing search and analysis tools
to interpret the data.
9DATABASE
Database is nothing but a collection of
quantitative data resulting from experimental
measurements or observations in various fields of
science.Recently interest in database has been
kindled through international efforts to organize
and analyze the data and update the knowledge
10A database is essentially just a store of
information.They are usually in the form of
simple files (just a flat file, say).You can
shove information into this store or retrieve it
from the store.
11Derived Database
One of the greatest challenges in database
research is analyze the database in depth and
create derived databases to meet the needs or
demands without compromising the sustainability
and quality of the existing database. Creating
desired database is expected is expected to
dramatically reduce the workload of the user
community and will serve as a highly focused
database.
12Packages developed at the Bioinformatics Centre
Raman Building Indian Institute of
Science Bangalore 560 012 Principal
Investigator Dr. K. Sekar E-mail
sekar_at_physics.iisc.ernet.in
13Search Engines 144.16.71.10 / psst Protein
Sequence Search Tool 144.16.71.2/bsdd Biomolecule
s Segment Display Device 144.16.71.10/msgs Motif
Search in Genome Sequence 144.16.71.2/ssep Second
ary Structural Elements in Protein
Programmers 1. S.Saravanan 2. A.Ajmal
Khan 3. C.K.Rajesh 4. T.Kamaraj 5.
P.Selvarani 6. V.Shanthi 7. S.Sirajuddin
Sheik
14Database with Search facility 144.16.71.2/lsdb Li
pase Structural Database 144.16.71.2/lysdb Lysozym
e Structural Database 144.16.71.2/asdb 3D-Amylase
Database 144.16.71.2/gsdb Globin Structural
Database
Programmers 1. C.K.Rajesh 2. T.Kamaraj 3.
P.Sundrarajan 4. P.Selvarani 5.
V.Shanthi 6. A.S.Zahir Hussain 7.
S.Sirajuddin Sheik
15Software for Structure analysis
manipulation 144.16.71.146/cap
Conformation Angles Package 144.16.71.146/rp
Ramachandran Plot 144.16.71.146/wap
Water Analysis Package 144.16.71.146/sem
Symmetry Equivalent Molecules
Generator 144.16.71.146/pdbgoodies
PDBGOODIES 144.16.71.10/gpsm
Geometrical Parameters for Small
Molecules 144.16.71.146/mbd
Measurability of Biovoet difference 144.16.71.146/
dtf Distribution of Temperature Factor
Programmers 1. C.K.Rajesh 2. T.Kamaraj 3.
P.Sundarajan 4. P.Selvarani 5.
V.Shanthi 6. S.Sirajuddin Sheik
16Present Programmers S.S. Sheik S. Das V.G.
Vijay J.J. Lakshmi Ch. K. Kumar C.C. Lingam K.S.
Mohan S.A. Fernando S.K. Raja
17Protein Sequence Search Tool (PSST
1.1) S.Saravanan,A.Ajmul Khan
K.Sekar CURR.SCIENCE, (2000) 550 552
PDB Goodies A Web based GUI to manipulate
Protein Data Bank files A.S.Z.Hussain,V.Shanthi,S
.S.Sheik,J.Jeyakanthan,P.Selvarani K.Sekar ACTA
CRYST. (2002), D58, 1385 1386
Ramachandran Plot (RP) S.Sheik,P.
Sundararajan,A.S.Z Hussain K.Sekar BIOINFORMATIC
S (2002) 18, 1548-1549
Water Analysis Package (WAP) V.Shanthi,
C.K.Rajesh,J.Jayalakshmi,V.G.Vijay
K.Sekar J.APPL.CRYST. (2003) 36, 167-168
18CADB Conformation Angles Data Base of proteins
S.S. Sheik, P. Ananthalakshmi, G. Ramya Bhargavi
K. Sekar NUCL. ACIDS RES., (2003) 448-451
SSEP Secondary Structural Elements of
Proteins V. Shanthi, P. Selvarani, Ch. K. Kumar,
C.S. Mohire K. Sekar NUCL. ACIDS RES., (2003)
(In the press)
SEM Symmetry Equivalent Molecules A web based
GUI to generate and visualize the
macromolecules A.S.Z. Hussain, V. Shanthi, Ch. K.
Kumar, C.K. Rajesh, S.S. Sheik K. Sekar NUCL.
ACIDS RES., (2003) (In the press)
Biomolecules Segment Display Device
(BSDD) P.Selvarani,V.Shanthi,C.K.Rajesh,S.Saravana
n K.Sekar J.MOL. GRAPHICS MODELLING (2003)
(in press)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26(No Transcript)
27(No Transcript)
28(No Transcript)
29(No Transcript)
30(No Transcript)
31(No Transcript)
32(No Transcript)
33(No Transcript)
34(No Transcript)
35(No Transcript)
36(No Transcript)
37(No Transcript)
38(No Transcript)
39(No Transcript)
40(No Transcript)
41(No Transcript)
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46(No Transcript)
47(No Transcript)
48(No Transcript)
49(No Transcript)
50(No Transcript)
51(No Transcript)
52(No Transcript)
53(No Transcript)
54(No Transcript)
55(No Transcript)
56(No Transcript)
57(No Transcript)
58(No Transcript)
59(No Transcript)
60(No Transcript)
61(No Transcript)
62(No Transcript)
63(No Transcript)
64(No Transcript)
65(No Transcript)
66(No Transcript)
67(No Transcript)
68(No Transcript)
69(No Transcript)
70(No Transcript)
71(No Transcript)
72(No Transcript)
73(No Transcript)
74(No Transcript)
75(No Transcript)
76(No Transcript)
77(No Transcript)
78(No Transcript)
79(No Transcript)
80(No Transcript)
81(No Transcript)
82(No Transcript)
83(No Transcript)
84(No Transcript)
85(No Transcript)
86(No Transcript)
87(No Transcript)
88(No Transcript)
89(No Transcript)
90(No Transcript)
91(No Transcript)
92(No Transcript)
93Our sincere thanks to
Department of Biotechnology Ministry of Science
Technology Govt. of India, India Jai Vigyan
National Science Foundation Govt. of India, India
94Acknowledgements
Professor M. Vijayan Professor N.
Balakrishnan Professor S.M. Rao Professor S.
Ramakumar Colleagues and Friends
95President of Nagoya University Professor Takashi
Yamane Other members Biotechnology
Biomaterial Science
96THANK YOU