Biological Language Modeling Toolkit - PowerPoint PPT Presentation

About This Presentation
Title:

Biological Language Modeling Toolkit

Description:

... Toolkit 'Graphing Utilities' by: Danny Lam. Overview. BLMT ... Current/Future Work. Generate graphing utility for every tool on the BLMT website. Questions? ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 18
Provided by: computi209
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Biological Language Modeling Toolkit


1
Biological Language Modeling Toolkit Graphing
Utilities
  • by Danny Lam

2
Overview
  • BLMT
  • Ex Computes association measures in protein
    sequences
  • Graphing Utilities
  • Display how well the association measures or
    other data (known or surmised) feature boundaries
  • Step 1 Automatic extraction of feature
    boundaries from given source files
  • Step 2 Plot data along with feature positions
    along a sequence

3
BLMT Mutual Information
  • Mutual Information -gt Computes "mutual
    information, which is a measure of association
    between adjacent amino acids.
  • Input amino acid sequence file(s)
  • (ex) Swiss prot SW datasets
  • Output file.mi.out.av -gt
  • first column is position in sequence
  • second column is mutual information value
    associated with that position

4
Feature Positions
  • Extract feature position information (via
    Swiss-prot)
  • Extracellular (EC),
  • Cytoplasmic (CP),
  • Helices (H)
  • --gt label where the EC, CP, and H regions are in
    the sequence.

5
DR PROSITE PS00238 OPSIN 1. KW
Photoreceptor Retinal protein Transmembrane
Glycoprotein Vision KW Phosphorylation
Lipoprotein Palmitate G-protein coupled
receptor KW Acetylation Retinitis pigmentosa
Disease mutation. FT DOMAIN 1 36
EXTRACELLULAR. FT TRANSMEM 37 61
1 (POTENTIAL). FT DOMAIN 62 73
CYTOPLASMIC. FT TRANSMEM 74 98
2 (POTENTIAL). FT DOMAIN 99 113
EXTRACELLULAR. FT TRANSMEM 114
133 3 (POTENTIAL). FT DOMAIN 134
152 CYTOPLASMIC. FT TRANSMEM 153
176 4 (POTENTIAL). FT DOMAIN 177
202 EXTRACELLULAR. FT TRANSMEM
203 230 5 (POTENTIAL). FT DOMAIN
231 252 CYTOPLASMIC. FT TRANSMEM
253 276 6 (POTENTIAL). FT DOMAIN
277 284 EXTRACELLULAR. FT
TRANSMEM 285 309 7 (POTENTIAL). FT
DOMAIN 310 348 CYTOPLASMIC. FT
MOD_RES 1 1 ACETYLATION (BY
SIMILARITY). FT CARBOHYD 2 2
N-LINKED (GLCNAC...) (BY SIMILARITY). FT
CARBOHYD 15 15 N-LINKED (GLCNAC...)
(BY SIMILARITY). FT DISULFID 110 187
BY SIMILARITY. FT BINDING 296 296
RETINAL CHROMOPHORE.
6
(No Transcript)
7
(No Transcript)
8
Problems/Solution
  • Problems
  • -Making one subplot graph (MATLAB) requires
    program customization
  • - Generation of multiple subplots together
    requires more tedious work. Waste of time and
    effort.
  • Solution
  • -Need clear interface to generate subplot graphs
    for you w/o writing tedious matlab code.

9
a1,b1textread(test.out', 'd f') hold
on subplot(1,1,1) hold on hh1 plot(a1, b1,
'linewidth',2.5) hold on ylabel('yaxis','fontsize
',16, 'Color','k','fontweight','bold') set(hh1,
'MarkerSize',5) set(gca, 'YLim',-1,
3) set(gca,'ytick',-.6,-.2,.2 xdash
NaN,62,73,NaN,134,152,NaN,231,252,NaN,310,348
cp ydash (-.2)(ones(size(xdash))) line(xdash,
ydash,'color','y','linewidth',3) xdash
1,36,NaN,99,113,NaN,177,202,NaN,277,284,NaN
ec ydash (-.2)(ones(size(xdash))) line(xdash,
ydash,'color','r','linewidth',3) hold
on xlabel('x_axis','fontsize',16,
'Color','k') print -dpsc -r0 sample
10
Design Capabilities
  • Access multiple mutual information output
    datasets
  • Display combination of EC/CP/H position
    information on MI datasets (color coded)
  • Specify range (Y limits) and naming conventions
    (X axis)
  • Output into convenient picture files (ex .tiff
    file).

11
Subplotter
  • Version 1 (In house use only)
  • -Initially the program takes as input
  • --gt .SW file (EC/CP/H)
  • --gt .m file (MATLAB file that code will be
    generated in)

12
Subplotter ( Version 1)
How many
output files to textread 1 What is the file to
be textread into matlab program output file 1
opsdh_1gpcr.out How many TOTAL subplots do you
request? 1
13
Subplotter ( Version 1)
Subplot(1,1,1)
Which file do you want results to be
graphed on this subplot? 0 opsdh_1gpcr.out Make
selection (0) 0
How many items (EC,CP,H) do you
want plotted (1,2, 3 GPCR, 4
Loops)?
--gt 3
14
Subplotter ( Version 1)
Specify Y-Axis Label? (y/n) n Y-Axis Label
GPCR Specify YLim? (y,n) n Give name to
X-Axis sample Give name to .tiff file for
output (no extension!) sample Matlab Program
completed! wait ...
15
Subplotter (Version 1)
16
Current/Future Work
  • Generate graphing utility for every tool on the
    BLMT website.

17
Questions?
Write a Comment
User Comments (0)
About PowerShow.com