Cerebral: Visualizing Multiple Experimental Conditions on a Graph with Biological Context

1 / 87
About This Presentation
Title:

Cerebral: Visualizing Multiple Experimental Conditions on a Graph with Biological Context

Description:

Cerebral: Visualizing Multiple Experimental Conditions on a Graph with Biological Context –

Number of Views:47
Avg rating:3.0/5.0
Slides: 88
Provided by: aaronb3
Category:

less

Transcript and Presenter's Notes

Title: Cerebral: Visualizing Multiple Experimental Conditions on a Graph with Biological Context


1
Cerebral Visualizing MultipleExperimental
Conditions on a Graph with Biological Context
  • M.Sc Thesis Presentation
  • Aaron Barsky
  • Supervisor Tamara Munzner
  • May 21st, 2008

2
Cerebral
  • Collaboration with systems biologists
  • Hancock innate immunity laboratory
  • Integrate
  • System model (Graph)
  • Experimental measurements

3
Outline
  • Cellular systems biology
  • Design decisions
  • Related work
  • Constrained simulated annealing graph layout
  • Interactive data exploration video
  • Conclusions / Future work

4
Biomolecular interactions are selective
  • Cell densely packed with biomolecules
  • Interactions rare
  • Model interactions as a graph

Image from Nature Publishing group
5
Systems biology model
  • Graph G V, E
  • V proteins, genes, DNA, RNA, tRNA, etc.
  • E interacting molecules
  • Reactions, information exchange, controls

6
Graph summarizes extensive lab work
  • Graphs extracted from database
  • Each edge summarizes experimental evidence
  • TIRAP an adapter molecule in the Toll signaling
    pathway. Horng T, Barton GM, Medzhitov R.
  • Mal (MyD88-adapter-like) is required for
    Toll-like receptor-4 signal transduction. Fitzgera
    ld KA, Palsson-McDermott EM, Bowie AG, Jefferies
    CA, Mansell AS, Brady G, Brint E, Dunne A, Gray
    P, Harte MT, McMurray D, Smith DE, Sims JE, Bird
    TA, O'Neill LA.

7
Models are dynamic
  • No official summary source
  • Changes with each publication
  • Exploration of diagrams useful
  • Choose scope to manage complexity
  • Reactions associated with a biomolecule
  • Reactions associated with a system
  • Reactions associated with an organism

8
TLR4 context E74, V54
9
Immune system context E1263, V760
10
Immune system context E1263, V760
11
Human cell (E50,000, V10,000)
12
Model interprets experiments.Experiments refine
model.
  • Systems biologists
  • Conduct experiments on cells
  • Interpret results in current model
  • Propose modifications to the model

13
Microarray experiments measure gene expression
level
  • Cells express genes to create proteins
  • Proteins are specialized tools
  • Associate real value with each graph node

Image www.immb.forth.gr
14
LL-37 Example
  • Suspect LL-37 helps reduce inflammation
  • Drug not part of model
  • Conduct experiment
  • Treat some cells with LL-37
  • Controls untreated
  • Expose all cells to bacteria
  • Measure gene expression over 4 time points

15
Experiment results
  • Context TLR4 immune response

16
Cerebral
17
Thesis contributions
  • Cerebral Interactive exploration tool
  • Views multiple experimental conditions
  • In context of graph model
  • Facilitates comparison between pairs of
    conditions
  • Graph layout algorithm
  • Uses biological meta-data

18
Outline
  • Cellular systems biology
  • Design decisions
  • Related work
  • Constrained simulated annealing graph layout
  • Interactive data exploration video
  • Conclusions / Future work

19
Many visualization options
  • Input
  • Graph G V, E
  • Descriptive meta-data for each v in V
  • Labels, biological attributes
  • Sets of experimental results
  • Multiple float values associated with each v in V
  • Output
  • Graphical representation

20
Our choices
  • Graph layout guided by biological meta-data
  • Small multiple views for experimental conditions
  • Parallel coordinates for a measurement- driven
    view

21
Traditional graph layout
  • Given graph GV,E
  • Create layout in 2D plane

Circular (Six and Tollis, 1999)
Force-directed (Fruchterman and Reingold, 1991)
Hierarchical (Sugiyama 1989)
22
Good layout criteria
  • Short edges
  • Minimal edge crossings
  • Minimal node-edge overlap
  • Compactness
  • Symmetry
  • Empirical Evaluation of Aesthetics-based Graph
    Layout (Purchase, 2002)
  • Many criteria NP hard

23
Biologists found existing layouts unsuitable
  • Generic layout criteria form unexpected groupings
  • Thats weird. Why is that transcription factor
    beside that cell surface protein?
  • Biologists want graph layout to encode
    biological structure

24
Biological cells divided by membranes
Image courtesy of Dr.G Weaver
  • Interactions generally occur within a
    compartment
  • Crossing membranes interesting

25
Hand-drawn diagrams
  • Cellular location encoded spatially

26
Cerebral spatial encoding
  • Similar to hand-drawn
  • Spatial position reveals
  • Location in cell
  • Function

27
Small multiple views for experimental conditions
  • One graph instance per condition
  • Each graph coloured according to the condition
  • Tufte, 1990

28
Animation over time
29
Visual memory poor
  • Matthew Plumlee and Colin Ware. Zooming versus
    multiple window interfaces Cognitive costs of
    visual comparisons. Proceedings of the ACM SIGCHI
    Conference on Human Factors in Computing
    Systems,13(2)179-209, 2006.
  • Barbara Tversky, Julie Bauer Morrison, and
    Mireille Betrancourt. Animation can it
    facilitate? International Journal of
    Human-Computer Studies, 57(4)247-262, 2002.

30
Embedded glyphs
  • Embed multiple conditions as a chart in the node
  • Good detail in local view
  • Westenberg 2008

31
Glyphs invisible in global view
  • Westenberg, 2008

32
Saraiya study
  • Purvi Saraiya, Peter Lee, and Chris North.
    Visualization of graphs with associated
    timeseries data, 2005.
  • Compared 4 interfaces for analyzing expression
    data in a graph
  • Animated coloured nodes outperformed glyphs
  • Multiple linked views improve accuracy
  • Aim to do better by distributing over space vs.
    over time

33
Parallel coordinates for ameasurement driven view
  • Each experimental condition is an axis
  • Each node in the graph is a line

34
Clusters indicate similar function?
  • Data driven hypothesis
  • Same pattern of gene expression same role in
    cell
  • Parallel coordinates alone untrustworthy
  • Data noisy
  • Different clustering algorithm different
    results

35
Linked graph and clustering aid exploration
36
Outline
  • Cellular systems biology
  • Design decisions
  • Related work
  • Constrained simulated annealing graph layout
  • Interactive data exploration video
  • Conclusions / Future work

37
Related Work
  • Systems biology graph viewers
  • Constrained graph layout

38
Systems biology graph visualization systems
  • Cytoscape (Shannon et al. 2003)
  • VisANT (Hu et al. 2004)
  • GeneSpring (Silicon Genetics)
  • GenMapp (Dalquist et al. 2002)
  • Graph layout without biological context
  • Overlay only a single condition at a time

VisANT (Hu et al. 2004)
39
Multiple coordinated views
  • Multiple linked views of experiment data
  • HCE (Seo and Schneiderman 2002)
  • SpotFire (Tibco SpotFire)
  • No graph view

40
Constrained graph drawing
  • Force directed
  • Contain with repulsive walls
  • Graph drawing by force directed placement
    (Fruchterman and Reingold, 1991)
  • A constrained, force-directed layout algorithm
    for biological pathways (Genc and Dogrusoz 2004)
  • Force balancing a challenge
  • Parameter tweaking
  • Brittle

41
Quadratic programming
  • Numerical approaches
  • Constrained graph layout (He, Marriott 1998)
  • IPSep-CoLa An incremental procedure for
    separation constraint layout of graphs (Dwyer,
    Marriott 2006)
  • Handles separation constraints
  • Ideal edge length parameter
  • Requires tweaking
  • Does not optimize edge crossings

42
Graph layout with simulated annealing
  • Simple and flexible
  • Drawing graphs nicely using simulated annealing
    (Davidson and Harel 1996)
  • Automatic drawing of biological networks using
    cross cost and subcomponent data (Kato and
    Nagasaki 2005)
  • Historically slow O(V3)
  • Our system, Cerebral expected O(EvV)

43
Outline
  • Cellular systems biology
  • Design decisions
  • Related work
  • Constrained simulated annealing graph layout
  • Interactive data exploration video
  • Conclusions / Future work

44
Graph layout constraints
  • Restrict node placement to band according to
    subcellular localization
  • Cluster activated proteins by function
  • Optimize edge length, crossings, etc..

45
Simulated annealing search
  • Choose a random graph layout
  • Repeat until cool
  • Repeat O(N)
  • Move a random node to a new position
  • Score new position with evaluation function
  • If improved
  • accept change
  • Else
  • accept change with probability 1

Reduce temperature
46
Adapting SA to constraints
  • Hard constraints
  • Layer by subcellular localization
  • Soft constraints
  • Minimize
  • Edge length
  • Edge-edge crossings
  • Node-edge crossings
  • Distance to biologically similar neighbours

Extracellular
Plasma membrane
Cytoplasm
Nucleus
47
Soft constraint violation evaluation is frequent
  • Must be efficient
  • Innermost loop
  • 50 cooling cycles 30N nodes
  • 1500N evaluations

48
Discretization key to efficiency
  • Limit node positions to grid centers
  • Uniform grid (Akman et al.,1989)

49
Clean, regular layouts
  • Room for labels

50
No node overlaps
  • Overlaps in dense areas of force directed
    algorithms
  • Cerebral Impossible by construction
  • - No cost to evaluate

51
Calculations with L1 distance
  • Manhattan or L1 distance
  • Cheap, integer only arithmetic
  • Measures
  • Edge length
  • Distance to functional neighbours

52
High speed edge crossing count estimation
  • Edge crossing count could be very expensive
  • Count everything O(E2) or O(E log E)
  • Count just the moved node O(deg(n)E)
  • Over 98 of unoptimized time
  • Cerebral Good estimate
  • O (vV)
  • Integer only

53
Quickly find cells with modified Bresenhams
algorithm
  • Modified Bresenhams
  • Green classic
  • Purple additional corners
  • Update grid cell each time edge is moved

54
Track edges in each grid cell
  • Cell stores edge count
  • Add up edges in cells a line passes through
  • No expensive line intersection tests
  • Upper bound estimate

55
High angular resolutionaids reading of edge
crossings
  • Exact intersection test 3 crossings
  • Approx. intersection test 12 crossings

56
Cerebral TLR4 (E74, V57) Time2.9 sec
57
Force-directed TLR4 Time lt1 sec
58
IPSep-CoLa Time1.3 sec
59
Cerebral innate immunity (V1263, N760) Time62
sec
60
Force-directed Time 64 sec
61
IPSep CoLa Time 296 sec
62
Cerebral graph layout
  • Shows subcellular localization through spatial
    positioning
  • Groups response proteins by biological function
  • Requires no user specified parameters
  • Runs in expected time O(EvV)
  • A few minutes for 1000 nodes and edges

63
Outline
  • Cellular systems biology
  • Design decisions
  • Related work
  • Constrained simulated annealing graph layout
  • Interactive data exploration
  • Conclusions / Future work

64
Usable, but complex
65
Video
66
Outline
  • Cellular systems biology
  • Design decisions
  • Related work
  • Constrained simulated annealing graph layout
  • Interactive data exploration video
  • Conclusions/ Future work

67
Released in two stages
  • Cerebral 1.0 (2007)
  • Biologically based graph drawing only
  • Announced in a Bioinformatics Application Note
  • Cerebral 2.0 (now)
  • Multiple experiment viewing with small multiple
    and parallel coordinate views

68
Implemented as a plugin for Cytoscape
  • Cytoscape (Shannon et al. 2003)
  • Open source systems biology tool
  • Provides model/attribute management
  • Replaced standard graph renderer
  • Biologically based graph layout
  • Added small multiple and parallel coordinate views

69
Biologists are using Cerebral
  • Published Cerebral-created diagrams to
    communicate results
  • Matthew D Dyer, T. M Murali, and Bruno W Sobral.
    The landscape of human proteins interacting with
    viruses and other pathogens. PLoS Pathogens,
    4(2)e32, 2008.

70
  • Liqun He et al. The glomerular transcriptome and
    a predicted protein-protein interaction network.
    Journal of the American Society of Nephrology,
    19(2)260-268, 2008.

71
InnateDB links to Cerebral
  • Integrated as a visualization system for InnateDB

72
Future Work
73
Visual scaling
  • Human cell (V10,000, E 50,000) Time 6199 sec

74
Support Organelles
  • Membrane-bound regions within Cytoplasm
  • Mitochondria
  • Lysosomes
  • Vesicles

Mitochondria
Vesicles
Cytoplasm
75
Biology-specific clustering
  • Replace k-means with biology specific clustering
    algorithm

76
Indicate data-mining bias
  • Exploration without hypothesis
  • Is pattern a signal?
  • Measure significance of pattern given
  • Graph size
  • Connectivity of members
  • Number of experiments

77
Conclusion
  • Cerebral
  • Visualizes experimental data from multiple
    conditions simultaneously
  • Allow interactive exploration of the data
  • Uses biological meta-data to guide the graph
    layout

78
Acknowledgements
  • Funding
  • Agilent Technologies
  • Robert Kincaid
  • Hancock lab members
  • Jennifer Gardy, David Lynn, Bob Hancock
  • Supervisor
  • Tamara Munzner
  • Information visualization group
  • - Stephen Ingram, Peter McLachlan, Dan
    Archambault, Heidi Lam, James Slack, and Ciaran
    Llachlan Leavitt

79
Yeast cell cycle data
  • Measure gene expression levels at 24 time points
    in yeast
  • Through a cell divide cycle
  • Observe

80
(No Transcript)
81
(No Transcript)
82
(No Transcript)
83
(No Transcript)
84
Principles of perception
  • Group objects by colour/shape/size
  • Group by rows, not columns
  • Automatically by visual system
  • Information Visualization - Perceptions for
    Design. Ware (2004)

85
Spatial position overrules colour/shape/size
groupings
  • Automatically view 3 groups of mixed objects
  • With effort can group by
  • Size
  • Shape
  • colour

86
Connectedness even stronger than position
  • Each group would be perceived differently without
    the connecting line
  • Information Visualization - Perceptions for
    Design. Ware (2004)

87
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com