Title: Computational Modeling For Understanding Biological Pathways
1Computational Modeling For Understanding
Biological Pathways
- Lisa Tucker-Kellogg
- CS3108A Computational Thinking
- 30 Sept 2008
2 -- Pat Philips, Microsoft Research
3TOPICS and sources
- Computational Thinking
- Quotes from Jeanette Wing
- Modeling and Simulation
- What is modeling
- Formalisms must match the data the goal.
- Much of this material quoted or adapted from
Wikipedia - My Research
- Modeling Biological Signaling Pathways
- Fallacies in Probabilisitic Reasoning
- Why we need models for simple scenarios
- Much material due to Norman Fenton,Yuval Shahar
- Additional sources are written on the individual
slides
4Computational Thinking
- doing arithmetic, solving mathematical
equations by sheer bulldozing power, is not the
most significant . Computers are thinking aids
of enormous potentialities. Merely having them
around is enough to change the way we think, to
force investigators in all fields to think
through their problems along new lines. We are at
the beginning of a trend that is certain to bring
machines which not only learn, but which will
accelerate the rate at which we ourselves learn.
The revolution to come is difficult to appreciate
fully. We only know that science, government, and
industry will change swiftly and radically in the
years ahead. -- The Thinking Machine, 1962.
5Futurism
- People tend to over-estimate how much change will
occur in 1-3 years. - People tend to under-estimate how much change
will occur in 10-30 years. - The information you learn now might not be
useful for long, but the way you learn to think
will endure.
6Coursework is an artificial workload
- Its fantastically beneficial and worth every
cent youre paying, but its artificial. - Real jobs are blah blah blah repetitive,
boring, gruntwork - Real problems often come with longer time frames,
greater risk, and more broadly delegated roles. - Real jobs are sometimes better grounded in the
philosophical principles of the field.
7The grounding of different fields
- Mathematics what can I prove, given my axioms?
- Engineering what can I build and how does it
behave? - Experimental Natural Science design controlled
experiments - Observational Natural Science choose what to
observe
8Computational Thinking
- Computer science interacts with almost every
other discipline on campus. Computational
biology, computational chemistry, computational
design, computational finance, computational
linguistics, computational logic, computational
mechanics, computational neuroscience,
computational physics, and computational and
statistical learning . Computer science is not
just about programming, but about thinking. Our
long-term vision is to make computational
thinking commonplace for everyone, not just
computer scientists. - -- Jeannette Wing (CMU)
9- Robotics CS Mechanical Engineering
Electrical Engineering - Language Technologies CS Linguistics
- Human-Computer Interaction CS Design
Psychology - Automated Learning and Discovery CS
Statistics - Software CS Public Policy Management
10Computational Thinking
- Computational thinking builds on the power and
limits of computing processes, whether they are
executed by a human or by a machine.
Computational methods and models give us the
courage to solve problems and design systems that
no one of us would be capable of tackling alone.
11Computational Modeling
- Computer models have some of the characteristics
of mental modeling as well as some of the
characteristics of math modeling and the types of
modeling done in other disciplines. If a problem
lends itself to computer modeling, then the
computer may well be able to carry out the steps
(procedures, symbol manipulations) needed to
solve the problem.
From www.iae-pedia.org
12Modeling
- AÂ model is a physical, mathematical, or
logical representation of a system of entities,
phenomena, or processes. Basically a model is a
simplified abstract view of the complex reality.
It may focus on particular views, enforcing the
"divide and conquer" principle for a compound
problem. - A model is a formalized interpretation which
deals with empirical entities, phenomena, and
physical processes in a logical, mathematical, or
systematic way. - A model is also a way in which the human thought
processes can be amplified.  Models that are
rendered in software allow scientists to leverage
computational power to simulate, visualize,
manipulate and gain intuition about the entity,
phenomenon or process being represented
13Modeling
- Scientific modeling is the process of generating
abstract, conceptual, graphical, and/or
mathematical models. Science offers a growing
collection of methods, techniques and theory
about all kinds of specialized scientific
modeling. - Modeling is an essential and inseparable part of
all scientific activity, and many scientific
disciplines have their own ideas about specific
types of modeling.Â
14Modeling
- Modeling is a comparatively new area of activity
involving the marriage of ideas from various
disciplines. - Modeling techniques include statistical methods,
computer simulation, system identification, and
sensitivity analysis. An important issue is to
understand the underlying dynamics of a complex
system. (chaos theory) - Must assess whether the assumptions of a model
are correct and complete. Does a model
reflects reality? Deal with divergences between
theory and data.
15Modeling
- One of the main aims of scientific modelling,
according to Silvert (2001), is to apply
quantitative reasoning to observations about the
world, in the hope of seeing aspects that may
have escaped the notice of others. - Typical Steps
- characterize the system,
- make some assumptions about how it works
- translate these into equations and a simulation.
- validate the results or check the models
predictions.
16Generating a Model
- Typically a model will refer only to some aspects
of the phenomenon in question, and two models of
the same phenomenon may be essentially different.
This may be due to differing requirements of the
model's end users or to conceptual or aesthetic
differences by the modelers and decisions made
during the modeling process. - Users of a model need to understand the model's
original purpose and the assumptions that it is
based on.
17Assumptions and Scope
- Identify abstract principles that are
approximately true (or true enough for modeling
purposes). - Remove issues that are higher and bigger than you
are willing to take on (such as organism-wide
cancer from medical viewpoint). Remove issues
that are too low-level (exact details of variants
forms, or exact details of how the DNA unwinds if
all you care about is the behavior of a whole
organ.) - Computational thinking what information will
suffice? what information is missing? What is
happening that doesnt have a cause?
18Ways to Validate a Model
- Ability to explain past observations
- Ability to predict future observations
- Ability to control events
- Cost of use, especially in combination with other
models - Refutability, enabling estimation of the degree
of confidence in the model - Simplicity, or even aesthetic appeal
19Computational Modeling
- a computational model is a program that attempts
to simulate an abstract model of a particular
system. - Computational models can be mathematical models,
or can be any otherexecutable form. - computational models can also use external
inputs, with only part of the system being
modeled, such as flight simulators.
20Computational Modeling
- What is the point of modeling and simulation?
- Dont want to repeat a prediction that is already
essentially obvious. - Dont want to build a model of something thats
so poorly understood that you dont really know
anything yet. - Need to find a middle ground.
- In biology thats particularly hard because
biologists informally have a lot of intuitive
knowledge that helps them get the individual
results. - Can you formalize the published and unpublished
knowledge?
21Choice of Formalism
- Determines what information you can represent in
your modeling. - Represent encode, capture.
- Affects input data, output results, and
everything else between. - Key questions for choosing formalisms amount of
detail, scale(s), dimensions of variation. - Scope
- Should we build complex models than incorporate
everything we know? - Should be build simple models that incorporate
only what were sure of?
22Spectrum of Computational Mining / Modeling
Methods
SPECIFIED
ABSTRACTED
differential equations
Markov chains
Bayesian networks
Boolean/fuzzy logic models
mechanisms
statistical mining
(including molecular structure-based computation)
influences and logic
components and relationships
Appropriate approach depends on question and data
23Adapted from Nature Cell Biology, 8 1195 - 1203
(2006)Â Aldridge et al.
24Why we need mathematical models to understand
biological systems
Weng, Bhalla and Iyengar, Science. 284, 92
(1999).
25(No Transcript)
26Pathway in diagram form
from Nature 407, 789-795
27Pathway knowledge
Time-series measurements of concentrations
Reaction equations with unknown parameters
Parameter estimation
Computational model of pathway dynamics
Biological Predictions
28Semi-precise format
29To be 100 precise we need each step to have a
reaction equation with defined kinetics
- Mass action
- caspase-3 IAP ? caspase-3IAP
- Irreversible mass action
- Ligand Receptor ? LigandReceptor
- Michaelis-Menten
- procaspase-3 ? caspase-3, catalyzed by
caspase-8. Two parameters, k and Km.
kforward
kbackward
kLR
30Reaction equations are interchangeable with
differential equations that calculate
concentrations over time
kLR
- Receptor Ligand LigandReceptor
- d LR / dt kLR L R
- d L / dt - kLR L R
- d R / dt -kLR L R
- kLR L R velocity
31Pathway knowledge
Time-series measurements of concentrations
Reaction equations with unknown parameters
Parameter estimation
Computational model of pathway dynamics
Biological Predictions
32How much ligand is needed to cause activation of
caspase-8?
33What part of this research is within a
traditional discipline?
- Mathematical equations for the pathway
- Defining the format and parameters of a model
- Experiments for calibrating the model
- Parameter Estimation Method
- Choose the best number for each parameter
- Simulate pathway behaviors
- Biological Experiments
Computer Science
Biology
34Impact of modelingaccording to (Bentele et al.,
2004)
- Communicate interpretations and causes
- Focus debate and avoid pointless argument
- Fewer experiments needed to pinpoint responsible
regulatory mechanism. - Probe different scenarios hypotheses.
- They feel they couldnt have gotten the result
without both model and experiment.
35Benefits of Modeling
- Larger pathways become very complex
- Human intuition isnt good at intuiting dynamics
- Steady states are much easier.
- Help design more efficient experiments.
- Such as choosing timepoints.
- Birds eye view shows possibilities you havent
considered
36Benefits of Modeling
- Once you set up the model, each thing you try
with it gives instantaneous results, for free - And you can try changing ANYTHING.
- Each wet experiment requires a lot of resources.
- Explore a huge number of perturbations
- Each perturbation can be an experimental
condition. - Each condition can be a hypothesis, and each
simulation shows the implications of that
hypothesis.
37Mental benefits of modeling
- Quantitative awareness
- For example about amount of uncertainty
- Visualize all concentration curves
- Can trace causes, not just correlations
- Many people swear by it
- We limit our thinking unconsciously, especially
about unmeasurable things - Complementary to human discussions
- Competitive advantage?
38My Research
- Using differential equations to model the
dynamics of signaling pathways.
39Nature Cell Biology, 8 1195 - 1203
(2006)Â Aldridge et al.
40My Research
PI3K active
PI3K inact
PI3K membr
Akt cyto
PIP3
PIP2
? ?
Akt membr
PP2A
RedEnz
PDK1 active
PDK1 membr
PDK1 inact
PTEN
PTENox
p-Akt
DDC
DPI
NOX
catalase
SOD
O2-
H2O2
41O2-
PDK1_inactive
Akt_p
42Whats my modeling for?
- A model cannot be proven correct.
- Modeling cannot disprove anything.
- A model can suggest a view of reality
- Suggest experiments
- Suggest interpretations of datasets
- Suggest implications
- Predict the results of experiments never
performed before
43Whats my modeling for?
- Proof of Plausibility
- Accounting for the effects of a drug
- Suggesting a missing link
- Hypothesis Management
- Suggesting experiments to differentiate
- Disambiguate Communication
- Meta-study of Existing Reports
- Understand complex dynamics
44Proof of Plausibility
- Drug D causes some mild, subtle upstream effects
e, f, g, and a massively important downstream
effect Z. - D causes large changes in the localization of
protein e, but the data is noisy. Protein f shows
15 increased expression, not so big. Etc. - GOAL QUESTION TO ANSWER Are the effects e,f,g
sufficient to explain the big downstream effect
Z? - Or should we look for D having other effects?
45Proof of Plausibility
- Build a set of differential equations starting
from reactions D?e, D ? f, D ? g and ending with
the reactions that create Z. - Simulate the system computationally. Are the
effects e,f,g,h sufficient to explain the
downstream effect Z? - E.g., Modeling can provide mathematical evidence
that D?e and D ? f are enough to explain 100 of
Z, without needing g.
46Hypothesis Management
- Rather than doing experiments to confirm or
refute the most likely hypothesis - Provide a systematic list of hypotheses
- According to whatever criteria (e.g. willing to
measure) - Simulate each hypothesis
- Automatically filter the results, looking for a
good match to data, a conclusive result that
depends on the hypothesis.
47inactiveReducingEnzyme
O2-
Too Presumptious?
ReducingEnzyme(oxidizedactive)
?
Wrong! Whats better?
?
PPaseNHEox
PPaseNHE1(active)
inactiveNHE1
NHE1 (active)
PPaseAkt
PPaseAktox
inactivePDK1
?
Aktcytosol
PIP3
PIP3PDK1(active)
Akt-P
Aktmembrane
48Hypothetical influences of O2- on Akt lifecycle
expression activation
NHE1
PIP3
via PTEN
Akt_m
PDK1
p-Akt_m
via ERM scaffolding
Akt_cytosol
reduced p-Akt_cytosol
p-Akt_cytosol with disulfide
PP2A
?
S-NO activation
thioredoxin
49Hypothesis Management
- Suppose we have a set of several very closely
related hypotheses about the connectivity in a
signaling pathway. - Experimental measurements seem to give
approximately the same results for each case. - Design a combination of experimental
perturbations that will maximize the differences
between the behaviors of the system under the
different hypotheses. - Modeling was successful for suggesting a
combination of knockout and si-RNA perturbations
for highlighting the phenomenon we wanted to
study.
50Disambiguate Communication
- Certain types of complex concepts are hard to
express in text. - High-dimensional or dynamic or nested
- If your data provides evidence for something
complex happening, how do you communicate this
interpretation unambiguously to your audience? - Even if you can contruct the logic in your head,
your reviewers and editors and audience need to
be able to follow your logic, reproducibly.
51Meta-Study of Existing Studies
- (Future project, not succeeded yet).
- Basic idea
- Everybody accepts that A?B?C?D.
- 4 papers show that A?X in some cell types.
- 2 papers show that X?Y in some cell types.
- 8 papers show that Y?D in some cell types.
- Use modeling to show what fraction of the A?D
effect could theoretically be explained by
A?X?Y?D. (Which might shock people.)
52Example a study of dynamics
Urokinase
- AIM 1
- Characterize dynamic behavior of Urokinase
mediated plasmin, in silico and in vitro. - AIM 2
- Model the role of plasmin dynamics in liver
fibrosis - AIM 3
- Experimentally validate model predictions
TSP1
Plasmin
LTGF-ß1
TGF-ß1
53Detail about first aim
- CONCEPT The mechanism of activation of plasmin
from urokinase allows for bistability in plasmin.
- 1a Model development Define the differential
equations and the parameters. - 1b Model simulation Simulate the model in
MATLAB, under different initial conditions. - Phase plane and Nullcline analysis Steady state
analysis of the reduced system of ODEs. - Bifurcation analysis Analyzing the steady state
behavior of the system with changes in
parameters. - Experimental validation Experimental validation
of bistable behavior.
54Then study larger implications
55(No Transcript)
56Common Human Errors in Prob/Stat Reasoning
- Slide Credits
- Norman Fentons BBN web site http//www.dcs.qmul.a
c.uk/norman - Yuval Shahars class on medical decision-making
http//www.ise.bgu.ac.il/courses/mdss/
57Probabilistic Reasoning
- Probabilistic reasoning is used extensively in
computer science, especially in the design of
algorithms and in artificial intelligence, and
CompSci is one of the few undergraduate degrees
outside math that teaches probabilistic
inference. - The following fallacies are to help us appreciate
the importance of using formal methods and not
just common sense and gut instincts for solving
problems. - These are the sorts of things where humans tend
to intuit the wrong conclusions, and where
automated reasoning about outcomes, or
simulations of outcomes, can be particularly
important.
58 Here is a list of names from Hollywood movies
- Tom Hanks
- Ruth Gordon
- George Clooney
- Meg Ryan
- Natalie Wood
- Jackie Chan
Judy Davis Bruce Willis Arnold Schwarzenegger Laur
en Bacall
59Memory Test
- How many names were in the list?
- How many were male
- How many were female
- When given an equal number of each gender in the
list, if the men in the list are more famous than
the women in the list, then subjects usually
think the list has more men than women. - Increased familiarity makes some names more
retrievable.
60Search Effects
- Think of English words that have at least three
letters. - For a word picked at random that contains the
letter r, - Is it more likely that the word would start with
letter "r, or more likely that r would be the
third letter?
- "r" is more frequently found in the third
position of English words than in the first
position. Subjects usually choose the reverse. - It is easier to search for words by their first
letter than by their third, so the first case is
more available, despite being less numerous.
61- You are given the following information
- Ken is an extremely athletic-looking young man
who drives a fast car and has an attractive
girlfriend. - Is Ken more likely to be a professional football
player or a nurse?
62Representativeness
- We often judge whether object X belongs to class
Y by how representative X is of class Y - Despite differences in the prior probability of
each profession, subjects usually order the
probability of potential occupations by
similarity to representatives. - Prob(famouspopular) ? Prob(popularfamous)
- Prior probabilities of diseases are often ignored
when the patient seems to fit the description of
a rare disease. - Eastern Equine Encephalitis
63- Â Steve is very shy and withdrawn, invariably
helpful, but with little interest in people, or
in the world of reality. A meek and tidy soul,
he has a need for order and structure, and a
passion for detail. - Is Steve a farmer, a librarian, a physician, an
airline pilot, or a salesman? - Farmers are half the worlds population. How
many librarians are there?
64- Write down the last two digits of your student ID
number on a piece of paper. - Or passport number if you dont have a student ID
number. - Digits only. Please ignore characters.
65Anchoring
- Please write down your estimate
- What percentage of Singapore households own at
least one iPod? - Anchoring describes the phenomenon where
previous estimates (including random or
uncorrelated numbers) bias later estimates and
revisions. - Are your ID numbers correlated with your
predictions of iPod ownership?
66Insufficient Adjustment Anchoring
- Anchoring occurs even when initial estimates
(e.g., percentage of African nations in the UN)
were explicitly made at random by spinning a
wheel! - Anchoring may occur due to incomplete
calculation, such as estimating by two
high-school student groups - the expression 8x7x6x5x4x3x2x1 (median answer
512) - with the expression 1x2x3x4x5x6x7x8 (median
answer 2250) - Anchoring occurs even with outrageously extreme
anchors (Quattrone et al., 1984) - Anchoring occurs even when experts (real-estate
agents) estimate real-estate prices (Northcraft
and Neale, 1987)
67Illusory Correlations
- The probability that two events will co-occur
- Is often judged not on the number of
co-occurrences (which would be correct) - But mistakenly its judged on the strength/degree
of their association, from the instances when
they do co-occur. Even if the number of
co-occurrences is small.
68- People expect random sequences to be
representatively random even locally - E.g., they consider a coin-toss run of HTHTTH to
be more likely than HHHTTT or HHHHTH - The Gamblers Fallacy
- the mistaken belief that past events will affect
future events when dealing with random activities
- After there has been a run of reds in roulette,
people think black is more (or less) likely to be
next.
69The Illusion of ValidityÂ
- A good match between input information and
outcome causes people to over-estimate their
confidence in a prediction. - Internal consistency of input pattern increases
confidence - Redundant, correlated data increases peoples
confidence in a prediction. - A series of Bs seems more predictive of a final
grade-point average than a set of As and Cs
70Misconceptions of RegressionÂ
- People tend to ignore the phenomenon of
regression towards the mean - Correlation between parents and childrens
heights or IQ. - People expect predicted outcomes to be as
representative of the input as possible.
71Prep for Exam
- Suppose performing well on an exam requires that
I set my alarm properly (95 probable), that I
wake up when my alarm sounds (90 probable), that
I dont have a headache (95 probable), that my
bus isnt late (80 probable), and that Im not
sitting next to somebody who taps their pencil
(80 probable). - Probability of performing well is 52. People
tend to overestimate the probability of
conjunctive events