Evolving LSystems to Capture Protein Structure Native Conformations - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Evolving LSystems to Capture Protein Structure Native Conformations

Description:

Protein Structure Prediction (PSP) ... EA approaches to PSP: Current (Direct) Encoding ... We are not solving the PSP yet, but. ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 18
Provided by: Pers192
Category:

less

Transcript and Presenter's Notes

Title: Evolving LSystems to Capture Protein Structure Native Conformations


1
Evolving L-Systems to Capture Protein
StructureNative Conformations
  • Gabi Escuela1, Gabriela Ochoa2 and Natalio
    Krasnogor3
  • 1,2 Department of Computer Science, Universidad
    Simon Bolivar, Caracas, Venezuela
  • 1gabiescuela_at_netuno.net.ve, 2gabro_at_ldc.usb.ve
  • 3 School of Computer Science and I.T., University
    of Nottingham
  • Natalio.Krasnogor_at_nottingham.ac.uk

2
Content
  • Proteins
  • Protein Structure Prediction (PSP)
  • The HP model
  • EA approaches to PSP current encoding
  • L-Systems
  • Why a grammatical encoding?
  • Methods and Results
  • Discussion and Future Work

3D structure of myoglobin, showing coloured alpha
helices.
3
Proteins
  • Linear chains of 30-400 units from 20 different
    amino acids
  • Fold into a unique functional structure native
    state or tertiary structure

Show repeated substructures alpha helices and
beta sheets
1A8M 3-D Structure
4
Protein Structure Prediction (PSP)
  • Goal Determining the 3D structure of proteins
    from their amino acid sequences
  • Strategy find an amino acid chain's state of
    minimum energy
  • Solution will have practical consequences in
    medicine, drug development and agriculture

5
The 2D HP Model
  • Hydrophobic effect is the main force governing
    folding
  • q ?H, P, each letter of q has to be put in
    vertex of a given lattice L (at each point turn
    90º Left or Right, or continue ahead)
  • Scoring function adds -1 for each contact
    between two Hs adjacent in the lattice that are
    not consecutive in q

2 Amino acids types hydrophobic (H) and polar or
hydrophilic (P)
HPHPPHHPHPPHPHHPPHPH
Square Lattice
9 H-H bonds Score -9
  • Objective Find the organization (embedding) of q
    in L of minimum score (maximum contacts)

6
EA approaches to PSP Current (Direct) Encoding
  • EAs and other stochastic methods global
    optimization of a suitable energy function
  • Encoding Cartesian Coordinates, Distance
    Geometries, Internal Coordinates
  • Absolute structure encoded as a string of
    symbols. For example In the 2D Square
  • s Up, Down, Left, Right
  • Relative each move is interpreted in terms of
    the previous one
  • s Forward, TurnLeft, TurnRight

7
Protein HPHPPHHPHPPHPHHPPHPH L 20
Absolute Encoding
RDDLULDLDLUURULURRD L 19
R
D
D
L
First position is fixed
Relative Encoding
RFRRLLRLRRFRLLRRFR L 18
R
R
F
R
First and second position are fixed
8
L-Systems (Lindenmayer, 1968)
  • A model of morphogenesis, based on formal
    grammars
  • Rewriting Define complex objects by replacing
    parts of a simple object using a set of
    productions.
  • Symbols F, f, , -, ,
  • Axiom (S)
  • Production (replacement) rules

r1
S F
r2
f
F
F
start
Ff
1
2
3
9
Why a Grammatical Encoding?
  • Specifies how to construct the phenotype
  • Can achieve greater scalability through
    self-similar and hierarchical structure
  • Proteins exhibit high degree of regularity, and
    repeated motifs
  • Current encoding may not be suitable for
    crossover and building block transfer between
    individuals

Protein Structure
3D L-System
10
Method
  • Prove of principle Can a folded protein be
    captured (encoded) by an L-system?
  • How to find that L-system An EA used to evolve
    an L-system that capture a folded protein
    (inverse problem)

Output L-system L that once derived, will
produce the target string RFRRLLRLRRFRLLRRFR
Input Folded structure in Relative
Coordinates RFRRLLRLRRFRLLRRFR
EA
Axiom 01F Rules 0RFR1, 12L2, 2R0L
11
Proposed Grammatical Encoding
  • D0L-system (deterministic and context free)
  • Alphabet ??t ? ?nt
  • ?tF,L,R terminal symbols (relative coord.)
    ?nt0,1,2,...,m-1 non-terminal symbols
    (rewriting rules), m max. number of rules
  • Axiom a ? ?
  • Rewriting rules i wi , where i ? ?nt and wi ?
    ?

axiom R2 rules 0R03F 1R01L
2F310 3LRL3
Example
12
Evolutionary Algorithm
  • Generational with rank based selection
  • Randomly generated initial population
  • Prefixed maximum number of rules
  • Axiom and Rules randomly generated strings of
    prefixed maximum length
  • Genetic operators
  • Uniform-like (homologous) recombination (rate
    1.0) complete production rules are interchanged
  • Per symbol mutation in both axioms and rules
    (deletion (30), insertion (10),
    modification(60))

13
Derivation, and Fitness Function
  • Derivation from genotype (axiom and rules) to
    phenotype (folded structure)
  • Post-processing non-terminal symbols pruning
  • Fitness calculation number of matches between
    the target string and the solution Min. 0, Max
    length of the desired folding.

14
Results (1)
15
Results (2)
Evolutionary progression towards the target
structure
16
Discussion
  • The proposed EA discovered L-systems that capture
    a target folding under the HP model in 2D
    lattices
  • We are not solving the PSP yet, but ..
  • We are proposing a novel and potentially useful,
    generative encoding for evolutionary approaches
    to PSP

17
Future work
  • Incorporate problem knowledge about secondary
    structures

Beta Turn
Beta Sheet
Alpha Helix
  • Explore longer chains and 3D lattices
Write a Comment
User Comments (0)
About PowerShow.com