Dimitris Papamichail - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Dimitris Papamichail

Description:

I have been working on design and synthesis of genomic sequences in collaboration with: ... The total number of times we visit each city is fixed (codon distribution) ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 20
Provided by: D2160
Category:

less

Transcript and Presenter's Notes

Title: Dimitris Papamichail


1
Design tools for Synthetic Virology
  • Dimitris Papamichail
  • Assistant Professor
  • Computer Science Department
  • University of Miami

2
Our group
  • I have been working on design and synthesis of
    genomic sequences in collaboration with
  • Professor Steven Skiena
  • Computer Science Department
  • Stony Brook University
  • Professor Eckard Wimmer
  • Department of Molecular Genetics and Microbiology
  • Stony Brook University
  • Other members Rob Coleman, Steffen Mueller,
    Bruce Futcher

3
Initial motive Vaccine Design
  • Question How can we rapidly create a vaccine for
    a new
  • viral disease?
  • Motivation Mutated lethal viruses,
    bio-terrorism, cheap
  • synthesis technologies
  • Input The genome of a virus
  •      
  • Output Our design of a "better" virus to serve
    as a vaccine
  • candidate
  • Aim Redesign life  (if you assume viruses are
    alive)

4
Our designs
  • So far, our group has designed, synthesized, and
    evaluated (or still evaluates)
  • Four new variants of poliovirus
  • - Two codon optimized designs
  • - Two codon pair optimized designs
  • Several optimized flu virus segments
  • Recently a couple of bacterial genes
  • A couple of artificial constructs for hypothesis
    testing.

5
Novel sequence design
Our goal
  • Debilitate virus genome translation /
    replication.
  • Make a difficult to revert and genetically stable
    virus as the cumulative phenotype of many
    mutations each with a small effect.
  • Optimization problem at hand
  • Select one (or few) of the 7.9 x 10442 possible
    synonymous encodings for a poliovirus capsid gene
    of 881 amino acids (compared to an estimated 1.3
    x 1079 atoms in the known universe), that serves
    our design criteria.

6
Novel sequence design
First two designs Codon (de-)optimized
  • We tested the hypothesis that underrepresented
    codons reduce translation efficiency by creating
    a novel polio capsid design (PV-AB) which
  • Encoded the same amino-acid sequence
  • Used only the least frequent codon for each
    amino-acid in human brain specific genes (and in
    human tissues in general).
  • Total number of silent mutations 680
  • We also created another design (PV-SD) which
    maximized the hamming distance of the capsid
    encoding, while keeping the same codon frequency
    distribution.
  • Total number of silent mutations 934
  • Why alter only the capsid coding region?
  • No cis-acting structural RNA elements

7
Novel sequence design
Experiments
  • The shuffled polio design translates relatively
    well and is as potent in killing mice as the
    wildtype.
  • The brain-hostile design translates minimally,
    but use of smaller segments leads to attenuated
    strains.

8
Novel sequence design
Methods
  • To achieve maximum hamming distance without
    altering the codon distribution, we used maximum
    weight bipartite matching between codon positions
    and codons, using as weight the number of bases
    changed.
  • Restriction sites were inserted uniquely
    (inserted in specific areas and then eliminated
    everywhere else).
  • Certain regions were locked to preserve secondary
    structure.

9
Novel sequence design
Codon pair bias
  • According to Hatfield et al., another source of
    translation (in)efficiency is the codon pair
    bias.
  • We quantified the bias with the following score
  • Many viruses actually are using overrepresented
    codon pairs (in human) to encode their genes.

10
Novel sequence design
Codon pair bias
  • There also seems to be significant correlation
    between related eucaryotes and codon pair bias.

11
Novel sequence design
Another two designs Codon pair (de-)optimized
  • We designed two polio capsid sequences that
    optimize the usage of over-represented and
    under-represented codon pairs in human.

12
Novel sequence design
Our designs
  • The design consisted of the following steps
  • Same codon frequency distribution
  • Optimized codon pair score
  • Restriction site uniqueness and elimination
  • Local Secondary structure folding energy
    restriction
  • Splice site elimination
  • Goals achieved with simulated annealing,
    optimization passes and manual intervention.

13
Novel sequence design
Codon pair optimization problem
  • Our problem variant of TSP (Travelling
    Salesperson Problem)
  • This variant is polynomially time solvable, but
    with O(n65) complexity, using dynamic
    programming.
  • We have 20 countries (amino-acids), 64 cities
    (codons), each country has from 1 to 6 cities.
  • We will make n visits to the countries
    (amino-acid chain).
  • Each time we visit a country we can visit only
    one city (select the codon to code for an amino
    acid).
  • The total number of times we visit each city is
    fixed (codon distribution).

14
Novel sequence design
Results
  • How do the new codon pair biased designs behave?
  • maxP1 (using overrepresented codon pairs, 566
    mutations) translates as well as the wildtype.
  • minP1 (using underrepresented codon pairs, 631
    mutations) translates poorly.
  • Details on the results can be found in our June
    27, 2008
  • Science publication
  • Currently we are also investigating other
    signals, such as CpG dinucleotide content, which
    are inherent in such biased constructs.

15
Sequence design tools
  • Tools already exist in sequence design
  • GeMS
  • CAD-PAM
  • Gene2oligo
  • DNAWorks
  • GeneDesign
  • GenoCAD
  • Most of these tools offer
  • Oligo-design
  • Restriction site creation/elimination
  • Codon usage alteration
  • Oligo-design is the process of breaking the
    sequence into oligos (short sequence fragments),
    which can be used to self-assemble through PCR
    cycles into synthons (or the whole sequence, if
    small enough).
  • We consider this process as a black box provided
    by synthesis companies.

16
Our goal
  • Efficient Algorithm Design for coding sequence
    alterations.
  • Techniques for multi-objective optimization in
    genomic sequence design.
  • Expand the knowledge base of systematic sequence
    bias in genomic sequences
  • Incorporation of pathways, transcriptional
    control, boolean logic and other complex criteria
    in novel genomic sequence design.
  • We aim to create a system conceptually built
    around constraints instead of sequences.
  • The gene/genome designer will work on the level
    of specifying characteristics of the desired
    gene/genome (amino acid sequences,
    codon/codon-pair distribution, distribution of
    restriction sites, RNA secondary structure
    constraints, incorporation/elimination of
    patterns, etc.) and the gene editor will
    algorithmically design a DNA sequence realizing
    these constraints.

17
Sequence design tools
SeEd (Sequence Editor)
18
  • Thank you!

19
  • Questions
Write a Comment
User Comments (0)
About PowerShow.com