Title: From NMR data to structure deposition: The ExtendNMR software pipeline
1Title
From NMR data to structure deposition The
Extend-NMR software pipeline
2The Extend-NMR Pipeline
TopSpin, MDD, PRODECOMP
CcpNmr Analysis, Auremol
ARIA, ISD, HADDOCK
CING
CCPN Data Model
CcpNmr FormatConverter
Reference Data
AutoDep
NMRStar 3.1
Legacy Formats
3Data Modelling
Chain Code
Residue 3 letter code Seq number
Atom Name Element
Coordinate X Y Z
4CCPN Interface Schemes
Via format conversion (Example HADDOCK)
Application
Formatted File
Proprietary Memory
CCPN Project
In-memory conversion (Example ARIA2)
Custom conversion
Application
Proprietary Model
CCPN Data Model
CCPN Project
Direct API access (Example PRODECOMP)
Application
CCPN Project
CCPN Data Model CcpNmr Functions
5Extend-NMR
6Collaborations outside Extend-NMR
- Europe
- Hartmut Oschkinat
- Solid State in CcpNmr Analysis Available
growing - Marcus Zweckstetter
- MARS - Available soon
- PALES - Planned
- Martin Blackledge
- MECCANO - Available
- America
- Miguel Llinas
- CLOUDS, BACUS - Available
- Guy Montelione (with NESGC)
- RPF - Available
- AutoAssign/Structure, PDB Harvest - Planned
7Title
From Bruker Spectrometer To Processed
Data Peter-René Steiner Software
Department Bruker BioSpin GmbH, Germany
8From Bruker Spectrometer to Processed Data
- Bruker BioSpin
- industrial partner within ExtendNMR
project,cooperation with - CCPN team data model
- Other partners e.g. fast methods, proteins
- MDD, PRODECOMP, Auremol
9From Bruker Spectrometer to Processed Data
Interfacing PRODECOMP
- Pulse, AU programs available
- Dialog-based parameter setup
- PRODECOMP scripts provided by M. Billeter group
- Called by TopSpin, calculates shapes
- Shapes stored as pseudo-1D data
- Displayed in TopSpin
- (working prototype, not released)
10From Bruker Spectrometer to Processed Data
Interfacing MDD
- Support for non-uniform data sampling in Bruker
parameters - Pulse, AU programs available
- MDD library provided by V. Orekhov group
- Called by TopSpin, performs data decomposition
- Result conventional nD data set
- (in progress, not released)
Parameter Settings
Pulse Programming
11From Bruker Spectrometer to Processed Data
Supporting the CCPN Data Model
- Bruker DVD installsCCPN API library
- TopSpin creates directory structure with CCPN
project - Experiment description
- nD peak lists
- Released with TS2.1, TS2.5P
- Can launch ExtendNMR GUI? integrates with
pipeline - Work continues e.g. NUS, MDD description
12From Bruker Spectrometer to Processed Data
Easy integration in TopSpin
13From Bruker Spectrometer to Processed Data
Easy integration in TopSpin
14From Bruker Spectrometer to Processed Data
Demonstration
15Title
16Projections
(a)
w
w1
W1
W
w2
W2
wHN
WHN
W1 f(W, W ...) W2 f(W, W ...)
17Input Projections
TOPSPIN (wNwHa/b , wHN)
CCPN (wNwCwHa/b , wHN)
Azurin, 128 a.a., 30 min / plane
18Output Components, Shapes
Component m (Gly!) Component n
N
Sequential components
136.0 ppm 95.1
CO
183.6 ppm 167.2
Ca/b(i-1)
86.5 ppm 4.1
Ha/b(i-1)
8.85 ppm 0.55
Ca
Cb
all shapes128 points
86.5 ppm 4.1
Ha
Hb
8.85 ppm 0.55
19Reconstruction 2 components
Component n Ca(i-1) Ha(i-1)
Component m (Gly!) Ca(i) Ha(i)
20Assignments
atom chemical shift N 2 120.8HN 2 8.87Ca 2 60.2
Ha 2 4.49Cb 2 33.7Hb 2 2.18N 3 120.5...
sequence
colours component chains side chains Ala, Ser,
Thr spheres Gly
213D Structure
HNOE shapes (from 4D NOESY)
w1 w2 atom1 atom2 vol. dist.ppm ppm Å 10.63 5
.74 HN 47 Ha 112 15 4.1 10.63 4.70 HN 47 Ha
48 10 4.4 ...
22Title
23Multi-Dimensional Decomposition (MDD)
F2
S Sb F1b ? F2b ? F3b
F1
F3
Assumption NMR signal is completely defined by
its line shapes in all spectral dimensions
24MddNMR
Input one or several N-D spectra in time or
frequency domains Output set of components in
USF3 format compact and convenient presentation
of N-D spectra. Non-uniform sampling for fast
and optimal data acquisition HD-spectroscopy or
co-processing of several spectra for maximal
sensitivity and resolution Integration with
TopSpin (Bruker), CCPNMR, nmrPipe, VnmrJ
(Varian)
25Example VDAC - integral membrane ion channel 4D
NUS-MDD Methyl NOESY, Bruker 900 MHz, mddNMR
software.
- 13 NUS schedule
- 6 days on 900 MHz Bruker
- mddNMR software
- Full experiment 46 days
Hiller et al, 2008, Science, 321, 1206-10
26MDD and HD - spectroscopy
Input Example seven 3D spectra 3D HNCO
?? ??????FN ?? ?FH ?? ?FC 3D intraHNCA ?? ???FN
?? ?FH ?? ?FCA 3D intraHNCB ?? ???FN ?? ?FH ??
?FCB 3D HN(CO)CA ?? ?? ???FN ?? ?FH ??
?FCA1 3D CBCA(CO)NH ?? ???FN ?? ?FH ??
?FCA1CB1 3D H(CCO)NH ?? ???FN ?? ?FH ??
?FC-TOCSY 3D NOESY-HSQC ?? ???FN ?? ?FH ??
?FNOESY Result Components of 9-dimensional hyper
spectrum ?? ?FN ?? ?FH ?? ?FC ?? ?FCA ?? ?FCB ??
?FCA1????? The ?FN and ?FH shapes serve for
binding of hyper-components over the experiments.
27Unified Spectra Format (USF3)
How do we deal with high resolution 9D
spectrum? In regular full layout it is ca. 109
Tb
28Unified Spectra Format (USF3)
Component from 9D HD spectrum (Ubiquitin R72)
In USF3, spectrum is stored and handled as a set
of components, i.e. 1D shapes. We need to deal
only with 1 MB.
29Unified Spectra Format (USF3)
- USF3
- all spectra representation regular, projections,
MDD, PRODECOMP, etc. - compact storage and easy handling of deconvoluted
spectra - efficient analysis and automation
30Unified Spectra Format (USF3)
Any projection s produced on the fly Example
Ca(i-1)-Ca(i) HD projection
31MddNMR integration with CCPNMR and TopSpin
NUS schedule generator
NUS table
Spectrometer
TopSpinTM
MDD
USF3
CCPNMR Analysis viewing and analyzing ND/HD
spectra
32Thank you
CCPNMR Example reconstruction of hncoca,
ubiquitin
33Title
Auremol
34AUREMOL Overview
- Top Down Approach
-
- RELAX
- ASSIGN
- KNOWNOE
- REFINE
- RFAC
35AUREMOL Routines accesible from outside
- Peak picking (PP) routines
- Threshold Based PP
- peak intensity threshold
- Adaptive PP
- dynamic threshold depending on local noise level
- Bayesian PP
- knowledge based
36AUREMOL Bayesian signal recognition
Different signal classes --gt different
distributions of specific
properties (line shape, line width,
intensity) Probability for cross peak j
37AUREMOL Bayesian signal recognition
Automated class recognition signal
(green) noise (red) Results for a 2D-NOESY
38AUREMOL KNOWNOE
Automated structure determination from NOESY
spectra using knowledge based volume distributions
2D NOESY spectra 3D NOESY spectra Automated shift
optimization (SHIFTOPT) Assignment by KNOWNOE
Stable results without initial cross peak
assignment and also with incomplete chemical
shift tables (35 )
39AUREMOL KNOWNOE
- gt200,000 high resolution probability
distributions - Based on 1107 structures (970 X-ray, 137 NMR)
- Up to threefold assignment ambiguities can be
resolved
40AUREMOL Iterative structure determination
- Start with backbone assignment of HN and Ha atoms
36.5 (163/447) - Experimentally determined h-bonds ( 26).
- Backbone dihedral angles generated from backbone
chemical shifts using TALOS ( 50) - Extended start conformation
- 2D 1H NOE spectrum
- 10 iterations using all steps following backbone
assignment(RELAX, ASSIGN, KNOWNOE, REFINE, CNS,
RFAC)
HPr H15A from S. aureus.
41Title
ARIA
42ARIA
- Ambiguous Restraints for Iterative Assignment1
- Automated NOE assignment and structure
calculation - Iterative NOE assignment (ADR, Violation
analysis, Network anchoring2) - Distance calibration with spin-diffusion
correction3 - Complexes, symmetric homo-dimers4
- Water refinement5
1. Nilges at al. JMB, 1995 Nilges et al. JMB,
1997 Linge et al. Bioinformatics 2003 Rieping
et al. Bioinformatics, 2007 2. Herrmann et al.
JMB, 2002 3. Linge et al. J Mag Res, 2004 4.
Nilges et al. Proteins 1993, Bardiaux et al.
Proteins, 2008 5. Linge et al. Proteins, 2003
43CCPN Analysis - ARIA
44http//aria.pasteur.fr
45Title
ISD
46What is ISD?
- Inferential structure determination (ISD) uses
inference to determine an NMR structure from
experimental data and general prior knowledge on
biomolecules - It has no free parameters and provides you with
the uncertainty of your structure, in the sense
of an error bar - It uses available information in an optimal way
(increases structural precision and accuracy)
Rieping W, Habeck M, Nilges M. (2005) Science.
8309(5732)303
47ISD Overview
- Inference uses rules of probability theory to
determine probability distributions for all
unknowns - Requirements
- A theory to calculate the ideal data from a
structure - An error distribution to describe deviations from
the experiment - Prior information about structure (force field,
etc)
- Most probable structure and uncertainty
- Theory parameters
- Measure of data quality
48ISD Features
- Markov chain Monte Carlo sampler
(replica-exchange algorithm) - Parallel calculations on Linux cluster
- Future developments
- Using chemical shifts
- Using prior knowledge about protein conformations
from the databases - Combining NMR with other experimental techniques
- (see Poster 18)
49ISD Extend-NMR integration
- Start an ISD simulation
- (a) directly from a CCPN project using the
Extend-NMR GUI - (b) from the command line allowing ISD to
retrieve data from a CCPN project - Data imported from a CCPN project
- sequence, NOEs, RDCs, J couplings, distances,
dihedral angles - Data exported to the CCPN project
- Probabilistic structure ensemble
- ISD project settings (bookkeeping)
50ISD GUI Demonstration
- General setting simulation names, path names,
etc. - Molecules and structures sequence, initial
conformation, etc. - Experimental data e.g. distance restraint list
generated by ARIA - Replica settings communication method, list of
machines - Analyses simulation report, etc.
ISD homepage http//www.bioc.cam.ac.uk/isd
51Title
HADDOCK
A.M.J.J Bonvin, C. Dominguez, R. Boelens, S.J. de
Vries, M. van Dijk, M. Krezminski, V. Hsu, A.
Thureau, T. Wassenaar, A.D.J. van Dijk. NMR
Spectroscopy Bijvoet Center for Biomolecular
Research Utrecht University, The Netherlands
52Studying complexes by means of docking
- Experimental structure determination of complexes
remains challenging both for NMR and X-ray
crystallography - Lets assume you are NOT able to solve the
structure of the complex! - You are not lost Macromolecular docking, the
process of predicting the structure of the
complex from its separate constituents. - Lets predict the complex using HADDOCK (High
Ambiguity Driven DOCKing)
53Requirements input structures
Protein NMR
Canonical B-DNA modeled
54Requirements restraints
NMR crosssaturation
55Docking protocol
56HADDOCK integration CCPN
57The Haddock WebPortal
- Accessible from
- www.enmr.eu
- www.haddocking.eu
- www.haddocking.org
- Registration required (but free for non profit)
- Four interfaces
- Easy simple docking from list of residues to
define interfaces - Expert more control on docking parameters. Allow
the user to input restraint files (e.g. NOEs,
Hbonds, dihedral angles) - GURU full control, addition support for RDCs and
diffusion anisotry restraints - Parameter file upload
58Title
CING
59CING Philosophy
- User friendly interface to WHAT
IF/QUEEN/PROCHECK/Aqua/SHIFTX/Wattos/DSSP/..
results and reports. - Residue oriented.
- Integrated analysis.
- Validation and data together.
- Hyperlinked HTML.
- Color-coded (red, orange, green) (ROG-score).
- Automated export to multiple formats.
- API to data and validation results.
60CING ROG color coding
red 34 (16) orange 57 (27) green
117 (56)
1Y4O
61CING Data flow
62CING checks
- Correction of minor errors e.g. nomenclature.
- Validation of resonance assignments.
- Validation of experimental restraints.
- Validation of stereochemical quality.
- Validation of structural quality.
- Analysis of structural results.
63CING CCPN Integration
- CCPN 2.0 framework
- Mutual CING/CCPN references in respective
databases - IUPAC
- CING validation results stored into CCPN
framework - Direct mapping of CING/CCPN objects
64CING server (iCing)
http//proteins.dyndns.org/CING
http//proteins.dyndns.org/iCing
65Title
AutoDep
66CCPN-based deposition system for joint
deposition at PDBe and BMRB
Start CCPN Project
Add BMRB Entry object
EBI/PDBe AutoDep server
Output in NMR-Star
1
3
4
5
2
BMRB
Curated by PDBe
6
Curated by BMRB
Software
Depositor
Annotator
67Getting your project ready
- Start with a CCPN project (needs to be API v2
can use upgrade server if you have a v1 project) - Add a BMRB Entry object to your project can
use FormatConverter or Entry Completion Interface
(ECI)
68Choosing data to deposit using an Entry object
I
- It is necessary to add an Entry to a CCPN
project. In this Entry, it is possible to
associate data that you wish to submit for
deposition from your CCPN project. - Includes
- PDB title
- PDB keywords
- Laboratory address
- Author information
- Publications
- Software
-
69Choosing data to deposit using an Entry object
II
- Also
- Molecular information
- Associated publications
- NMR experiments/data
- Restraints used to calculate
- the structural ensemble
- Structures
- For complete projects, very little
- user input is required as most of
- the deposition information is
- automatically extracted.
70Submitting your project to the EBI server
- Send to EBI/PDBe AutoDep server
- This then gets curated by PDBe curators for
structural information - CCPN project gets updated with curated
information - Project gets written out in NMR-Star formatted
and automatically gets sent to BMRB for curation
of NMR data
71AutoDep web interface I
72AutoDep web interface II
73ADIT NMR
74Title
Final Remarks
75Extend-NMR Dissemination plans
- Release of a DVD early next year
- Keystone Frontiers of NMR Meeting
- Publication in J. Biomol. NMR
- Presentations/workshops in European NMR Centres
in the 1st half of 2009 - Please let us know if you would like to organise
a meeting of NMR groups in your country/region