Macromolecular Structure Database group - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Macromolecular Structure Database group

Description:

By non stereo substructure. By non stereo superstructure. By exact stereo or non stereo structure ... Optimise and boost the performance of substructure searches ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 19
Provided by: same84
Category:

less

Transcript and Presenter's Notes

Title: Macromolecular Structure Database group


1
Dimitris Dimitropoulos Chemistry the PDB MSDchem
2
The chemical database
3
MSDchem ligand dictionary
  • Complete, clean, up to date collection of all
    the chemical species and small molecules in the
    PDB
  • A ligand in MSDchem is a complete, distinct
    stereo isomer of a chemical compound
  • Atoms and element types
  • Bonds and bond orders
  • Stereo configuration of atoms and bonds in cases
    of stereo-isomers (R/S E/Z)
  • Atom names and coordinates are not fundamental
    properties

4
Role in the MSD database
  • An integral component in the core of MSD
    database
  • Relational reference from entities where a
    molecule or atom name is used in the PDB (protein
    residues and atoms)
  • It is not possible for an ATOM line
  • HETATM 4342 C2 PLA 86 14.227 11.195 -8.256
    1.00 67.95 C
  • to be loaded if the PLA ligand is not defined
    or it does not include a C2 atom.

5
Chemistry and PDB
  • Eliminate chemical inconsistencies from new PDB
    entries
  • Structure and derived properties of a ligand
    apply automatically to residues and bound
    molecules that reference it
  • The basic structure is carefully determined
    during curation, and a rich set of derived
    attributes is calculated for each ligand
  • Graph isomorphism is being applied to check the
    consistency of the PDB, taking stereo-configuratio
    n into account
  • Old legacy PDB entries are chemically
    corrected when loaded in the MSD database
  • In thousands of cases errors are identified and
    corrected, involving most of them times
    inconsistent naming or different
    stereo-configuration
  • Exchanged in cooperation with RCSB and the wwPDB

6
More than just the PDB codes
  • All ligands are modelled as separate
    inter-related ligands and the appropriate one is
    referenced
  • No distinction is made in the PDB between ribo-
    and deoxyribonucleotides (all are identified
    with the same residue name i.e., A, C, G, T, U,
    I)
  • Modified nucleic acids are given as A etc
    regardless of modification
  • No distinction between different topological
    variants (12 different variants can be found for
    HIS in PDB)

7
Derived information
  • External scientific software (CACTVS, VEGA,
    CORINA, ACD-labs, CCP4, OELIB) together with in
    house development has been used to derive
  • Stereochemistry (R/S E/Z)

DCM C4' S C3' R C1' S
DCF C4' R C3' S C1' R
  • Smiles and detailed gifs
  • Systematic IUPAC names

THIOALANINE (ALT) CC(N)C(O)S -
CC_at_H(N)C(O)S (2S)-2-aminopropanethioic O-acid
8
Derived information
  • Fingerprints
  • A bit string in hexadecimal form that indicates
    the presence or not of segments from predefined
    lists
  • Useful for fast search and classification
  • Different libraries of predefined lists can be
    set
  • Currently calculated for the CACTVS library (500
    segments)
  • Molecule Segments

BitString 1 0 1 0 1 0
Fingerprint 2A
9
(No Transcript)
10
Search options
  • By ligand code
  • By ligand name or synonym
  • By formula or formula range
  • By non stereo substructure
  • By non stereo superstructure
  • By exact stereo or non stereo structure
  • By fingerprint similarity

11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
Click on EAA
Results of is superstructure of
15
EAA details
3-chloro-phenol
16
Results Viewers
17
PDB residue KWT
ltchemCompgt ltcodegtKWTlt/codegt ltnamegt(1S,6BR,9AS,11
R,11BR)-9A,11B-DIMETHYL-1-(METHYLOXY)METHYL-3,6,
9-TRIOXO-1,6,6B,7,8,9,9A,10,11,11B-DECAHYDRO-3H-FU
RO4,3,2-DEINDENO4,5-H2BENZOPYRAN-11-YL
ACETATElt/namegt ltnAtomsAllgt55lt/nAtomsAllgt
ltnAtomsNhgt31lt/nAtomsNhgt ltoverallChargegt0lt/overal
lChargegt ltstereoSmilesgtCOCC_at_H1OC(O)c2coc3C(O)
C4C(C_at__at_H(CC_at__at_5(C)C_at_H4CCC5O)OC(C)O)C_at_1(C
)c23lt/stereoSmilesgt ltsystematicNamegt(1S,6bR,9aS,11
R,11bR)-1-(methoxymethyl)-9a,11b-dimethyl-3,6,9-tr
ioxo-1,6,6b,7,8,9,9a,10,11,11b-decahydro-3H-furo4
,3,2-deindeno4,5-hisochromen-11-yl
acetatelt/systematicNamegt
18
Future targets
  • Identify and model protein inhibitors as ligands
  • Pre-classify functional groups for ligands and
    ligand atoms based on substructure fragments.
  • Optimise and boost the performance of
    substructure searches
  • Enhance visualisation and integration with other
    MSD tools
Write a Comment
User Comments (0)
About PowerShow.com