QSAR Application Toolbox: Step 12: Building a QSAR model - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

QSAR Application Toolbox: Step 12: Building a QSAR model

Description:

This presentation demonstrates building a QSAR model for ... clicking shows. profiling ... can be executed in a global fashion (i.e., collecting all data of ... – PowerPoint PPT presentation

Number of Views:429
Avg rating:3.0/5.0
Slides: 52
Provided by: schu3
Category:

less

Transcript and Presenter's Notes

Title: QSAR Application Toolbox: Step 12: Building a QSAR model


1
QSAR Application ToolboxStep 12 Building a
QSAR model
2
Objectives
  • This presentation demonstrates building a QSAR
    model for predicting acute toxicity to
    Tetrahymena pyriformis of aldehydes. The
    presentation addresses specifically
  • predicting acute toxicity for a target chemical
  • building QSAR model based on the prediction
  • applying the model to other aldehydes
  • exporting the predictions to a file.

3
The Exercise
  • This exercise includes the following steps
  • select a target chemical Furfural, CAS 98011
  • extract available experimental results
  • search for analogues
  • estimate the 48h-IGC50 for Tetrahymena pyriformis
    by using trend analysis
  • improve the data set by either
  • subcategorising by Protein binding mechanisms,
    or
  • assessing the difference between outliers and the
    target chemical
  • evaluate and save the model
  • Use the model to display its training set,
    visualize its applicability domain and perform
    predictions.

4
Chemical Input
  • After launching the Toolbox, select the Flexible
    Track.
  • This takes you to the first module, which is
    Chemical input.
  • Enter the target chemical by its CAS number
    (98-01-1)

5
Select target chemical Furfural, CAS 98011
6
Substance Information
7
Profiling the Target Chemical
  • Select the Profiling methods you wish to use by
    clicking on the box before the name of the
    profiler.
  • For this example check all mechanistic methods.
  • Click on Apply.

8
Profiling
9
Target interaction with proteins
Double clicking shows profiling scheme
The chemical could interact with protein by
Schiff-base formation.
10
Target interaction with proteins
11
Endpoints
  • Endpoints refer to the electronic process of
    retrieving the environmental fate, ecotoxicity
    and toxicity data that are stored in the Toolbox
    database.
  • Data gathering can be executed in a global
    fashion (i.e., collecting all data of all
    endpoints) or on a more narrowly defined basis
    (e.g., collecting data for a single or limited
    number of endpoints).

12
Extracting endpoint values
13
Redundancy table Reports for same endpoint
values across databases
14
Reproducing endpoint value
In this exercise we will build a QSAR model to
estimate the following endpoint
Ecotoxicological Information Aquatic
Toxicity Protozoa Tetrahymena
pyriformis IGC50 48h
15
Defining a Category
The initial search for analogues is based on
structural similarity, in this example - US
EPA categorization
16
Category Definition
17
Set Category Name
18
Analogues
  • The data is automatically collated.
  • Based on the defined category (Aldehydes US EPA
    categorisation) 274 analogues have been
    identified.
  • These 274 compounds along with the target
    chemical form a category (Aldehydes), which can
    be used for data gap filling (see next slide).

19
Analogues
20
Extracting experimental results for analogues
  • Highlight the 274 Aldehydes (US EPA
    categorisation).
  • The inserted window entitled Read Data?
    appears (see next slide).
  • Click OK.

21
Extracting experimental results for analogues
22
Extracting experimental results for analogues
23
Applying Trend-analysis
  • Move to the module Filling data gap
  • Open the data tree to
  • Ecotoxicological information
  • Protozoa
  • Tetrahymena pyriformis
  • IGC50
  • 48 h
  • Highlight the data endpoint box under the target
    chemical.
  • It contains already an experimental result, which
    we are going to reproduce by trend analysis.
  • Next with the trend analysis box highlighted,
    click Apply (see next slide).

24
Apply Trend-analysis
25
Results of Trend-analysis
26
Interpreting the Trend-analysis
  • The resulting plot outlines the available
    experimental results of all analogues (Y axis)
    according to a default descriptor Log Kow (X
    axis).
  • The RED dot represents the target chemical.
  • The BLUE dots represent the experimental results
    available for the analogues.
  • The GREEN dots represent the analogues belonging
    to a different subcategory (see following slides).

27
An Accurate Trend Analysis of the Data set (1)
  • In this example, the mechanistic properties of
    the analogues are not consistent.
  • Subcategorization can be performed based on
    protein binding mechanisms. This is the second
    stage of analogue search - requiring the same
    interaction mechanism.
  • Acute effects are indeed associated with
    interaction of chemicals with lipid cell
    membrane, i.e. with protein binding.
  • Chemicals with a different protein binding
    mechanism compared to the target chemical will be
    removed.

28
Subcategorization
  • To improve the data by subcategorizing, follow
    these steps
  • Click on Subcategor.
  • Select Protein binding from the Grouping methods
    list.
  • All chemicals which have a potential protein
    binding mechanism different from the target
    chemical are highlighted (GREEN dots)
  • Click on Remove.

29
Subcategorization
30
Result after Subcategorization
31
An Accurate Trend Analysis of the Data set (2)
  • The chemicals which differ from the target are
  • Michael type nucleophilic addition (23)
  • No binding (48)
  • Nucleophilic addition to azomethynes (1)
  • Nucleophilic substitution of haloaromatics (1)
  • Another way for refining the data set is to ask
    what makes the obvious outliers different from
    the target.

32
Subcategorization
  • Right-Click on any of the outlying results from
    the analogues (BLUE dots)
  • Select Differences to target from the menu
  • Select Protein binding from the Grouping methods
    list
  • Click on Remove (see next slide)

33
Subcategorization
34
Result after Subcategorization
35
QSAR Model evaluation
  • To assess the model accuracy use
  • - Adequacy (predictions after leave-one-out)
  • - Statistics
  • - Cumulative frequency

36
QSAR Model evaluation
37
QSAR Model evaluation
38
QSAR Model evaluation
The residuals abs (obs-predicted) for 95 of
analogues are comparable with the variation of
experimental data.
39
Saving the Derived QSAR Model
  • To save the new regression model follow these
    steps
  • - Click on Save model button
  • - Enter the model name Acute tox
  • - Click on OK and
  • - Accept the value

40
QSAR Model evaluation
41
Apply QSAR model
  • The derived model can be used to
  • List training set chemicals
  • Right-click on the QSAR model Acute tox
  • Select training set from the context menu
  • Visualize whether a chemical is in the
    applicability domain of the model
  • In the data matrix highlight the empty cell of
    one of the analogues (e.g. chemical no 2 in the
    matrix) for the endpoint 48-h IGC Tetrahymena
    pyriformis
  • Right-click on the QSAR model Acute tox
  • Select Display domain
  • Perform predictions for the chemicals in the
    matrix.
  • Right-click on the QSAR model Acute tox
  • Select Predict endpoint and All Chemicals in
    domain

42
Apply QSAR model Training set
43
Apply QSAR model Visualize whether a chemical is
in the applicability domain of the model
  • The chemical is an aldehyde as required by the
    model. It can react with protein by Schiff-base
    formation and does not react to protein by any of
    the eliminated mechanisms
  • Michael-type nucleophilic addition
  • No binding
  • Nucleophilic addition to azomethynes
  • Nucleophilic substitution of haloaromatics
  • Another requirement is Log Kow to be gt0.3210
    and lt 4.75. The last requirement is slightly
    violated (Log Kow 4.87) and therefore the
    chemical is outside of the applicability domain
    of the model.

44
Apply QSAR model Visualize whether a chemical is
in the applicability domain of the model
45
Apply QSAR model Perform predictions
46
Apply QSAR model Perform predictions
47
Export QSAR results
  • The predictions for the chemicals in the matrix
    can be exported into a text file.
  • In the data tree right-click on 48 h (for the
    endpoint IGC50 for Tetrahymena pyriformis) and
    select Export endpoint data from the menu.

48
Export QSAR results
click right button
49
Export QSAR results
50
Export QSAR results
51
Export QSAR results
  • The resulting text file can be loaded into a
    spreadsheet and further analysed.
Write a Comment
User Comments (0)
About PowerShow.com