Title: Organic pollutants environmental fate: modeling and prediction
1Organic pollutants environmental fate modeling
and prediction of global persistence by
molecular descriptors
P.Gramatica, F.Consolaro and M.Pavan QSAR
Research Unit, Dept. of Structural and Functional
Biology, University of Insubria, Varese,
Italy e-mail paola.gramatica_at_uninsubria.it Web
http//fisio.dipbsf.uninsubria.it/dbsf/qsar/QSAR.h
tml
INTRODUCTION The persistence of organic compounds
in various environmental compartments is mainly
governed by the rates at which they are removed
by chemical and/or physical processes. Half-life
in various compartments is the most commonly used
criteria for studying persistence, but such data,
available for only a few organic compounds, vary
greatly for the various compartments and depend
on laboratory tests. As most literature data are
reported as ranges of values, we used the average
values as the input data in QSPR studies.
Validated OLS regression models have been
developed using different theoretical molecular
descriptors to predict half-life mean values in
the atmosphere, soil, surface water and
groundwater for more than 90 supposed POPs of
different chemical classes (pesticides, PAH,
PCB,etc). All the regression models have been
strongly validated and the predicted data checked
for their reliability by the leverage procedure.
These predicted values are obviously not the
real half-life, but a reasonable estimate that
have been simultaneously used in Principal
Component Analysis to produce useful indexes for
POP persistence PC1 as a global persistence
index and PC2 as a compartment related
persistence index. These two indexes have been
also modeled allowing a fast screening and
ranking of organic compounds for their
persistence.
DATA SET Our data set is constituted by 33
organic pollutants, mainly supposed POPs, for
which half-life values in air, surface-water,
groundwater and soil have been collected from
Howard2 and Rodan3 . It must be emphasized that
these values are subject to considerable
variation, thus presenting single value is
over-simplistic, which is why we considered the
mean value of the half life range. The data of
the mean value reported range were always
transformed in logarithmic units to linearize the
experimental range of variation. 2 Howard,P.H.
et all. Handbook of Environmental Degradation
Rates, (1991). 3 Rodan,B.D. et all. Screening
for Persistent Organic Pollutants Techniques To
Provide a scientific basis for POPs Criteria in
International Negotiations. Environ. Sci.
Technol.,33(20), 3482-3488 (1999).
- MOLECULAR DESCRIPTORS
- The molecular structure has been represented by a
wide set of molecular descriptors (about 170)
calculated by the software DRAGON 1.0 of
R.Todeschini (http//www.disat.unimib.it/chm) - mono-dimensional counts and fragments
descriptors - two-dimensional topological descriptors
- three-dimensional3D-WHIM (Weighted Holistic
Invariant Molecular)1 - 1R.Todeschini and P.Gramatica, 3D-modelling and
prediction by WHIM descriptors. Part 5. Theory
development and chemical meaning of the WHIM
descriptors, Quant.Struct.-Act.Relat., 16 (1997)
113-119.
Are half life ranges usefully QSPR - modeled and
predicted by theoretical molecular descriptors?
Log h.l. air - 16.95 1.46 nR07 3.51 BAL
0.73 UI 15.82 E1e
NR07 number of rings with 7 atoms BAL Balaban
index UI unsaturation index E1e directional
WHIM
Principal Component Analysis on QSPR-predicted
half life data Cum E.V. 78.6
Principal Component Analysis on experimental
plus QSPR-predicted half life data Cum E.V.
79.5
SOLUBLES and VOLATILES
SOLUBLES and VOLATILES
SOLUBLES and VOLATILES
SORBED
SORBED
SORBED
PERSISTENCE
PERSISTENCE
PERSISTENCE
PC1 (overall persistence index) 9.22 3.14 AAC-
6.32 E2s 17.49 E1e 0.16 Tm n 91 R2 85.1
Q2LOO 82.6 Q2LMO 82.2 s 0.565 F
86 122.878 SDEC 0.549 SDEP 0.595
PC1 and PC2 scores as persistence indexes
The PC scores have been used as indexes for POP
persistence PC1 (EV 51.2) as a global
persistence index and PC2 (EV 28.3) as a
compartment related persistence index. These two
indexes have been also modeled by molecular
descriptors selected by Genetic Algorithm with
satisfactory predictive power this allows a fast
screening and ranking of organic compounds for
their persistence. The data predicted by this
QSPR approach, based on few descriptors of the
molecular structure, could be usefully applied in
organic pollutants environmental fate modelling,
for not yet synthesised chemicals too.
PC2 (media persistence index)10.31 8.29IDE
0.48G2p9.93 E1p5.46 Ks0.09Ve n 91 R2 78.9
Q2LOO 75.1 Q2LMO 74.5 s 0.502
F6, 85 63.762 SDEC 0.485 SDEP 0.527