RESUMO
With the aim of obtaining reliable estimates of Estrogen Receptor (ER) binding for diverse classes of compounds, a weight of evidence approach using estimates from a suite of in silico models was assessed. The predictivity of a simple Majority Consensus of (Q)SAR models was assessed using a test set of compounds with experimental Relative Binding Affinity (RBA) data. Molecular docking was also carried out and the binding energies of these compounds to the ERα receptor were determined. For a few selected compounds, including a known full agonist and antagonist, the intrinsic activity was determined using low-mode molecular dynamics methods. Individual (Q)SAR model predictivity varied, as expected, with some models showing high sensitivity, others higher specificity. However, the Majority Consensus (Q)SAR prediction showed a high accuracy and reasonably balanced sensitivity and specificity. Molecular docking provided quantitative information on strength of binding to the ERα receptor. For the 50 highest binding affinity compounds with positive RBA experimental values, just 5 of them were predicted to be non-binders by the Majority QSAR Consensus. Furthermore, agonist-specific assay experimental values for these 5 compounds were negative, which indicates that they may be ER antagonists. We also showed different scenarios of combining (Q)SAR results with Molecular docking classification of ER binding based on cut-off values of binding energies, providing a rational combined strategy to maximize terms of toxicological interest.
Assuntos
Receptor alfa de Estrogênio/metabolismo , Estrogênios/metabolismo , Humanos , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , Ligação Proteica/fisiologia , Relação Quantitativa Estrutura-AtividadeRESUMO
Ecotoxicological safety assessment of chemicals requires toxicity data on multiple species, despite the general desire of minimizing animal testing. Predictive models, specifically machine learning (ML) methods, are one of the tools capable of solving this apparent contradiction as they allow to generalize toxicity patterns across chemicals and species. However, despite the availability of large public toxicity datasets, the data is highly sparse, complicating model development. The aim of this study is to provide insights into how ML can predict toxicity using a large but sparse dataset. We developed models to predict LC50-values, based on experimental LC50-data covering 2431 organic chemicals and 1506 aquatic species from the ECOTOX-database. Several well-known ML techniques were evaluated and a new ML model was developed, inspired by recommender systems. This new model involves a simple linear model that learns low-rank interactions between species and chemicals using factorization machines. We evaluated the predictive performances of the developed models based on two validation settings: 1) predicting unseen chemical-species pairs, and 2) predicting unseen chemicals. The results of this study show that ML models can accurately predict LC50-values in both validation settings. Moreover, we show that the novel factorization machine approach can match well-tuned, complex, ML approaches.
Assuntos
Aprendizado de Máquina , Relação Quantitativa Estrutura-Atividade , Animais , EcotoxicologiaRESUMO
There are various types of hepatic steatosis of which non-alcoholic fatty liver disease, which may be caused by exposure to chemicals and environmental pollutants is the most prevalent, representing a potential major health risk. QSAR modelling has the potential to provide a rapid and cost-effective method to identify compounds which may trigger steatosis. Although models exist to predict key molecular initiating events of steatosis such as nuclear receptor binding, we are aware of no models to predict the apical effect steatosis. In this study, we describe the development of a QSAR model to predict steatosis using freely available machine learning tools. It was built using a dataset of 207 pharmaceuticals and pesticides which were identified as steatotic or non-steatotic from existing data from in vivo human and animal studies. The best performing model developed using the linear discriminant analysis module in TANAGRA, based on four chemical descriptors, had an accuracy of 70%, a sensitivity of 66% and a specificity of 74%. The expansion of the steatosis dataset to other chemical types, to enable the development of further models, would be of benefit in the identification of compounds with a range of mechanisms of action contributing to steatosis.
Assuntos
Aprendizado de Máquina , Hepatopatia Gordurosa não Alcoólica/metabolismo , Algoritmos , Poluentes Ambientais/química , Poluentes Ambientais/toxicidade , Humanos , Hepatopatia Gordurosa não Alcoólica/induzido quimicamente , Relação Quantitativa Estrutura-AtividadeRESUMO
Three existing models and one newly developed model for the prediction of ready biodegradability of organic compounds are evaluated by comparing the descriptors they use, and the consistency of the models when applied to the set of High Production Volume Chemicals (HPVC) in the European Union. Linear regression models developed for the OECD showed the best performance in the external validation (84.7% correct), although comparison with the other three models is flawed because of the class specificity of these models. With these models 567 of the 894 compounds could be predicted in the validation. The multivariate statistical model showed the best performance in the external validation (82.7% correct) combined with the broadest applicability of the model. The evaluation of the predictions of the models for the HPVC shows that all models are highly consistent in their prediction of not-ready biodegradability, but much less consistency is seen in the prediction of ready biodegradability. This complies with the observation that all 4 models show better performance in their predictions of not-ready biodegradability.
Assuntos
Biodegradação Ambiental , Poluentes Ambientais/metabolismo , Compostos Orgânicos/metabolismo , Análise de Variância , Modelos Lineares , Modelos Biológicos , Reprodutibilidade dos TestesRESUMO
A structural analysis of the substrate specificity of hydrolytic dehalogenases originating from three different bacterial isolates has been performed using the multiple computer-automated structure evaluation methodology. This methodology identifies structural fragments in substrate molecules that either activate or deactivate biological processes. The analysis presented in this contribution is based on newly measured dehalogenation data combined with data from the literature (91 substrates). The enzymes under study represent different specificity classes of haloalkane dehalogenases (haloalkane dehalogenase from Xanthobacter autotrophicus GJ10, Rhodococcus erythropolis Y2, and Sphingomonas paucimobilis UT26). Three sets of structural rules have been identified to explain their substrate specificity and to predict activity for untested substrates. Predictions of activity and inactivity based on the structural rules from this analysis were provided for those compounds that were not yet tested experimentally. Predictions were also made for the compounds with available experimental data not used for the model construction (i.e., the external validation set). Correct predictions were obtained for 28 of 30 compounds in the validation set. Incorrect predictions were noted for two substrates outside the chemical domain of the set of compounds for which the structural rules were generated. A mechanistic interpretation of the structural rules generated provided a fundamental understanding of the structure-specificity relationships for the family of haloalkane dehalogenases.
Assuntos
Poluentes Ambientais/metabolismo , Halogênios/metabolismo , Hidrolases/metabolismo , Modelos Teóricos , Biodegradação Ambiental , Previsões , Valores de Referência , Rhodococcus/enzimologia , Sphingomonas/enzimologia , Relação Estrutura-Atividade , Xanthobacter/enzimologiaRESUMO
In this study a systematic analysis of the predictive capabilities of models built with backpropagation neural networks (BPNN) is made to corroborate the hypothesis that BPNN is capable of modeling the interaction terms in group contribution models, without explicitly adding these as descriptors. The data used for comparison are reactivities of 275 organic compounds towards the atomospheric OH-radical. This dataset was selected because of the internal consistency, reliability and relatively large size of this dataset. While training the network, the minimal Mean Squared Error (MSE) on a test set was used as the stop criterion. This avoids overfitting on the training data, and is most likely to give the best generalizing network. A network trained with a designed training and test set is compared with networks trained on randomly constructed training and test sets. The BPNN model based on designed training and test set not only gives the best model, but also the best predictability on an external validation set, compared both to linear models built with the same training and validation sets, and BPNN models based on randomly constructed training and test sets. The performance of the designed BPNN model is comparable to an existing model which includes interaction terms.
Assuntos
Poluição Ambiental/estatística & dados numéricos , Redes Neurais de Computação , Poluentes Ambientais , Radical Hidroxila/química , Cinética , Modelos Estatísticos , Análise de Regressão , Relação Estrutura-AtividadeRESUMO
Existing models for the reductive dehalogenation reaction under environmentally relevant conditions use Hammett and Taft coefficients as descriptors. Drawbacks of these descriptors are the limited possibilities for interpretation in terms of reaction mechanisms, and the limited availability of these descriptors for more "exotic' substituents. Therefore, in this study new descriptors are tested, using semi-empirical molecular orbital calculations. These descriptors are based on the energetic and electronic properties of the reaction sites and should be able to account for the systematics of the rate constants in a better way than substituent coefficient models. This approach is expected to give reliable estimates of the rate constants even for compounds containing less common structural features. Several relationships for a series of halogenated aromatics are presented here, relating the experimental rate constants to, among others, the calculated activation energy of the rate limiting step in the reductive dehalogenation process. Results show that semi-empirical molecular orbital descriptors are capable of describing the reaction kinetics within a homologous series of compounds. All descriptors can be explained for in terms of reaction mechanisms, thus corroborating the hypothesis about mechanisms taking place in the environment.
Assuntos
Poluentes Ambientais , Halogênios/química , Modelos Químicos , Fenômenos Químicos , Físico-Química , Clorobenzenos/química , Radicais Livres/química , Compostos de Iodo/química , Cinética , Estrutura Molecular , Oxirredução , Porfirinas/química , Relação Estrutura-Atividade , TermodinâmicaRESUMO
A project for the development of Structure-Activity Relationship for Biodegradation is presented. The aim of the project is to assemble sets of structural rules governing the potential microbial degradability of (classes of) chemicals. These rules will provide tools to take into account the biodegradation aspects of a product--and all precursors in the production process--early in the product development. The modeling concept is to take all experimental biodegradation data available and combine structural trends in the data with mechanistical information from degradation pathways. The rules that are derived should give insight into the possibility of biodegradation for specific classes of chemicals, thereby revealing why a compound is biodegradable or not. For the class of imidazole derivatives such rules are derived, and a model degradation mechanism is proposed in analogy to the urocanate-hydratase mechanism from histidine metabolism. The model is validated using 12 imidazole-compounds, which are all predicted correctly to be poorly biodegradable. It is demonstrated that both data analysis and information on enzymatic reaction mechanisms are necessary to yield valid Structure-Biodegradation Relationship.
Assuntos
Imidazóis/metabolismo , Modelos Químicos , Bactérias , Biodegradação Ambiental , Previsões , Imidazóis/química , Relação Estrutura-Atividade , Urocanato Hidratase/farmacologiaRESUMO
The kinetics of the reductive transformation rates of a set of 17 halogenated aliphatic hydrocarbons in anaerobic sediment-water mixtures are examined using different QSAR methods. Statistical experimental design in combination with multivariate chemical characterization of the compounds was used to select a representative training and validation set. The aim of the QSARs is to generate predictions for priority setting and risk assessment purposes, and to better understand the kinetics of the dehalogenation of aliphatic hydrocarbons. The first QSAR was constructed with multiple linear regression using readily available descriptors. Subsequently, a multivariate QSAR was constructed using the partial least squares (PLS) method with 36 (physico)-chemical descriptors. Finally, a transition state approach has been used in which quantum chemically calculated activation energies for the transition state of the most probable reaction mechanism are used to model the reaction rate constants k. Because of the relatively small size of the training set (10 compounds) the linear regression QSAR using multiple descriptors does not show good predictive capabilities on the validation set. The PLS relationship and the transition state QSAR are both capable of generating predictions of rate constants within one order of magnitude. Moreover, the transition state QSAR closely follows, and thus corroborates the assumed reaction mechanism for reductive dehalogenation. Predictions for 23 non tested halogenated aliphatics are given and compared using both the PLS and the transition state model.
RESUMO
Developmental toxicity testing according to the globally standardized OECD 414 protocol is an important basis for decisions on classification and labeling of developmental toxicants in the European Union (EU). This test requires relatively large animal numbers, given that parental and offspring generations are involved. In vitro assay designs and systems biology paradigms are being developed to reduce animal use and to improve prediction of human hazard. Such approaches could benefit from the long-term experience with animal protocols and more specifically from information on the relevance of effects observed in these tests for developmental toxicity. Therefore, we have analyzed relative parameter sensitivity in 22 publicly available developmental toxicity studies, representing about one third of all classified developmental toxicants under European legislation. Maternal and fetal weight effects and fetal survival were most often affected parameters at the developmental Lowest Observed Adverse Effect Level (dLOAEL), followed by skeletal malformations. Specific end points such as cleft palate were observed in fewer studies at dLOAEL, but if observed may have been crucial in classification and labeling decisions. These results are similar to earlier studies using different selections of chemicals, indicating that in general classified developmental toxicants have a similar pattern of effects at the dLOAEL as chemicals in general. These findings are discussed within the perspective of the development of innovative alternative approaches to developmental hazard assessment.
Assuntos
Troca Materno-Fetal , Teratogênicos/toxicidade , Anormalidades Induzidas por Medicamentos/etiologia , Animais , Peso Corporal/efeitos dos fármacos , Desenvolvimento Embrionário/efeitos dos fármacos , Feminino , Desenvolvimento Fetal/efeitos dos fármacos , Reabsorção do Feto/induzido quimicamente , Tamanho do Órgão/efeitos dos fármacos , Gravidez , Testes de Toxicidade , Útero/efeitos dos fármacos , Útero/crescimento & desenvolvimentoRESUMO
The multi-generation reproductive toxicity study (OECD TG 416 and USEPA 870.3800) has been extensively used internationally to assess the adverse effects of substances on reproduction. Recently the necessity of producing a second generation to assess the potential for human health risks has been questioned. The present standardized retrospective analysis of the impact of the second generation on overall study outcome combines earlier analyses and includes 498 rat multi-generation studies representing 438 different tested substances. Detailed assessment of study reports revealed no critical differences in sensitivities between the generations on the basis of a consideration of all endpoints evaluated. This analysis indicates that the second generation mating and offspring will very rarely provide critical information. These findings are consistent with the conclusions of previous retrospective analyses conducted by RIVM, USEPA and PMRA and support adoption of the proposed OECD extended one-generation reproductive toxicity study protocol in regulatory risk assessment testing strategies.