Pesquisa | Portal Regional da BVS

CoMPARA: Collaborative Modeling Project for Androgen Receptor Activity.

Mansouri, Kamel; Kleinstreuer, Nicole; Abdelaziz, Ahmed M; Alberga, Domenico; Alves, Vinicius M; Andersson, Patrik L; Andrade, Carolina H; Bai, Fang; Balabin, Ilya; Ballabio, Davide; Benfenati, Emilio; Bhhatarai, Barun; Boyer, Scott; Chen, Jingwen; Consonni, Viviana; Farag, Sherif; Fourches, Denis; García-Sosa, Alfonso T; Gramatica, Paola; Grisoni, Francesca; Grulke, Chris M; Hong, Huixiao; Horvath, Dragos; Hu, Xin; Huang, Ruili; Jeliazkova, Nina; Li, Jiazhong; Li, Xuehua; Liu, Huanxiang; Manganelli, Serena; Mangiatordi, Giuseppe F; Maran, Uko; Marcou, Gilles; Martin, Todd; Muratov, Eugene; Nguyen, Dac-Trung; Nicolotti, Orazio; Nikolov, Nikolai G; Norinder, Ulf; Papa, Ester; Petitjean, Michel; Piir, Geven; Pogodin, Pavel; Poroikov, Vladimir; Qiao, Xianliang; Richard, Ann M; Roncaglioni, Alessandra; Ruiz, Patricia; Rupakheti, Chetan; Sakkiah, Sugunadevi.

Environ Health Perspect ; 128(2): 27002, 2020 02.

Artigo em Inglês | MEDLINE | ID: mdl-32074470

RESUMO

BACKGROUND: Endocrine disrupting chemicals (EDCs) are xenobiotics that mimic the interaction of natural hormones and alter synthesis, transport, or metabolic pathways. The prospect of EDCs causing adverse health effects in humans and wildlife has led to the development of scientific and regulatory approaches for evaluating bioactivity. This need is being addressed using high-throughput screening (HTS) in vitro approaches and computational modeling. OBJECTIVES: In support of the Endocrine Disruptor Screening Program, the U.S. Environmental Protection Agency (EPA) led two worldwide consortiums to virtually screen chemicals for their potential estrogenic and androgenic activities. Here, we describe the Collaborative Modeling Project for Androgen Receptor Activity (CoMPARA) efforts, which follows the steps of the Collaborative Estrogen Receptor Activity Prediction Project (CERAPP). METHODS: The CoMPARA list of screened chemicals built on CERAPP's list of 32,464 chemicals to include additional chemicals of interest, as well as simulated ToxCast™ metabolites, totaling 55,450 chemical structures. Computational toxicology scientists from 25 international groups contributed 91 predictive models for binding, agonist, and antagonist activity predictions. Models were underpinned by a common training set of 1,746 chemicals compiled from a combined data set of 11 ToxCast™/Tox21 HTS in vitro assays. RESULTS: The resulting models were evaluated using curated literature data extracted from different sources. To overcome the limitations of single-model approaches, CoMPARA predictions were combined into consensus models that provided averaged predictive accuracy of approximately 80% for the evaluation set. DISCUSSION: The strengths and limitations of the consensus predictions were discussed with example chemicals; then, the models were implemented into the free and open-source OPERA application to enable screening of new chemicals with a defined applicability domain and accuracy assessment. This implementation was used to screen the entire EPA DSSTox database of â¼875,000 chemicals, and their predicted AR activities have been made available on the EPA CompTox Chemicals dashboard and National Toxicology Program's Integrated Chemical Environment. https://doi.org/10.1289/EHP5580.

Assuntos

Simulação por Computador , Disruptores Endócrinos , Androgênios , Bases de Dados Factuais , Ensaios de Triagem em Larga Escala , Humanos , Receptores Androgênicos , Estados Unidos , United States Environmental Protection Agency

Open-source QSAR models for pKa prediction using multiple machine learning approaches.

Mansouri, Kamel; Cariello, Neal F; Korotcov, Alexandru; Tkachenko, Valery; Grulke, Chris M; Sprankle, Catherine S; Allen, David; Casey, Warren M; Kleinstreuer, Nicole C; Williams, Antony J.

J Cheminform ; 11(1): 60, 2019 Sep 18.

Artigo em Inglês | MEDLINE | ID: mdl-33430972

RESUMO

BACKGROUND: The logarithmic acid dissociation constant pKa reflects the ionization of a chemical, which affects lipophilicity, solubility, protein binding, and ability to pass through the plasma membrane. Thus, pKa affects chemical absorption, distribution, metabolism, excretion, and toxicity properties. Multiple proprietary software packages exist for the prediction of pKa, but to the best of our knowledge no free and open-source programs exist for this purpose. Using a freely available data set and three machine learning approaches, we developed open-source models for pKa prediction. METHODS: The experimental strongest acidic and strongest basic pKa values in water for 7912 chemicals were obtained from DataWarrior, a freely available software package. Chemical structures were curated and standardized for quantitative structure-activity relationship (QSAR) modeling using KNIME, and a subset comprising 79% of the initial set was used for modeling. To evaluate different approaches to modeling, several datasets were constructed based on different processing of chemical structures with acidic and/or basic pKas. Continuous molecular descriptors, binary fingerprints, and fragment counts were generated using PaDEL, and pKa prediction models were created using three machine learning methods, (1) support vector machines (SVM) combined with k-nearest neighbors (kNN), (2) extreme gradient boosting (XGB) and (3) deep neural networks (DNN). RESULTS: The three methods delivered comparable performances on the training and test sets with a root-mean-squared error (RMSE) around 1.5 and a coefficient of determination (R2) around 0.80. Two commercial pKa predictors from ACD/Labs and ChemAxon were used to benchmark the three best models developed in this work, and performance of our models compared favorably to the commercial products. CONCLUSIONS: This work provides multiple QSAR models to predict the strongest acidic and strongest basic pKas of chemicals, built using publicly available data, and provided as free and open-source software on GitHub.

OPERA models for predicting physicochemical properties and environmental fate endpoints.

Mansouri, Kamel; Grulke, Chris M; Judson, Richard S; Williams, Antony J.

J Cheminform ; 10(1): 10, 2018 Mar 08.

Artigo em Inglês | MEDLINE | ID: mdl-29520515

RESUMO

The collection of chemical structure information and associated experimental data for quantitative structure-activity/property relationship (QSAR/QSPR) modeling is facilitated by an increasing number of public databases containing large amounts of useful data. However, the performance of QSAR models highly depends on the quality of the data and modeling methodology used. This study aims to develop robust QSAR/QSPR models for chemical properties of environmental interest that can be used for regulatory purposes. This study primarily uses data from the publicly available PHYSPROP database consisting of a set of 13 common physicochemical and environmental fate properties. These datasets have undergone extensive curation using an automated workflow to select only high-quality data, and the chemical structures were standardized prior to calculation of the molecular descriptors. The modeling procedure was developed based on the five Organization for Economic Cooperation and Development (OECD) principles for QSAR models. A weighted k-nearest neighbor approach was adopted using a minimum number of required descriptors calculated using PaDEL, an open-source software. The genetic algorithms selected only the most pertinent and mechanistically interpretable descriptors (2-15, with an average of 11 descriptors). The sizes of the modeled datasets varied from 150 chemicals for biodegradability half-life to 14,050 chemicals for logP, with an average of 3222 chemicals across all endpoints. The optimal models were built on randomly selected training sets (75%) and validated using fivefold cross-validation (CV) and test sets (25%). The CV Q2 of the models varied from 0.72 to 0.95, with an average of 0.86 and an R2 test value from 0.71 to 0.96, with an average of 0.82. Modeling and performance details are described in QSAR model reporting format and were validated by the European Commission's Joint Research Center to be OECD compliant. All models are freely available as an open-source, command-line application called OPEn structure-activity/property Relationship App (OPERA). OPERA models were applied to more than 750,000 chemicals to produce freely available predicted data on the U.S. Environmental Protection Agency's CompTox Chemistry Dashboard.

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA