Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros

Bases de datos
Tipo del documento
Asunto de la revista
País de afiliación
Intervalo de año de publicación
1.
SAR QSAR Environ Res ; 32(2): 111-131, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-33461329

RESUMEN

This paper is devoted to the analysis of available experimental data and preparation of predictive models for binding affinity of molecules with respect to two nuclear receptors involved in endocrine disruption (ED): the oestrogen (ER) and the androgen (AR) receptors. The ED-relevant data were retrieved from multiple sources, including the CERAPP, CoMPARA, and the Tox21 projects as well as ChEMBL and PubChem databases. Data analysis performed with the help of generative topographic mapping revealed the problem of low agreement between experimental values from different sources. Collected data were used to train both classification models for ER and AR binding activities and regression models for relative binding affinity (RBA) and median inhibition concentration (IC50). These models displayed relatively poor performance in classification (sensitivities ER = 0.34, AR = 0.49) and in regression (determination coefficient r 2 for the RBA and IC50 models in external validation varied from 0.44 to 0.76). Our analysis demonstrates that low models' performance resulted from misinterpreted experimental endpoints or wrongly reported values, thus confirming the observations reported in CERAPP and CoMPARA studies. Developed models and collected data sets included of 6215 (ER) and 3789 (AR) unique compounds, which are freely available.


Asunto(s)
Disruptores Endocrinos/química , Relación Estructura-Actividad Cuantitativa , Receptores Androgénicos/química , Receptores de Estrógenos/química , Humanos , Modelos Teóricos
2.
SAR QSAR Environ Res ; 31(9): 655-675, 2020 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-32799684

RESUMEN

We report new consensus models estimating acute toxicity for algae, Daphnia and fish endpoints. We assembled a large collection of 3680 public unique compounds annotated by, at least, one experimental value for the given endpoint. Support Vector Machine models were internally and externally validated following the OECD principles. Reasonable predictive performances were achieved (RMSEext = 0.56-0.78) which are in line with those of state-of-the-art models. The known structural alerts are compared with analysis of the atomic contributions to these models obtained using the ISIDA/ColorAtom utility. A benchmarking against existing tools has been carried out on a set of compounds considered more representative and relevant for the chemical space of the current chemical industry. Our model scored one of the best accuracy and data coverage. Nevertheless, industrial data performances were noticeably lower than those on public data, indicating that existing models fail to meet the industrial needs. Thus, final models were updated with the inclusion of new industrial compounds, extending the applicability domain and relevance for application in an industrial context. Generated models and collected public data are made freely available.


Asunto(s)
Daphnia/efectos de los fármacos , Peces , Microalgas/efectos de los fármacos , Relación Estructura-Actividad Cuantitativa , Pruebas de Toxicidad Aguda , Contaminantes Químicos del Agua/toxicidad , Animales , Máquina de Vectores de Soporte
3.
SAR QSAR Environ Res ; 31(7): 493-510, 2020 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-32588650

RESUMEN

The evaluation of persistency of chemicals in environmental media (water, soil, sediment) is included in European Regulations, in the context of the Persistence, Bioaccumulation and Toxicity (PBT) assessment. In silico predictions are valuable alternatives for compounds screening and prioritization. However, already existing prediction tools have limitations: narrow applicability domains due to their relatively small training sets, and lack of medium-specific models. A dataset of 1579 unique compounds has been collected, merging several persistence data sources annotated by, at least, one experimental dissipation half-life value for the given environmental medium. This dataset was used to train binary classification models discriminating persistent/non-persistent (P/nP) compounds based on REACH half-life thresholds on sediment, water and soil compartments. Models were built using ISIDA (In SIlico design and Data Analysis) fragment descriptors and support vector regression, random forest and naïve Bayesian machine-learning methods. All models scored satisfactory performances: sediment being the most performing one (BAext = 0.91), followed by water (BAext = 0.77) and soil (BAext = 0.76). The latter suffer from low detection of persistent ('P') compounds (Snext = 0.50), reflecting discrepancies in reported half-life measurements among the different data sources. Generated models and collected data are made publicly available.


Asunto(s)
Contaminantes Ambientales/farmacología , Relación Estructura-Actividad Cuantitativa , Teorema de Bayes , Simulación por Computador , Contaminantes Ambientales/química , Semivida , Modelos Químicos , Máquina de Vectores de Soporte
4.
SAR QSAR Environ Res ; 31(3): 171-186, 2020 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-31858821

RESUMEN

The European Registration, Evaluation, Authorization and Restriction of Chemical Substances Regulation, requires marketed chemicals to be evaluated for Ready Biodegradability (RB), considering in silico prediction as valid alternative to experimental testing. However, currently available models may not be relevant to predict compounds of industrial interest, due to accuracy and applicability domain restriction issues. In this work, we present a new and extended RB dataset (2830 compounds), issued by the merging of several public data sources. It was used to train classification models, which were externally validated and benchmarked against already-existing tools on a set of 316 compounds coming from the industrial context. New models showed good performances in terms of predictive power (Balance Accuracy (BA) = 0.74-0.79) and data coverage (83-91%). The Generative Topographic Mapping approach identified several chemotypes and structural motifs unique to the industrial dataset, highlighting for which chemical classes currently available models may have less reliable predictions. Finally, public and industrial data were merged into global dataset containing 3146 compounds. This is the biggest dataset reported in the literature so far, covering some chemotypes absent in the public data. Thus, predictive model developed on the Global dataset has larger applicability domain than the existing ones.


Asunto(s)
Bases de Datos de Compuestos Químicos , Contaminantes Ambientales/química , Modelos Químicos , Algoritmos , Benchmarking , Biodegradación Ambiental , Simulación por Computador , Bases de Datos de Compuestos Químicos/normas , Relación Estructura-Actividad Cuantitativa , Reproducibilidad de los Resultados
5.
SAR QSAR Environ Res ; 30(7): 507-524, 2019 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-31244346

RESUMEN

The bioconcentration factor (BCF), a key parameter required by the REACH regulation, estimates the tendency for a xenobiotic to concentrate inside living organisms. In silico methods can be valid alternatives to costly data measurements. However, in the industrial context, these theoretical approaches may fail to predict BCF with reasonable accuracy. We analyzed whether models built on public data only have adequate performances when challenged to predict industrial compounds. A new set of 1129 compounds has been collected by merging publicly available datasets. Generative Topographic Mapping was employed to compare this chemical space with a set of new compounds issued from the industry. Some new chemotypes absent in the training set (such as siloxanes) have been detected. A new BCF model has been built using ISIDA (In SIlico design and Data Analysis) fragment descriptors, support vector regression and random forest machine-learning methods. It has been externally validated on: (i) collected data from the literature and (ii) industrial data. The latter also served as benchmark for the freely available tools VEGA, EPISuite, TEST, OPERA. New model performs (RMSE of 0.58 log BCF units) comparably to existing ones but benefits of an extended applicability, covering the industrial set chemical space (78% data coverage).


Asunto(s)
Simulación por Computador , Relación Estructura-Actividad Cuantitativa , Contaminantes Químicos del Agua/química , Xenobióticos/química , Animales , Cadena Alimentaria , Aprendizaje Automático , Máquina de Vectores de Soporte , Contaminantes Químicos del Agua/metabolismo , Xenobióticos/metabolismo
6.
SAR QSAR Environ Res ; 30(12): 879-897, 2019 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-31607169

RESUMEN

We report predictive models of acute oral systemic toxicity representing a follow-up of our previous work in the framework of the NICEATM project. It includes the update of original models through the addition of new data and an external validation of the models using a dataset relevant for the chemical industry context. A regression model for LD50 and multi-class classification model for toxicity classes according to the Global Harmonized System categories were prepared. ISIDA descriptors were used to encode molecular structures. Machine learning algorithms included support vector machine (SVM), random forest (RF) and naïve Bayesian. Selected individual models were combined in consensus. The different datasets were compared using the generative topographic mapping approach. It appeared that the NICEATM datasets were lacking some relevant chemotypes for chemical industry. The new models trained on enlarged data sets have applicability domains (AD) sufficiently large to accommodate industrial compounds. The fraction of compounds inside the models' AD increased from 58% (NICEATM model) to 94% (new model). The increase of training sets improved models' prediction performance: RMSE values decreased from 0.56 to 0.47 and balanced accuracies increased from 0.69 to 0.71 for NICEATM and new models, respectively.


Asunto(s)
Alternativas a las Pruebas en Animales/métodos , Modelos Teóricos , Pruebas de Toxicidad Aguda/métodos , Administración Oral , Alternativas a las Pruebas en Animales/normas , Animales , Simulación por Computador , Consenso , Bases de Datos de Compuestos Químicos , Aprendizaje Automático , Relación Estructura-Actividad Cuantitativa , Ratas , Reproducibilidad de los Resultados , Pruebas de Toxicidad Aguda/normas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA