Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
1.
J Cheminform ; 16(1): 35, 2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38528548

RESUMO

Natural products are a diverse class of compounds with promising biological properties, such as high potency and excellent selectivity. However, they have different structural motifs than typical drug-like compounds, e.g., a wider range of molecular weight, multiple stereocenters and higher fraction of sp3-hybridized carbons. This makes the encoding of natural products via molecular fingerprints difficult, thus restricting their use in cheminformatics studies. To tackle this issue, we explored over 30 years of research to systematically evaluate which molecular fingerprint provides the best performance on the natural product chemical space. We considered 20 molecular fingerprints from four different sources, which we then benchmarked on over 100,000 unique natural products from the COCONUT (COlleCtion of Open Natural prodUcTs) and CMNPD (Comprehensive Marine Natural Products Database) databases. Our analysis focused on the correlation between different fingerprints and their classification performance on 12 bioactivity prediction datasets. Our results show that different encodings can provide fundamentally different views of the natural product chemical space, leading to substantial differences in pairwise similarity and performance. While Extended Connectivity Fingerprints are the de-facto option to encoding drug-like compounds, other fingerprints resulted to match or outperform them for bioactivity prediction of natural products. These results highlight the need to evaluate multiple fingerprinting algorithms for optimal performance and suggest new areas of research. Finally, we provide an open-source Python package for computing all molecular fingerprints considered in the study, as well as data and scripts necessary to reproduce the results, at https://github.com/dahvida/NP_Fingerprints .

2.
Toxics ; 12(1)2024 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-38276722

RESUMO

Cardiovascular disease is a leading global cause of mortality. The potential cardiotoxic effects of chemicals from different classes, such as environmental contaminants, pesticides, and drugs can significantly contribute to effects on health. The same chemical can induce cardiotoxicity in different ways, following various Adverse Outcome Pathways (AOPs). In addition, the potential synergistic effects between chemicals further complicate the issue. In silico methods have become essential for tackling the problem from different perspectives, reducing the need for traditional in vivo testing, and saving valuable resources in terms of time and money. Artificial intelligence (AI) and machine learning (ML) are among today's advanced approaches for evaluating chemical hazards. They can serve, for instance, as a first-tier component of Integrated Approaches to Testing and Assessment (IATA). This study employed ML and AI to assess interactions between chemicals and specific biological targets within the AOP networks for cardiotoxicity, starting with molecular initiating events (MIEs) and progressing through key events (KEs). We explored methods to encode chemical information in a suitable way for ML and AI. We started with commonly used approaches in Quantitative Structure-Activity Relationship (QSAR) methods, such as molecular descriptors and different types of fingerprint. We then increased the complexity of encoders, incorporating graph-based methods, auto-encoders, and character embeddings employed in neural language processing. We also developed a multimodal neural network architecture, capable of considering the complementary nature of different chemical representations simultaneously. The potential of this approach, compared to more conventional architectures designed to handle a single encoder, becomes apparent when the amount of data increases.

3.
Food Res Int ; 171: 113036, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37330849

RESUMO

The capacity to discriminate safe from dangerous compounds has played an important role in the evolution of species, including human beings. Highly evolved senses such as taste receptors allow humans to navigate and survive in the environment through information that arrives to the brain through electrical pulses. Specifically, taste receptors provide multiple bits of information about the substances that are introduced orally. These substances could be pleasant or not according to the taste responses that they trigger. Tastes have been classified into basic (sweet, bitter, umami, sour and salty) or non-basic (astringent, chilling, cooling, heating, pungent), while some compounds are considered as multitastes, taste modifiers or tasteless. Classification-based machine learning approaches are useful tools to develop predictive mathematical relationships in such a way as to predict the taste class of new molecules based on their chemical structure. This work reviews the history of multicriteria quantitative structure-taste relationship modelling, starting from the first ligand-based (LB) classifier proposed in 1980 by Lemont B. Kier and concluding with the most recent studies published in 2022.


Assuntos
Papilas Gustativas , Paladar , Humanos , Paladar/fisiologia , Percepção Gustatória
4.
Front Plant Sci ; 13: 1033308, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36531358

RESUMO

Bitter pit (BP) is one of the most relevant post-harvest disorders for apple industry worldwide, which is often related to calcium (Ca) deficiency at the calyx end of the fruit. Its occurrence takes place along with an imbalance with other minerals, such as potassium (K). Although the K/Ca ratio is considered a valuable indicator of BP, a high variability in the levels of these elements occurs within the fruit, between fruits of the same plant, and between plants and orchards. Prediction systems based on the content of elements in fruit have a high variability because they are determined in samples composed of various fruits. With X-ray fluorescence (XRF) spectrometry, it is possible to characterize non-destructively the signal intensity for several mineral elements at a given position in individual fruit and thus, the complete signal of the mineral composition can be used to perform a predictive model to determine the incidence of bitter pit. Therefore, it was hypothesized that using a multivariate modeling approach, other elements beyond the K and Ca could be found that could improve the current clutter prediction capability. Two studies were carried out: on the first one an experiment was conducted to determine the K/Ca and the whole spectrum using XRF of a balanced sample of affected and non-affected 'Granny Smith' apples. On the second study apples of three cultivars ('Granny Smith', 'Brookfield' and 'Fuji'), were harvested from two commercial orchards to evaluate the use of XRF to predict BP. With data from the first study a multivariate classification system was trained (balanced database of healthy and BP fruit, consisting in 176 from each group) and then the model was applied on the second study to fruit from two orchards with a history of BP. Results show that when dimensionality reduction was performed on the XRF spectra (1.5 - 8 KeV) of 'Granny Smith' apples, comparing fruit with and without BP, along with K and Ca, four other elements (i.e., Cl, Si, P, and S) were found to be deterministic. However, the PCA revealed that the classification between samples (BP vs. non-BP fruit) was not possible by univariate analysis (individual elements or the K/Ca ratio).Therefore, a multivariate classification approach was applied, and the classification measures (sensitivity, specificity, and balanced precision) of the PLS-DA models for all cultivars evaluated ('Granny Smith', 'Fuji' and 'Brookfield') on the full training samples and with both validation procedures (Venetian and Monte Carlo), ranged from 0.76 to 0.92. The results of this work indicate that using this technology at the individual fruit level is essential to understand the factors that determine this disorder and can improve BP prediction of intact fruit.

5.
J Agric Food Chem ; 70(51): 16347-16357, 2022 Dec 28.
Artigo em Inglês | MEDLINE | ID: mdl-36512435

RESUMO

A Box-Behnken experimental design was implemented in model wine (MW) to clarify the impact of copper, iron, and oxygen in the photo-degradation of riboflavin (RF) and methionine (Met) by means of response surface methodology (RSM). Analogous experiments were undertaken in MW containing caffeic acid or catechin. The results evidenced the impact of copper, iron, and oxygen in the photo-induced reaction between RF and Met. In particular, considering a number of volatile sulfur compounds (VSCs) that act as markers of light-struck taste (LST), both transition metals can favor VSC formation, which was shown for the first time for iron. Oxygen in combination can also affect the concentration of VSCs, and a lower content of VSCs was revealed in the presence of phenols, especially caffeic acid. The perception of "cabbage" sensory character indicative of LST can be related to the transition metals as well as to the different phenols, with potentially strong prevention by phenolic acids.


Assuntos
Metionina , Vinho , Vinho/análise , Cobre , Oxigênio , Compostos de Enxofre , Racemetionina , Ferro , Fenóis/análise , Riboflavina , Ácidos Cafeicos/farmacologia
6.
Molecules ; 27(18)2022 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-36144564

RESUMO

Mass spectrometry (MS) is widely used for the identification of chemical compounds by matching the experimentally acquired mass spectrum against a database of reference spectra. However, this approach suffers from a limited coverage of the existing databases causing a failure in the identification of a compound not present in the database. Among the computational approaches for mining metabolite structures based on MS data, one option is to predict molecular fingerprints from the mass spectra by means of chemometric strategies and then use them to screen compound libraries. This can be carried out by calibrating multi-task artificial neural networks from large datasets of mass spectra, used as inputs, and molecular fingerprints as outputs. In this study, we prepared a large LC-MS/MS dataset from an on-line open repository. These data were used to train and evaluate deep-learning-based approaches to predict molecular fingerprints and retrieve the structure of unknown compounds from their LC-MS/MS spectra. Effects of data sparseness and the impact of different strategies of data curing and dimensionality reduction on the output accuracy have been evaluated. Moreover, extensive diagnostics have been carried out to evaluate modelling advantages and drawbacks as a function of the explored chemical space.


Assuntos
Redes Neurais de Computação , Espectrometria de Massas em Tandem , Cromatografia Líquida/métodos , Bases de Dados Factuais , Espectrometria de Massas em Tandem/métodos
7.
Food Chem (Oxf) ; 4: 100090, 2022 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-35415670

RESUMO

The purpose of this work is the creation of a chemical database named ChemTastesDB that includes both organic and inorganic tastants. The creation, curation pipeline and the main features of the database are described in detail. The database includes 2944 verified and curated compounds divided into nine classes, which comprise the five basic tastes (sweet, bitter, umami sour and salty) along with four additional categories: tasteless, non-sweet, multitaste and miscellaneous. ChemTastesDB provides the following information for each tastant: name, PubChem CID, CAS registry number, canonical SMILES, class taste and references to the scientific sources from which data were retrieved. The molecular structure in the HyperChem (.hin) format of each chemical is also made available. In addition, molecular fingerprints were used for characterizing and analyzing the chemical space of tastants by means of unsupervised machine learning. ChemTastesDB constitutes a useful tool to the scientific community to expand the information of taste molecules and to assist in silico studies for the taste prediction of unevaluated and as yet unsynthetized compounds, as well as the analysis of the relationships between molecular structure and taste. The database is freely accessible at https://doi.org/10.5281/zenodo.5747393.

8.
J Pharm Biomed Anal ; 214: 114724, 2022 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-35303646

RESUMO

Heparin has been used successfully as a clinical antithrombotic for almost one century. Its isolation from animal sources (mostly porcine intestinal mucosa) involves multistep purification processes starting from the slaughterhouse (as mucosa) to the pharmaceutical plant (as the API). This complex supply chain increases the risk of contamination and adulteration, mainly with non-porcine ruminant material. The structural similarity of heparins from different origins, the natural variability of the heparin within samples from each source as well as the structural changes induced by manufacturing processes, require increasingly sophisticated methods capable of detecting low levels of contamination. The application of suitable multivariate classification approaches on API 1H NMRspectra serve as rapid and reliable tools for product authentication and the detection of contaminants. Soft Independent Modeling of Class Analogies (SIMCA), Discriminant Analysis (DA), Partial Least Square Discriminant Analysis (PLS-DA) and local classification methods (kNN, BNN and N3) were tested on about one hundred certified heparin samples produced by 14 different manufacturers revealing that Partial Least Squares Discriminant Analysis (PLS-DA) provided the best discrimination of contaminated batches, with a balanced accuracy of 97%.


Assuntos
Heparina , Ruminantes , Animais , Análise Discriminante , Heparina/análise , Análise dos Mínimos Quadrados , Espectroscopia de Ressonância Magnética/métodos , Preparações Farmacêuticas , Suínos
9.
Regul Toxicol Pharmacol ; 129: 105109, 2022 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34968630

RESUMO

Several public efforts are aimed at discovering patterns or classifiers in the high-dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints. The current study sought to assess and compare the predictions of the Globally Harmonized System (GHS) categories and Dangerous Goods (DG) classifications based on Lethal Dose (LD50) from several available tools (ACD/Labs, Leadscope, T.E.S.T., CATMoS, CaseUltra). External validation was done using dataset of 375 substances to demonstrate their predictive capacity. All models showed very good performance for identifying non-toxic compounds, which would be useful for DG classification, developing or triaging new chemicals, prioritizing existing chemicals for more detailed and rigorous toxicity assessments, and assessing non-active pharmaceutical intermediates. This would ultimately reduce animal use and improve risk assessments. Category-to-category prediction was not optimal, mainly due to the tendency to overpredict the outcome and the general limitations of acute oral toxicity (AOT) in vivo studies. Overprediction does not specifically pose a risk to human health, it can impact transport and material packaging requirements. Performance for compounds with LD50 ≤ 300 mg/kg (approx. 5% of the dataset) was the poorest among all groups and could be potentially improved by including expert review and read-across to similar substances.


Assuntos
Modelos Biológicos , Testes de Toxicidade Aguda/métodos , Testes de Toxicidade Aguda/normas , Administração Oral , Alternativas aos Testes com Animais , Simulação por Computador , Relação Dose-Resposta a Droga , Dose Letal Mediana , Reprodutibilidade dos Testes , Relação Estrutura-Atividade
10.
Molecules ; 28(1)2022 Dec 25.
Artigo em Inglês | MEDLINE | ID: mdl-36615358

RESUMO

According to the 2021 World Drug Report, around 275 million people use drugs of abuse, and 36 million people suffer from addiction, fostering a thriving market for illicit substances. In Italy, 30,083 people were reported to the Judicial Authority for offenses in violation of the Italian Law D.P.R. 309/1990. These offences are sentenced after a qualitative-quantitative analysis of seized materials. Given the large quantity of seized drugs and the need to perform accurate analytical determinations, Italian forensic laboratories struggle to complete analyses in a short time, delaying the entire reporting process needed to achieve sentencing. For this purpose, an UHPLC-MS/MS-based platform was developed at the University of Milano-Bicocca to support law-enforcement authorities. Software was designed to easily manage street seizure acquisition, documentation registration, and sampling. A sensitive UHPLC-MS/MS method was fully validated for the quantification of the traditional illicit substances (cocaine, heroin, 6-MAM, morphine, amphetamine, methamphetamine, MDMA, ketamine, GHB, GBL, LSD, trans-∆9-THC, and THCA) at the ppb level. The final report is relayed to the Prefecture in 3-4 days, even within 24 h for urgent requests. The platform allows for semi-automatic data handling to minimize erroneous results for an accurate report generation by standardized procedures.


Assuntos
Drogas Ilícitas , Metanfetamina , Humanos , Espectrometria de Massas em Tandem/métodos , Cromatografia Líquida de Alta Pressão , Drogas Ilícitas/análise , Anfetamina , Detecção do Abuso de Substâncias/métodos
11.
Molecules ; 26(23)2021 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-34885837

RESUMO

Neural networks are rapidly gaining popularity in chemical modeling and Quantitative Structure-Activity Relationship (QSAR) thanks to their ability to handle multitask problems. However, outcomes of neural networks depend on the tuning of several hyperparameters, whose small variations can often strongly affect their performance. Hence, optimization is a fundamental step in training neural networks although, in many cases, it can be very expensive from a computational point of view. In this study, we compared four of the most widely used approaches for tuning hyperparameters, namely, grid search, random search, tree-structured Parzen estimator, and genetic algorithms on three multitask QSAR datasets. We mainly focused on parsimonious optimization and thus not only on the performance of neural networks, but also the computational time that was taken into account. Furthermore, since the optimization approaches do not directly provide information about the influence of hyperparameters, we applied experimental design strategies to determine their effects on the neural network performance. We found that genetic algorithms, tree-structured Parzen estimator, and random search require on average 0.08% of the hours required by grid search; in addition, tree-structured Parzen estimator and genetic algorithms provide better results than random search.

14.
Environ Health Perspect ; 129(4): 47013, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33929906

RESUMO

BACKGROUND: Humans are exposed to tens of thousands of chemical substances that need to be assessed for their potential toxicity. Acute systemic toxicity testing serves as the basis for regulatory hazard classification, labeling, and risk management. However, it is cost- and time-prohibitive to evaluate all new and existing chemicals using traditional rodent acute toxicity tests. In silico models built using existing data facilitate rapid acute toxicity predictions without using animals. OBJECTIVES: The U.S. Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) Acute Toxicity Workgroup organized an international collaboration to develop in silico models for predicting acute oral toxicity based on five different end points: Lethal Dose 50 (LD50 value, U.S. Environmental Protection Agency hazard (four) categories, Globally Harmonized System for Classification and Labeling hazard (five) categories, very toxic chemicals [LD50 (LD50≤50mg/kg)], and nontoxic chemicals (LD50>2,000mg/kg). METHODS: An acute oral toxicity data inventory for 11,992 chemicals was compiled, split into training and evaluation sets, and made available to 35 participating international research groups that submitted a total of 139 predictive models. Predictions that fell within the applicability domains of the submitted models were evaluated using external validation sets. These were then combined into consensus models to leverage strengths of individual approaches. RESULTS: The resulting consensus predictions, which leverage the collective strengths of each individual model, form the Collaborative Acute Toxicity Modeling Suite (CATMoS). CATMoS demonstrated high performance in terms of accuracy and robustness when compared with in vivo results. DISCUSSION: CATMoS is being evaluated by regulatory agencies for its utility and applicability as a potential replacement for in vivo rat acute oral toxicity studies. CATMoS predictions for more than 800,000 chemicals have been made available via the National Toxicology Program's Integrated Chemical Environment tools and data sets (ice.ntp.niehs.nih.gov). The models are also implemented in a free, standalone, open-source tool, OPERA, which allows predictions of new and untested chemicals to be made. https://doi.org/10.1289/EHP8495.


Assuntos
Órgãos Governamentais , Animais , Simulação por Computador , Ratos , Testes de Toxicidade Aguda , Estados Unidos , United States Environmental Protection Agency
15.
Biointerphases ; 15(6): 061004, 2020 11 16.
Artigo em Inglês | MEDLINE | ID: mdl-33198474

RESUMO

The advantages of applying multivariate analysis to mass spectrometry imaging (MSI) data have been thoroughly demonstrated in recent decades. The identification and visualization of complex relationships between pixels in a hyperspectral data set can provide unique insights into the underlying surface chemistry. It is now recognized that most MSI data contain nonlinear relationships, which has led to increased application of machine learning approaches. Previously, we exemplified the use of the self-organizing map (SOM), a type of artificial neural network, for analyzing time-of-flight secondary ion mass spectrometry (TOF-SIMS) hyperspectral images. Recently, we developed a novel methodology, SOM-relational perspective mapping (RPM), which incorporates the algorithm RPM to improve visualization of the SOM for 2D TOF-SIMS images. Here, we use SOM-RPM to characterize and interpret 3D TOF-SIMS depth profile data, voxel-by-voxel. An organic Irganox™ multilayer standard sample was depth profiled using TOF-SIMS, and SOM-RPM was used to create 3D similarity maps of the depth-profiled sample, in which the mass spectral similarity of individual voxels is modeled with color similarity. We used this similarity map to segment the data into spatial features, demonstrating that the unsupervised method meaningfully differentiated between Irganox-3114 and Irganox-1010 nanometer-thin multilayer films. The method also identified unique clusters at the surface associated with environmental exposure and sample degradation. Key fragment ions characteristic of each cluster were identified, tying clusters to their underlying chemistries. SOM-RPM has the demonstrable ability to reduce vast data sets to simple 3D visualizations that can be used for clustering data and visualizing the complex relationships within.


Assuntos
Aprendizado de Máquina , Espectrometria de Massa de Íon Secundário/métodos , Hidroxitolueno Butilado/química , Imageamento Hiperespectral
16.
Toxicol Appl Pharmacol ; 407: 115244, 2020 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-32961130

RESUMO

Nuclear receptors (NRs) are key regulators of human health and constitute a relevant target for medicinal chemistry applications as well as for toxicological risk assessment. Several open databases dedicated to small molecules that modulate NRs exist; however, depending on their final aim (i.e., adverse effect assessment or drug design), these databases contain a different amount and type of annotated molecules, along with a different distribution of experimental bioactivity values. Stemming from these considerations, in this work we aim to provide a unified dataset, NURA (NUclear Receptor Activity) dataset, collecting curated information on small molecules that modulate NRs, to be intended for both pharmacological and toxicological applications. NURA contains bioactivity annotations for 15,247 molecules and 11 selected NRs, and it was obtained by integrating and curating data from toxicological and pharmacological databases (i.e., Tox21, ChEMBL, NR-DBIND and BindingDB). Our results show that NURA dataset is a useful tool to bridge the gap between toxicology- and medicinal-chemistry-related databases, as it is enriched in terms of number of molecules, structural diversity and covered atomic scaffolds compared to the single sources. To the best of our knowledge, NURA dataset is the most exhaustive collection of small molecules annotated for their modulation of the chosen nuclear receptors. NURA dataset is intended to support decision-making in pharmacology and toxicology, as well as to contribute to data-driven applications, such as machine learning. The dataset and the data curation pipeline can be downloaded free of charge on Zenodo at the following DOI: https://doi.org/10.5281/zenodo.3991561.


Assuntos
Bases de Dados Factuais , Receptores Citoplasmáticos e Nucleares/efeitos dos fármacos , Química Farmacêutica/métodos , Simulação por Computador , Coleta de Dados , Interpretação Estatística de Dados , Avaliação Pré-Clínica de Medicamentos , Humanos , Técnicas In Vitro , Modelos Moleculares , Bibliotecas de Moléculas Pequenas , Software , Toxicologia/métodos
17.
Anal Chem ; 92(15): 10450-10459, 2020 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-32614172

RESUMO

We present an optimization of the toroidal self-organizing map (SOM) algorithm for the accurate visualization of hyperspectral data. This represents a significant advancement on our previous work, in which we demonstrated the use of toroidal SOMs for the visualization of time-of-flight secondary ion mass spectrometry (ToF-SIMS) imaging data. We have previously shown that the toroidal SOM can be used, unsupervised, to produce a multicolor similarity map of the analysis area, in which pixels with similar mass spectra are assigned a similar color. Here, we use an additional algorithm, relational perspective mapping (RPM), to produce more accurate visualizations of hyperspectral data. The SOM output is used as an input for the RPM algorithm, which is a nonlinear dimensionality reduction technique designed to produce a two-dimensional map of high-dimensional data. Using the topological information provided by the SOM, RPM provides complementary distance information. The result is a color scheme that more accurately reflects the local spectral distances between pixels in the data. We exemplify SOM-RPM using ToF-SIMS imaging data from a mouse tumor tissue section. The similarity maps produced are compared with those produced by two leading hyperspectral visualization techniques in the field of mass spectrometry imaging: t-distributed stochastic neighborhood embedding (t-SNE) and uniform manifold approximation and projection (UMAP). We evaluate the performance of each technique both qualitatively and quantitatively, investigating the correlations between distances in the models and distances in the data. SOM-RPM is demonstrably highly competitive with t-SNE and UMAP, according to our evaluations. Furthermore, the use of a neural network offers distinct advantages in data characterization, which we discuss. We also show how spectra extracted from regions of interest identified by SOM-RPM can be further analyzed using linear discriminant analysis for the validation and characterization of the surface chemistry.

18.
Anal Chem ; 92(9): 6587-6597, 2020 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-32233419

RESUMO

Combinatorial approaches to materials discovery offer promising potential for the rapid development of novel polymer systems. Polymer microarrays enable the high-throughput comparison of material physical and chemical properties-such as surface chemistry and properties like cell attachment or protein adsorption-in order to identify correlations that can progress materials development. A challenge for this approach is to accurately discriminate between highly similar polymer chemistries or identify heterogeneities within individual polymer spots. Time-of-flight secondary ion mass spectrometry (ToF-SIMS) offers unique potential in this regard, capable of describing the chemistry associated with the outermost layer of a sample with high spatial resolution and chemical sensitivity. However, this comes at the cost of generating large scale, complex hyperspectral imaging data sets. We have demonstrated previously that machine learning is a powerful tool for interpreting ToF-SIMS images, describing a method for color-tagging the output of a self-organizing map (SOM). This reduces the entire hyperspectral data set to a single reconstructed color similarity map, in which the spectral similarity between pixels is represented by color similarity in the map. Here, we apply the same methodology to a ToF-SIMS image of a printed polymer microarray for the first time. We report complete, single-pixel molecular discrimination of the 70 unique homopolymer spots on the array while also identifying intraspot heterogeneities thought to be related to intermixing of the polymer and the pHEMA coating. In this way, we show that the SOM can identify layers of similarity and clusters in the data, both with respect to polymer backbone structures and their individual side groups. Finally, we relate the output of the SOM analysis with fluorescence data from polymer-protein adsorption studies, highlighting how polymer performance can be visualized within the context of the global topology of the data set.

19.
J Chem Inf Model ; 60(3): 1215-1223, 2020 03 23.
Artigo em Inglês | MEDLINE | ID: mdl-32073844

RESUMO

Consensus strategies have been widely applied in many different scientific fields, based on the assumption that the fusion of several sources of information increases the outcome reliability. Despite the widespread application of consensus approaches, their advantages in quantitative structure-activity relationship (QSAR) modeling have not been thoroughly evaluated, mainly due to the lack of appropriate large-scale data sets. In this study, we evaluated the advantages and drawbacks of consensus approaches compared to single classification QSAR models. To this end, we used a data set of three properties (androgen receptor binding, agonism, and antagonism) for approximately 4000 molecules with predictions performed by more than 20 QSAR models, made available in a large-scale collaborative project. The individual QSAR models were compared with two consensus approaches, majority voting and the Bayes consensus with discrete probability distributions, in both protective and nonprotective forms. Consensus strategies proved to be more accurate and to better cover the analyzed chemical space than individual QSARs on average, thus motivating their widespread application for property prediction. Scripts and data to reproduce the results of this study are available for download.


Assuntos
Relação Quantitativa Estrutura-Atividade , Teorema de Bayes , Consenso , Reprodutibilidade dos Testes
20.
Environ Health Perspect ; 128(2): 27002, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-32074470

RESUMO

BACKGROUND: Endocrine disrupting chemicals (EDCs) are xenobiotics that mimic the interaction of natural hormones and alter synthesis, transport, or metabolic pathways. The prospect of EDCs causing adverse health effects in humans and wildlife has led to the development of scientific and regulatory approaches for evaluating bioactivity. This need is being addressed using high-throughput screening (HTS) in vitro approaches and computational modeling. OBJECTIVES: In support of the Endocrine Disruptor Screening Program, the U.S. Environmental Protection Agency (EPA) led two worldwide consortiums to virtually screen chemicals for their potential estrogenic and androgenic activities. Here, we describe the Collaborative Modeling Project for Androgen Receptor Activity (CoMPARA) efforts, which follows the steps of the Collaborative Estrogen Receptor Activity Prediction Project (CERAPP). METHODS: The CoMPARA list of screened chemicals built on CERAPP's list of 32,464 chemicals to include additional chemicals of interest, as well as simulated ToxCast™ metabolites, totaling 55,450 chemical structures. Computational toxicology scientists from 25 international groups contributed 91 predictive models for binding, agonist, and antagonist activity predictions. Models were underpinned by a common training set of 1,746 chemicals compiled from a combined data set of 11 ToxCast™/Tox21 HTS in vitro assays. RESULTS: The resulting models were evaluated using curated literature data extracted from different sources. To overcome the limitations of single-model approaches, CoMPARA predictions were combined into consensus models that provided averaged predictive accuracy of approximately 80% for the evaluation set. DISCUSSION: The strengths and limitations of the consensus predictions were discussed with example chemicals; then, the models were implemented into the free and open-source OPERA application to enable screening of new chemicals with a defined applicability domain and accuracy assessment. This implementation was used to screen the entire EPA DSSTox database of ∼875,000 chemicals, and their predicted AR activities have been made available on the EPA CompTox Chemicals dashboard and National Toxicology Program's Integrated Chemical Environment. https://doi.org/10.1289/EHP5580.


Assuntos
Simulação por Computador , Disruptores Endócrinos , Androgênios , Bases de Dados Factuais , Ensaios de Triagem em Larga Escala , Humanos , Receptores Androgênicos , Estados Unidos , United States Environmental Protection Agency
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA