Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
J Cheminform ; 16(1): 20, 2024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-38383444

RESUMO

REINVENT 4 is a modern open-source generative AI framework for the design of small molecules. The software utilizes recurrent neural networks and transformer architectures to drive molecule generation. These generators are seamlessly embedded within the general machine learning optimization algorithms, transfer learning, reinforcement learning and curriculum learning. REINVENT 4 enables and facilitates de novo design, R-group replacement, library design, linker design, scaffold hopping and molecule optimization. This contribution gives an overview of the software and describes its design. Algorithms and their applications are discussed in detail. REINVENT 4 is a command line tool which reads a user configuration in either TOML or JSON format. The aim of this release is to provide reference implementations for some of the most common algorithms in AI based molecule generation. An additional goal with the release is to create a framework for education and future innovation in AI based molecular design. The software is available from https://github.com/MolecularAI/REINVENT4 and released under the permissive Apache 2.0 license. Scientific contribution. The software provides an open-source reference implementation for generative molecular design where the software is also being used in production to support in-house drug discovery projects. The publication of the most common machine learning algorithms in one code and full documentation thereof will increase transparency of AI and foster innovation, collaboration and education.

2.
Nat Protoc ; 18(7): 1981-2013, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37344608

RESUMO

In image-based profiling, software extracts thousands of morphological features of cells from multi-channel fluorescence microscopy images, yielding single-cell profiles that can be used for basic research and drug discovery. Powerful applications have been proven, including clustering chemical and genetic perturbations on the basis of their similar morphological impact, identifying disease phenotypes by observing differences in profiles between healthy and diseased cells and predicting assay outcomes by using machine learning, among many others. Here, we provide an updated protocol for the most popular assay for image-based profiling, Cell Painting. Introduced in 2013, it uses six stains imaged in five channels and labels eight diverse components of the cell: DNA, cytoplasmic RNA, nucleoli, actin, Golgi apparatus, plasma membrane, endoplasmic reticulum and mitochondria. The original protocol was updated in 2016 on the basis of several years' experience running it at two sites, after optimizing it by visual stain quality. Here, we describe the work of the Joint Undertaking for Morphological Profiling Cell Painting Consortium, to improve upon the assay via quantitative optimization by measuring the assay's ability to detect morphological phenotypes and group similar perturbations together. The assay gives very robust outputs despite various changes to the protocol, and two vendors' dyes work equivalently well. We present Cell Painting version 3, in which some steps are simplified and several stain concentrations can be reduced, saving costs. Cell culture and image acquisition take 1-2 weeks for typically sized batches of ≤20 plates; feature extraction and data analysis take an additional 1-2 weeks.This protocol is an update to Nat. Protoc. 11, 1757-1774 (2016): https://doi.org/10.1038/nprot.2016.105.


Assuntos
Técnicas de Cultura de Células , Processamento de Imagem Assistida por Computador , Processamento de Imagem Assistida por Computador/métodos , Microscopia de Fluorescência , Mitocôndrias , Software
3.
Front Pharmacol ; 14: 1116081, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36817116

RESUMO

Uncontrolled angiogenesis is a common denominator underlying many deadly and debilitating diseases such as myocardial infarction, chronic wounds, cancer, and age-related macular degeneration. As the current range of FDA-approved angiogenesis-based medicines are far from meeting clinical demands, the vast reserve of natural products from traditional Chinese medicine (TCM) offers an alternative source for developing pro-angiogenic or anti-angiogenic modulators. Here, we investigated 100 traditional Chinese medicine-derived individual metabolites which had reported gene expression in MCF7 cell lines in the Gene Expression Omnibus (GSE85871). We extracted literature angiogenic activities for 51 individual metabolites, and subsequently analysed their predicted targets and differentially expressed genes to understand their mechanisms of action. The angiogenesis phenotype was used to generate decision trees for rationalising the poly-pharmacology of known angiogenesis modulators such as ferulic acid and curculigoside and validated by an in vitro endothelial tube formation assay and a zebrafish model of angiogenesis. Moreover, using an in silico model we prospectively examined the angiogenesis-modulating activities of the remaining 49 individual metabolites. In vitro, tetrahydropalmatine and 1 beta-hydroxyalantolactone stimulated, while cinobufotalin and isoalantolactone inhibited endothelial tube formation. In vivo, ginsenosides Rb3 and Rc, 1 beta-hydroxyalantolactone and surprisingly cinobufotalin, restored angiogenesis against PTK787-induced impairment in zebrafish. In the absence of PTK787, deoxycholic acid and ursodeoxycholic acid did not affect angiogenesis. Despite some limitations, these results suggest further refinements of in silico prediction combined with biological assessment will be a valuable platform for accelerating the research and development of natural products from traditional Chinese medicine and understanding their mechanisms of action, and also for other traditional medicines for the prevention and treatment of angiogenic diseases.

4.
J Cheminform ; 13(1): 62, 2021 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-34412708

RESUMO

Measurements of protein-ligand interactions have reproducibility limits due to experimental errors. Any model based on such assays will consequentially have such unavoidable errors influencing their performance which should ideally be factored into modelling and output predictions, such as the actual standard deviation of experimental measurements (σ) or the associated comparability of activity values between the aggregated heterogenous activity units (i.e., Ki versus IC50 values) during dataset assimilation. However, experimental errors are usually a neglected aspect of model generation. In order to improve upon the current state-of-the-art, we herein present a novel approach toward predicting protein-ligand interactions using a Probabilistic Random Forest (PRF) classifier. The PRF algorithm was applied toward in silico protein target prediction across ~ 550 tasks from ChEMBL and PubChem. Predictions were evaluated by taking into account various scenarios of experimental standard deviations in both training and test sets and performance was assessed using fivefold stratified shuffled splits for validation. The largest benefit in incorporating the experimental deviation in PRF was observed for data points close to the binary threshold boundary, when such information was not considered in any way in the original RF algorithm. For example, in cases when σ ranged between 0.4-0.6 log units and when ideal probability estimates between 0.4-0.6, the PRF outperformed RF with a median absolute error margin of ~ 17%. In comparison, the baseline RF outperformed PRF for cases with high confidence to belong to the active class (far from the binary decision threshold), although the RF models gave errors smaller than the experimental uncertainty, which could indicate that they were overtrained and/or over-confident. Finally, the PRF models trained with putative inactives decreased the performance compared to PRF models without putative inactives and this could be because putative inactives were not assigned an experimental pXC50 value, and therefore they were considered inactives with a low uncertainty (which in practice might not be true). In conclusion, PRF can be useful for target prediction models in particular for data where class boundaries overlap with the measurement uncertainty, and where a substantial part of the training data is located close to the classification threshold.

5.
J Chem Inf Model ; 61(3): 1444-1456, 2021 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-33661004

RESUMO

The understanding of the mechanism-of-action (MoA) of compounds and the prediction of potential drug targets play an important role in small-molecule drug discovery. The aim of this work was to compare chemical and cell morphology information for bioactivity prediction. The comparison was performed using bioactivity data from the ExCAPE database, image data (in the form of CellProfiler features) from the Cell Painting data set (the largest publicly available data set of cell images with ∼30,000 compound perturbations), and extended connectivity fingerprints (ECFPs) using the multitask Bayesian matrix factorization (BMF) approach Macau. We found that the BMF Macau and random forest (RF) performance were overall similar when ECFPs were used as compound descriptors. However, BMF Macau outperformed RF in 159 out of 224 targets (71%) when image data were used as compound information. Using BMF Macau, 100 (corresponding to about 45%) and 90 (about 40%) of the 224 targets were predicted with high predictive performance (AUC > 0.8) with ECFP data and image data as side information, respectively. There were targets better predicted by image data as side information, such as ß-catenin, and others better predicted by fingerprint-based side information, such as proteins belonging to the G-protein-Coupled Receptor 1 family, which could be rationalized from the underlying data distributions in each descriptor domain. In conclusion, both cell morphology changes and chemical structure information contain information about compound bioactivity, which is also partially complementary, and can hence contribute to in silico MoA analysis.


Assuntos
Descoberta de Drogas , Proteínas , Teorema de Bayes , Simulação por Computador , Bases de Dados Factuais
6.
J Cereb Blood Flow Metab ; 41(4): 874-885, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-32281457

RESUMO

Functional magnetic resonance imaging (fMRI) is an extensively used method for the investigation of normal and pathological brain function. In particular, fMRI has been used to characterize spatiotemporal hemodynamic response to pharmacological challenges as a non-invasive readout of neuronal activity. However, the mechanisms underlying regional signal changes are yet unclear. In this study, we use a meta-analytic approach to converge data from microdialysis experiments with relative cerebral blood volume (rCBV) changes following acute administration of neuropsychiatric drugs in adult male rats. At whole-brain level, the functional response patterns show very weak correlation with neurochemical alterations, while for numerous brain areas a strong positive correlation with noradrenaline release exists. At a local scale of individual brain regions, the rCBV response to neurotransmitters is anatomically heterogeneous and, importantly, based on a complex interplay of different neurotransmitters that often exert opposing effects, thus providing a mechanism for regulating and fine tuning hemodynamic responses in specific regions.


Assuntos
Química Encefálica/efeitos dos fármacos , Circulação Cerebrovascular/efeitos dos fármacos , Hemodinâmica/efeitos dos fármacos , Psicotrópicos/farmacologia , Animais , Humanos , Imageamento por Ressonância Magnética , Microdiálise
7.
Drug Discov Today ; 26(2): 474-489, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33253918

RESUMO

Machine learning and artificial intelligence are increasingly being applied to the drug-design process as a result of the development of novel algorithms, growing access, the falling cost of computation and the development of novel technologies for generating chemically and biologically relevant data. There has been recent progress in fields such as molecular de novo generation, synthetic route prediction and, to some extent, property predictions. Despite this, most research in these fields has focused on improving the accuracy of the technologies, rather than on quantifying the uncertainty in the predictions. Uncertainty quantification will become a key component in autonomous decision making and will be crucial for integrating machine learning and chemistry automation to create an autonomous design-make-test-analyse cycle. This review covers the empirical, frequentist and Bayesian approaches to uncertainty quantification, and outlines how they can be used for drug design. We also outline the impact of uncertainty quantification on decision making.


Assuntos
Desenho de Fármacos , Incerteza , Algoritmos , Inteligência Artificial , Automação , Teorema de Bayes , Humanos , Aprendizado de Máquina
8.
J Chem Inf Model ; 60(10): 4546-4559, 2020 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-32865408

RESUMO

In the context of bioactivity prediction, the question of how to calibrate a score produced by a machine learning method into a probability of binding to a protein target is not yet satisfactorily addressed. In this study, we compared the performance of three such methods, namely, Platt scaling (PS), isotonic regression (IR), and Venn-ABERS predictors (VA), in calibrating prediction scores obtained from ligand-target prediction comprising the Naïve Bayes, support vector machines, and random forest (RF) algorithms. Calibration quality was assessed on bioactivity data available at AstraZeneca for 40 million data points (compound-target pairs) across 2112 targets and performance was assessed using stratified shuffle split (SSS) and leave 20% of scaffolds out (L20SO) validation. VA achieved the best calibration performances across all machine learning algorithms and cross validation methods tested and also the lowest (best) Brier score loss (mean squared difference between the outputted probability estimates assigned to a compound and the actual outcome). In comparison, the PS and IR methods can actually degrade the assigned probability estimates, particularly for the RF for SSS and during L20SO. Sphere exclusion, a method to sample additional (putative) inactive compounds, was shown to inflate the overall Brier score loss performance, through the artificial requirement for inactive molecules to be dissimilar to active compounds, but was shown to result in overconfident estimators. VA was able to successfully calibrate the probability estimates for even small calibration sets. The multiprobability values (lower and upper probability boundary intervals) were shown to produce large discordance for test set molecules that are neither very similar nor very dissimilar to the active training set, which were hence difficult to predict, suggesting that multiprobability discordance can be used as an estimate for target prediction uncertainty. Overall, we were able to show in this work that VA scaling of target prediction models is able to improve probability estimates in all testing instances and is currently being applied for in-house approaches.


Assuntos
Aprendizado de Máquina , Máquina de Vetores de Suporte , Teorema de Bayes , Ligantes , Probabilidade
9.
J Cheminform ; 11(1): 36, 2019 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-31152262

RESUMO

Despite the increasing knowledge in both the chemical and biological domains the assimilation and exploration of heterogeneous datasets, encoding information about the chemical, bioactivity and phenotypic properties of compounds, remains a challenge due to requirement for overlap between chemicals assayed across the spaces. Here, we have constructed a novel dataset, larger than we have used in prior work, comprising 579 acute oral toxic compounds and 1427 non-toxic compounds derived from regulatory GHS information, along with their corresponding molecular and protein target descriptors and qHTS in vitro assay readouts from the Tox21 project. We found no clear association between the results of a FAFDrugs4 toxicophore screen and the acute oral toxicity classifications for our compound set; and a screen using a subset of the ToxAlerts toxicophores was also of limited utility, with only slight enrichment toward the toxic set (odds ratio of 1.48). We then investigated to what degree toxic and non-toxic compounds could be separated in each of the spaces, to compare their potential contribution to further analyses. Using an LDA projection, we found the largest degree of separation using chemical descriptors (Cohen's d of 1.95) and the lowest degree of separation between toxicity classes using qHTS descriptors (Cohen's d of 0.67). To compare the predictivity of the feature spaces for the toxicity endpoint, we next trained Random Forest (RF) acute oral toxicity classifiers on either molecular, protein target and qHTS descriptors. RFs trained on molecular and protein target descriptors were most predictive, with ROC AUC values of 0.80-0.92 and 0.70-0.85, respectively, across three test sets. RFs trained on both chemical and protein target descriptors combined exhibited similar predictive performance to the single-domain models (ROC AUC of 0.80-0.91). Model interpretability was improved by the inclusion of protein target descriptors, which allow the identification of specific targets (e.g. Retinal dehydrogenase) with literature links to toxic modes of action (e.g. oxidative stress). The dataset compiled in this study has been made available for future application.

10.
Nat Commun ; 9(1): 4699, 2018 11 08.
Artigo em Inglês | MEDLINE | ID: mdl-30410047

RESUMO

Neuropsychiatric disorders are the third leading cause of global disease burden. Current pharmacological treatment for these disorders is inadequate, with often insufficient efficacy and undesirable side effects. One reason for this is that the links between molecular drug action and neurobehavioral drug effects are elusive. We use a big data approach from the neurotransmitter response patterns of 258 different neuropsychiatric drugs in rats to address this question. Data from experiments comprising 110,674 rats are presented in the Syphad database [ www.syphad.org ]. Chemoinformatics analyses of the neurotransmitter responses suggest a mismatch between the current classification of neuropsychiatric drugs and spatiotemporal neurostransmitter response patterns at the systems level. In contrast, predicted drug-target interactions reflect more appropriately brain region related neurotransmitter response. In conclusion the neurobiological mechanism of neuropsychiatric drugs are not well reflected by their current classification or their chemical similarity, but can be better captured by molecular drug-target interactions.


Assuntos
Antipsicóticos/farmacologia , Neurotransmissores/metabolismo , Animais , Encéfalo/metabolismo , Simulação por Computador , Bases de Dados como Assunto , Ratos Sprague-Dawley , Ratos Wistar
11.
Front Pharmacol ; 9: 613, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29942259

RESUMO

In silico protein target deconvolution is frequently used for mechanism-of-action investigations; however existing protocols usually do not predict compound functional effects, such as activation or inhibition, upon binding to their protein counterparts. This study is hence concerned with including functional effects in target prediction. To this end, we assimilated a bioactivity training set for 332 targets, comprising 817,239 active data points with unknown functional effect (binding data) and 20,761,260 inactive compounds, along with 226,045 activating and 1,032,439 inhibiting data points from functional screens. Chemical space analysis of the data first showed some separation between compound sets (binding and inhibiting compounds were more similar to each other than both binding and activating or activating and inhibiting compounds), providing a rationale for implementing functional prediction models. We employed three different architectures to predict functional response, ranging from simplistic random forest models ('Arch1') to cascaded models which use separate binding and functional effect classification steps ('Arch2' and 'Arch3'), differing in the way training sets were generated. Fivefold stratified cross-validation outlined cascading predictions provides superior precision and recall based on an internal test set. We next prospectively validated the architectures using a temporal set of 153,467 of in-house data points (after a 4-month interim from initial data extraction). Results outlined Arch3 performed with the highest target class averaged precision and recall scores of 71% and 53%, which we attribute to the use of inactive background sets. Distance-based applicability domain (AD) analysis outlined that Arch3 provides superior extrapolation into novel areas of chemical space, and thus based on the results presented here, propose as the most suitable architecture for the functional effect prediction of small molecules. We finally conclude including functional effects could provide vital insight in future studies, to annotate cases of unanticipated functional changeover, as outlined by our CHRM1 case study.

12.
Bioinformatics ; 34(1): 72-79, 2018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-28961699

RESUMO

Motivation: In silico approaches often fail to utilize bioactivity data available for orthologous targets due to insufficient evidence highlighting the benefit for such an approach. Deeper investigation into orthologue chemical space and its influence toward expanding compound and target coverage is necessary to improve the confidence in this practice. Results: Here we present analysis of the orthologue chemical space in ChEMBL and PubChem and its impact on target prediction. We highlight the number of conflicting bioactivities between human and orthologues is low and annotations are overall compatible. Chemical space analysis shows orthologues are chemically dissimilar to human with high intra-group similarity, suggesting they could effectively extend the chemical space modelled. Based on these observations, we show the benefit of orthologue inclusion in terms of novel target coverage. We also benchmarked predictive models using a time-series split and also using bioactivities from Chemistry Connect and HTS data available at AstraZeneca, showing that orthologue bioactivity inclusion statistically improved performance. Availability and implementation: Orthologue-based bioactivity prediction and the compound training set are available at www.github.com/lhm30/PIDGINv2. Contact: ab454@cam.ac.uk. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Simulação por Computador , Descoberta de Drogas/métodos , Proteínas/metabolismo , Homologia de Sequência de Aminoácidos , Animais , Humanos , Ligantes , Modelos Biológicos , Proteínas/efeitos dos fármacos
13.
J Chem Inf Model ; 57(3): 468-483, 2017 03 27.
Artigo em Inglês | MEDLINE | ID: mdl-28257573

RESUMO

One important, however, poorly understood, concept of Traditional Chinese Medicine (TCM) is that of hot, cold, and neutral nature of its bioactive principles. To advance the field, in this study, we analyzed compound-nature pairs from TCM on a large scale (>23 000 structures) via chemical space visualizations to understand its physicochemical domain and in silico target prediction to understand differences related to their modes-of-action (MoA) against proteins. We found that overall TCM natures spread into different subclusters with specific molecular patterns, as opposed to forming coherent global groups. Compounds associated with cold nature had a lower clogP and contain more aliphatic rings than the other groups and were found to control detoxification, heat-clearing, heart development processes, and have sedative function, associated with "Mental and behavioural disorders" diseases. While compounds associated with hot nature were on average of lower molecular weight, have more aromatic ring systems than other groups, frequently seemed to control body temperature, have cardio-protection function, improve fertility and sexual function, and represent excitatory or activating effects, associated with "endocrine, nutritional and metabolic diseases" and "diseases of the circulatory system". Compounds associated with neutral nature had a higher polar surface area and contain more cyclohexene moieties than other groups and seem to be related to memory function, suggesting that their nature may be a useful guide for their utility in neural degenerative diseases. We were hence able to elucidate the difference between different nature classes in TCM on the molecular level, and on a large data set, for the first time, thereby helping a better understanding of TCM nature theory and bridging the gap between traditional medicine and our current understanding of the human body.


Assuntos
Simulação por Computador , Medicina Tradicional Chinesa , Terapia de Alvo Molecular
14.
Curr Pharm Des ; 22(46): 6918-6927, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27784247

RESUMO

Cancer cell line panels have proved useful disease models to, among others, identify genomic markers of drug sensitivity and to develop new anticancer drugs. The increasing availability of in vitro sensitivity and cell line profiling data sets raises the question of whether this information could be used, and to which extent, to predict the activity of drugs in cancer cell lines and, ultimately, in patients tumors. Drug sensitivity prediction embraces those approaches aiming at predicting in vitro drug activity on cancer cell lines by integrating genomic and/or chemical information using machine learning models. In this review, we summarize the cytotoxicity assays generally used to determine in vitro activity on cultured cell lines, and revisit the drug sensitivity prediction studies that have leveraged chemical and cell line profiling data from the NCI60, Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (GDSC) projects. A section outlining current limitations and future perspectives in the field closes the review.


Assuntos
Antineoplásicos/farmacologia , Neoplasias/tratamento farmacológico , Antineoplásicos/química , Linhagem Celular Tumoral , Proliferação de Células/efeitos dos fármacos , Sobrevivência Celular/efeitos dos fármacos , Humanos , Neoplasias/genética , Neoplasias/patologia
15.
ACS Chem Biol ; 11(11): 3007-3023, 2016 11 18.
Artigo em Inglês | MEDLINE | ID: mdl-27571164

RESUMO

While mechanisms of cytotoxicity and cytostaticity have been studied extensively from the biological side, relatively little is currently understood regarding areas of chemical space leading to cytotoxicity and cytostasis in large compound collections. Predicting and rationalizing potential adverse mechanism-of-actions (MoAs) of small molecules is however crucial for screening library design, given the link of even low level cytotoxicity and adverse events observed in man. In this study, we analyzed results from a cell-based cytotoxicity screening cascade, comprising 296 970 nontoxic, 5784 cytotoxic and cytostatic, and 2327 cytostatic-only compounds evaluated on the THP-1 cell-line. We employed an in silico MoA analysis protocol, utilizing 9.5 million active and 602 million inactive bioactivity points to generate target predictions, annotate predicted targets with pathways, and calculate enrichment metrics to highlight targets and pathways. Predictions identify known mechanisms for the top ranking targets and pathways for both phenotypes after review and indicate that while processes involved in cytotoxicity versus cytostaticity seem to overlap, differences between both phenotypes seem to exist to some extent. Cytotoxic predictions highlight many kinases, including the potentially novel cytotoxicity-related target STK32C, while cytostatic predictions outline targets linked with response to DNA damage, metabolism, and cytoskeletal machinery. Fragment analysis was also employed to generate a library of toxicophores to improve general understanding of the chemical features driving toxicity. We highlight substructures with potential kinase-dependent and kinase-independent mechanisms of toxicity. We also trained a cytotoxic classification model on proprietary and public compound readouts, and prospectively validated these on 988 novel compounds comprising difficult and trivial testing instances, to establish the applicability domain of models. The proprietary model performed with precision and recall scores of 77.9% and 83.8%, respectively. The MoA results and top ranking substructures with accompanying MoA predictions are available as a platform to assess screening collections.


Assuntos
Ciclo Celular/efeitos dos fármacos , Sobrevivência Celular/efeitos dos fármacos , Ensaios de Triagem em Larga Escala/métodos , Linhagem Celular , Humanos
16.
ACS Omega ; 1(6): 1412-1424, 2016 Dec 31.
Artigo em Inglês | MEDLINE | ID: mdl-30023509

RESUMO

The epidermal growth factor receptor (EGFR) is a validated therapeutic target for triple-negative breast cancer (TNBC). In the present study, we synthesize novel adamantanyl-based thiadiazolyl pyrazoles by introducing the adamantane ring to thiazolopyrazoline. On the basis of loss of cell viability in TNBC cells, 4-(adamantan-1-yl)-2-(3-(2,4-dichlorophenyl)-5-phenyl-4,5-dihydro-1H-pyrazol-1-yl)thiazole (APP) was identified as a lead compound. Using a Parzen-Rosenblatt Window classifier, APP was predicted to target the EGFR protein, and the same was confirmed by surface plasmon resonance. Further analysis revealed that APP suppressed the phosphorylation of EGFR at Y992, Y1045, Y1068, Y1086, Y1148, and Y1173 in TNBC cells. APP also inhibited the phosphorylation of ERK at Y204 and of STAT3 at Y705, implying that APP downregulates the activity of EGFR downstream effectors. Small interfering RNA mediated depletion of EGFR expression prevented the effect of APP in BT549 and MDA-MB-231 cells, indicating that APP specifically targets the EGFR. Furthermore, APP modulated the expression of the proteins involved in cell proliferation and survival. In addition, APP altered the expression of epithelial-mesenchymal transition related proteins and suppressed the invasion of TNBC cells. Hence, we report a novel and specific inhibitor of the EGFR signaling cascade.

17.
J Cheminform ; 7: 51, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26500705

RESUMO

BACKGROUND: In silico analyses are increasingly being used to support mode-of-action investigations; however many such approaches do not utilise the large amounts of inactive data held in chemogenomic repositories. The objective of this work is concerned with the integration of such bioactivity data in the target prediction of orphan compounds to produce the probability of activity and inactivity for a range of targets. To this end, a novel human bioactivity data set was constructed through the assimilation of over 195 million bioactivity data points deposited in the ChEMBL and PubChem repositories, and the subsequent application of a sphere-exclusion selection algorithm to oversample presumed inactive compounds. RESULTS: A Bernoulli Naïve Bayes algorithm was trained using the data and evaluated using fivefold cross-validation, achieving a mean recall and precision of 67.7 and 63.8 % for active compounds and 99.6 and 99.7 % for inactive compounds, respectively. We show the performances of the models are considerably influenced by the underlying intraclass training similarity, the size of a given class of compounds, and the degree of additional oversampling. The method was also validated using compounds extracted from WOMBAT producing average precision-recall AUC and BEDROC scores of 0.56 and 0.85, respectively. Inactive data points used for this test are based on presumed inactivity, producing an approximated indication of the true extrapolative ability of the models. A distance-based applicability domain analysis was also conducted; indicating an average Tanimoto Coefficient distance of 0.3 or greater between a test and training set can be used to give a global measure of confidence in model predictions. A final comparison to a method trained solely on active data from ChEMBL performed with precision-recall AUC and BEDROC scores of 0.45 and 0.76. CONCLUSIONS: The inclusion of inactive data for model training produces models with superior AUC and improved early recognition capabilities, although the results from internal and external validation of the models show differing performance between the breadth of models. The realised target prediction protocol is available at https://github.com/lhm30/PIDGIN.Graphical abstractThe inclusion of large scale negative training data for in silico target prediction improves the precision and recall AUC and BEDROC scores for target models.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA