ABSTRACT
Explaining predictions for drug repositioning with biological knowledge graphs is a challenging problem. Graph completion methods using symbolic reasoning predict drug treatments and associated rules to generate evidence representing the therapeutic basis of the drug. Yet the vast amounts of generated paths that are biologically irrelevant or not mechanistically meaningful within the context of disease biology can limit utility. We use a reinforcement learning based knowledge graph completion model combined with an automatic filtering approach that produces the most relevant rules and biological paths explaining the predicted drug's therapeutic connection to the disease. In this work we validate the approach against preclinical experimental data for Fragile X syndrome demonstrating strong correlation between automatically extracted paths and experimentally derived transcriptional changes of selected genes and pathways of drug predictions Sulindac and Ibudilast. Additionally, we show it reduces the number of generated paths in two case studies, 85% for Cystic fibrosis and 95% for Parkinson's disease.
Subject(s)
Drug Discovery , Drug Repositioning , Parkinson Disease , Humans , Drug Discovery/methods , Parkinson Disease/drug therapy , Parkinson Disease/genetics , Drug Repositioning/methods , Cystic Fibrosis/drug therapy , Cystic Fibrosis/genetics , Sulindac/pharmacology , Sulindac/therapeutic use , Animals , AlgorithmsABSTRACT
The molecular chaperone heat shock protein 90 (HSP90) works in concert with co-chaperones to stabilize its client proteins, which include multiple drivers of oncogenesis and malignant progression. Pharmacologic inhibitors of HSP90 have been observed to exert a wide range of effects on the proteome, including depletion of client proteins, induction of heat shock proteins, dissociation of co-chaperones from HSP90, disruption of client protein signaling networks, and recruitment of the protein ubiquitylation and degradation machinery-suggesting widespread remodeling of cellular protein complexes. However, proteomics studies to date have focused on inhibitor-induced changes in total protein levels, often overlooking protein complex alterations. Here, we use size-exclusion chromatography in combination with mass spectrometry (SEC-MS) to characterize the early changes in native protein complexes following treatment with the HSP90 inhibitor tanespimycin (17-AAG) for 8 h in the HT29 colon adenocarcinoma cell line. After confirming the signature cellular response to HSP90 inhibition (e.g., induction of heat shock proteins, decreased total levels of client proteins), we were surprised to find only modest perturbations to the global distribution of protein elution profiles in inhibitor-treated HT29 cells at this relatively early time-point. Similarly, co-chaperones that co-eluted with HSP90 displayed no clear difference between control and treated conditions. However, two distinct analysis strategies identified multiple inhibitor-induced changes, including known and unknown components of the HSP90-dependent proteome. We validate two of these-the actin-binding protein Anillin and the mitochondrial isocitrate dehydrogenase 3 complex-as novel HSP90 inhibitor-modulated proteins. We present this dataset as a resource for the HSP90, proteostasis, and cancer communities (https://www.bioinformatics.babraham.ac.uk/shiny/HSP90/SEC-MS/), laying the groundwork for future mechanistic and therapeutic studies related to HSP90 pharmacology. Data are available via ProteomeXchange with identifier PXD033459.
Subject(s)
Adenocarcinoma , Antineoplastic Agents , Colonic Neoplasms , Humans , Proteome/metabolism , Adenocarcinoma/drug therapy , Colonic Neoplasms/drug therapy , HSP90 Heat-Shock Proteins , Molecular Chaperones , Antineoplastic Agents/pharmacology , Mass Spectrometry , Chromatography, GelABSTRACT
We hypothesize that the study of acute protein perturbation in signal transduction by targeted anticancer drugs can predict drug sensitivity of these agents used as single agents and rational combination therapy. We assayed dynamic changes in 52 phosphoproteins caused by an acute exposure (1 hour) to clinically relevant concentrations of seven targeted anticancer drugs in 35 non-small cell lung cancer (NSCLC) cell lines and 16 samples of NSCLC cells isolated from pleural effusions. We studied drug sensitivities across 35 cell lines and synergy of combinations of all drugs in six cell lines (252 combinations). We developed orthogonal machine-learning approaches to predict drug response and rational combination therapy. Our methods predicted the most and least sensitive quartiles of drug sensitivity with an AUC of 0.79 and 0.78, respectively, whereas predictions based on mutations in three genes commonly known to predict response to the drug studied, for example, EGFR, PIK3CA, and KRAS, did not predict sensitivity (AUC of 0.5 across all quartiles). The machine-learning predictions of combinations that were compared with experimentally generated data showed a bias to the highest quartile of Bliss synergy scores (P = 0.0243). We confirmed feasibility of running such assays on 16 patient samples of freshly isolated NSCLC cells from pleural effusions. We have provided proof of concept for novel methods of using acute ex vivo exposure of cancer cells to targeted anticancer drugs to predict response as single agents or combinations. These approaches could complement current approaches using gene mutations/amplifications/rearrangements as biomarkers and demonstrate the utility of proteomics data to inform treatment selection in the clinic.
Subject(s)
Antineoplastic Agents , Carcinoma, Non-Small-Cell Lung , Lung Neoplasms , Pleural Effusion , Antineoplastic Agents/pharmacology , Antineoplastic Agents/therapeutic use , Artificial Intelligence , Carcinoma, Non-Small-Cell Lung/drug therapy , Carcinoma, Non-Small-Cell Lung/genetics , Carcinoma, Non-Small-Cell Lung/metabolism , Humans , Lung Neoplasms/drug therapy , Lung Neoplasms/genetics , Lung Neoplasms/metabolism , MutationABSTRACT
canSAR (http://cansar.icr.ac.uk) is the largest, public, freely available, integrative translational research and drug discovery knowledgebase for oncology. canSAR integrates vast multidisciplinary data from across genomic, protein, pharmacological, drug and chemical data with structural biology, protein networks and more. It also provides unique data, curation and annotation and crucially, AI-informed target assessment for drug discovery. canSAR is widely used internationally by academia and industry. Here we describe significant developments and enhancements to the data, web interface and infrastructure of canSAR in the form of the new implementation of the system: canSARblack. We demonstrate new functionality in aiding translation hypothesis generation and experimental design, and show how canSAR can be adapted and utilised outside oncology.
Subject(s)
Computational Biology/methods , Databases, Genetic , Drug Discovery/methods , Knowledge Bases , Neoplasms/genetics , Translational Research, Biomedical/methods , Antineoplastic Agents/chemistry , Antineoplastic Agents/therapeutic use , Data Mining/methods , Genomics/methods , Humans , Internet , Medical Oncology/methods , Molecular Structure , Neoplasms/metabolism , Proteomics/methods , User-Computer InterfaceABSTRACT
PURPOSE: Martsolf (MS) and Warburg micro syndromes (WARBM) are rare autosomal recessive inherited allelic disorders, which share similar clinical features including microcephaly, intellectual disability, brain malformations, ocular abnormalities, and spasticity. Here, we revealed the functions of novel mutations in RAB3GAP1 in a Turkish female patient with MS and two siblings with WARBM. We also present a review of MS patients as well as all reported RAB3GAP1 pathogenic mutations in the literature. METHODS: We present a female with MS phenotype and two siblings with WARBM having more severe phenotypes. We utilized whole-exome sequencing to identify the molecular basis of these syndromes and confirmed suspected variants by Sanger sequencing. Quantitative (q) RT-PCR analysis was carried out to reveal the functions of novel splice site mutation detected in MS patient. RESULTS: We found a novel homozygous c.2607-1G>C splice site mutation in intron 22 of RAB3GAP1 in MS patient and a novel homozygous c.2187_2188delinsCT, p.(Met729_Lys730delinsIleTer) mutation in exon 19 of RAB3GAP1 in the WARBM patients. We showed exon skipping in MS patient by Sanger sequencing and gel electrophoresis. qRT-PCR analysis demonstrated the reduced expression of RAB3GAP1 in the patient with the c.2607-1G>C splice site mutation compared to a healthy control individual. CONCLUSION: Here, we have studied two novel RAB3GAP1 mutations in two different phenotypes; a MS associated novel splice site mutation, and a WARBM1 associated novel deletion-insertion mutation. Our findings suggest that this splice site mutation is responsible for milder phenotype and the deletion-insertion mutation presented here is associated with severe phenotype.
Subject(s)
Abnormalities, Multiple/genetics , Abnormalities, Multiple/pathology , Alternative Splicing , Cataract/congenital , Cornea/abnormalities , Hypogonadism/genetics , Hypogonadism/pathology , Intellectual Disability/genetics , Intellectual Disability/pathology , Microcephaly/genetics , Microcephaly/pathology , Mutation , Optic Atrophy/genetics , Optic Atrophy/pathology , rab3 GTP-Binding Proteins/genetics , Cataract/genetics , Cataract/pathology , Child , Cornea/pathology , Female , Homozygote , Humans , INDEL Mutation , Male , Pedigree , Phenotype , Siblings , TurkeyABSTRACT
canSAR (http://cansar.icr.ac.uk) is a public, freely available, integrative translational research and drug discovery knowlegebase. canSAR informs researchers to help solve key bottlenecks in cancer translation and drug discovery. It integrates genomic, protein, pharmacological, drug and chemical data with structural biology, protein networks and unique, comprehensive and orthogonal 'druggability' assessments. canSAR is widely used internationally by academia and industry. Here we describe major enhancements to canSAR including new and expanded data. We also describe the first components of canSARblack-an advanced, responsive, multi-device compatible redesign of canSAR with a question-led interface.
Subject(s)
Antineoplastic Agents , Databases, Pharmaceutical , Drug Discovery , Knowledge Bases , Antineoplastic Agents/chemistry , Antineoplastic Agents/therapeutic use , Humans , Neoplasms/drug therapy , Neoplasms/genetics , Protein Conformation , Protein Interaction Mapping , Translational Research, Biomedical , User-Computer InterfaceABSTRACT
Demonstrating intracellular protein target engagement is an essential step in the development and progression of new chemical probes and potential small molecule therapeutics. However, this can be particularly challenging for poorly studied and noncatalytic proteins, as robust proximal biomarkers are rarely known. To confirm that our recently discovered chemical probe 1 (CCT251236) binds the putative transcription factor regulator pirin in living cells, we developed a heterobifunctional protein degradation probe. Focusing on linker design and physicochemical properties, we generated a highly active probe 16 (CCT367766) in only three iterations, validating our efficient strategy for degradation probe design against nonvalidated protein targets.
Subject(s)
Prion Proteins/metabolism , Proteolysis/drug effects , Cell Line , Cell Survival , Models, Molecular , Protein ConformationABSTRACT
DNA methylation is an important epigenetic phenomenon that plays a key role in the regulation of expression. Most of the studies on the topic of methylation's role in cancer mechanisms include analyses based on differential methylation, with the integration of expression information as supporting evidence. In the present study, we sought to identify methylation-driven patterns by also integrating protein-protein interaction information. We performed integrative analyses of DNA methylation, expression, SNP and copy number data on paired samples from six different cancer types. As a result, we found that genes that show a methylation change larger than 32.2% may influence cancer-related genes via fewer interaction steps and with much higher percentages compared with genes showing a methylation change less than 32.2%. Additionally, we investigated whether there were shared cancer mechanisms among different cancer types. Specifically, five cancer types shared a change in AGTR1 and IGF1 genes, which implies that there may be similar underlying disease mechanisms among these cancers. Additionally, when the focus was placed on distinctly altered genes within each cancer type, we identified various cancer-specific genes that are also supported in the literature and may play crucial roles as therapeutic targets. Overall, our novel graph-based approach for identifying methylation-driven patterns will improve our understanding of the effects of methylation on cancer progression and lead to improved knowledge of cancer etiology.
Subject(s)
DNA Methylation , Epigenesis, Genetic , Gene Expression Regulation, Neoplastic , Neoplasms/genetics , Computational Biology/methods , CpG Islands , Databases, Genetic , Genes, Tumor Suppressor , Humans , Metabolic Networks and Pathways , Neoplasms/metabolism , Oncogenes , Protein Interaction Mapping , TranscriptomeABSTRACT
With ongoing developments in technology, changes in DNA methylation levels have become prevalent to study cancer biology. Previous studies report that DNA methylation affects gene expression in a direct manner, most probably by blocking gene regulatory regions. In this study, we have studied the interplay between methylation and expression to improve our knowledge of cancer aetiology. For this purpose, we have investigated which genomic regions are of higher importance; hence, first exon, 5'UTR and 200 bp near the transcription start sites are proposed as being more crucial compared to other genomic regions. Furthermore, we have searched for a valid methylation level change threshold, and as a result, 25 % methylation change in previously determined genomic regions showed the highest inverse correlation with expression data. As a final step, we have examined the commonly affected genes and pathways by integrating methylation and expression information. Remarkably, the GPR115 gene and ErbB signalling pathway were found to be significantly altered for all cancer types in our analysis. Overall, combining methylation and expression information and identifying commonly affected genes and pathways in a variety of cancer types revealed new insights of cancer disease mechanisms. Moreover, compared to previous methylation-based studies, we have identified more important genomic regions and have defined a methylation change threshold level in order to obtain more reliable results. In addition to the novel analysis framework that involves the analysis of four different cancer types, our study exposes essential information regarding the contribution of methylation changes and its impact on cancer disease biology, which may facilitate the identification of new drug targets.
Subject(s)
DNA Methylation/genetics , Epigenesis, Genetic , Molecular Targeted Therapy , Neoplasms/genetics , ErbB Receptors/biosynthesis , ErbB Receptors/genetics , Gene Expression Regulation, Neoplastic , Humans , Neoplasms/drug therapy , Neoplasms/pathology , Promoter Regions, Genetic , Receptors, G-Protein-Coupled/biosynthesis , Receptors, G-Protein-Coupled/genetics , Signal Transduction/genetics , Transcription Initiation SiteABSTRACT
BACKGROUND: Recently, a wide range of diseases have been associated with changes in DNA methylation levels, which play a vital role in gene expression regulation. With ongoing developments in technology, attempts to understand disease mechanism have benefited greatly from epigenetics and transcriptomics studies. In this work, we have used expression and methylation data of thyroid carcinoma as a case study and explored how to optimally incorporate expression and methylation information into the disease study when both data are available. Moreover, we have also investigated whether there are important post-translational modifiers which could drive critical insights on thyroid cancer genetics. RESULTS: In this study, we have conducted a threshold analysis for varying methylation levels to identify whether setting a methylation level threshold increases the performance of functional enrichment. Moreover, in order to decide on best-performing analysis strategy, we have performed data integration analysis including comparison of 10 different analysis strategies. As a result, combining methylation with expression and using genes with more than 15% methylation change led to optimal detection rate of thyroid-cancer associated pathways in top 20 functional enrichment results. Furthermore, pooling the data from different experiments increased analysis confidence by improving the data range. Consequently, we have identified 207 transcription factors and 245 post-translational modifiers with more than 15% methylation change which may be important in understanding underlying mechanisms of thyroid cancer. CONCLUSION: While only expression or only methylation information would not reveal both primary and secondary mechanisms involved in disease state, combining expression and methylation led to a better detection of thyroid cancer-related genes and pathways that are found in the recent literature. Moreover, focusing on genes that have certain level of methylation change improved the functional enrichment results, revealing the core pathways involved in disease development such as; endocytosis, apoptosis, glutamatergic synapse, MAPK, ErbB, TGF-beta and Toll-like receptor pathways. Overall, in addition to novel analysis framework, our study reveals important thyroid-cancer related mechanisms, secondary molecular alterations and contributes to better knowledge of thyroid cancer aetiology.
Subject(s)
DNA Methylation , Gene Regulatory Networks , Thyroid Neoplasms/genetics , Transcriptome , Computational Biology/methods , Databases, Genetic , Gene Expression Regulation, Neoplastic , Humans , Sequence Analysis, RNA , Transcription Factors/geneticsABSTRACT
We report an association between a new causative gene and spastic paraplegia, which is a genetically heterogeneous disorder. Clinical phenotyping of one consanguineous family followed by combined homozygosity mapping and whole-exome sequencing analysis. Three patients from the same family shared common features of progressive complicated spastic paraplegia. They shared a single homozygous stretch area on chromosome 6. Whole-exome sequencing revealed a homozygous mutation (c.853_871del19) in the gene coding the kinesin light chain 4 protein (KLC4). Meanwhile, the unaffected parents and two siblings were heterozygous and one sibling was homozygous wild type. The 19 bp deletion in exon 6 generates a stop codon and thus a truncated messenger RNA and protein. The association of a KLC4 mutation with spastic paraplegia identifies a new locus for the disease.
Subject(s)
Base Sequence , Exons , Genes, Recessive , Genetic Diseases, Inborn/genetics , Microtubule-Associated Proteins/genetics , Paraplegia/genetics , Quantitative Trait, Heritable , Sequence Deletion , Codon, Terminator/genetics , Exome , Female , Humans , Kinesins , MaleABSTRACT
UNLABELLED: Due to the big data produced by next-generation sequencing studies, there is an evident need for methods to extract the valuable information gathered from these experiments. In this work, we propose GeneCOST, a novel scoring-based method to evaluate every gene for their disease association. Without any prior filtering and any prior knowledge, we assign a disease likelihood score to each gene in correspondence with their variations. Then, we rank all genes based on frequency, conservation, pedigree and detailed variation information to find out the causative reason of the disease state. We demonstrate the usage of GeneCOST with public and real life Mendelian disease cases including recessive, dominant, compound heterozygous and sporadic models. As a result, we were able to identify causative reason behind the disease state in top rankings of our list, proving that this novel prioritization framework provides a powerful environment for the analysis in genetic disease studies alternative to filtering-based approaches. AVAILABILITY AND IMPLEMENTATION: GeneCOST software is freely available at www.igbam.bilgem.tubitak.gov.tr/en/softwares/genecost-en/index.html. CONTACT: buozer@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Subject(s)
Disease/genetics , Genetic Association Studies/methods , Software , Family , Humans , Mutation/geneticsABSTRACT
Recently, the rapid advance in genome sequencing technology has led to production of huge amount of sensitive genomic data. However, a serious privacy challenge is confronted with increasing number of genetic tests as genomic data is the ultimate source of identity for humans. Lately, privacy threats and possible solutions regarding the undesired access to genomic data are discussed, however it is challenging to apply proposed solutions to real life problems due to the complex nature of security definitions. In this review, we have categorized pre-existing problems and corresponding solutions in more understandable and convenient way. Additionally, we have also included open privacy problems coming with each genomic data processing procedure. We believe our classification of genome associated privacy problems will pave the way for linking of real-life problems with previously proposed methods.