RESUMEN
Mutations in splicing factor 3B subunit 1 (SF3B1) frequently occur in patients with chronic lymphocytic leukemia (CLL) and myelodysplastic syndromes (MDS). These mutations have different effects on the disease prognosis with beneficial effect in MDS and worse prognosis in CLL patients. A full-length transcriptome approach can expand our knowledge on SF3B1 mutation effects on RNA splicing and its contribution to patient survival and treatment options. We applied long-read transcriptome sequencing (LRTS) to 44 MDS and CLL patients, as well as two pairs of isogenic cell lines with and without SF3B1 mutations, and found >60% of novel isoforms. Splicing alterations were largely shared between cancer types and specifically affected the usage of introns and 3' splice sites. Our data highlighted a constrained window at canonical 3' splice sites in which dynamic splice site switches occurred in SF3B1-mutated patients. Using transcriptome-wide RNA binding maps and molecular dynamics simulations, we showed multimodal SF3B1 binding at 3' splice sites and predicted reduced RNA binding at the second binding pocket of SF3B1K700E Our work presents the hitherto most complete LRTS study of the SF3B1 mutation in CLL and MDS and provides a resource to study aberrant splicing in cancer. Moreover, we showed that different disease prognosis most likely results from the different cell types expanded during carcinogenesis rather than different mechanisms of action of the mutated SF3B1 These results have important implications for understanding the role of SF3B1 mutations in hematological malignancies and other related diseases.
RESUMEN
The Long-read RNA-Seq Genome Annotation Assessment Project Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. Using different protocols and sequencing platforms, the consortium generated over 427 million long-read sequences from complementary DNA and direct RNA datasets, encompassing human, mouse and manatee species. Developers utilized these data to address challenges in transcript isoform detection, quantification and de novo transcript detection. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. Incorporating additional orthogonal data and replicate samples is advised when aiming to detect rare and novel transcripts or using reference-free approaches. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.
Asunto(s)
Perfilación de la Expresión Génica , RNA-Seq , Humanos , Animales , Ratones , RNA-Seq/métodos , Perfilación de la Expresión Génica/métodos , Transcriptoma , Análisis de Secuencia de ARN/métodos , Anotación de Secuencia Molecular/métodosRESUMEN
The Rab-GTPase-activating protein (RabGAP) TBC1D4 (AS160) represents a key component in the regulation of glucose transport into skeletal muscle and white adipose tissue (WAT) and is therefore crucial during the development of insulin resistance and type 2 diabetes. Increased daily activity has been shown to be associated with improved postprandial hyperglycemia in allele carriers of a loss-of-function variant in the human TBC1D4 gene. Using conventional Tbc1d4-deficient mice (D4KO) fed a high-fat diet, we show that moderate endurance exercise training leads to substantially improved glucose and insulin tolerance and enhanced expression levels of markers for mitochondrial activity and browning in WAT from D4KO animals. Importantly, in vivo and ex vivo analyses of glucose uptake revealed increased glucose clearance in interscapular brown adipose tissue and WAT from trained D4KO mice. Thus, chronic exercise is able to overcome the genetically induced insulin resistance caused by Tbc1d4 depletion. Gene variants in TBC1D4 may be relevant in future precision medicine as determinants of exercise response.
Asunto(s)
Tejido Adiposo Blanco , Proteínas Activadoras de GTPasa , Resistencia a la Insulina , Ratones Noqueados , Condicionamiento Físico Animal , Resistencia a la Insulina/genética , Resistencia a la Insulina/fisiología , Proteínas Activadoras de GTPasa/genética , Proteínas Activadoras de GTPasa/metabolismo , Animales , Ratones , Condicionamiento Físico Animal/fisiología , Tejido Adiposo Blanco/metabolismo , Dieta Alta en Grasa , Masculino , Tejido Adiposo Pardo/metabolismo , Músculo Esquelético/metabolismo , Glucosa/metabolismo , Ratones Endogámicos C57BLRESUMEN
Computational drug sensitivity models have the potential to improve therapeutic outcomes by identifying targeted drugs components that are tailored to the transcriptomic profile of a given primary tumor. The SMILES representation of molecules that is used by state-of-the-art drug-sensitivity models is not conducive for neural networks to generalize to new drugs, in part because the distance between atoms does not generally correspond to the distance between their representation in the SMILES strings. Graph-attention networks, on the other hand, are high-capacity models that require large training-data volumes which are not available for drug-sensitivity estimation. We develop a modular drug-sensitivity graph-attentional neural network. The modular architecture allows us to separately pre-train the graph encoder and graph-attentional pooling layer on related tasks for which more data are available. We observe that this model outperforms reference models for the use cases of precision oncology and drug discovery; in particular, it is better able to predict the specific interaction between drug and cell line that is not explained by the general cytotoxicity of the drug and the overall survivability of the cell line. The complete source code is available at https://zenodo.org/doi/10.5281/zenodo.8020945. All experiments are based on the publicly available GDSC data.
RESUMEN
Patients affected by neurofibromatosis type 1 (NF1) frequently show muscle weakness with unknown etiology. Here we show that, in mice, Neurofibromin 1 (Nf1) is not required in muscle fibers, but specifically in early postnatal myogenic progenitors (MPs), where Nf1 loss led to cell cycle exit and differentiation blockade, depleting the MP pool resulting in reduced myonuclear accretion as well as reduced muscle stem cell numbers. This was caused by precocious induction of stem cell quiescence coupled to metabolic reprogramming of MPs impinging on glycolytic shutdown, which was conserved in muscle fibers. We show that a Mek/Erk/NOS pathway hypersensitizes Nf1-deficient MPs to Notch signaling, consequently, early postnatal Notch pathway inhibition ameliorated premature quiescence, metabolic reprogramming and muscle growth. This reveals an unexpected role of Ras/Mek/Erk signaling supporting postnatal MP quiescence in concert with Notch signaling, which is controlled by Nf1 safeguarding coordinated muscle growth and muscle stem cell pool establishment. Furthermore, our data suggest transmission of metabolic reprogramming across cellular differentiation, affecting fiber metabolism and function in NF1.
Asunto(s)
Neurofibromatosis 1 , Neurofibromina 1 , Ratones , Humanos , Animales , Neurofibromina 1/genética , Neurofibromina 1/metabolismo , Neurofibromatosis 1/genética , Neurofibromatosis 1/metabolismo , Transducción de Señal/fisiología , Sistema de Señalización de MAP Quinasas , Quinasas de Proteína Quinasa Activadas por Mitógenos/metabolismoRESUMEN
The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. The consortium generated over 427 million long-read sequences from cDNA and direct RNA datasets, encompassing human, mouse, and manatee species, using different protocols and sequencing platforms. These data were utilized by developers to address challenges in transcript isoform detection and quantification, as well as de novo transcript isoform identification. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. When aiming to detect rare and novel transcripts or when using reference-free approaches, incorporating additional orthogonal data and replicate samples are advised. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.
RESUMEN
MOTIVATION: Long-read transcriptome sequencing (LRTS) has the potential to enhance our understanding of alternative splicing and the complexity of this process requires the use of versatile computational tools, with the ability to accommodate various stages of the workflow with maximum flexibility. RESULTS: We introduce IsoTools, a Python-based LRTS analysis framework that offers a wide range of functionality for transcriptome reconstruction and quantification of transcripts. Furthermore, we integrate a graph-based method for identifying alternative splicing events and a statistical approach based on the beta-binomial distribution for detecting differential events. To demonstrate the effectiveness of our methods, we applied IsoTools to PacBio LRTS data of human hepatocytes treated with the histone deacetylase inhibitor valproic acid. Our results indicate that LRTS can provide valuable insights into alternative splicing, particularly in terms of complex and differential splicing patterns, in comparison to short-read RNA-seq. AVAILABILITY AND IMPLEMENTATION: IsoTools is available on GitHub and PyPI, and its documentation, including tutorials, CLI, and API references, can be found at https://isotools.readthedocs.io/.
Asunto(s)
Empalme Alternativo , Transcriptoma , Humanos , Flujo de Trabajo , Perfilación de la Expresión Génica , Empalme del ARN , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ARN/métodosRESUMEN
Large-scale databases that report the inhibitory capacities of many combinations of candidate drug compounds and cultivated cancer cell lines have driven the development of preclinical drug-sensitivity models based on machine learning. However, cultivated cell lines have devolved from human cancer cells over years or even decades under selective pressure in culture conditions. Moreover, models that have been trained on in vitro data cannot account for interactions with other types of cells. Drug-response data that are based on patient-derived cell cultures, xenografts, and organoids, on the other hand, are not available in the quantities that are needed to train high-capacity machine-learning models. We found that pre-training deep neural network models of drug sensitivity on in vitro drug-sensitivity databases before fine-tuning the model parameters on patient-derived data improves the models' accuracy and improves the biological plausibility of the features, compared to training only on patient-derived data. From our experiments, we can conclude that pre-trained models outperform models that have been trained on the target domains in the vast majority of cases.
RESUMEN
BACKGROUND: Epirubicin (EPI) is an important anticancer drug that is well-known for its cardiotoxic side effect. Studying epigenetic modification such as DNA methylation can help to understand the EPI-related toxic mechanisms in cardiac tissue. In this study, we analyzed the DNA methylation profile in a relevant human cell model and inspected the expression of differentially methylated genes at the transcriptome level to understand how changes in DNA methylation could affect gene expression in relation to EPI-induced cardiotoxicity. METHODS: Human cardiac microtissues were exposed to either therapeutic or toxic (IC20) EPI doses during 2 weeks. The DNA and RNA were collected from microtissues in triplicates at 2, 8, 24, 72, 168, 240, and 336 hours of exposure. Methylated DNA immunoprecipitation-sequencing (MeDIP-seq) analysis was used to detect DNA methylation levels in EPI-treated and control samples. The MeDIP-seq data were analyzed and processed using the QSEA package with a recently published workflow. RNA sequencing (RNA-seq) was used to measure global gene expression in the same samples. RESULTS: After processing the MeDIP-seq data, we detected 35, 37, 15 candidate genes which show strong methylated alterations between all EPI-treated, EPI therapeutic and EPI toxic dose-treated samples compared to control, respectively. For several genes, gene expressions changed compatibly reflecting the DNA methylation regulation. CONCLUSIONS: The observed DNA methylation modifications provide further insights into the EPI-induced cardiotoxicity. Multiple differentially methylated genes under EPI treatment, such as SMARCA4, PKN1, RGS12, DPP9, NCOR2, SDHA, POLR2A, and AGPAT3, have been implicated in different cardiac dysfunction mechanisms. Together with other differentially methylated genes, these genes can be candidates for further investigations of EPI-related toxic mechanisms. Data Repository: The data has been generated by the HeCaToS project (http://www.ebi.ac.uk/biostudies) under accession numbers S-HECA433 and S-HECA434 for the MeDIP-seq data and S-HECA11 for the RNA-seq data. The R code is available on Github (https://github.com/NhanNguyen000/MeDIP).
Asunto(s)
Cardiotoxicidad , Metilación de ADN , Cardiotoxicidad/genética , ADN , ADN Helicasas , Epirrubicina/toxicidad , Humanos , Proteínas Nucleares , Análisis de Secuencia de ADN , Factores de TranscripciónRESUMEN
Computational drug sensitivity models have the potential to improve therapeutic outcomes by identifying targeted drug components that are likely to achieve the highest efficacy for a cancer cell line at hand at a therapeutic dose. State of the art drug sensitivity models use regression techniques to predict the inhibitory concentration of a drug for a tumor cell line. This regression objective is not directly aligned with either of these principal goals of drug sensitivity models: We argue that drug sensitivity modeling should be seen as a ranking problem with an optimization criterion that quantifies a drug's inhibitory capacity for the cancer cell line at hand relative to its toxicity for healthy cells. We derive an extension to the well-established drug sensitivity regression model PaccMann that employs a ranking loss and focuses on the ratio of inhibitory concentration and therapeutic dosage range. We find that the ranking extension significantly enhances the model's capability to identify the most effective anticancer drugs for unseen tumor cell profiles based in on in-vitro data.
RESUMEN
Mutations in splicing factor genes have a severe impact on the survival of cancer patients. Splicing factor 3b subunit 1 (SF3B1) is one of the most frequently mutated genes in chronic lymphocytic leukemia (CLL); patients carrying these mutations have a poor prognosis. Since the splicing machinery and the epigenome are closely interconnected, we investigated whether these alterations may affect the epigenomes of CLL patients. While an overall hypomethylation during CLL carcinogenesis has been observed, the interplay between the epigenetic stage of the originating B cells and SF3B1 mutations, and the subsequent effect of the mutations on methylation alterations in CLL, have not been investigated. We profiled the genome-wide DNA methylation patterns of 27 CLL patients with and without SF3B1 mutations and identified local decreases in methylation levels in SF3B1mut CLL patients at 67 genomic regions, mostly in proximity to telomeric regions. These differentially methylated regions (DMRs) were enriched in gene bodies of cancer-related signaling genes, e.g., NOTCH1, HTRA3, and BCL9L. In our study, SF3B1 mutations exclusively emerged in two out of three epigenetic stages of the originating B cells. However, not all the DMRs could be associated with the methylation programming of B cells during development, suggesting that mutations in SF3B1 cause additional epigenetic aberrations during carcinogenesis.
Asunto(s)
Biomarcadores de Tumor/genética , Metilación de ADN , Regulación Leucémica de la Expresión Génica , Leucemia Linfocítica Crónica de Células B/patología , Mutación , Fosfoproteínas/genética , Factores de Empalme de ARN/genética , Epigénesis Genética , Humanos , Leucemia Linfocítica Crónica de Células B/genética , PronósticoRESUMEN
BACKGROUND: There is a huge body of scientific literature describing the relation between tumor types and anti-cancer drugs. The vast amount of scientific literature makes it impossible for researchers and physicians to extract all relevant information manually. METHODS: In order to cope with the large amount of literature we applied an automated text mining approach to assess the relations between 30 most frequent cancer types and 270 anti-cancer drugs. We applied two different approaches, a classical text mining based on named entity recognition and an AI-based approach employing word embeddings. The consistency of literature mining results was validated with 3 independent methods: first, using data from FDA approvals, second, using experimentally measured IC-50 cell line data and third, using clinical patient survival data. RESULTS: We demonstrated that the automated text mining was able to successfully assess the relation between cancer types and anti-cancer drugs. All validation methods showed a good correspondence between the results from literature mining and independent confirmatory approaches. The relation between most frequent cancer types and drugs employed for their treatment were visualized in a large heatmap. All results are accessible in an interactive web-based knowledge base using the following link: https://knowledgebase.microdiscovery.de/heatmap . CONCLUSIONS: Our approach is able to assess the relations between compounds and cancer types in an automated manner. Both, cancer types and compounds could be grouped into different clusters. Researchers can use the interactive knowledge base to inspect the presented results and follow their own research questions, for example the identification of novel indication areas for known drugs.
Asunto(s)
Antineoplásicos , Neoplasias , Minería de Datos , Humanos , Bases del Conocimiento , Neoplasias/tratamiento farmacológico , PublicacionesRESUMEN
Genetic predisposition affects the penetrance of tumor-initiating mutations, such as APC mutations that stabilize ß-catenin and cause intestinal tumors in mice and humans. However, the mechanisms involved in genetically predisposed penetrance are not well understood. Here, we analyzed tumor multiplicity and gene expression in tumor-prone Apc Min/+ mice on highly variant C57BL/6J (B6) and PWD/Ph (PWD) genetic backgrounds. (B6 × PWD) F1 APC Min offspring mice were largely free of intestinal adenoma, and several chromosome substitution (consomic) strains carrying single PWD chromosomes on the B6 genetic background displayed reduced adenoma numbers. Multiple dosage-dependent modifier loci on PWD chromosome 5 each contributed to tumor suppression. Activation of ß-catenin-driven and stem cell-specific gene expression in the presence of Apc Min or following APC loss remained moderate in intestines carrying PWD chromosome 5, suggesting that PWD variants restrict adenoma initiation by controlling stem cell homeostasis. Gene expression of modifier candidates and DNA methylation on chromosome 5 were predominantly cis controlled and largely reflected parental patterns, providing a genetic basis for inheritance of tumor susceptibility. Human SNP variants of several modifier candidates were depleted in colorectal cancer genomes, suggesting that similar mechanisms may also affect the penetrance of cancer driver mutations in humans. Overall, our analysis highlights the strong impact that multiple genetic variants acting in networks can exert on tumor development. SIGNIFICANCE: These findings in mice show that, in addition to accidental mutations, cancer risk is determined by networks of individual gene variants.
Asunto(s)
Transformación Celular Neoplásica/patología , Neoplasias Colorrectales/prevención & control , Genes APC , Intestinos/patología , Mutación , Proteínas Wnt/metabolismo , beta Catenina/metabolismo , Animales , Transformación Celular Neoplásica/genética , Transformación Celular Neoplásica/metabolismo , Neoplasias Colorrectales/genética , Neoplasias Colorrectales/patología , Predisposición Genética a la Enfermedad , Masculino , Ratones , Ratones Endogámicos C57BL , Proteínas Wnt/genética , beta Catenina/genéticaRESUMEN
Uncovering cellular responses from heterogeneous genomic data is crucial for molecular medicine in particular for drug safety. This can be realized by integrating the molecular activities in networks of interacting proteins. As proof-of-concept we challenge network modeling with time-resolved proteome, transcriptome and methylome measurements in iPSC-derived human 3D cardiac microtissues to elucidate adverse mechanisms of anthracycline cardiotoxicity measured with four different drugs (doxorubicin, epirubicin, idarubicin and daunorubicin). Dynamic molecular analysis at in vivo drug exposure levels reveal a network of 175 disease-associated proteins and identify common modules of anthracycline cardiotoxicity in vitro, related to mitochondrial and sarcomere function as well as remodeling of extracellular matrix. These in vitro-identified modules are transferable and are evaluated with biopsies of cardiomyopathy patients. This to our knowledge most comprehensive study on anthracycline cardiotoxicity demonstrates a reproducible workflow for molecular medicine and serves as a template for detecting adverse drug responses from complex omics data.
Asunto(s)
Metaboloma , Modelos Biológicos , Proteoma , Transcriptoma , Epigénesis Genética , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Humanos , Metabolómica/métodos , Mitocondrias/genética , Mitocondrias/metabolismo , Proteómica/métodos , Sarcómeros/genética , Sarcómeros/metabolismo , Transducción de SeñalRESUMEN
Current molecular tumor diagnostics encompass panel sequencing to detect mutations, copy number alterations, and rearrangements. However, tumor suppressor genes can also be inactivated by methylation within their promoter region. These epigenetic alterations are so far rarely assessed in the clinical setting. Therefore, we established the AllCap protocol facilitating the combined detection of mutations and DNA methylation at the coding and promoter regions of 342 DNA repair genes in one experiment. We demonstrate the use of the protocol by applying it to ovarian cancer cell lines with different responsiveness to poly(ADP-ribose) polymerase inhibition. BRCA1, ATM, ATR, and EP300 mutations and methylation of the BRCA1 promoter were detected as potential predictors for therapy response. The required amount of input DNA was optimized, and the application to formalin-fixed, paraffin-embedded tissue samples was verified to improve the clinical applicability. Thus, by adding DNA methylation values to panel resequencings, the AllCap assay will add another important level of information to clinical tests and will improve stratification of patients for systemic therapies.
Asunto(s)
Supervivencia Celular/efectos de los fármacos , Metilación de ADN/efectos de los fármacos , Inhibidores de Poli(ADP-Ribosa) Polimerasas/farmacología , Proteína BRCA1/genética , Línea Celular Tumoral , Supervivencia Celular/genética , Metilación de ADN/genética , Análisis Mutacional de ADN , Proteína p300 Asociada a E1A/genética , Femenino , Humanos , Neoplasias Ováricas/genética , Poli(ADP-Ribosa) Polimerasa-1/genética , Regiones Promotoras Genéticas/genética , Temozolomida/farmacologíaRESUMEN
BACKGROUND: Non-small cell lung cancer (NSCLC) is the most common cause of cancer-related deaths worldwide and is primarily treated with radiation, surgery, and platinum-based drugs like cisplatin and carboplatin. The major challenge in the treatment of NSCLC patients is intrinsic or acquired resistance to chemotherapy. Molecular markers predicting the outcome of the patients are urgently needed. METHODS: Here, we employed patient-derived xenografts (PDXs) to detect predictive methylation biomarkers for platin-based therapies. We used MeDIP-Seq to generate genome-wide DNA methylation profiles of 22 PDXs, their parental primary NSCLC, and their corresponding normal tissues and complemented the data with gene expression analyses of the same tissues. Candidate biomarkers were validated with quantitative methylation-specific PCRs (qMSP) in an independent cohort. RESULTS: Comprehensive analyses revealed that differential methylation patterns are highly similar, enriched in PDXs and lung tumor-specific when comparing differences in methylation between PDXs versus primary NSCLC. We identified a set of 40 candidate regions with methylation correlated to carboplatin response and corresponding inverse gene expression pattern even before therapy. This analysis led to the identification of a promoter CpG island methylation of LDL receptor-related protein 12 (LRP12) associated with increased resistance to carboplatin. Validation in an independent patient cohort (n = 35) confirmed that LRP12 methylation status is predictive for therapeutic response of NSCLC patients to platin therapy with a sensitivity of 80% and a specificity of 84% (p < 0.01). Similarly, we find a shorter survival time for patients with LRP12 hypermethylation in the TCGA data set for NSCLC (lung adenocarcinoma). CONCLUSIONS: Using an epigenome-wide sequencing approach, we find differential methylation patterns from primary lung cancer and PDX-derived cancers to be very similar, albeit with a lower degree of differential methylation in primary tumors. We identify LRP12 DNA methylation as a powerful predictive marker for carboplatin resistance. These findings outline a platform for the identification of epigenetic therapy resistance biomarkers based on PDX NSCLC models.
Asunto(s)
Biomarcadores de Tumor/genética , Carboplatino/uso terapéutico , Carcinoma de Pulmón de Células no Pequeñas/tratamiento farmacológico , Carcinoma de Pulmón de Células no Pequeñas/genética , Metilación de ADN/genética , Epigenómica , Proteína 1 Relacionada con Receptor de Lipoproteína de Baja Densidad/genética , Ensayos Antitumor por Modelo de Xenoinjerto , Animales , Biomarcadores de Tumor/metabolismo , Carboplatino/farmacología , Supervivencia sin Enfermedad , Resistencia a Antineoplásicos/genética , Genes Supresores de Tumor , Genoma Humano , Humanos , Proteína 1 Relacionada con Receptor de Lipoproteína de Baja Densidad/metabolismo , Neoplasias Pulmonares/genética , Ratones Desnudos , Regiones Promotoras Genéticas , Resultado del TratamientoRESUMEN
Genomic sequencing has driven precision-based oncology therapy; however, the genetic drivers of many malignancies remain unknown or non-targetable, so alternative approaches to the identification of therapeutic leads are necessary. Ependymomas are chemotherapy-resistant brain tumours, which, despite genomic sequencing, lack effective molecular targets. Intracranial ependymomas are segregated on the basis of anatomical location (supratentorial region or posterior fossa) and further divided into distinct molecular subgroups that reflect differences in the age of onset, gender predominance and response to therapy. The most common and aggressive subgroup, posterior fossa ependymoma group A (PF-EPN-A), occurs in young children and appears to lack recurrent somatic mutations. Conversely, posterior fossa ependymoma group B (PF-EPN-B) tumours display frequent large-scale copy number gains and losses but have favourable clinical outcomes. More than 70% of supratentorial ependymomas are defined by highly recurrent gene fusions in the NF-κB subunit gene RELA (ST-EPN-RELA), and a smaller number involve fusion of the gene encoding the transcriptional activator YAP1 (ST-EPN-YAP1). Subependymomas, a distinct histologic variant, can also be found within the supratetorial and posterior fossa compartments, and account for the majority of tumours in the molecular subgroups ST-EPN-SE and PF-EPN-SE. Here we describe mapping of active chromatin landscapes in 42 primary ependymomas in two non-overlapping primary ependymoma cohorts, with the goal of identifying essential super-enhancer-associated genes on which tumour cells depend. Enhancer regions revealed putative oncogenes, molecular targets and pathways; inhibition of these targets with small molecule inhibitors or short hairpin RNA diminished the proliferation of patient-derived neurospheres and increased survival in mouse models of ependymomas. Through profiling of transcriptional enhancers, our study provides a framework for target and drug discovery in other cancers that lack known genetic drivers and are therefore difficult to treat.
Asunto(s)
Elementos de Facilitación Genéticos/genética , Ependimoma/tratamiento farmacológico , Ependimoma/genética , Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes/genética , Terapia Molecular Dirigida , Oncogenes/genética , Factores de Transcripción/metabolismo , Animales , Secuencia de Bases , Ependimoma/clasificación , Ependimoma/patología , Femenino , Humanos , Ratones , Medicina de Precisión , Interferencia de ARN , Ensayos Antitumor por Modelo de XenoinjertoRESUMEN
Genome-wide enrichment of methylated DNA followed by sequencing (MeDIP-seq) offers a reasonable compromise between experimental costs and genomic coverage. However, the computational analysis of these experiments is complex, and quantification of the enrichment signals in terms of absolute levels of methylation requires specific transformation. In this work, we present QSEA, Quantitative Sequence Enrichment Analysis, a comprehensive workflow for the modelling and subsequent quantification of MeDIP-seq data. As the central part of the workflow we have developed a Bayesian statistical model that transforms the enrichment read counts to absolute levels of methylation and, thus, enhances interpretability and facilitates comparison with other methylation assays. We suggest several calibration strategies for the critical parameters of the model, either using additional data or fairly general assumptions. By comparing the results with bisulfite sequencing (BS) validation data, we show the improvement of QSEA over existing methods. Additionally, we generated a clinically relevant benchmark data set consisting of methylation enrichment experiments (MeDIP-seq), BS-based validation experiments (Methyl-seq) as well as gene expression experiments (RNA-seq) derived from non-small cell lung cancer patients, and show that the workflow retrieves well-known lung tumour methylation markers that are causative for gene expression changes, demonstrating the applicability of QSEA for clinical studies. QSEA is implemented in R and available from the Bioconductor repository 3.4 (www.bioconductor.org/packages/qsea).
Asunto(s)
Metilación de ADN , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Animales , Teorema de Bayes , Regulación de la Expresión Génica , Humanos , Neoplasias Pulmonares/genética , Ratones , Regiones Promotoras Genéticas , Sulfitos , Flujo de TrabajoRESUMEN
ConsensusPathDB consists of a comprehensive collection of human (as well as mouse and yeast) molecular interaction data integrated from 32 different public repositories and a web interface featuring a set of computational methods and visualization tools to explore these data. This protocol describes the use of ConsensusPathDB (http://consensuspathdb.org) with respect to the functional and network-based characterization of biomolecules (genes, proteins and metabolites) that are submitted to the system either as a priority list or together with associated experimental data such as RNA-seq. The tool reports interaction network modules, biochemical pathways and functional information that are significantly enriched by the user's input, applying computational methods for statistical over-representation, enrichment and graph analysis. The results of this protocol can be observed within a few minutes, even with genome-wide data. The resulting network associations can be used to interpret high-throughput data mechanistically, to characterize and prioritize biomarkers, to integrate different omics levels, to design follow-up functional assay experiments and to generate topology for kinetic models at different scales.
Asunto(s)
Genómica/métodos , Redes y Vías Metabólicas , Mapas de Interacción de Proteínas , Proteínas/genética , Proteínas/metabolismo , Algoritmos , Animales , Bases de Datos Genéticas , Ontología de Genes , Genoma , Humanos , Internet , Metabolómica/métodos , Ratones , Programas Informáticos , Interfaz Usuario-Computador , LevadurasRESUMEN
DNA enrichment followed by sequencing (DNA-IP seq) is a versatile tool in molecular biology with a wide variety of applications. Computational analysis of differential DNA enrichment between conditions is important for identifying epigenetic alterations in disease compared to healthy controls and for revealing dynamic epigenetic modifications throughout normal and distorted cell differentiation and development. We present a protocol for genome-wide comparative analysis of DNA-IP sequencing data to identify statistically significant differential sequencing coverage between two conditions by considering variation across replicates. The protocol provides a detailed description for the comparative analysis of DNA-IP sequencing data including basic data processing, quality controls, and identification of differential enrichment using the Bioconductor package "MEDIPS".