Search | VHL Regional Portal

1.

WNT signalling control by KDM5C during development affects cognition.

Karwacki-Neisius, Violetta; Jang, Ahram; Cukuroglu, Engin; Tai, Albert; Jiao, Alan; Predes, Danilo; Yoon, Joon; Brookes, Emily; Chen, Jiekai; Iberg, Aimee; Halbritter, Florian; Õunap, Katrin; Gecz, Jozef; Schlaeger, Thorsten M; Ho Sui, Shannan; Göke, Jonathan; He, Xi; Lehtinen, Maria K; Pomeroy, Scott L; Shi, Yang.

Nature ; 627(8004): 594-603, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38383780

ABSTRACT

Although KDM5C is one of the most frequently mutated genes in X-linked intellectual disability1, the exact mechanisms that lead to cognitive impairment remain unknown. Here we use human patient-derived induced pluripotent stem cells and Kdm5c knockout mice to conduct cellular, transcriptomic, chromatin and behavioural studies. KDM5C is identified as a safeguard to ensure that neurodevelopment occurs at an appropriate timescale, the disruption of which leads to intellectual disability. Specifically, there is a developmental window during which KDM5C directly controls WNT output to regulate the timely transition of primary to intermediate progenitor cells and consequently neurogenesis. Treatment with WNT signalling modulators at specific times reveal that only a transient alteration of the canonical WNT signalling pathway is sufficient to rescue the transcriptomic and chromatin landscapes in patient-derived cells and to induce these changes in wild-type cells. Notably, WNT inhibition during this developmental period also rescues behavioural changes of Kdm5c knockout mice. Conversely, a single injection of WNT3A into the brains of wild-type embryonic mice cause anxiety and memory alterations. Our work identifies KDM5C as a crucial sentinel for neurodevelopment and sheds new light on KDM5C mutation-associated intellectual disability. The results also increase our general understanding of memory and anxiety formation, with the identification of WNT functioning in a transient nature to affect long-lasting cognitive function.

Subject(s)

Cognition , Embryo, Mammalian , Embryonic Development , Histone Demethylases , Wnt Signaling Pathway , Animals , Humans , Mice , Anxiety , Chromatin/drug effects , Chromatin/genetics , Chromatin/metabolism , Embryo, Mammalian/metabolism , Gene Expression Profiling , Histone Demethylases/genetics , Histone Demethylases/metabolism , Induced Pluripotent Stem Cells/cytology , Induced Pluripotent Stem Cells/metabolism , Intellectual Disability/genetics , Memory , Mice, Knockout , Mutation , Neurogenesis/genetics , Wnt Signaling Pathway/drug effects

2.

Flexiplex: a versatile demultiplexer and search tool for omics data.

Cheng, Oliver; Ling, Min Hao; Wang, Changqing; Wu, Shuyi; Ritchie, Matthew E; Göke, Jonathan; Amin, Noorul; Davidson, Nadia M.

Bioinformatics ; 40(3)2024 Mar 04.

Article in English | MEDLINE | ID: mdl-38379414

ABSTRACT

MOTIVATION: The process of analyzing high throughput sequencing data often requires the identification and extraction of specific target sequences. This could include tasks, such as identifying cellular barcodes and UMIs in single-cell data, and specific genetic variants for genotyping. However, existing tools, which perform these functions are often task-specific, such as only demultiplexing barcodes for a dedicated type of experiment, or are not tolerant to noise in the sequencing data. RESULTS: To overcome these limitations, we developed Flexiplex, a versatile and fast sequence searching and demultiplexing tool for omics data, which is based on the Levenshtein distance and thus allows imperfect matches. We demonstrate Flexiplex's application on three use cases, identifying cell-line-specific sequences in Illumina short-read single-cell data, and discovering and demultiplexing cellular barcodes from noisy long-read single-cell RNA-seq data. We show that Flexiplex achieves an excellent balance of accuracy and computational efficiency compared to leading task-specific tools. AVAILABILITY AND IMPLEMENTATION: Flexiplex is available at https://davidsongroup.github.io/flexiplex/.

Subject(s)

Search Engine , Software , Sequence Analysis, DNA , High-Throughput Nucleotide Sequencing , Electronic Data Processing

3.

Systematic assessment of long-read RNA-seq methods for transcript identification and quantification.

Pardo-Palacios, Francisco J; Wang, Dingjie; Reese, Fairlie; Diekhans, Mark; Carbonell-Sala, Sílvia; Williams, Brian; Loveland, Jane E; De María, Maite; Adams, Matthew S; Balderrama-Gutierrez, Gabriela; Behera, Amit K; Gonzalez, Jose M; Hunt, Toby; Lagarde, Julien; Liang, Cindy E; Li, Haoran; Jerryd Meade, Marcus; Moraga Amador, David A; Prjibelski, Andrey D; Birol, Inanc; Bostan, Hamed; Brooks, Ashley M; Hasan Çelik, Muhammed; Chen, Ying; Du, Mei R M; Felton, Colette; Göke, Jonathan; Hafezqorani, Saber; Herwig, Ralf; Kawaji, Hideya; Lee, Joseph; Liang Li, Jian; Lienhard, Matthias; Mikheenko, Alla; Mulligan, Dennis; Ming Nip, Ka; Pertea, Mihaela; Ritchie, Matthew E; Sim, Andre D; Tang, Alison D; Kei Wan, Yuk; Wang, Changqing; Wong, Brandon Y; Yang, Chen; Barnes, If; Berry, Andrew; Capella, Salvador; Dhillon, Namrita; Fernandez-Gonzalez, Jose M; Ferrández-Peral, Luis.

bioRxiv ; 2023 Jul 27.

Article in English | MEDLINE | ID: mdl-37546854

ABSTRACT

The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. The consortium generated over 427 million long-read sequences from cDNA and direct RNA datasets, encompassing human, mouse, and manatee species, using different protocols and sequencing platforms. These data were utilized by developers to address challenges in transcript isoform detection and quantification, as well as de novo transcript isoform identification. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. When aiming to detect rare and novel transcripts or when using reference-free approaches, incorporating additional orthogonal data and replicate samples are advised. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.

4.

Context-aware transcript quantification from long-read RNA-seq data with Bambu.

Chen, Ying; Sim, Andre; Wan, Yuk Kei; Yeo, Keith; Lee, Joseph Jing Xian; Ling, Min Hao; Love, Michael I; Göke, Jonathan.

Nat Methods ; 20(8): 1187-1195, 2023 08.

Article in English | MEDLINE | ID: mdl-37308696

ABSTRACT

Most approaches to transcript quantification rely on fixed reference annotations; however, the transcriptome is dynamic and depending on the context, such static annotations contain inactive isoforms for some genes, whereas they are incomplete for others. Here we present Bambu, a method that performs machine-learning-based transcript discovery to enable quantification specific to the context of interest using long-read RNA-sequencing. To identify novel transcripts, Bambu estimates the novel discovery rate, which replaces arbitrary per-sample thresholds with a single, interpretable, precision-calibrated parameter. Bambu retains the full-length and unique read counts, enabling accurate quantification in presence of inactive isoforms. Compared to existing methods for transcript discovery, Bambu achieves greater precision without sacrificing sensitivity. We show that context-aware annotations improve quantification for both novel and known transcripts. We apply Bambu to quantify isoforms from repetitive HERVH-LTR7 retrotransposons in human embryonic stem cells, demonstrating the ability for context-specific transcript expression analysis.

Subject(s)

Gene Expression Profiling , Transcriptome , Humans , RNA-Seq , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Protein Isoforms/genetics

5.

Author Correction: Genomic basis for RNA alterations in cancer.

Calabrese, Claudia; Davidson, Natalie R; Demircioglu, Deniz; Fonseca, Nuno A; He, Yao; Kahles, André; Lehmann, Kjong-Van; Liu, Fenglin; Shiraishi, Yuichi; Soulette, Cameron M; Urban, Lara; Greger, Liliana; Li, Siliang; Liu, Dongbing; Perry, Marc D; Xiang, Qian; Zhang, Fan; Zhang, Junjun; Bailey, Peter; Erkek, Serap; Hoadley, Katherine A; Hou, Yong; Huska, Matthew R; Kilpinen, Helena; Korbel, Jan O; Marin, Maximillian G; Markowski, Julia; Nandi, Tannistha; Pan-Hammarström, Qiang; Pedamallu, Chandra Sekhar; Siebert, Reiner; Stark, Stefan G; Su, Hong; Tan, Patrick; Waszak, Sebastian M; Yung, Christina; Zhu, Shida; Awadalla, Philip; Creighton, Chad J; Meyerson, Matthew; Ouellette, B F Francis; Wu, Kui; Yang, Huanming; Brazma, Alvis; Brooks, Angela N; Göke, Jonathan; Rätsch, Gunnar; Schwarz, Roland F; Stegle, Oliver; Zhang, Zemin.

Nature ; 614(7948): E37, 2023 Feb.

Article in English | MEDLINE | ID: mdl-36697831

6.

Detection of m6A from direct RNA sequencing using a multiple instance learning framework.

Hendra, Christopher; Pratanwanich, Ploy N; Wan, Yuk Kei; Goh, W S Sho; Thiery, Alexandre; Göke, Jonathan.

Nat Methods ; 19(12): 1590-1598, 2022 12.

Article in English | MEDLINE | ID: mdl-36357692

ABSTRACT

RNA modifications such as m6A methylation form an additional layer of complexity in the transcriptome. Nanopore direct RNA sequencing can capture this information in the raw current signal for each RNA molecule, enabling the detection of RNA modifications using supervised machine learning. However, experimental approaches provide only site-level training data, whereas the modification status for each single RNA molecule is missing. Here we present m6Anet, a neural-network-based method that leverages the multiple instance learning framework to specifically handle missing read-level modification labels in site-level training data. m6Anet outperforms existing computational methods, shows similar accuracy as experimental approaches, and generalizes with high accuracy to different cell lines and species without retraining model parameters. In addition, we demonstrate that m6Anet captures the underlying read-level stoichiometry, which can be used to approximate differences in modification rates. Overall, m6Anet offers a tool to capture the transcriptome-wide identification and quantification of m6A from a single run of direct RNA sequencing.

Subject(s)

Nanopore Sequencing , RNA , RNA/genetics , RNA/metabolism , Sequence Analysis, RNA/methods , Methylation , Transcriptome

7.

JAFFAL: detecting fusion genes with long-read transcriptome sequencing.

Davidson, Nadia M; Chen, Ying; Sadras, Teresa; Ryland, Georgina L; Blombery, Piers; Ekert, Paul G; Göke, Jonathan; Oshlack, Alicia.

Genome Biol ; 23(1): 10, 2022 01 06.

Article in English | MEDLINE | ID: mdl-34991664

ABSTRACT

In cancer, fusions are important diagnostic markers and targets for therapy. Long-read transcriptome sequencing allows the discovery of fusions with their full-length isoform structure. However, due to higher sequencing error rates, fusion finding algorithms designed for short reads do not work. Here we present JAFFAL, to identify fusions from long-read transcriptome sequencing. We validate JAFFAL using simulations, cell lines, and patient data from Nanopore and PacBio. We apply JAFFAL to single-cell data and find fusions spanning three genes demonstrating transcripts detected from complex rearrangements. JAFFAL is available at https://github.com/Oshlack/JAFFA/wiki .

Subject(s)

High-Throughput Nucleotide Sequencing , Transcriptome , Algorithms , Gene Fusion , Humans , Sequence Analysis, DNA

8.

Beyond sequencing: machine learning algorithms extract biology hidden in Nanopore signal data.

Wan, Yuk Kei; Hendra, Christopher; Pratanwanich, Ploy N; Göke, Jonathan.

Trends Genet ; 38(3): 246-257, 2022 03.

Article in English | MEDLINE | ID: mdl-34711425

ABSTRACT

Nanopore sequencing provides signal data corresponding to the nucleotide motifs sequenced. Through machine learning-based methods, these signals are translated into long-read sequences that overcome the read size limit of short-read sequencing. However, analyzing the raw nanopore signal data provides many more opportunities beyond just sequencing genomes and transcriptomes: algorithms that use machine learning approaches to extract biological information from these signals allow the detection of DNA and RNA modifications, the estimation of poly(A) tail length, and the prediction of RNA secondary structures. In this review, we discuss how developments in machine learning methodologies contributed to more accurate basecalling and lower error rates, and how these methods enable new biological discoveries. We argue that direct nanopore sequencing of DNA and RNA provides a new dimensionality for genomics experiments and highlight challenges and future directions for computational approaches to extract the additional information provided by nanopore signal data.

Subject(s)

Nanopore Sequencing , Nanopores , Algorithms , Genomics , High-Throughput Nucleotide Sequencing/methods , Machine Learning , Sequence Analysis, DNA/methods

9.

Epigenetic promoter alterations in GI tumour immune-editing and resistance to immune checkpoint inhibition.

Sundar, Raghav; Huang, Kie-Kyon; Kumar, Vikrant; Ramnarayanan, Kalpana; Demircioglu, Deniz; Her, Zhisheng; Ong, Xuewen; Bin Adam Isa, Zul Fazreen; Xing, Manjie; Tan, Angie Lay-Keng; Tai, David Wai Meng; Choo, Su Pin; Zhai, Weiwei; Lim, Jia Qi; Das Thakur, Meghna; Molinero, Luciana; Cha, Edward; Fasso, Marcella; Niger, Monica; Pietrantonio, Filippo; Lee, Jeeyun; Jeyasekharan, Anand D; Qamra, Aditi; Patnala, Radhika; Fabritius, Arne; De Simone, Mark; Yeong, Joe; Ng, Cedric Chuan Young; Rha, Sun Young; Narita, Yukiya; Muro, Kei; Guo, Yu Amanda; Skanderup, Anders Jacobsen; So, Jimmy Bok Yan; Yong, Wei Peng; Chen, Qingfeng; Göke, Jonathan; Tan, Patrick.

Gut ; 71(7): 1277-1288, 2022 07.

Article in English | MEDLINE | ID: mdl-34433583

ABSTRACT

OBJECTIVES: Epigenomic alterations in cancer interact with the immune microenvironment to dictate tumour evolution and therapeutic response. We aimed to study the regulation of the tumour immune microenvironment through epigenetic alternate promoter use in gastric cancer and to expand our findings to other gastrointestinal tumours. DESIGN: Alternate promoter burden (APB) was quantified using a novel bioinformatic algorithm (proActiv) to infer promoter activity from short-read RNA sequencing and samples categorised into APBhigh, APBint and APBlow. Single-cell RNA sequencing was performed to analyse the intratumour immune microenvironment. A humanised mouse cancer in vivo model was used to explore dynamic temporal interactions between tumour kinetics, alternate promoter usage and the human immune system. Multiple cohorts of gastrointestinal tumours treated with immunotherapy were assessed for correlation between APB and treatment outcomes. RESULTS: APBhigh gastric cancer tumours expressed decreased levels of T-cell cytolytic activity and exhibited signatures of immune depletion. Single-cell RNAsequencing analysis confirmed distinct immunological populations and lower T-cell proportions in APBhigh tumours. Functional in vivo studies using 'humanised mice' harbouring an active human immune system revealed distinct temporal relationships between APB and tumour growth, with APBhigh tumours having almost no human T-cell infiltration. Analysis of immunotherapy-treated patients with GI cancer confirmed resistance of APBhigh tumours to immune checkpoint inhibition. APBhigh gastric cancer exhibited significantly poorer progression-free survival compared with APBlow (median 55 days vs 121 days, HR 0.40, 95% CI 0.18 to 0.93, p=0.032). CONCLUSION: These findings demonstrate an association between alternate promoter use and the tumour microenvironment, leading to immune evasion and immunotherapy resistance.

Subject(s)

Gastrointestinal Neoplasms , Stomach Neoplasms , Animals , Epigenesis, Genetic , Epigenomics , Gastrointestinal Neoplasms/genetics , Gastrointestinal Neoplasms/therapy , Humans , Immune Checkpoint Inhibitors/pharmacology , Immune Checkpoint Inhibitors/therapeutic use , Immunotherapy , Mice , Stomach Neoplasms/drug therapy , Stomach Neoplasms/therapy , Tumor Microenvironment

10.

Antisense RNAs Influence Promoter Usage of Their Counterpart Sense Genes in Cancer.

Bellido Molias, Fernando; Sim, Andre; Leong, Ka Wai; An, Omer; Song, Yangyang; Ng, Vanessa Hui En; Lim, Max Wei Jie; Ying, Chen; Teo, Jasmin Xin Jia; Göke, Jonathan; Chen, Leilei.

Cancer Res ; 81(23): 5849-5861, 2021 12 01.

Article in English | MEDLINE | ID: mdl-34649947

ABSTRACT

Multiple noncoding natural antisense transcripts (ncNAT) are known to modulate key biological events such as cell growth or differentiation. However, the actual impact of ncNATs on cancer progression remains largely unknown. In this study, we identified a complete list of differentially expressed ncNATs in hepatocellular carcinoma. Among them, a previously undescribed ncNAT HNF4A-AS1L suppressed cancer cell growth by regulating its sense gene HNF4A, a well-known cancer driver, through a promoter-specific mechanism. HNF4A-AS1L selectively activated the HNF4A P1 promoter via HNF1A, which upregulated expression of tumor suppressor P1-driven isoforms, while having no effect on the oncogenic P2 promoter. RNA-seq data from 23 tissue and cancer types identified approximately 100 ncNATs whose expression correlated specifically with the activity of one promoter of their associated sense gene. Silencing of two of these ncNATs ENSG00000259357 and ENSG00000255031 (antisense to CERS2 and CHKA, respectively) altered the promoter usage of CERS2 and CHKA. Altogether, these results demonstrate that promoter-specific regulation is a mechanism used by ncNATs for context-specific control of alternative isoform expression of their counterpart sense genes. SIGNIFICANCE: This study characterizes a previously unexplored role of ncNATs in regulation of isoform expression of associated sense genes, highlighting a mechanism of alternative promoter usage in cancer.

Subject(s)

Carcinoma, Hepatocellular/pathology , Choline Kinase/metabolism , Hepatocyte Nuclear Factor 4/metabolism , Liver Neoplasms/pathology , Membrane Proteins/metabolism , Promoter Regions, Genetic , RNA, Antisense/genetics , Sphingosine N-Acyltransferase/metabolism , Tumor Suppressor Proteins/metabolism , Animals , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , Carcinoma, Hepatocellular/genetics , Carcinoma, Hepatocellular/metabolism , Choline Kinase/antagonists & inhibitors , Choline Kinase/genetics , Gene Expression Regulation, Neoplastic , Hepatocyte Nuclear Factor 4/antagonists & inhibitors , Hepatocyte Nuclear Factor 4/genetics , Humans , Liver Neoplasms/genetics , Liver Neoplasms/metabolism , Male , Membrane Proteins/antagonists & inhibitors , Membrane Proteins/genetics , Mice , Mice, SCID , Prognosis , Sphingosine N-Acyltransferase/antagonists & inhibitors , Sphingosine N-Acyltransferase/genetics , Tumor Cells, Cultured , Tumor Suppressor Proteins/antagonists & inhibitors , Tumor Suppressor Proteins/genetics , Xenograft Model Antitumor Assays

11.

Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers.

Wratten, Laura; Wilm, Andreas; Göke, Jonathan.

Nat Methods ; 18(10): 1161-1168, 2021 10.

Article in English | MEDLINE | ID: mdl-34556866

ABSTRACT

The rapid growth of high-throughput technologies has transformed biomedical research. With the increasing amount and complexity of data, scalability and reproducibility have become essential not just for experiments, but also for computational analysis. However, transforming data into information involves running a large number of tools, optimizing parameters, and integrating dynamically changing reference data. Workflow managers were developed in response to such challenges. They simplify pipeline development, optimize resource usage, handle software installation and versions, and run on different compute platforms, enabling workflow portability and sharing. In this Perspective, we highlight key features of workflow managers, compare commonly used approaches for bioinformatics workflows, and provide a guide for computational and noncomputational users. We outline community-curated pipeline initiatives that enable novice and experienced users to perform complex, best-practice analyses without having to manually assemble workflows. In sum, we illustrate how workflow managers contribute to making computational analysis in biomedical research shareable, scalable, and reproducible.

Subject(s)

Biomedical Research/methods , Biomedical Research/standards , Computational Biology/methods , Workflow , Reproducibility of Results

12.

Integrative analysis of epigenetics data identifies gene-specific regulatory elements.

Schmidt, Florian; Marx, Alexander; Baumgarten, Nina; Hebel, Marie; Wegner, Martin; Kaulich, Manuel; Leisegang, Matthias S; Brandes, Ralf P; Göke, Jonathan; Vreeken, Jilles; Schulz, Marcel H.

Nucleic Acids Res ; 49(18): 10397-10418, 2021 10 11.

Article in English | MEDLINE | ID: mdl-34508352

ABSTRACT

Understanding how epigenetic variation in non-coding regions is involved in distal gene-expression regulation is an important problem. Regulatory regions can be associated to genes using large-scale datasets of epigenetic and expression data. However, for regions of complex epigenomic signals and enhancers that regulate many genes, it is difficult to understand these associations. We present StitchIt, an approach to dissect epigenetic variation in a gene-specific manner for the detection of regulatory elements (REMs) without relying on peak calls in individual samples. StitchIt segments epigenetic signal tracks over many samples to generate the location and the target genes of a REM simultaneously. We show that this approach leads to a more accurate and refined REM detection compared to standard methods even on heterogeneous datasets, which are challenging to model. Also, StitchIt REMs are highly enriched in experimentally determined chromatin interactions and expression quantitative trait loci. We validated several newly predicted REMs using CRISPR-Cas9 experiments, thereby demonstrating the reliability of StitchIt. StitchIt is able to dissect regulation in superenhancers and predicts thousands of putative REMs that go unnoticed using peak-based approaches suggesting that a large part of the regulome might be uncharted water.

Subject(s)

Chromatin/metabolism , Data Analysis , Enhancer Elements, Genetic , Epigenesis, Genetic , Gene Expression Regulation , Human Umbilical Vein Endothelial Cells , Humans

13.

Identification of differential RNA modifications from nanopore direct RNA sequencing with xPore.

Pratanwanich, Ploy N; Yao, Fei; Chen, Ying; Koh, Casslynn W Q; Wan, Yuk Kei; Hendra, Christopher; Poon, Polly; Goh, Yeek Teck; Yap, Phoebe M L; Chooi, Jing Yuan; Chng, Wee Joo; Ng, Sarah B; Thiery, Alexandre; Goh, W S Sho; Göke, Jonathan.

Nat Biotechnol ; 39(11): 1394-1402, 2021 11.

Article in English | MEDLINE | ID: mdl-34282325

ABSTRACT

RNA modifications, such as N6-methyladenosine (m6A), modulate functions of cellular RNA species. However, quantifying differences in RNA modifications has been challenging. Here we develop a computational method, xPore, to identify differential RNA modifications from nanopore direct RNA sequencing (RNA-seq) data. We evaluate our method on transcriptome-wide m6A profiling data, demonstrating that xPore identifies positions of m6A sites at single-base resolution, estimates the fraction of modified RNA species in the cell and quantifies the differential modification rate across conditions. We apply xPore to direct RNA-seq data from six cell lines and multiple myeloma patient samples without a matched control sample and find that many m6A sites are preserved across cell types, whereas a subset exhibit significant differences in their modification rates. Our results show that RNA modifications can be identified from direct RNA-seq data with high accuracy, enabling analysis of differential modifications and expression from a single high-throughput experiment.

Subject(s)

Nanopore Sequencing , Nanopores , High-Throughput Nucleotide Sequencing , Humans , RNA/genetics , RNA/metabolism , RNA Processing, Post-Transcriptional/genetics , Sequence Analysis, RNA/methods , Transcriptome/genetics

14.

Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer.

Huang, Kie Kyon; Huang, Jiawen; Wu, Jeanie Kar Leng; Lee, Minghui; Tay, Su Ting; Kumar, Vikrant; Ramnarayanan, Kalpana; Padmanabhan, Nisha; Xu, Chang; Tan, Angie Lay Keng; Chan, Charlene; Kappei, Dennis; Göke, Jonathan; Tan, Patrick.

Genome Biol ; 22(1): 44, 2021 01 22.

Article in English | MEDLINE | ID: mdl-33482911

ABSTRACT

BACKGROUND: Deregulated gene expression is a hallmark of cancer; however, most studies to date have analyzed short-read RNA sequencing data with inherent limitations. Here, we combine PacBio long-read isoform sequencing (Iso-Seq) and Illumina paired-end short-read RNA sequencing to comprehensively survey the transcriptome of gastric cancer (GC), a leading cause of global cancer mortality. RESULTS: We performed full-length transcriptome analysis across 10 GC cell lines covering four major GC molecular subtypes (chromosomal unstable, Epstein-Barr positive, genome stable and microsatellite unstable). We identify 60,239 non-redundant full-length transcripts, of which > 66% are novel compared to current transcriptome databases. Novel isoforms are more likely to be cell line and subtype specific, expressed at lower levels with larger number of exons, with longer isoform/coding sequence lengths. Most novel isoforms utilize an alternate first exon, and compared to other alternative splicing categories, are expressed at higher levels and exhibit higher variability. Collectively, we observe alternate promoter usage in 25% of detected genes, with the majority (84.2%) of known/novel promoter pairs exhibiting potential changes in their coding sequences. Mapping these alternate promoters to TCGA GC samples, we identify several cancer-associated isoforms, including novel variants of oncogenes. Tumor-specific transcript isoforms tend to alter protein coding sequences to a larger extent than other isoforms. Analysis of outcome data suggests that novel isoforms may impart additional prognostic information. CONCLUSIONS: Our results provide a rich resource of full-length transcriptome data for deeper studies of GC and other gastrointestinal malignancies.

Subject(s)

Stomach Neoplasms/genetics , Stomach Neoplasms/metabolism , Transcriptome , Adaptor Proteins, Signal Transducing , Alternative Splicing , Cell Line, Tumor , Exons , Gene Expression Profiling , Genome , Humans , Open Reading Frames , Protein Isoforms , Sequence Analysis, RNA

15.

Genomic basis for RNA alterations in cancer.

Calabrese, Claudia; Davidson, Natalie R; Demircioglu, Deniz; Fonseca, Nuno A; He, Yao; Kahles, André; Lehmann, Kjong-Van; Liu, Fenglin; Shiraishi, Yuichi; Soulette, Cameron M; Urban, Lara; Greger, Liliana; Li, Siliang; Liu, Dongbing; Perry, Marc D; Xiang, Qian; Zhang, Fan; Zhang, Junjun; Bailey, Peter; Erkek, Serap; Hoadley, Katherine A; Hou, Yong; Huska, Matthew R; Kilpinen, Helena; Korbel, Jan O; Marin, Maximillian G; Markowski, Julia; Nandi, Tannistha; Pan-Hammarström, Qiang; Pedamallu, Chandra Sekhar; Siebert, Reiner; Stark, Stefan G; Su, Hong; Tan, Patrick; Waszak, Sebastian M; Yung, Christina; Zhu, Shida; Awadalla, Philip; Creighton, Chad J; Meyerson, Matthew; Ouellette, B F Francis; Wu, Kui; Yang, Huanming; Brazma, Alvis; Brooks, Angela N; Göke, Jonathan; Rätsch, Gunnar; Schwarz, Roland F; Stegle, Oliver; Zhang, Zemin.

Nature ; 578(7793): 129-136, 2020 02.

Article in English | MEDLINE | ID: mdl-32025019

ABSTRACT

Transcript alterations often result from somatic changes in cancer genomes1. Various forms of RNA alterations have been described in cancer, including overexpression2, altered splicing3 and gene fusions4; however, it is difficult to attribute these to underlying genomic changes owing to heterogeneity among patients and tumour types, and the relatively small cohorts of patients for whom samples have been analysed by both transcriptome and whole-genome sequencing. Here we present, to our knowledge, the most comprehensive catalogue of cancer-associated gene alterations to date, obtained by characterizing tumour transcriptomes from 1,188 donors of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA)5. Using matched whole-genome sequencing data, we associated several categories of RNA alterations with germline and somatic DNA alterations, and identified probable genetic mechanisms. Somatic copy-number alterations were the major drivers of variations in total gene and allele-specific expression. We identified 649 associations of somatic single-nucleotide variants with gene expression in cis, of which 68.4% involved associations with flanking non-coding regions of the gene. We found 1,900 splicing alterations associated with somatic mutations, including the formation of exons within introns in proximity to Alu elements. In addition, 82% of gene fusions were associated with structural variants, including 75 of a new class, termed 'bridged' fusions, in which a third genomic location bridges two genes. We observed transcriptomic alteration signatures that differ between cancer types and have associations with variations in DNA mutational signatures. This compendium of RNA alterations in the genomic context provides a rich resource for identifying genes and mechanisms that are functionally implicated in cancer.

Subject(s)

Gene Expression Regulation, Neoplastic , Neoplasms/genetics , RNA/genetics , DNA Copy Number Variations , DNA, Neoplasm , Genome, Human , Genomics , Humans , Transcriptome

16.

Multiple Myeloma DREAM Challenge reveals epigenetic regulator PHF19 as marker of aggressive disease.

Mason, Mike J; Schinke, Carolina; Eng, Christine L P; Towfic, Fadi; Gruber, Fred; Dervan, Andrew; White, Brian S; Pratapa, Aditya; Guan, Yuanfang; Chen, Hongjie; Cui, Yi; Li, Bailiang; Yu, Thomas; Chaibub Neto, Elias; Mavrommatis, Konstantinos; Ortiz, Maria; Lyzogubov, Valeriy; Bisht, Kamlesh; Dai, Hongyue Y; Schmitz, Frank; Flynt, Erin; Danziger, Samuel A; Ratushny, Alexander; Dalton, William S; Goldschmidt, Hartmut; Avet-Loiseau, Herve; Samur, Mehmet; Hayete, Boris; Sonneveld, Pieter; Shain, Kenneth H; Munshi, Nikhil; Auclair, Daniel; Hose, Dirk; Morgan, Gareth; Trotter, Matthew; Bassett, Douglas; Goke, Jonathan; Walker, Brian A; Thakurta, Anjan; Guinney, Justin.

Leukemia ; 34(7): 1866-1874, 2020 07.

Article in English | MEDLINE | ID: mdl-32060406

ABSTRACT

While the past decade has seen meaningful improvements in clinical outcomes for multiple myeloma patients, a subset of patients does not benefit from current therapeutics for unclear reasons. Many gene expression-based models of risk have been developed, but each model uses a different combination of genes and often involves assaying many genes making them difficult to implement. We organized the Multiple Myeloma DREAM Challenge, a crowdsourced effort to develop models of rapid progression in newly diagnosed myeloma patients and to benchmark these against previously published models. This effort lead to more robust predictors and found that incorporating specific demographic and clinical features improved gene expression-based models of high risk. Furthermore, post-challenge analysis identified a novel expression-based risk marker, PHF19, which has recently been found to have an important biological role in multiple myeloma. Lastly, we show that a simple four feature predictor composed of age, ISS, and expression of PHF19 and MMSET performs similarly to more complex models with many more gene expression features included.

Subject(s)

Biomarkers, Tumor/metabolism , Clinical Trials as Topic/statistics & numerical data , DNA-Binding Proteins/metabolism , Epigenesis, Genetic , Gene Expression Regulation, Neoplastic , Models, Statistical , Multiple Myeloma/pathology , Transcription Factors/metabolism , Biomarkers, Tumor/genetics , Cell Cycle , Cell Proliferation , DNA-Binding Proteins/genetics , Databases, Factual , Datasets as Topic , Humans , Multiple Myeloma/genetics , Multiple Myeloma/metabolism , Transcription Factors/genetics , Tumor Cells, Cultured

17.

A Chemically Defined Feeder-free System for the Establishment and Maintenance of the Human Naive Pluripotent State.

Szczerbinska, Iwona; Gonzales, Kevin Andrew Uy; Cukuroglu, Engin; Ramli, Muhammad Nadzim Bin; Lee, Bertha Pei Ge; Tan, Cheng Peow; Wong, Cheng Kit; Rancati, Giulia Irene; Liang, Hongqing; Göke, Jonathan; Ng, Huck-Hui; Chan, Yun-Shen.

Stem Cell Reports ; 13(4): 612-626, 2019 10 08.

Article in English | MEDLINE | ID: mdl-31522974

ABSTRACT

The distinct states of pluripotency in the pre- and post-implantation embryo can be captured in vitro as naive and primed pluripotent stem cell cultures, respectively. The study and application of the naive state remains hampered, particularly in humans, partially due to current culture protocols relying on extraneous undefined factors such as feeders. Here we performed a small-molecule screen to identify compounds that facilitate chemically defined establishment and maintenance of human feeder-independent naive embryonic (FINE) stem cells. The expression profile in genic and repetitive elements of FINE cells resembles the 8-cell-to-morula stage in vivo, and only differs from feeder-dependent naive cells in genes involved in cell-cell/cell-matrix interactions. FINE cells offer several technical advantages, such as increased amenability to transfection and a longer period of genomic stability, compared with feeder-dependent cells. Thus, FINE cells will serve as an accessible and useful system for scientific and translational applications of naïve pluripotent stem cells.

Subject(s)

Cell Culture Techniques , Cell Self Renewal/drug effects , Pluripotent Stem Cells/cytology , Pluripotent Stem Cells/drug effects , Biomarkers , Cell Survival/drug effects , Dasatinib/pharmacology , Drug Discovery/methods , Feeder Cells , High-Throughput Screening Assays , Human Embryonic Stem Cells/cytology , Human Embryonic Stem Cells/metabolism , Humans , Imidazoles/pharmacology , Pluripotent Stem Cells/metabolism , Pyrimidines/pharmacology , Small Molecule Libraries

18.

A Pan-cancer Transcriptome Analysis Reveals Pervasive Regulation through Alternative Promoters.

Demircioglu, Deniz; Cukuroglu, Engin; Kindermans, Martin; Nandi, Tannistha; Calabrese, Claudia; Fonseca, Nuno A; Kahles, André; Lehmann, Kjong-Van; Stegle, Oliver; Brazma, Alvis; Brooks, Angela N; Rätsch, Gunnar; Tan, Patrick; Göke, Jonathan.

Cell ; 178(6): 1465-1477.e17, 2019 09 05.

Article in English | MEDLINE | ID: mdl-31491388

ABSTRACT

Most human protein-coding genes are regulated by multiple, distinct promoters, suggesting that the choice of promoter is as important as its level of transcriptional activity. However, while a global change in transcription is recognized as a defining feature of cancer, the contribution of alternative promoters still remains largely unexplored. Here, we infer active promoters using RNA-seq data from 18,468 cancer and normal samples, demonstrating that alternative promoters are a major contributor to context-specific regulation of transcription. We find that promoters are deregulated across tissues, cancer types, and patients, affecting known cancer genes and novel candidates. For genes with independently regulated promoters, we demonstrate that promoter activity provides a more accurate predictor of patient survival than gene expression. Our study suggests that a dynamic landscape of active promoters shapes the cancer transcriptome, opening new diagnostic avenues and opportunities to further explore the interplay of regulatory mechanisms with transcriptional aberrations in cancer.

Subject(s)

Computational Biology/methods , Gene Expression Regulation, Neoplastic/genetics , Neoplasms/genetics , Promoter Regions, Genetic/genetics , Transcriptome/genetics , Databases, Genetic , Humans , RNA-Seq/methods

19.

Pathogenic Epigenetic Consequences of Genetic Alterations in IDH-Wild-Type Diffuse Astrocytic Gliomas.

Ohka, Fumiharu; Shinjo, Keiko; Deguchi, Shoichi; Matsui, Yusuke; Okuno, Yusuke; Katsushima, Keisuke; Suzuki, Miho; Kato, Akira; Ogiso, Noboru; Yamamichi, Akane; Aoki, Kosuke; Suzuki, Hiromichi; Sato, Shinya; Arul Rayan, Nirmala; Prabhakar, Shyam; Göke, Jonathan; Shimamura, Teppei; Maruyama, Reo; Takahashi, Satoru; Suzumura, Akio; Kimura, Hiroshi; Wakabayashi, Toshihiko; Zong, Hui; Natsume, Atsushi; Kondo, Yutaka.

Cancer Res ; 79(19): 4814-4827, 2019 Oct 01.

Article in English | MEDLINE | ID: mdl-31431463

ABSTRACT

Gliomas are classified by combining histopathologic and molecular features, including isocitrate dehydrogenase (IDH) status. Although IDH-wild-type diffuse astrocytic glioma (DAG) shows a more aggressive phenotype than IDH-mutant type, lack of knowledge regarding relevant molecular drivers for this type of tumor has hindered the development of therapeutic agents. Here, we examined human IDH-wild-type DAGs and a glioma mouse model with a mosaic analysis with double markers (MADM) system, which concurrently lacks p53 and NF1 and spontaneously develops tumors highly comparable with human IDH-wild-type DAG without characteristic molecular features of glioblastoma (DAG-nonMF). During tumor formation, enhancer of zeste homolog (EZH2) and the other polycomb repressive complex 2 (PRC2) components were upregulated even at an early stage of tumorigenesis, together with an increased number of genes with H3K27me3 or H3K27me3 and H3K4me3 bivalent modifications. Among the epigenetically dysregulated genes, frizzled-8 (Fzd8), which is known to be a cancer- and stem cell reprogramming-related gene, was gradually silenced during tumorigenesis. Genetic and pharmacologic inhibition of EZH2 in MADM mice showed reactivation of aberrant H3K27me3 target genes, including Fzd8, together with significant reduction of tumor size. Our study clarifies a pathogenic molecular pathway of IDH-wild-type DAG-nonMF that depends on EZH2 activity and provides a strong rationale for targeting EZH2 as a promising therapeutic approach for this type of glioma. SIGNIFICANCE: EZH2 is involved in the generation of IDH-wild-type diffuse astrocytic gliomas and is a potential therapeutic target for this type of glioma. GRAPHICAL ABSTRACT: http://cancerres.aacrjournals.org/content/canres/79/19/4814/F1.large.jpg.

Subject(s)

Astrocytoma/genetics , Astrocytoma/pathology , Enhancer of Zeste Homolog 2 Protein/metabolism , Epigenesis, Genetic/genetics , Animals , Astrocytoma/metabolism , Enhancer of Zeste Homolog 2 Protein/genetics , Humans , Isocitrate Dehydrogenase/genetics , Mice , Mice, Transgenic

20.

Zscan4c activates endogenous retrovirus MERVL and cleavage embryo genes.

Zhang, Weiyu; Chen, Fuquan; Chen, Ruiqing; Xie, Dan; Yang, Jiao; Zhao, Xin; Guo, Renpeng; Zhang, Yongwang; Shen, Yang; Göke, Jonathan; Liu, Lin; Lu, Xinyi.

Nucleic Acids Res ; 47(16): 8485-8501, 2019 09 19.

Article in English | MEDLINE | ID: mdl-31304534

ABSTRACT

Endogenous retroviruses (ERVs) contribute to â¼10 percent of the mouse genome. They are often silenced in differentiated somatic cells but differentially expressed at various embryonic developmental stages. A minority of mouse embryonic stem cells (ESCs), like 2-cell cleavage embryos, highly express ERV MERVL. However, the role of ERVs and mechanism of their activation in these cells are still poorly understood. In this study, we investigated the regulation and function of the stage-specific expressed ERVs, with a particular focus on the totipotency marker MT2/MERVL. We show that the transcription factor Zscan4c functions as an activator of MT2/MERVL and 2-cell/4-cell embryo genes. Zinc finger domains of Zscan4c play an important role in this process. In addition, Zscan4c interacts with MT2 and regulates MT2-nearby 2-cell/4-cell genes through promoting enhancer activity of MT2. Furthermore, MT2 activation is accompanied by enhanced H3K4me1, H3K27ac, and H3K14ac deposition on MT2. Zscan4c also interacts with GBAF chromatin remodelling complex through SCAN domain to further activate MT2 enhancer activity. Taken together, we delineate a previously unrecognized regulatory axis that Zscan4c interacts with and activates MT2/MERVL loci and their nearby genes through epigenetic regulation.

Subject(s)

Endogenous Retroviruses/genetics , Gene Expression Regulation, Developmental , Genome , Histones/metabolism , Retroelements , Transcription Factors/genetics , Animals , Chromatin/chemistry , Chromatin/metabolism , Embryo, Mammalian , Endogenous Retroviruses/metabolism , Enhancer Elements, Genetic , Epigenesis, Genetic , Gene Expression Profiling , Gene Ontology , Histones/genetics , Mice , Molecular Sequence Annotation , Mouse Embryonic Stem Cells/cytology , Mouse Embryonic Stem Cells/metabolism , Transcription Factors/metabolism

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL