RESUMO
Alternative splicing (AS) is prevalent in cancer, generating an extensive but largely unexplored repertoire of novel immunotherapy targets. We describe Isoform peptides from RNA splicing for Immunotherapy target Screening (IRIS), a computational platform capable of discovering AS-derived tumor antigens (TAs) for T cell receptor (TCR) and chimeric antigen receptor T cell (CAR-T) therapies. IRIS leverages large-scale tumor and normal transcriptome data and incorporates multiple screening approaches to discover AS-derived TAs with tumor-associated or tumor-specific expression. In a proof-of-concept analysis integrating transcriptomics and immunopeptidomics data, we showed that hundreds of IRIS-predicted TCR targets are presented by human leukocyte antigen (HLA) molecules. We applied IRIS to RNA-seq data of neuroendocrine prostate cancer (NEPC). From 2,939 NEPC-associated AS events, IRIS predicted 1,651 epitopes from 808 events as potential TCR targets for two common HLA types (A*02:01 and A*03:01). A more stringent screening test prioritized 48 epitopes from 20 events with "neoantigen-like" NEPC-specific expression. Predicted epitopes are often encoded by microexons of ≤30 nucleotides. To validate the immunogenicity and T cell recognition of IRIS-predicted TCR epitopes, we performed in vitro T cell priming in combination with single-cell TCR sequencing. Seven TCRs transduced into human peripheral blood mononuclear cells (PBMCs) showed high activity against individual IRIS-predicted epitopes, providing strong evidence of isolated TCRs reactive to AS-derived peptides. One selected TCR showed efficient cytotoxicity against target cells expressing the target peptide. Our study illustrates the contribution of AS to the TA repertoire of cancer cells and demonstrates the utility of IRIS for discovering AS-derived TAs and expanding cancer immunotherapies.
Assuntos
Neoplasias , Precursores de RNA , Masculino , Humanos , Precursores de RNA/metabolismo , Processamento Alternativo , Leucócitos Mononucleares/metabolismo , Receptores de Antígenos de Linfócitos T , Epitopos de Linfócito T , Imunoterapia , Antígenos de Neoplasias , Peptídeos/metabolismo , Neoplasias/genética , Neoplasias/terapiaRESUMO
Pre-mRNA alternative splicing is a prevalent mechanism for diversifying eukaryotic transcriptomes and proteomes. Regulated alternative splicing plays a role in many biological processes, and dysregulated alternative splicing is a feature of many human diseases. Short-read RNA sequencing (RNA-seq) is now the standard approach for transcriptome-wide analysis of alternative splicing. Since 2011, our laboratory has developed and maintained Replicate Multivariate Analysis of Transcript Splicing (rMATS), a computational tool for discovering and quantifying alternative splicing events from RNA-seq data. Here we provide a protocol for the contemporary version of rMATS, rMATS-turbo, a fast and scalable re-implementation that maintains the statistical framework and user interface of the original rMATS software, while incorporating a revamped computational workflow with a substantial improvement in speed and data storage efficiency. The rMATS-turbo software scales up to massive RNA-seq datasets with tens of thousands of samples. To illustrate the utility of rMATS-turbo, we describe two representative application scenarios. First, we describe a broadly applicable two-group comparison to identify differential alternative splicing events between two sample groups, including both annotated and novel alternative splicing events. Second, we describe a quantitative analysis of alternative splicing in a large-scale RNA-seq dataset (~1,000 samples), including the discovery of alternative splicing events associated with distinct cell states. We detail the workflow and features of rMATS-turbo that enable efficient parallel processing and analysis of large-scale RNA-seq datasets on a compute cluster. We anticipate that this protocol will help the broad user base of rMATS-turbo make the best use of this software for studying alternative splicing in diverse biological systems.
Assuntos
Processamento Alternativo , RNA , Humanos , RNA/genética , RNA-Seq , Splicing de RNA , Software , Análise de Sequência de RNA/métodos , Análise MultivariadaRESUMO
Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improve the identification of splice junctions and the discovery of their corresponding transcript isoforms. On both a synthetic spike-in RNA sample and human RNA samples, ESPRESSO outperforms multiple contemporary tools in not only transcript isoform discovery but also transcript isoform quantification. In total, we generated and analyzed ~1.1 billion nanopore RNA-seq reads covering 30 human tissue samples and three human cell lines. ESPRESSO and its companion dataset provide a useful resource for studying the RNA repertoire of eukaryotic transcriptomes.
Assuntos
RNA , Transcriptoma , Humanos , RNA/genética , RNA-Seq , Análise de Sequência de RNA , Isoformas de Proteínas/genética , Perfilação da Expressão GênicaRESUMO
Long-read RNA sequencing (RNA-seq) is a powerful technology for transcriptome analysis, but the relatively low throughput of current long-read sequencing platforms limits transcript coverage. One strategy for overcoming this bottleneck is targeted long-read RNA-seq for preselected gene panels. We present TEQUILA-seq, a versatile, easy-to-implement, and low-cost method for targeted long-read RNA-seq utilizing isothermally linear-amplified capture probes. When performed on the Oxford nanopore platform with multiple gene panels of varying sizes, TEQUILA-seq consistently and substantially enriches transcript coverage while preserving transcript quantification. We profile full-length transcript isoforms of 468 actionable cancer genes across 40 representative breast cancer cell lines. We identify transcript isoforms enriched in specific subtypes and discover novel transcript isoforms in extensively studied cancer genes such as TP53. Among cancer genes, tumor suppressor genes (TSGs) are significantly enriched for aberrant transcript isoforms targeted for degradation via mRNA nonsense-mediated decay, revealing a common RNA-associated mechanism for TSG inactivation. TEQUILA-seq reduces the per-reaction cost of targeted capture by 2-3 orders of magnitude, as compared to a standard commercial solution. TEQUILA-seq can be broadly used for targeted sequencing of full-length transcripts in diverse biomedical research settings.
Assuntos
Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , RNA/genética , Isoformas de Proteínas/genética , Transcriptoma/genéticaRESUMO
Cancer transcriptomes frequently exhibit RNA dysregulation. As the resulting aberrant transcripts may be translated into cancer-specific proteins, there is growing interest in exploiting RNA dysregulation as a source of tumor antigens (TAs) and thus novel immunotherapy targets. Recent advances in high-throughput technologies and rapid accumulation of multiomic cancer profiling data in public repositories have provided opportunities to systematically characterize RNA dysregulation in cancer and identify antigen targets for immunotherapy. However, given the complexity of cancer transcriptomes and proteomes, important conceptual and technological challenges exist. Here, we highlight the expanding repertoire of TAs arising from RNA dysregulation and introduce multiomic and big data strategies for identifying optimal immunotherapy targets. We discuss extant barriers for translating these targets into effective therapies as well as the implications for future research.
Assuntos
Neoplasias , RNA , Humanos , Imunoterapia , Neoplasias/genética , Neoplasias/terapia , Proteoma , RNA/genética , TranscriptomaRESUMO
Circular RNAs (circRNAs) have emerged as an important class of functional RNA molecules. Short-read RNA sequencing (RNA-seq) is a widely used strategy to identify circRNAs. However, an inherent limitation of short-read RNA-seq is that it does not experimentally determine the full-length sequences and exact exonic compositions of circRNAs. Here, we report isoCirc, a strategy for sequencing full-length circRNA isoforms, using rolling circle amplification followed by nanopore long-read sequencing. We describe an integrated computational pipeline to reliably characterize full-length circRNA isoforms using isoCirc data. Using isoCirc, we generate a comprehensive catalog of 107,147 full-length circRNA isoforms across 12 human tissues and one human cell line (HEK293), including 40,628 isoforms ≥500 nt in length. We identify widespread alternative splicing events within the internal part of circRNAs, including 720 retained intron events corresponding to a class of exon-intron circRNAs (EIciRNAs). Collectively, isoCirc and the companion dataset provide a useful strategy and resource for studying circRNAs in human transcriptomes.