Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Nat Methods ; 21(7): 1349-1363, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38849569

RESUMO

The Long-read RNA-Seq Genome Annotation Assessment Project Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. Using different protocols and sequencing platforms, the consortium generated over 427 million long-read sequences from complementary DNA and direct RNA datasets, encompassing human, mouse and manatee species. Developers utilized these data to address challenges in transcript isoform detection, quantification and de novo transcript detection. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. Incorporating additional orthogonal data and replicate samples is advised when aiming to detect rare and novel transcripts or using reference-free approaches. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.


Assuntos
Perfilação da Expressão Gênica , RNA-Seq , Humanos , Animais , Camundongos , RNA-Seq/métodos , Perfilação da Expressão Gênica/métodos , Transcriptoma , Análise de Sequência de RNA/métodos , Anotação de Sequência Molecular/métodos
2.
Cell Syst ; 14(12): 1103-1112.e6, 2023 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-38016465

RESUMO

The sequence in the 5' untranslated regions (UTRs) is known to affect mRNA translation rates. However, the underlying regulatory grammar remains elusive. Here, we propose MTtrans, a multi-task translation rate predictor capable of learning common sequence patterns from datasets across various experimental techniques. The core premise is that common motifs are more likely to be genuinely involved in translation control. MTtrans outperforms existing methods in both accuracy and the ability to capture transferable motifs across species, highlighting its strength in identifying evolutionarily conserved sequence motifs. Our independent fluorescence-activated cell sorting coupled with deep sequencing (FACS-seq) experiment validates the impact of most motifs identified by MTtrans. Additionally, we introduce "GRU-rewiring," a technique to interpret the hidden states of the recurrent units. Gated recurrent unit (GRU)-rewiring allows us to identify regulatory element-enriched positions and examine the local effects of 5' UTR mutations. MTtrans is a powerful tool for deciphering the translation regulatory motifs.


Assuntos
Sequências Reguladoras de Ácido Nucleico , Regiões 5' não Traduzidas/genética , Sequência Conservada
3.
RNA ; 29(12): 1839-1855, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37816550

RESUMO

The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for continuous extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies, while the containers and reproducible workflows could easily be deployed and extended to evaluate new methods or data sets.


Assuntos
Benchmarking , RNA , RNA/genética , RNA-Seq , Poliadenilação , Análise de Sequência de RNA/métodos
4.
bioRxiv ; 2023 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-37425672

RESUMO

The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, and limitations and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for seamless extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies. Furthermore, the containers and reproducible workflows generated in the course of this project can be seamlessly deployed and extended in the future to evaluate new methods or datasets.

5.
Nat Methods ; 20(8): 1187-1195, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37308696

RESUMO

Most approaches to transcript quantification rely on fixed reference annotations; however, the transcriptome is dynamic and depending on the context, such static annotations contain inactive isoforms for some genes, whereas they are incomplete for others. Here we present Bambu, a method that performs machine-learning-based transcript discovery to enable quantification specific to the context of interest using long-read RNA-sequencing. To identify novel transcripts, Bambu estimates the novel discovery rate, which replaces arbitrary per-sample thresholds with a single, interpretable, precision-calibrated parameter. Bambu retains the full-length and unique read counts, enabling accurate quantification in presence of inactive isoforms. Compared to existing methods for transcript discovery, Bambu achieves greater precision without sacrificing sensitivity. We show that context-aware annotations improve quantification for both novel and known transcripts. We apply Bambu to quantify isoforms from repetitive HERVH-LTR7 retrotransposons in human embryonic stem cells, demonstrating the ability for context-specific transcript expression analysis.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Humanos , RNA-Seq , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Isoformas de Proteínas/genética
6.
Nat Methods ; 19(12): 1590-1598, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36357692

RESUMO

RNA modifications such as m6A methylation form an additional layer of complexity in the transcriptome. Nanopore direct RNA sequencing can capture this information in the raw current signal for each RNA molecule, enabling the detection of RNA modifications using supervised machine learning. However, experimental approaches provide only site-level training data, whereas the modification status for each single RNA molecule is missing. Here we present m6Anet, a neural-network-based method that leverages the multiple instance learning framework to specifically handle missing read-level modification labels in site-level training data. m6Anet outperforms existing computational methods, shows similar accuracy as experimental approaches, and generalizes with high accuracy to different cell lines and species without retraining model parameters. In addition, we demonstrate that m6Anet captures the underlying read-level stoichiometry, which can be used to approximate differences in modification rates. Overall, m6Anet offers a tool to capture the transcriptome-wide identification and quantification of m6A from a single run of direct RNA sequencing.


Assuntos
Sequenciamento por Nanoporos , RNA , RNA/genética , RNA/metabolismo , Análise de Sequência de RNA/métodos , Metilação , Transcriptoma
7.
Trends Genet ; 38(3): 246-257, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34711425

RESUMO

Nanopore sequencing provides signal data corresponding to the nucleotide motifs sequenced. Through machine learning-based methods, these signals are translated into long-read sequences that overcome the read size limit of short-read sequencing. However, analyzing the raw nanopore signal data provides many more opportunities beyond just sequencing genomes and transcriptomes: algorithms that use machine learning approaches to extract biological information from these signals allow the detection of DNA and RNA modifications, the estimation of poly(A) tail length, and the prediction of RNA secondary structures. In this review, we discuss how developments in machine learning methodologies contributed to more accurate basecalling and lower error rates, and how these methods enable new biological discoveries. We argue that direct nanopore sequencing of DNA and RNA provides a new dimensionality for genomics experiments and highlight challenges and future directions for computational approaches to extract the additional information provided by nanopore signal data.


Assuntos
Sequenciamento por Nanoporos , Nanoporos , Algoritmos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Aprendizado de Máquina , Análise de Sequência de DNA/métodos
8.
Nat Biotechnol ; 39(11): 1394-1402, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34282325

RESUMO

RNA modifications, such as N6-methyladenosine (m6A), modulate functions of cellular RNA species. However, quantifying differences in RNA modifications has been challenging. Here we develop a computational method, xPore, to identify differential RNA modifications from nanopore direct RNA sequencing (RNA-seq) data. We evaluate our method on transcriptome-wide m6A profiling data, demonstrating that xPore identifies positions of m6A sites at single-base resolution, estimates the fraction of modified RNA species in the cell and quantifies the differential modification rate across conditions. We apply xPore to direct RNA-seq data from six cell lines and multiple myeloma patient samples without a matched control sample and find that many m6A sites are preserved across cell types, whereas a subset exhibit significant differences in their modification rates. Our results show that RNA modifications can be identified from direct RNA-seq data with high accuracy, enabling analysis of differential modifications and expression from a single high-throughput experiment.


Assuntos
Sequenciamento por Nanoporos , Nanoporos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , RNA/genética , RNA/metabolismo , Processamento Pós-Transcricional do RNA/genética , Análise de Sequência de RNA/métodos , Transcriptoma/genética
9.
STAR Protoc ; 2(1): 100255, 2021 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-33490975

RESUMO

The CRISPR-Cas system coupled with Combinatorial Genetics En Masse (CombiGEM) enables systematic analysis of high-order genetic perturbations that are important for understanding biological processes and discovering therapeutic target combinations. Here, we present detailed steps and technical considerations for building multiplexed guide RNA libraries and carrying out a combinatorial CRISPR screen in mammalian cells. We also present an analytical pipeline, CombiPIPE, for mapping two- and three-way genetic interactions. For complete details on the use and execution of this protocol, please refer to Zhou et al. (2020).


Assuntos
Sistemas CRISPR-Cas , Testes Genéticos , Animais , Linhagem Celular , Humanos , RNA Guia de Cinetoplastídeos/genética , RNA Guia de Cinetoplastídeos/metabolismo
10.
Methods Mol Biol ; 2199: 3-12, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33125641

RESUMO

Exploring how combinatorial mutations can be combined to optimize protein functions is important to guide protein engineering. Given the vast combinatorial space of changing multiple amino acids, identifying the top-performing variants from a large number of mutants might not be possible without a high-throughput gene assembly and screening strategy. Here we describe the CombiSEAL platform, a strategy that allows for modularization of any protein sequence into multiple segments for mutagenesis and barcoding, and seamless single-pot ligations of different segments to generate a library of combination mutants linked with concatenated barcodes at one end. By reading the barcodes using next-generation sequencing, activities of each protein variant during the protein selection process can be easily tracked in a high-throughput manner. CombiSEAL not only allows the identification of better protein variants but also enables the systematic analyses to distinguish the beneficial, deleterious, and neutral effects of combining different mutations on protein functions.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Mutagênese , Engenharia de Proteínas , Proteínas Recombinantes/genética
11.
Cell Rep ; 32(6): 108020, 2020 08 11.
Artigo em Inglês | MEDLINE | ID: mdl-32783942

RESUMO

We present a CRISPR-based multi-gene knockout screening system and toolkits for extensible assembly of barcoded high-order combinatorial guide RNA libraries en masse. We apply this system for systematically identifying not only pairwise but also three-way synergistic therapeutic target combinations and successfully validate double- and triple-combination regimens for suppression of cancer cell growth and protection against Parkinson's disease-associated toxicity. This system overcomes the practical challenges of experimenting on a large number of high-order genetic and drug combinations and can be applied to uncover the rare synergistic interactions between druggable targets.


Assuntos
Sistemas CRISPR-Cas , Combinação de Medicamentos , Sistemas de Liberação de Medicamentos/métodos , Ensaios de Triagem em Larga Escala/métodos , Animais , Antineoplásicos/farmacologia , Drosophila melanogaster , Técnicas de Inativação de Genes , Células HEK293 , Humanos , Camundongos , Neoplasias/tratamento farmacológico , Doença de Parkinson/tratamento farmacológico , RNA Guia de Cinetoplastídeos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA