Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 39(7)2023 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-37432342

RESUMO

MOTIVATION: Alternative splicing (AS) of introns from pre-mRNA produces diverse sets of transcripts across cell types and tissues, but is also dysregulated in many diseases. Alignment-free computational methods have greatly accelerated the quantification of mRNA transcripts from short RNA-seq reads, but they inherently rely on a catalog of known transcripts and might miss novel, disease-specific splicing events. By contrast, alignment of reads to the genome can effectively identify novel exonic segments and introns. Event-based methods then count how many reads align to predefined features. However, an alignment is more expensive to compute and constitutes a bottleneck in many AS analysis methods. RESULTS: Here, we propose fortuna, a method that guesses novel combinations of annotated splice sites to create transcript fragments. It then pseudoaligns reads to fragments using kallisto and efficiently derives counts of the most elementary splicing units from kallisto's equivalence classes. These counts can be directly used for AS analysis or summarized to larger units as used by other widely applied methods. In experiments on synthetic and real data, fortuna was around 7× faster than traditional align and count approaches, and was able to analyze almost 300 million reads in just 15 min when using four threads. It mapped reads containing mismatches more accurately across novel junctions and found more reads supporting aberrant splicing events in patients with autism spectrum disorder than existing methods. We further used fortuna to identify novel, tissue-specific splicing events in Drosophila. AVAILABILITY AND IMPLEMENTATION: fortuna source code is available at https://github.com/canzarlab/fortuna.


Assuntos
Transtorno do Espectro Autista , Humanos , Análise de Sequência de RNA/métodos , Splicing de RNA , Processamento Alternativo , Software
2.
Acta Biotheor ; 71(1): 5, 2023 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-36695929

RESUMO

In this work we propose the partial Wiener index as one possible measure of branching in phylogenetic evolutionary trees. We establish the connection between the generalized Robinson-Foulds (RF) metric for measuring the similarity of phylogenetic trees and partial Wiener indices by expressing the number of conflicting pairs of edges in the generalized RF metric in terms of partial Wiener indices. To do so we compute the minimum and maximum value of the partial Wiener index [Formula: see text], where [Formula: see text] is a binary rooted tree with root [Formula: see text] and [Formula: see text] leaves. Moreover, under the Yule probabilistic model, we show how to compute the expected value of [Formula: see text]. As a direct consequence, we give exact formulas for the upper bound and the expected number of conflicting pairs. By doing so we provide a better theoretical understanding of the computational complexity of the generalized RF metric.


Assuntos
Algoritmos , Evolução Biológica , Animais , Filogenia
3.
Algorithms Mol Biol ; 19(1): 21, 2024 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-38863064

RESUMO

Metric multidimensional scaling is one of the classical methods for embedding data into low-dimensional Euclidean space. It creates the low-dimensional embedding by approximately preserving the pairwise distances between the input points. However, current state-of-the-art approaches only scale to a few thousand data points. For larger data sets such as those occurring in single-cell RNA sequencing experiments, the running time becomes prohibitively large and thus alternative methods such as PCA are widely used instead. Here, we propose a simple neural network-based approach for solving the metric multidimensional scaling problem that is orders of magnitude faster than previous state-of-the-art approaches, and hence scales to data sets with up to a few million cells. At the same time, it provides a non-linear mapping between high- and low-dimensional space that can place previously unseen cells in the same embedding.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA