Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Genome Biol ; 25(1): 130, 2024 05 21.
Artigo em Inglês | MEDLINE | ID: mdl-38773520

RESUMO

Bulk DNA sequencing of multiple samples from the same tumor is becoming common, yet most methods to infer copy-number aberrations (CNAs) from this data analyze individual samples independently. We introduce HATCHet2, an algorithm to identify haplotype- and clone-specific CNAs simultaneously from multiple bulk samples. HATCHet2 extends the earlier HATCHet method by improving identification of focal CNAs and introducing a novel statistic, the minor haplotype B-allele frequency (mhBAF), that enables identification of mirrored-subclonal CNAs. We demonstrate HATCHet2's improved accuracy using simulations and a single-cell sequencing dataset. HATCHet2 analysis of 10 prostate cancer patients reveals previously unreported mirrored-subclonal CNAs affecting cancer genes.


Assuntos
Algoritmos , Variações do Número de Cópias de DNA , Haplótipos , Neoplasias da Próstata , Humanos , Neoplasias da Próstata/genética , Masculino , Análise de Sequência de DNA/métodos , Neoplasias/genética , Frequência do Gene , Análise de Célula Única
2.
PLoS One ; 14(8): e0221068, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31437182

RESUMO

Clustering homologous sequences based on their similarity is a problem that appears in many bioinformatics applications. The fact that sequences cluster is ultimately the result of their phylogenetic relationships. Despite this observation and the natural ways in which a tree can define clusters, most applications of sequence clustering do not use a phylogenetic tree and instead operate on pairwise sequence distances. Due to advances in large-scale phylogenetic inference, we argue that tree-based clustering is under-utilized. We define a family of optimization problems that, given an arbitrary tree, return the minimum number of clusters such that all clusters adhere to constraints on their heterogeneity. We study three specific constraints, limiting (1) the diameter of each cluster, (2) the sum of its branch lengths, or (3) chains of pairwise distances. These three problems can be solved in time that increases linearly with the size of the tree, and for two of the three criteria, the algorithms have been known in the theoretical computer scientist literature. We implement these algorithms in a tool called TreeCluster, which we test on three applications: OTU clustering for microbiome data, HIV transmission clustering, and divide-and-conquer multiple sequence alignment. We show that, by using tree-based distances, TreeCluster generates more internally consistent clusters than alternatives and improves the effectiveness of downstream applications. TreeCluster is available at https://github.com/niemasd/TreeCluster.


Assuntos
Algoritmos , Biologia Computacional/estatística & dados numéricos , HIV/genética , Microbiota/genética , Filogenia , Alinhamento de Sequência/estatística & dados numéricos , Sequência de Bases , Análise por Conglomerados , Biologia Computacional/métodos , HIV/classificação , Infecções por HIV/epidemiologia , Infecções por HIV/transmissão , Infecções por HIV/virologia , Humanos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA