Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Reconstruction of Viral Variants via Monte Carlo Clustering.

Juyal, Akshay; Hosseini, Roya; Novikov, Daniel; Grinshpon, Mark; Zelikovsky, Alex.

J Comput Biol ; 30(9): 1009-1018, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37695837

RESUMO

Identifying viral variants through clustering is essential for understanding the composition and structure of viral populations within and between hosts, which play a crucial role in disease progression and epidemic spread. This article proposes and validates novel Monte Carlo (MC) methods for clustering aligned viral sequences by minimizing either entropy or Hamming distance from consensuses. We validate these methods on four benchmarks: two SARS-CoV-2 interhost data sets and two HIV intrahost data sets. A parallelized version of our tool is scalable to very large data sets. We show that both entropy and Hamming distance-based MC clusterings discern the meaningful information from sequencing data. The proposed clustering methods consistently converge to similar clusterings across different runs. Finally, we show that MC clustering improves reconstruction of intrahost viral population from sequencing data.

Assuntos

COVID-19 , Humanos , COVID-19/genética , SARS-CoV-2/genética , Benchmarking , Análise por Conglomerados , Progressão da Doença

Scalable Reconstruction of SARS-CoV-2 Phylogeny with Recurrent Mutations.

Novikov, Daniel; Knyazev, Sergey; Grinshpon, Mark; Icer, Pelin; Skums, Pavel; Zelikovsky, Alex.

J Comput Biol ; 28(11): 1130-1141, 2021 11.

Artigo em Inglês | MEDLINE | ID: mdl-34698524

RESUMO

This article presents a novel scalable character-based phylogeny algorithm for dense viral sequencing data called SPHERE (Scalable PHylogEny with REcurrent mutations). The algorithm is based on an evolutionary model where recurrent mutations are allowed, but backward mutations are prohibited. The algorithm creates rooted character-based phylogeny trees, wherein all leaves and internal nodes are labeled by observed taxa. We show that SPHERE phylogeny is more stable than Nextstrain's, and that it accurately infers known transmission links from the early pandemic. SPHERE is a fast algorithm that can process >200,000 sequences in <2 hours, which offers a compact phylogenetic visualization of Global Initiative on Sharing All Influenza Data (GISAID).

Assuntos

Mutação , Filogenia , SARS-CoV-2/genética , Algoritmos , COVID-19/transmissão , COVID-19/virologia , Bases de Dados Genéticas , Humanos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA