Your browser doesn't support javascript.
loading
A binning tool to reconstruct viral haplotypes from assembled contigs.
Chen, Jiao; Shang, Jiayu; Wang, Jianrong; Sun, Yanni.
Afiliação
  • Chen J; Computer Science and Engineering, Michigan State University, East Lansing, 48824, USA.
  • Shang J; Electrical Engineering, City University of Hong Kong, Hong Kong, China.
  • Wang J; Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, 48824, USA.
  • Sun Y; Electrical Engineering, City University of Hong Kong, Hong Kong, China. yannisun@cityu.edu.hk.
BMC Bioinformatics ; 20(1): 544, 2019 Nov 04.
Article em En | MEDLINE | ID: mdl-31684876
ABSTRACT

BACKGROUND:

Infections by RNA viruses such as Influenza, HIV still pose a serious threat to human health despite extensive research on viral diseases. One challenge for producing effective prevention and treatment strategies is high intra-species genetic diversity. As different strains may have different biological properties, characterizing the genetic diversity is thus important to vaccine and drug design. Next-generation sequencing technology enables comprehensive characterization of both known and novel strains and has been widely adopted for sequencing viral populations. However, genome-scale reconstruction of haplotypes is still a challenging problem. In particular, haplotype assembly programs often produce contigs rather than full genomes. As a mutation in one gene can mask the phenotypic effects of a mutation at another locus, clustering these contigs into genome-scale haplotypes is still needed.

RESULTS:

We developed a contig binning tool, VirBin, which clusters contigs into different groups so that each group represents a haplotype. Commonly used features based on sequence composition and contig coverage cannot effectively distinguish viral haplotypes because of their high sequence similarity and heterogeneous sequencing coverage for RNA viruses. VirBin applied prototype-based clustering to cluster regions that are more likely to contain mutations specific to a haplotype. The tool was tested on multiple simulated sequencing data with different haplotype abundance distributions and contig sizes, and also on mock quasispecies sequencing data. The benchmark results with other contig binning tools demonstrated the superior sensitivity and precision of VirBin in contig binning for viral haplotype reconstruction.

CONCLUSIONS:

In this work, we presented VirBin, a new contig binning tool for distinguishing contigs from different viral haplotypes with high sequence similarity. It competes favorably with other tools on viral contig binning. The source codes are available at https//github.com/chjiao/VirBin .
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Vírus de RNA / Biologia Computacional Tipo de estudo: Evaluation_studies Limite: Humans Idioma: En Ano de publicação: 2019 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Vírus de RNA / Biologia Computacional Tipo de estudo: Evaluation_studies Limite: Humans Idioma: En Ano de publicação: 2019 Tipo de documento: Article