Your browser doesn't support javascript.
loading
GRASShopPER-An algorithm for de novo assembly based on GPU alignments.
Swiercz, Aleksandra; Frohmberg, Wojciech; Kierzynka, Michal; Wojciechowski, Pawel; Zurkowski, Piotr; Badura, Jan; Laskowski, Artur; Kasprzak, Marta; Blazewicz, Jacek.
Afiliação
  • Swiercz A; Institute of Computing Science, Poznan University of Technology, Poznan, Poland.
  • Frohmberg W; Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland.
  • Kierzynka M; European Centre for Bioinformatics and Genomics, Poznan, Poland.
  • Wojciechowski P; Institute of Computing Science, Poznan University of Technology, Poznan, Poland.
  • Zurkowski P; European Centre for Bioinformatics and Genomics, Poznan, Poland.
  • Badura J; Institute of Computing Science, Poznan University of Technology, Poznan, Poland.
  • Laskowski A; European Centre for Bioinformatics and Genomics, Poznan, Poland.
  • Kasprzak M; Poznan Supercomputing and Networking Center, Poznan, Poland.
  • Blazewicz J; Institute of Computing Science, Poznan University of Technology, Poznan, Poland.
PLoS One ; 13(8): e0202355, 2018.
Article em En | MEDLINE | ID: mdl-30114279
ABSTRACT
Next generation sequencers produce billions of short DNA sequences in a massively parallel manner, which causes a great computational challenge in accurately reconstructing a genome sequence de novo using these short sequences. Here, we propose the GRASShopPER assembler, which follows an approach of overlap-layout-consensus. It uses an efficient GPU implementation for the sequence alignment during the graph construction stage and a greedy hyper-heuristic algorithm at the fork detection stage. A two-part fork detection method allows us to identify repeated fragments of a genome and to reconstruct them without misassemblies. The assemblies of data sets of bacteria Candidatus Microthrix, nematode Caenorhabditis elegans, and human chromosome 14 were evaluated with the golden standard tool QUAST. In comparison with other assemblers, GRASShopPER provided contigs that covered the largest part of the genomes and, at the same time, kept good values of other metrics, e.g., NG50 and misassembly rate.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Análise de Sequência de DNA / Sequenciamento de Nucleotídeos em Larga Escala Tipo de estudo: Evaluation_studies / Prognostic_studies Limite: Animals / Humans Idioma: En Ano de publicação: 2018 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Análise de Sequência de DNA / Sequenciamento de Nucleotídeos em Larga Escala Tipo de estudo: Evaluation_studies / Prognostic_studies Limite: Animals / Humans Idioma: En Ano de publicação: 2018 Tipo de documento: Article