KAnalyze: a fast versatile pipelined k-mer toolkit.

Audano, Peter; Vannberg, Fredrik

Audano, Peter; Vannberg, Fredrik.

Afiliação

Audano P; School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.
Vannberg F; School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.

Bioinformatics ; 30(14): 2070-2, 2014 Jul 15.

Article em En | MEDLINE | ID: mdl-24642064

ABSTRACT

ABSTRACT

MOTIVATION Converting nucleotide sequences into short overlapping fragments of uniform length, k-mers, is a common step in many bioinformatics applications. While existing software packages count k-mers, few are optimized for speed, offer an application programming interface (API), a graphical interface or contain features that make it extensible and maintainable. We designed KAnalyze to compete with the fastest k-mer counters, to produce reliable output and to support future development efforts through well-architected, documented and testable code. Currently, KAnalyze can output k-mer counts in a sorted tab-delimited file or stream k-mers as they are read. KAnalyze can process large datasets with 2 GB of memory. This project is implemented in Java 7, and the command line interface (CLI) is designed to integrate into pipelines written in any language.

RESULTS:

As a k-mer counter, KAnalyze outperforms Jellyfish, DSK and a pipeline built on Perl and Linux utilities. Through extensive unit and system testing, we have verified that KAnalyze produces the correct k-mer counts over multiple datasets and k-mer sizes. AVAILABILITY AND IMPLEMENTATION KAnalyze is available on SourceForge https//sourceforge.net/projects/kanalyze/.

Assuntos

Análise de Sequência de DNA/métodos; Software; Algoritmos; Cromossomos Humanos Par 1/química; Humanos

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Análise de Sequência de DNA Limite: Humans Idioma: En Ano de publicação: 2014 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google