Accelerating minimap2 for long-read sequencing applications on modern CPUs.
Nat Comput Sci
; 2(2): 78-83, 2022 Feb.
Article
en En
| MEDLINE
| ID: mdl-38177520
ABSTRACT
Long-read sequencing is now routinely used at scale for genomics and transcriptomics applications. Mapping long reads or a draft genome assembly to a reference sequence is often one of the most time-consuming steps in these applications. Here we present techniques to accelerate minimap2, a widely used software for this task. We present multiple optimizations using single-instruction multiple-data parallelization, efficient cache utilization and a learned index data structure to accelerate the three main computational modules of minimap2 seeding, chaining and pairwise sequence alignment. These optimizations result in an up to 1.8-fold reduction of end-to-end mapping time of minimap2 while maintaining identical output.
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Idioma:
En
Revista:
Nat Comput Sci
Año:
2022
Tipo del documento:
Article
País de afiliación:
India