Your browser doesn't support javascript.
loading
MinimapR: A parallel alignment tool for the analysis of large-scale third-generation sequencing data.
Wang, Zihang; Cui, Yingbo; Peng, Shaoliang; Liao, Xiangke; Yu, Yangbo.
Afiliação
  • Wang Z; College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
  • Cui Y; School of Computer, National University of Defense Technology, Changsha, China. Electronic address: yingbocui@nudt.edu.cn.
  • Peng S; College of Computer Science and Electronic Engineering, Hunan University, Changsha, China. Electronic address: slpeng@hnu.edu.cn.
  • Liao X; School of Computer, National University of Defense Technology, Changsha, China.
  • Yu Y; College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
Comput Biol Chem ; 99: 107735, 2022 Aug.
Article em En | MEDLINE | ID: mdl-35850048
ABSTRACT
The development of third-generation sequencing technology has brought significant changes and influences on genomics. Compared to the second-generation sequencing methods, the third-generation technologies produce around 100 times longer reads to reveal new genomic variations that complete long-term gaps in the human reference genome. However, these reads' excessive length and high error rate severely increase the amount of data and alignment cost. The traditional data analysis platform and serial sequence alignment method can not effectively deal with large-scale long read alignment. There is a critical need for a novel data analysis platform that can deliver fast alignment of large-scale sequences to solve the problem of long read alignment. High-performance computing platforms and efficient, scalable algorithms based on these platforms have significant potential to impact sequence analysis approaches. This paper presented minimapR, a multi-level parallel long-read alignment tool based on minimap2, a popular third-generation read aligner. MinimapR is developed based on the new high-performance distributed framework Ray. Ray fully integrates with the Python environment and can be easily installed with pip. MinimapR can utilize the power of multiple computing nodes, significantly accelerating alignment speeds without sacrificing sensitivity. The minimapR tool was tested on 64 nodes and demonstrated a 50 fold increase in speed with 78 % parallel efficiency. The source code and user manual of minimapR are freely available at https//github.com/Geehome/minimapR.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Sequenciamento de Nucleotídeos em Larga Escala Limite: Humans Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Sequenciamento de Nucleotídeos em Larga Escala Limite: Humans Idioma: En Ano de publicação: 2022 Tipo de documento: Article