Your browser doesn't support javascript.
loading
Study of the error correction capability of multiple sequence alignment algorithm (MAFFT) in DNA storage.
Xie, Ranze; Zan, Xiangzhen; Chu, Ling; Su, Yanqing; Xu, Peng; Liu, Wenbin.
Afiliação
  • Xie R; Institution of Computational Science and Technology, Guangzhou University, Guangzhou, 510006, China.
  • Zan X; Institution of Computational Science and Technology, Guangzhou University, Guangzhou, 510006, China.
  • Chu L; Institution of Computational Science and Technology, Guangzhou University, Guangzhou, 510006, China.
  • Su Y; Institution of Computational Science and Technology, Guangzhou University, Guangzhou, 510006, China.
  • Xu P; Institution of Computational Science and Technology, Guangzhou University, Guangzhou, 510006, China. gdxupeng@gzhu.edu.cn.
  • Liu W; Institution of Computational Science and Technology, Guangzhou University, Guangzhou, 510006, China. wbliu6910@gzhu.edu.cn.
BMC Bioinformatics ; 24(1): 111, 2023 Mar 23.
Article em En | MEDLINE | ID: mdl-36959531
Synchronization (insertions-deletions) errors are still a major challenge for reliable information retrieval in DNA storage. Unlike traditional error correction codes (ECC) that add redundancy in the stored information, multiple sequence alignment (MSA) solves this problem by searching the conserved subsequences. In this paper, we conduct a comprehensive simulation study on the error correction capability of a typical MSA algorithm, MAFFT. Our results reveal that its capability exhibits a phase transition when there are around 20% errors. Below this critical value, increasing sequencing depth can eventually allow it to approach complete recovery. Otherwise, its performance plateaus at some poor levels. Given a reasonable sequencing depth (≤ 70), MSA could achieve complete recovery in the low error regime, and effectively correct 90% of the errors in the medium error regime. In addition, MSA is robust to imperfect clustering. It could also be combined with other means such as ECC, repeated markers, or any other code constraints. Furthermore, by selecting an appropriate sequencing depth, this strategy could achieve an optimal trade-off between cost and reading speed. MSA could be a competitive alternative for future DNA storage.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / DNA Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / DNA Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2023 Tipo de documento: Article