Explorer: efficient DNA coding by De Bruijn graph toward arbitrary local and global biochemical constraints.
Brief Bioinform
; 25(5)2024 Jul 25.
Article
en En
| MEDLINE
| ID: mdl-39073829
ABSTRACT
With the exponential growth of digital data, there is a pressing need for innovative storage media and techniques. DNA molecules, due to their stability, storage capacity, and density, offer a promising solution for information storage. However, DNA storage also faces numerous challenges, such as complex biochemical constraints and encoding efficiency. This paper presents Explorer, a high-efficiency DNA coding algorithm based on the De Bruijn graph, which leverages its capability to characterize local sequences. Explorer enables coding under various biochemical constraints, such as homopolymers, GC content, and undesired motifs. This paper also introduces Codeformer, a fast decoding algorithm based on the transformer architecture, to further enhance decoding efficiency. Numerical experiments indicate that, compared with other advanced algorithms, Explorer not only achieves stable encoding and decoding under various biochemical constraints but also increases the encoding efficiency and bit rate by ¿10%. Additionally, Codeformer demonstrates the ability to efficiently decode large quantities of DNA sequences. Under different parameter settings, its decoding efficiency exceeds that of traditional algorithms by more than two-fold. When Codeformer is combined with Reed-Solomon code, its decoding accuracy exceeds 99%, making it a good choice for high-speed decoding applications. These advancements are expected to contribute to the development of DNA-based data storage systems and the broader exploration of DNA as a novel information storage medium.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Algoritmos
/
ADN
Idioma:
En
Revista:
Brief Bioinform
/
Brief. bioinform
/
Briefings in bioinformatics
Asunto de la revista:
BIOLOGIA
/
INFORMATICA MEDICA
Año:
2024
Tipo del documento:
Article
País de afiliación:
China