Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Small Methods ; 5(5): e2001094, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-34928102

RESUMO

Synthetic DNA has recently risen as a viable alternative for long-term digital data storage. To ensure that information is safely recovered after storage, it is essential to appropriately preserve the physical DNA molecules encoding the data. While preservation of biological DNA has been studied previously, synthetic DNA differs in that it is typically much shorter in length, it has different sequence profiles with fewer, if any, repeats (or homopolymers), and it has different contaminants. In this paper, nine different methods used to preserve data files encoded in synthetic DNA are evaluated by accelerated aging of nearly 29 000 DNA sequences. In addition to a molecular count comparison, the DNA is also sequenced and analyzed after aging. These findings show that errors and erasures are stochastic and show no practical distribution difference between preservation methods. Finally, the physical density of these methods is compared and a stability versus density trade-offs discussion provided.


Assuntos
DNA/química , Sequência de Bases , DNA/metabolismo , Meia-Vida , Sequenciamento de Nucleotídeos em Larga Escala , Nanopartículas de Magnetita/química , Reação em Cadeia da Polimerase , Análise de Sequência de DNA , Temperatura , Fatores de Tempo , Trealose/química
2.
Nat Commun ; 11(1): 3264, 2020 06 29.
Artigo em Inglês | MEDLINE | ID: mdl-32601272

RESUMO

DNA has recently emerged as an attractive medium for archival data storage. Recent work has demonstrated proof-of-principle prototype systems; however, very uneven (biased) sequencing coverage has been reported, which indicates inefficiencies in the storage process. Deviations from the average coverage in the sequence copy distribution can either cause wasteful provisioning in sequencing or excessive number of missing sequences. Here, we use millions of unique sequences from a DNA-based digital data archival system to study the oligonucleotide copy unevenness problem and show that the two paramount sources of bias are the synthesis and amplification (PCR) processes. Based on these findings, we develop a statistical model for each molecular process as well as the overall process. We further use our model to explore the trade-offs between synthesis bias, storage physical density, logical redundancy, and sequencing redundancy, providing insights for engineering efficient, robust DNA data storage systems.


Assuntos
Armazenamento e Recuperação da Informação , Análise de Sequência de DNA , Viés , Modelos Teóricos , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/estatística & dados numéricos
4.
Nat Biotechnol ; 36(3): 242-248, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29457795

RESUMO

Synthetic DNA is durable and can encode digital data with high density, making it an attractive medium for data storage. However, recovering stored data on a large-scale currently requires all the DNA in a pool to be sequenced, even if only a subset of the information needs to be extracted. Here, we encode and store 35 distinct files (over 200 MB of data), in more than 13 million DNA oligonucleotides, and show that we can recover each file individually and with no errors, using a random access approach. We design and validate a large library of primers that enable individual recovery of all files stored within the DNA. We also develop an algorithm that greatly reduces the sequencing read coverage required for error-free decoding by maximizing information from all sequence reads. These advances demonstrate a viable, large-scale system for DNA data storage and retrieval.


Assuntos
DNA/genética , Armazenamento e Recuperação da Informação , Análise de Sequência de DNA/métodos , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA