SPLASH: A statistical, reference-free genomic algorithm unifies biological discovery.
Cell
; 186(25): 5440-5456.e26, 2023 12 07.
Article
en En
| MEDLINE
| ID: mdl-38065078
Today's genomics workflows typically require alignment to a reference sequence, which limits discovery. We introduce a unifying paradigm, SPLASH (Statistically Primary aLignment Agnostic Sequence Homing), which directly analyzes raw sequencing data, using a statistical test to detect a signature of regulation: sample-specific sequence variation. SPLASH detects many types of variation and can be efficiently run at scale. We show that SPLASH identifies complex mutation patterns in SARS-CoV-2, discovers regulated RNA isoforms at the single-cell level, detects the vast sequence diversity of adaptive immune receptors, and uncovers biology in non-model organisms undocumented in their reference genomes: geographic and seasonal variation and diatom association in eelgrass, an oceanic plant impacted by climate change, and tissue-specific transcripts in octopus. SPLASH is a unifying approach to genomic analysis that enables expansive discovery without metadata or references.
Palabras clave
Texto completo:
1
Bases de datos:
MEDLINE
Asunto principal:
Algoritmos
/
Genómica
Límite:
Humans
Idioma:
En
Revista:
Cell
Año:
2023
Tipo del documento:
Article
País de afiliación:
Estados Unidos