Your browser doesn't support javascript.
loading
Pangenome graph layout by Path-Guided Stochastic Gradient Descent.
Heumos, Simon; Guarracino, Andrea; Schmelzle, Jan-Niklas M; Li, Jiajie; Zhang, Zhiru; Hagmann, Jörg; Nahnsen, Sven; Prins, Pjotr; Garrison, Erik.
Afiliación
  • Heumos S; Quantitative Biology Center (QBiC), University of Tübingen, Tübingen 72076, Germany.
  • Guarracino A; Biomedical Data Science, Department of Computer Science, University of Tübingen, Tübingen 72076, Germany.
  • Schmelzle JM; Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA.
  • Li J; Genomics Research Centre, Human Technopole, Milan 20157, Italy.
  • Zhang Z; Department of Computer Engineering, School of Computation, Information and Technology (CIT), Technical University of Munich, Munich 80333, Germany.
  • Hagmann J; School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853, USA.
  • Nahnsen S; School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853, USA.
  • Prins P; School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853, USA.
  • Garrison E; Computomics GmbH, Eisenbahnstr. 1, 72072 Tübingen, Germany.
bioRxiv ; 2023 Oct 17.
Article en En | MEDLINE | ID: mdl-37790531
ABSTRACT
Motivation The increasing availability of complete genomes demands for models to study genomic variability within entire populations. Pangenome graphs capture the full genomic similarity and diversity between multiple genomes. In order to understand them, we need to see them. For visualization, we need a human readable graph layout A graph embedding in low (e.g. two) dimensional depictions. Due to a pangenome graph's potential excessive size, this is a significant challenge.

Results:

In response, we introduce a novel graph layout algorithm the Path-Guided Stochastic Gradient Descent (PG-SGD). PG-SGD uses the genomes, represented in the pangenome graph as paths, as an embedded positional system to sample genomic distances between pairs of nodes. This avoids the quadratic cost seen in previous versions of graph drawing by Stochastic Gradient Descent (SGD). We show that our implementation efficiently computes the low dimensional layouts of gigabase-scale pangenome graphs, unveiling their biological features.

Availability:

We integrated PG-SGD in ODGI which is released as free software under the MIT open source license. Source code is available at https//github.com/pangenome/odgi.

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: BioRxiv Año: 2023 Tipo del documento: Article País de afiliación: Alemania

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: BioRxiv Año: 2023 Tipo del documento: Article País de afiliación: Alemania
...