Your browser doesn't support javascript.
loading
Scalable and efficient DNA sequencing analysis on different compute infrastructures aiding variant discovery.
Hanssen, Friederike; Garcia, Maxime U; Folkersen, Lasse; Pedersen, Anders Sune; Lescai, Francesco; Jodoin, Susanne; Miller, Edmund; Seybold, Matthias; Wacker, Oskar; Smith, Nicholas; Gabernet, Gisela; Nahnsen, Sven.
Afiliación
  • Hanssen F; Quantitative Biology Center, Eberhard-Karls University of Tübingen, Otfried-Müller Str. 37, Tübingen 72076, Baden-Württemberg, Germany.
  • Garcia MU; Department of Computer Science, Eberhard-Karls University of Tübingen, 72076 Baden-Württemberg, Germany.
  • Folkersen L; M3 Research Center, University Hospital, Otfried-Müller Str. 37, Tübingen 72076, Baden-Württemberg, Germany.
  • Pedersen AS; Cluster of Excellence iFIT (EXC 2180) 'Image-Guided and Functionally Instructed Tumor Therapies', Eberhard-Karls University of Tübingen, Tübingen 72076, Baden-Württemberg, Germany.
  • Lescai F; Seqera Labs, Carrer de Marià Aguilò, 28, Barcelona 08005, Spain.
  • Jodoin S; Barntumörbanken, Department of Oncology-Pathology, Karolinska Institutet, BioClinicum, Visionsgatan 4, Solna 17164, Sweden.
  • Miller E; National Genomics Infrastructure, SciLifeLab, SciLifeLab, Tomtebodavägen 23, Solna 17165, Sweden.
  • Seybold M; Nucleus Genomics, 584 Broadway, New York, 10012 NY, USA.
  • Wacker O; National Genome Center Denmark, Ørestads Boulevard 5, Copenhagen 2300, Denmark.
  • Smith N; Department of Biology and Biotechnology "L. Spallanzani", University of Pavia, via Ferrata, 9, Pavia, 27100 PV, Italy.
  • Gabernet G; Quantitative Biology Center, Eberhard-Karls University of Tübingen, Otfried-Müller Str. 37, Tübingen 72076, Baden-Württemberg, Germany.
  • Nahnsen S; M3 Research Center, University Hospital, Otfried-Müller Str. 37, Tübingen 72076, Baden-Württemberg, Germany.
NAR Genom Bioinform ; 6(2): lqae031, 2024 Jun.
Article en En | MEDLINE | ID: mdl-38666213
ABSTRACT
DNA variation analysis has become indispensable in many aspects of modern biomedicine, most prominently in the comparison of normal and tumor samples. Thousands of samples are collected in local sequencing efforts and public databases requiring highly scalable, portable, and automated workflows for streamlined processing. Here, we present nf-core/sarek 3, a well-established, comprehensive variant calling and annotation pipeline for germline and somatic samples. It is suitable for any genome with a known reference. We present a full rewrite of the original pipeline showing a significant reduction of storage requirements by using the CRAM format and runtime by increasing intra-sample parallelization. Both are leading to a 70% cost reduction in commercial clouds enabling users to do large-scale and cross-platform data analysis while keeping costs and CO2 emissions low. The code is available at https//nf-co.re/sarek.

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: NAR Genom Bioinform Año: 2024 Tipo del documento: Article País de afiliación: Alemania

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: NAR Genom Bioinform Año: 2024 Tipo del documento: Article País de afiliación: Alemania