Your browser doesn't support javascript.
loading
Streamlining data-intensive biology with workflow systems.
Reiter, Taylor; Brooks, Phillip T; Irber, Luiz; Joslin, Shannon E K; Reid, Charles M; Scott, Camille; Brown, C Titus; Pierce-Ward, N Tessa.
Afiliação
  • Reiter T; Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA.
  • Brooks PT; Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA.
  • Irber L; Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA.
  • Joslin SEK; Department of Animal Science, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA.
  • Reid CM; Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA.
  • Scott C; Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA.
  • Brown CT; Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA.
  • Pierce-Ward NT; Department of Population Health and Reproduction, University of California, Davis, 1 Shields Avenue, Davis, CA 95616, USA.
Gigascience ; 10(1)2021 01 13.
Article em En | MEDLINE | ID: mdl-33438730
ABSTRACT
As the scale of biological data generation has increased, the bottleneck of research has shifted from data generation to analysis. Researchers commonly need to build computational workflows that include multiple analytic tools and require incremental development as experimental insights demand tool and parameter modifications. These workflows can produce hundreds to thousands of intermediate files and results that must be integrated for biological insight. Data-centric workflow systems that internally manage computational resources, software, and conditional execution of analysis steps are reshaping the landscape of biological data analysis and empowering researchers to conduct reproducible analyses at scale. Adoption of these tools can facilitate and expedite robust data analysis, but knowledge of these techniques is still lacking. Here, we provide a series of strategies for leveraging workflow systems with structured project, data, and resource management to streamline large-scale biological analysis. We present these practices in the context of high-throughput sequencing data analysis, but the principles are broadly applicable to biologists working beyond this field.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Biologia Computacional Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Biologia Computacional Idioma: En Ano de publicação: 2021 Tipo de documento: Article