Your browser doesn't support javascript.
loading
OVarFlow: a resource optimized GATK 4 based Open source Variant calling workFlow.
Bathke, Jochen; Lühken, Gesine.
Afiliação
  • Bathke J; Institute of Animal Breeding and Genetics, Justus Liebig University Gießen, Ludwigstraße 21, 35390, Gießen, Germany. jochen.bathke@agrar.uni-giessen.de.
  • Lühken G; Institute of Animal Breeding and Genetics, Justus Liebig University Gießen, Ludwigstraße 21, 35390, Gießen, Germany.
BMC Bioinformatics ; 22(1): 402, 2021 Aug 13.
Article em En | MEDLINE | ID: mdl-34388963
ABSTRACT

BACKGROUND:

The advent of next generation sequencing has opened new avenues for basic and applied research. One application is the discovery of sequence variants causative of a phenotypic trait or a disease pathology. The computational task of detecting and annotating sequence differences of a target dataset between a reference genome is known as "variant calling". Typically, this task is computationally involved, often combining a complex chain of linked software tools. A major player in this field is the Genome Analysis Toolkit (GATK). The "GATK Best Practices" is a commonly referred recipe for variant calling. However, current computational recommendations on variant calling predominantly focus on human sequencing data and ignore ever-changing demands of high-throughput sequencing developments. Furthermore, frequent updates to such recommendations are counterintuitive to the goal of offering a standard workflow and hamper reproducibility over time.

RESULTS:

A workflow for automated detection of single nucleotide polymorphisms and insertion-deletions offers a wide range of applications in sequence annotation of model and non-model organisms. The introduced workflow builds on the GATK Best Practices, while enabling reproducibility over time and offering an open, generalized computational architecture. The workflow achieves parallelized data evaluation and maximizes performance of individual computational tasks. Optimized Java garbage collection and heap size settings for the GATK applications SortSam, MarkDuplicates, HaplotypeCaller, and GatherVcfs effectively cut the overall analysis time in half.

CONCLUSIONS:

The demand for variant calling, efficient computational processing, and standardized workflows is growing. The Open source Variant calling workFlow (OVarFlow) offers automation and reproducibility for a computationally optimized variant calling task. By reducing usage of computational resources, the workflow removes prior existing entry barriers to the variant calling field and enables standardized variant calling.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Sequenciamento de Nucleotídeos em Larga Escala Tipo de estudo: Guideline / Prognostic_studies Limite: Humans Idioma: En Revista: BMC Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2021 Tipo de documento: Article País de afiliação: Alemanha

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Sequenciamento de Nucleotídeos em Larga Escala Tipo de estudo: Guideline / Prognostic_studies Limite: Humans Idioma: En Revista: BMC Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2021 Tipo de documento: Article País de afiliação: Alemanha