Your browser doesn't support javascript.
loading
Omics Analysis and Quality Control Pipelines in a High-Performance Computing Environment.
Ricke, Darrell O; Ng, Derek; Michaleas, Adam; Fremont-Smith, Philip.
Afiliação
  • Ricke DO; Massachusetts Institute of Technology Lincoln Laboratory, Lexington, Massachusetts, USA.
  • Ng D; Massachusetts Institute of Technology Lincoln Laboratory, Lexington, Massachusetts, USA.
  • Michaleas A; Massachusetts Institute of Technology Lincoln Laboratory, Lexington, Massachusetts, USA.
  • Fremont-Smith P; Massachusetts Institute of Technology Lincoln Laboratory, Lexington, Massachusetts, USA.
OMICS ; 27(11): 519-525, 2023 Nov.
Article em En | MEDLINE | ID: mdl-37943668
ABSTRACT
Data quality is often an overlooked feature in the analysis of omics data. This is particularly relevant in studies of chemical and pathogen exposures that can modify an individual's epigenome and transcriptome with persistence over time. Portable, quality control (QC) pipelines for multiple different omics datasets are therefore needed. To meet these goals, portable quality assurance (QA) metrics, metric acceptability criterion, and pipelines to compute these metrics were developed and consolidated into one framework for 12 different omics assays. Performance of these QA metrics and pipelines were evaluated on human data generated by the Defense Advanced Research Projects Agency (DARPA) Epigenetic CHaracterization and Observation (ECHO) program. Twelve analytical pipelines were developed leveraging standard tools when possible. These QC pipelines were containerized using Singularity to ensure portability and scalability. Datasets for these 12 omics assays were analyzed and results were summarized. The quality thresholds and metrics used were described. We found that these pipelines enabled early identification of lower quality datasets, datasets with insufficient reads for additional sequencing, and experimental protocols needing refinements. These omics data analysis and QC pipelines are available as open-source resources as reported and discussed in this article for the omics and life sciences communities.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Sequenciamento de Nucleotídeos em Larga Escala Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Software / Sequenciamento de Nucleotídeos em Larga Escala Idioma: En Ano de publicação: 2023 Tipo de documento: Article