Your browser doesn't support javascript.
loading
Filtering procedures for untargeted LC-MS metabolomics data.
Schiffman, Courtney; Petrick, Lauren; Perttula, Kelsi; Yano, Yukiko; Carlsson, Henrik; Whitehead, Todd; Metayer, Catherine; Hayes, Josie; Rappaport, Stephen; Dudoit, Sandrine.
Afiliação
  • Schiffman C; Division of Biostatistics, UC Berkeley, Berkeley, 94720, USA. courtneys@berkeley.edu.
  • Petrick L; The Senator Frank R. Lautenberg Environmental Health Sciences Laboratory, Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, USA.
  • Perttula K; Center for Integrative Research on Childhood Leukemia and the Environment, UC Berkeley, Berkeley, 94720, USA.
  • Yano Y; Division of Environmental Health Sciences, UC Berkeley, Berkeley, 94720, USA.
  • Carlsson H; Division of Environmental Health Sciences, UC Berkeley, Berkeley, 94720, USA.
  • Whitehead T; Division of Environmental Health Sciences, UC Berkeley, Berkeley, 94720, USA.
  • Metayer C; Division of Epidemiology, UC Berkeley, Berkeley, 94720, USA.
  • Hayes J; Center for Integrative Research on Childhood Leukemia and the Environment, UC Berkeley, Berkeley, 94720, USA.
  • Rappaport S; Division of Epidemiology, UC Berkeley, Berkeley, 94720, USA.
  • Dudoit S; Center for Integrative Research on Childhood Leukemia and the Environment, UC Berkeley, Berkeley, 94720, USA.
BMC Bioinformatics ; 20(1): 334, 2019 Jun 14.
Article em En | MEDLINE | ID: mdl-31200644
ABSTRACT

BACKGROUND:

Untargeted metabolomics datasets contain large proportions of uninformative features that can impede subsequent statistical analysis such as biomarker discovery and metabolic pathway analysis. Thus, there is a need for versatile and data-adaptive methods for filtering data prior to investigating the underlying biological phenomena. Here, we propose a data-adaptive pipeline for filtering metabolomics data that are generated by liquid chromatography-mass spectrometry (LC-MS) platforms. Our data-adaptive pipeline includes novel methods for filtering features based on blank samples, proportions of missing values, and estimated intra-class correlation coefficients.

RESULTS:

Using metabolomics datasets that were generated in our laboratory from samples of human blood, as well as two public LC-MS datasets, we compared our data-adaptive filtering method with traditional methods that rely on non-method specific thresholds. The data-adaptive approach outperformed traditional approaches in terms of removing noisy features and retaining high quality, biologically informative ones. The R code for running the data-adaptive filtering method is provided at https//github.com/courtneyschiffman/Metabolomics-Filtering .

CONCLUSIONS:

Our proposed data-adaptive filtering pipeline is intuitive and effectively removes uninformative features from untargeted metabolomics datasets. It is particularly relevant for interrogation of biological phenomena in data derived from complex matrices associated with biospecimens.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Espectrometria de Massas em Tandem / Metabolômica Limite: Humans Idioma: En Ano de publicação: 2019 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Espectrometria de Massas em Tandem / Metabolômica Limite: Humans Idioma: En Ano de publicação: 2019 Tipo de documento: Article