RESUMO
BACKGROUND: Chromatographic peakpicking continues to represent a significant bottleneck in automated LC-MS workflows. Uncontrolled false discovery rates and the lack of manually-calibrated quality metrics require researchers to visually evaluate individual peaks, requiring large amounts of time and breaking replicability. This problem is exacerbated in noisy environmental datasets and for novel separation methods such as hydrophilic interaction columns in metabolomics, creating a demand for a simple, intuitive, and robust metric of peak quality. RESULTS: Here, we manually labeled four HILIC oceanographic particulate metabolite datasets to assess the performance of individual peak quality metrics. We used these datasets to construct a predictive model calibrated to the likelihood that visual inspection by an MS expert would include a given mass feature in the downstream analysis. We implemented two novel peak quality metrics, a custom signal-to-noise metric and a test of similarity to a bell curve, both calculated from the raw data in the extracted ion chromatogram, and found that these outperformed existing measurements of peak quality. A simple logistic regression model built on two metrics reduced the fraction of false positives in the analysis from 70-80% down to 1-5% and showed minimal overfitting when applied to novel datasets. We then explored the implications of this quality thresholding on the conclusions obtained by the downstream analysis and found that while only 10% of the variance in the dataset could be explained by depth in the default output from the peakpicker, approximately 40% of the variance was explained when restricted to high-quality peaks alone. CONCLUSIONS: We conclude that the poor performance of peakpicking algorithms significantly reduces the power of both univariate and multivariate statistical analyses to detect environmental differences. We demonstrate that simple models built on intuitive metrics and derived from the raw data are more robust and can outperform more complex models when applied to new data. Finally, we show that in properly curated datasets, depth is a major driver of variability in the marine microbial metabolome and identify several interesting metabolite trends for future investigation.
Assuntos
Metabolômica , Software , Metabolômica/métodos , Cromatografia Líquida/métodos , Metaboloma , Espectrometria de Massas/métodosRESUMO
Stomata are epidermal valves that facilitate gas exchange between plants and their environment. Stomatal patterning is regulated by the EPIDERMAL PATTERING FACTOR (EPF) family of secreted peptides: EPF1 enforces stomatal spacing, whereas EPIDERMAL PATTERNING FACTOR-LIKE9 (EPFL9), also known as Stomagen, promotes stomatal development. It remains unknown, however, how far these signaling peptides act. Utilizing Cre-lox recombination-based mosaic sectors that overexpress either EPF1 or Stomagen in Arabidopsis cotyledons, we reveal a range within the epidermis and across the cell layers in which these peptides influence patterns. To determine their effective ranges quantitatively, we developed a computational pipeline, SPACE (stomata patterning autocorrelation on epidermis), that describes probabilistic two-dimensional stomatal distributions based upon spatial autocorrelation statistics used in astrophysics. The SPACE analysis shows that, whereas both peptides act locally, the inhibitor EPF1 exerts longer range effects than the activator Stomagen. Furthermore, local perturbation of stomatal development has little influence on global two-dimensional stomatal patterning. Our findings conclusively demonstrate the nature and extent of EPF peptides as non-cell autonomous local signals and provide a means for quantitative characterization of complex spatial patterns in development.This article has an associated 'The people behind the papers' interview.