Búsqueda | BVS Bolivia

GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies.

Wei, Runmin; Wang, Jingye; Jia, Erik; Chen, Tianlu; Ni, Yan; Jia, Wei.

PLoS Comput Biol ; 14(1): e1005973, 2018 01.

Artículo en Inglés | MEDLINE | ID: mdl-29385130

RESUMEN

Left-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Improper data processing procedures for missing values will cause adverse impacts on subsequent statistical analyses. However, few imputation methods have been developed and applied to the situation of MNAR in the field of metabolomics. Thus, a practical left-censored missing value imputation method is urgently needed. We developed an iterative Gibbs sampler based left-censored missing value imputation approach (GSimp). We compared GSimp with other three imputation methods on two real-world targeted metabolomics datasets and one simulation dataset using our imputation evaluation pipeline. The results show that GSimp outperforms other imputation methods in terms of imputation accuracy, observation distribution, univariate and multivariate analyses, and statistical sensitivity. Additionally, a parallel version of GSimp was developed for dealing with large scale metabolomics datasets. The R code for GSimp, evaluation pipeline, tutorial, real-world and simulated targeted metabolomics datasets are available at: https://github.com/WandeRum/GSimp.

Asunto(s)

Biología Computacional/métodos , Interpretación Estadística de Datos , Metabolómica/métodos , Lenguajes de Programación , Algoritmos , Ácidos y Sales Biliares/química , Simulación por Computador , Bases de Datos Factuales , Ácidos Grasos no Esterificados/química , Ácidos Grasos no Esterificados/metabolismo , Humanos , Límite de Detección , Espectrometría de Masas , Modelos Estadísticos , Análisis Multivariante , Análisis de Componente Principal , Probabilidad , Programas Informáticos , Procesos Estocásticos

Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data.

Wei, Runmin; Wang, Jingye; Su, Mingming; Jia, Erik; Chen, Shaoqiu; Chen, Tianlu; Ni, Yan.

Sci Rep ; 8(1): 663, 2018 01 12.

Artículo en Inglés | MEDLINE | ID: mdl-29330539

RESUMEN

Missing values exist widely in mass-spectrometry (MS) based metabolomics data. Various methods have been applied for handling missing values, but the selection can significantly affect following data analyses. Typically, there are three types of missing values, missing not at random (MNAR), missing at random (MAR), and missing completely at random (MCAR). Our study comprehensively compared eight imputation methods (zero, half minimum (HM), mean, median, random forest (RF), singular value decomposition (SVD), k-nearest neighbors (kNN), and quantile regression imputation of left-censored data (QRILC)) for different types of missing values using four metabolomics datasets. Normalized root mean squared error (NRMSE) and NRMSE-based sum of ranks (SOR) were applied to evaluate imputation accuracy. Principal component analysis (PCA)/partial least squares (PLS)-Procrustes analysis were used to evaluate the overall sample distribution. Student's t-test followed by correlation analysis was conducted to evaluate the effects on univariate statistics. Our findings demonstrated that RF performed the best for MCAR/MAR and QRILC was the favored one for left-censored MNAR. Finally, we proposed a comprehensive strategy and developed a public-accessible web-tool for the application of missing value imputation in metabolomics ( https://metabolomics.cc.hawaii.edu/software/MetImp/ ).

Asunto(s)

Espectrometría de Masas/métodos , Metabolómica/métodos , Análisis por Conglomerados , Biología Computacional/métodos , Análisis de los Mínimos Cuadrados , Análisis de Componente Principal

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA