DIMA: Data-Driven Selection of an Imputation Algorithm.

Egert, Janine; Brombacher, Eva; Warscheid, Bettina; Kreutz, Clemens

Egert, Janine; Brombacher, Eva; Warscheid, Bettina; Kreutz, Clemens.

Afiliação

Egert J; Institute of Medical Biometry and Statistics (IMBI), Institute of Medicine and Medical Center Freiburg, 79104 Freiburg im Breisgau, Germany.
Brombacher E; Centre for Integrative Biological Signalling Studies (CIBSS), Albert-Ludwigs-Universität Freiburg, 79104 Freiburg, Germany.
Warscheid B; Institute of Medical Biometry and Statistics (IMBI), Institute of Medicine and Medical Center Freiburg, 79104 Freiburg im Breisgau, Germany.
Kreutz C; Centre for Integrative Biological Signalling Studies (CIBSS), Albert-Ludwigs-Universität Freiburg, 79104 Freiburg, Germany.

J Proteome Res ; 20(7): 3489-3496, 2021 07 02.

Article em En | MEDLINE | ID: mdl-34062065

RESUMO

Imputation is a prominent strategy when dealing with missing values (MVs) in proteomics data analysis pipelines. However, it is difficult to assess the performance of different imputation methods and varies strongly depending on data characteristics. To overcome this issue, we present the concept of a data-driven selection of an imputation algorithm (DIMA). The performance and broad applicability of DIMA are demonstrated on 142 quantitative proteomics data sets from the PRoteomics IDEntifications (PRIDE) database and on simulated data consisting of 5-50% MVs with different proportions of missing not at random and missing completely at random values. DIMA reliably suggests a high-performing imputation algorithm, which is always among the three best algorithms and results in a root mean square error difference (ΔRMSE) ≤ 10% in 80% of the cases. DIMA implementation is available in MATLAB at github.com/kreutz-lab/OmicsData and in R at github.com/kreutz-lab/DIMAR.

Assuntos

Algoritmos; Proteômica; Bases de Dados Factuais; Humanos

Palavras-chave

accuracy; imputation; mass spectrometry; missing values; proteomics

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Proteômica Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: J Proteome Res Assunto da revista: BIOQUIMICA Ano de publicação: 2021 Tipo de documento: Article País de afiliação: Alemanha

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google