Your browser doesn't support javascript.
loading
Methylation data imputation performances under different representations and missingness patterns.
Lena, Pietro Di; Sala, Claudia; Prodi, Andrea; Nardini, Christine.
Afiliación
  • Lena PD; Department of Computer Science and Engineering, University of Bologna, Mura Anteo Zamboni 7, Bologna, Italy. pietro.dilena@unibo.it.
  • Sala C; Department of Physics and Astronomy, University of Bologna, Viale Berti Pichat 6/2, Bologna, Italy.
  • Prodi A; Smart Cities Living Lab, ISOF CNR, Via P. Gobetti, 101, Bologna, Italy.
  • Nardini C; Istituto per le Applicazioni del Calcolo Mauro Picone, CNR, Via dei Taurini, 19, Roma, Italy. christine.nardini@cnr.it.
BMC Bioinformatics ; 21(1): 268, 2020 Jun 29.
Article en En | MEDLINE | ID: mdl-32600298
ABSTRACT

BACKGROUND:

High-throughput technologies enable the cost-effective collection and analysis of DNA methylation data throughout the human genome. This naturally entails missing values management that can complicate the analysis of the data. Several general and specific imputation methods are suitable for DNA methylation data. However, there are no detailed studies of their performances under different missing data mechanisms -(completely) at random or not- and different representations of DNA methylation levels (ß and M-value).

RESULTS:

We make an extensive analysis of the imputation performances of seven imputation methods on simulated missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR) methylation data. We further consider imputation performances on the popular ß- and M-value representations of methylation levels. Overall, ß-values enable better imputation performances than M-values. Imputation accuracy is lower for mid-range ß-values, while it is generally more accurate for values at the extremes of the ß-value range. The MAR values distribution is on the average more dense in the mid-range in comparison to the expected ß-value distribution. As a consequence, MAR values are on average harder to impute.

CONCLUSIONS:

The results of the analysis provide guidelines for the most suitable imputation approaches for DNA methylation data under different representations of DNA methylation levels and different missing data mechanisms.
Asunto(s)
Palabras clave

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Metilación de ADN Límite: Humans Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2020 Tipo del documento: Article País de afiliación: Italia

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Metilación de ADN Límite: Humans Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2020 Tipo del documento: Article País de afiliación: Italia