Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Biom J ; 66(4): e2300090, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38813859

RESUMO

Linear regression (LR) is vastly used in data analysis for continuous outcomes in biomedicine and epidemiology. Despite its popularity, LR is incompatible with missing data, which frequently occur in health sciences. For parameter estimation, this shortcoming is usually resolved by complete-case analysis or imputation. Both work-arounds, however, are inadequate for prediction, since they either fail to predict on incomplete records or ignore missingness-induced reduction in prediction accuracy and rely on (unrealistic) assumptions about the missing mechanism. Here, we derive adaptive predictor-set linear model (aps-lm), capable of making predictions for incomplete data without the need for imputation. It is derived by using a predictor-selection operation, the Moore-Penrose pseudoinverse, and the reduced QR decomposition. aps-lm is an LR generalization that inherently handles missing values. It is applied on a reference data set, where complete predictors and outcome are available, and yields a set of privacy-preserving parameters. In a second stage, these are shared for making predictions of the outcome on external data sets with missing entries for predictors without imputation. Moreover, aps-lm computes prediction errors that account for the pattern of missing values even under extreme missingness. We benchmark aps-lm in a simulation study. aps-lm showed greater prediction accuracy and reduced bias compared to popular imputation strategies under a wide range of scenarios including variation of sample size, goodness of fit, missing value type, and covariance structure. Finally, as a proof-of-principle, we apply aps-lm in the context of epigenetic aging clocks, linear models that predict a person's biological age from epigenetic data with promising clinical applications.


Assuntos
Biometria , Modelos Lineares , Biometria/métodos , Humanos
2.
Forensic Sci Int Genet ; 65: 102878, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37116245

RESUMO

Tobacco smoking is a frequent habit sustained by > 1.3 billion people in 2020 and the leading preventable factor for health risk and premature mortality worldwide. In the forensic context, predicting smoking habits from biological samples may allow broadening DNA phenotyping. In this study, we aimed to implement previously published smoking habit classification models based on blood DNA methylation at 13 CpGs. First, we developed a matching lab tool based on bisulfite conversion and multiplex PCR followed by amplification-free library preparation and targeted paired-end massively parallel sequencing (MPS). Analysis of six technical duplicates revealed high reproducibility of methylation measurements (Pearson correlation of 0.983). Artificially methylated standards uncovered marker-specific amplification bias, which we corrected via bi-exponential models. We then applied our MPS tool to 232 blood samples from Europeans of a wide age range, of which 90 were current, 71 former and 71 never smokers. On average, we obtained 189,000 reads/sample and 15,000 reads/CpG, without marker drop-out. Methylation distributions per smoking category roughly corresponded to previous microarray analysis, showcasing large inter-individual variation but with technology-driven bias. Methylation at 11 out of 13 smoking-CpGs correlated with daily cigarettes in current smokers, while solely one was weakly correlated with time since cessation in former smokers. Interestingly, eight smoking-CpGs correlated with age, and one displayed weak but significant sex-associated methylation differences. Using bias-uncorrected MPS data, smoking habits were relatively accurately predicted using both two- (current/non-current) and three- (never/former/current) category model, but bias correction resulted in worse prediction performance for both models. Finally, to account for technology-driven variation, we built new, joint models with inter-technology corrections, which resulted in improved prediction results for both models, with or without PCR bias correction (e.g. MPS cross-validation F1-score > 0.8; 2-categories). Overall, our novel assay takes us one step closer towards the forensic application of viable smoking habit prediction from blood traces. However, future research is needed towards forensically validating the assay, especially in terms of sensitivity. We also need to further shed light on the employed biomarkers, particularly on the mechanistics, tissue specificity and putative confounders of smoking epigenetic signatures.


Assuntos
Metilação de DNA , Fumar , Humanos , Reprodutibilidade dos Testes , Fumar/genética , Reação em Cadeia da Polimerase , Sequenciamento de Nucleotídeos em Larga Escala , Ilhas de CpG/genética
3.
Genome Biol ; 22(1): 274, 2021 09 21.
Artigo em Inglês | MEDLINE | ID: mdl-34548083

RESUMO

BACKGROUND: Illumina DNA methylation microarrays enable epigenome-wide analysis vastly used for the discovery of novel DNA methylation variation in health and disease. However, the microarrays' probe design cannot fully consider the vast human genetic diversity, leading to genetic artifacts. Distinguishing genuine from artifactual genetic influence is of particular relevance in the study of DNA methylation heritability and methylation quantitative trait loci. But despite its importance, current strategies to account for genetic artifacts are lagging due to a limited mechanistic understanding on how such artifacts operate. RESULTS: To address this, we develop and benchmark UMtools, an R-package containing novel methods for the quantification and qualification of genetic artifacts based on fluorescence intensity signals. With our approach, we model and validate known SNPs/indels on a genetically controlled dataset of monozygotic twins, and we estimate minor allele frequency from DNA methylation data and empirically detect variants not included in dbSNP. Moreover, we identify examples where genetic artifacts interact with each other or with imprinting, X-inactivation, or tissue-specific regulation. Finally, we propose a novel strategy based on co-methylation that can discern between genetic artifacts and genuine genomic influence. CONCLUSIONS: We provide an atlas to navigate through the huge diversity of genetic artifacts encountered on DNA methylation microarrays. Overall, our study sets the ground for a paradigm shift in the study of the genetic component of epigenetic variation in DNA methylation microarrays.


Assuntos
Artefatos , Metilação de DNA , Análise de Sequência com Séries de Oligonucleotídeos , Software , Corantes Fluorescentes , Humanos , Mutação INDEL , Íntrons , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Gêmeos Monozigóticos/genética
4.
Aging (Albany NY) ; 13(5): 6442-6458, 2021 03 11.
Artigo em Inglês | MEDLINE | ID: mdl-33744870

RESUMO

Although DNA methylation variation of autosomal CpGs provides robust age predictive biomarkers, no male-specific age predictor exists based on Y-CpGs yet. Since sex chromosomes play an important role in aging, a Y-chromosome-based age predictor would allow studying male-specific aging effects and would also be useful in forensics. Here, we used blood-based DNA methylation microarray data of 1,057 males from six cohorts aged 15-87 and identified 75 Y-CpGs with an interquartile range of ≥0.1. Of these, 22 and six were significantly hyper- and hypomethylated with age (p(cor)<0.05, Bonferroni), respectively. Amongst several machine learning algorithms, a model based on support vector machines with radial kernel performed best in male-specific age prediction. We achieved a mean absolute deviation (MAD) between true and predicted age of 7.54 years (cor=0.81, validation) when using all 75 Y-CpGs, and a MAD of 8.46 years (cor=0.73, validation) based on the most predictive 19 Y-CpGs. The accuracies of both age predictors did not worsen with increased age, in contrast to autosomal CpG-based age predictors that are known to predict age with reduced accuracy in the elderly. Overall, we introduce the first-of-its-kind male-specific epigenetic age predictor for future applications in aging research and forensics.


Assuntos
Envelhecimento/genética , Cromossomos Humanos Y , Metilação de DNA , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Ilhas de CpG , Epigênese Genética , Humanos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Modelos Genéticos , Máquina de Vetores de Suporte , Adulto Jovem
5.
Genome Biol ; 22(1): 18, 2021 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-33402197

RESUMO

BACKGROUND: Although the genomes of monozygotic twins are practically identical, their methylomes may evolve divergently throughout their lifetime as a consequence of factors such as the environment or aging. Particularly for young and healthy monozygotic twins, DNA methylation divergence, if any, may be restricted to stochastic processes occurring post-twinning during embryonic development and early life. However, to what extent such stochastic mechanisms can systematically provide a stable source of inter-individual epigenetic variation remains uncertain until now. RESULTS: We enriched for inter-individual stochastic variation by using an equivalence testing-based statistical approach on whole blood methylation microarray data from healthy adolescent monozygotic twins. As a result, we identified 333 CpGs displaying similarly large methylation variation between monozygotic co-twins and unrelated individuals. Although their methylation variation surpasses measurement error and is stable in a short timescale, susceptibility to aging is apparent in the long term. Additionally, 46% of these CpGs were replicated in adipose tissue. The identified sites are significantly enriched at the clustered protocadherin loci, known for stochastic methylation in developing neurons. We also confirmed an enrichment in monozygotic twin DNA methylation discordance at these loci in whole genome bisulfite sequencing data from blood and adipose tissue. CONCLUSIONS: We have isolated a component of stochastic methylation variation, distinct from genetic influence, measurement error, and epigenetic drift. Biomarkers enriched in this component may serve in the future as the basis for universal epigenetic fingerprinting, relevant for instance in the discrimination of monozygotic twin individuals in forensic applications, currently impossible with standard DNA profiling.


Assuntos
Metilação de DNA , Epigênese Genética , Gêmeos Monozigóticos/genética , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Criança , Ilhas de CpG , Feminino , Genoma Humano , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA