Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
1.
Bioinformatics ; 38(16): 3950-3957, 2022 08 10.
Article in English | MEDLINE | ID: mdl-35771651

ABSTRACT

MOTIVATION: Data normalization is an essential step to reduce technical variation within and between arrays. Due to the different karyotypes and the effects of X chromosome inactivation, females and males exhibit distinct methylation patterns on sex chromosomes; thus, it poses a significant challenge to normalize sex chromosome data without introducing bias. Currently, existing methods do not provide unbiased solutions to normalize sex chromosome data, usually, they just process autosomal and sex chromosomes indiscriminately. RESULTS: Here, we demonstrate that ignoring this sex difference will lead to introducing artificial sex bias, especially for thousands of autosomal CpGs. We present a novel two-step strategy (interpolatedXY) to address this issue, which is applicable to all quantile-based normalization methods. By this new strategy, the autosomal CpGs are first normalized independently by conventional methods, such as funnorm or dasen; then the corrected methylation values of sex chromosome-linked CpGs are estimated as the weighted average of their nearest neighbors on autosomes. The proposed two-step strategy can also be applied to other non-quantile-based normalization methods, as well as other array-based data types. Moreover, we propose a useful concept: the sex explained fraction of variance, to quantitatively measure the normalization effect. AVAILABILITY AND IMPLEMENTATION: The proposed methods are available by calling the function 'adjustedDasen' or 'adjustedFunnorm' in the latest wateRmelon package (https://github.com/schalkwyk/wateRmelon), with methods compatible with all the major workflows, including minfi. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Methylation , Sexism , Female , Male , Humans , Oligonucleotide Array Sequence Analysis/methods , Protein Processing, Post-Translational
2.
BMC Genomics ; 22(1): 484, 2021 Jun 28.
Article in English | MEDLINE | ID: mdl-34182928

ABSTRACT

BACKGROUND: Sex is an important covariate of epigenome-wide association studies due to its strong influence on DNA methylation patterns across numerous genomic positions. Nevertheless, many samples on the Gene Expression Omnibus (GEO) frequently lack a sex annotation or are incorrectly labelled. Considering the influence that sex imposes on DNA methylation patterns, it is necessary to ensure that methods for filtering poor samples and checking of sex assignment are accurate and widely applicable. RESULTS: Here we presented a novel method to predict sex using only DNA methylation beta values, which can be readily applied to almost all DNA methylation datasets of different formats (raw IDATs or text files with only signal intensities) uploaded to GEO. We identified 4345 significantly (p<0.01) sex-associated CpG sites present on both 450K and EPIC arrays, and constructed a sex classifier based on the two first principal components of the DNA methylation data of sex-associated probes mapped on sex chromosomes. The proposed method is constructed using whole blood samples and exhibits good performance across a wide range of tissues. We further demonstrated that our method can be used to identify samples with sex chromosome aneuploidy, this function is validated by five Turner syndrome cases and one Klinefelter syndrome case. CONCLUSIONS: This proposed sex classifier not only can be used for sex predictions but also applied to identify samples with sex chromosome aneuploidy, and it is freely and easily accessible by calling the 'estimateSex' function from the newest wateRmelon Bioconductor package ( https://github.com/schalkwyk/wateRmelon ).


Subject(s)
DNA Methylation , Genomics , Aneuploidy , CpG Islands , Humans , Sex Chromosomes/genetics
3.
Sci Adv ; 7(16)2021 04.
Article in English | MEDLINE | ID: mdl-33853786

ABSTRACT

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) macrodomain within the nonstructural protein 3 counteracts host-mediated antiviral adenosine diphosphate-ribosylation signaling. This enzyme is a promising antiviral target because catalytic mutations render viruses nonpathogenic. Here, we report a massive crystallographic screening and computational docking effort, identifying new chemical matter primarily targeting the active site of the macrodomain. Crystallographic screening of 2533 diverse fragments resulted in 214 unique macrodomain-binders. An additional 60 molecules were selected from docking more than 20 million fragments, of which 20 were crystallographically confirmed. X-ray data collection to ultra-high resolution and at physiological temperature enabled assessment of the conformational heterogeneity around the active site. Several fragment hits were confirmed by solution binding using three biophysical techniques (differential scanning fluorimetry, homogeneous time-resolved fluorescence, and isothermal titration calorimetry). The 234 fragment structures explore a wide range of chemotypes and provide starting points for development of potent SARS-CoV-2 macrodomain inhibitors.


Subject(s)
Catalytic Domain/physiology , Protein Binding/physiology , Viral Nonstructural Proteins/metabolism , Catalytic Domain/genetics , Crystallography, X-Ray , Humans , Models, Molecular , Molecular Docking Simulation , Protein Conformation , SARS-CoV-2/genetics , SARS-CoV-2/physiology , Viral Nonstructural Proteins/genetics , COVID-19 Drug Treatment
4.
bioRxiv ; 2020 Nov 24.
Article in English | MEDLINE | ID: mdl-33269349

ABSTRACT

The SARS-CoV-2 macrodomain (Mac1) within the non-structural protein 3 (Nsp3) counteracts host-mediated antiviral ADP-ribosylation signalling. This enzyme is a promising antiviral target because catalytic mutations render viruses non-pathogenic. Here, we report a massive crystallographic screening and computational docking effort, identifying new chemical matter primarily targeting the active site of the macrodomain. Crystallographic screening of diverse fragment libraries resulted in 214 unique macrodomain-binding fragments, out of 2,683 screened. An additional 60 molecules were selected from docking over 20 million fragments, of which 20 were crystallographically confirmed. X-ray data collection to ultra-high resolution and at physiological temperature enabled assessment of the conformational heterogeneity around the active site. Several crystallographic and docking fragment hits were validated for solution binding using three biophysical techniques (DSF, HTRF, ITC). Overall, the 234 fragment structures presented explore a wide range of chemotypes and provide starting points for development of potent SARS-CoV-2 macrodomain inhibitors.

5.
PLoS Genet ; 16(10): e1009035, 2020 10.
Article in English | MEDLINE | ID: mdl-33048947

ABSTRACT

Epidemiological research suggests that paternal obesity may increase the risk of fathering small for gestational age offspring. Studies in non-human mammals indicate that such associations could be mediated by DNA methylation changes in spermatozoa that influence offspring development in utero. Human obesity is associated with differential DNA methylation in peripheral blood. It is unclear, however, whether this differential DNA methylation is reflected in spermatozoa. We profiled genome-wide DNA methylation using the Illumina MethylationEPIC array in a cross-sectional study of matched human blood and sperm from lean (discovery n = 47; replication n = 21) and obese (n = 22) males to analyse tissue covariation of DNA methylation, and identify obesity-associated methylomic signatures. We found that DNA methylation signatures of human blood and spermatozoa are highly discordant, and methylation levels are correlated at only a minority of CpG sites (~1%). At the majority of these sites, DNA methylation appears to be influenced by genetic variation. Obesity-associated DNA methylation in blood was not generally reflected in spermatozoa, and obesity was not associated with altered covariation patterns or accelerated epigenetic ageing in the two tissues. However, one cross-tissue obesity-specific hypermethylated site (cg19357369; chr4:2429884; P = 8.95 × 10-8; 2% DNA methylation difference) was identified, warranting replication and further investigation. When compared to a wide range of human somatic tissue samples (n = 5,917), spermatozoa displayed differential DNA methylation across pathways enriched in transcriptional regulation. Overall, human sperm displays a unique DNA methylation profile that is highly discordant to, and practically uncorrelated with, that of matched peripheral blood. We observed that obesity was only nominally associated with differential DNA methylation in sperm, and therefore suggest that spermatozoal DNA methylation is an unlikely mediator of intergenerational effects of metabolic traits.


Subject(s)
DNA Methylation/genetics , Epigenome/genetics , Obesity/genetics , Spermatozoa/metabolism , Adolescent , Adult , Body Mass Index , Child , Child, Preschool , CpG Islands/genetics , DNA Replication/genetics , Epigenesis, Genetic/genetics , Gene Expression Profiling , Gene Expression Regulation/genetics , Genome, Human/genetics , Gestational Age , Humans , Infant , Infant, Newborn , Male , Middle Aged , Obesity/blood , Obesity/epidemiology , Obesity/pathology , Polymorphism, Single Nucleotide/genetics , Spermatozoa/growth & development , Spermatozoa/immunology , Young Adult
6.
BMC Genomics ; 20(1): 366, 2019 May 14.
Article in English | MEDLINE | ID: mdl-31088362

ABSTRACT

BACKGROUND: There has been a steady increase in the number of studies aiming to identify DNA methylation differences associated with complex phenotypes. Many of the challenges of epigenetic epidemiology regarding study design and interpretation have been discussed in detail, however there are analytical concerns that are outstanding and require further exploration. In this study we seek to address three analytical issues. First, we quantify the multiple testing burden and propose a standard statistical significance threshold for identifying DNA methylation sites that are associated with an outcome. Second, we establish whether linear regression, the chosen statistical tool for the majority of studies, is appropriate and whether it is biased by the underlying distribution of DNA methylation data. Finally, we assess the sample size required for adequately powered DNA methylation association studies. RESULTS: We quantified DNA methylation in the Understanding Society cohort (n = 1175), a large population based study, using the Illumina EPIC array to assess the statistical properties of DNA methylation association analyses. By simulating null DNA methylation studies, we generated the distribution of p-values expected by chance and calculated the 5% family-wise error for EPIC array studies to be 9 × 10- 8. Next, we tested whether the assumptions of linear regression are violated by DNA methylation data and found that the majority of sites do not satisfy the assumption of normal residuals. Nevertheless, we found no evidence that this bias influences analyses by increasing the likelihood of affected sites to be false positives. Finally, we performed power calculations for EPIC based DNA methylation studies, demonstrating that existing studies with data on ~ 1000 samples are adequately powered to detect small differences at the majority of sites. CONCLUSION: We propose that a significance threshold of P < 9 × 10- 8 adequately controls the false positive rate for EPIC array DNA methylation studies. Moreover, our results indicate that linear regression is a valid statistical methodology for DNA methylation studies, despite the fact that the data do not always satisfy the assumptions of this test. These findings have implications for epidemiological-based studies of DNA methylation and provide a framework for the interpretation of findings from current and future studies.


Subject(s)
DNA Methylation , Epigenomics/methods , Oligonucleotide Array Sequence Analysis/methods , CpG Islands , Epigenesis, Genetic , Genome-Wide Association Study , Humans , Linear Models
7.
Bioinformatics ; 35(6): 981-986, 2019 03 15.
Article in English | MEDLINE | ID: mdl-30875430

ABSTRACT

MOTIVATION: The datasets generated by DNA methylation analyses are getting bigger. With the release of the HumanMethylationEPIC micro-array and datasets containing thousands of samples, analyses of these large datasets using R are becoming impractical due to large memory requirements. As a result there is an increasing need for computationally efficient methodologies to perform meaningful analysis on high dimensional data. RESULTS: Here we introduce the bigmelon R package, which provides a memory efficient workflow that enables users to perform the complex, large scale analyses required in epigenome wide association studies (EWAS) without the need for large RAM. Building on top of the CoreArray Genomic Data Structure file format and libraries packaged in the gdsfmt package, we provide a practical workflow that facilitates the reading-in, preprocessing, quality control and statistical analysis of DNA methylation data.We demonstrate the capabilities of the bigmelon package using a large dataset consisting of 1193 human blood samples from the Understanding Society: UK Household Longitudinal Study, assayed on the EPIC micro-array platform. AVAILABILITY AND IMPLEMENTATION: The bigmelon package is available on Bioconductor (http://bioconductor.org/packages/bigmelon/). The Understanding Society dataset is available at https://www.understandingsociety.ac.uk/about/health/data upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Methylation , Software , Genomics , Humans , Longitudinal Studies , Workflow
8.
Am J Hum Genet ; 103(5): 654-665, 2018 11 01.
Article in English | MEDLINE | ID: mdl-30401456

ABSTRACT

Characterizing the complex relationship between genetic, epigenetic, and transcriptomic variation has the potential to increase understanding about the mechanisms underpinning health and disease phenotypes. We undertook a comprehensive analysis of common genetic variation on DNA methylation (DNAm) by using the Illumina EPIC array to profile samples from the UK Household Longitudinal study. We identified 12,689,548 significant DNA methylation quantitative trait loci (mQTL) associations (p < 6.52 × 10-14) occurring between 2,907,234 genetic variants and 93,268 DNAm sites, including a large number not identified by previous DNAm-profiling methods. We demonstrate the utility of these data for interpreting the functional consequences of common genetic variation associated with > 60 human traits by using summary-data-based Mendelian randomization (SMR) to identify 1,662 pleiotropic associations between 36 complex traits and 1,246 DNAm sites. We also use SMR to characterize the relationship between DNAm and gene expression and thereby identify 6,798 pleiotropic associations between 5,420 DNAm sites and the transcription of 1,702 genes. Our mQTL database and SMR results are available via a searchable online database as a resource to the research community.


Subject(s)
DNA Methylation/genetics , DNA/genetics , Epigenesis, Genetic/genetics , Gene Expression/genetics , Genetic Variation/genetics , Quantitative Trait Loci/genetics , Transcriptome/genetics , Genome-Wide Association Study/methods , Humans , Longitudinal Studies , Phenotype , Quantitative Trait, Heritable , Transcription, Genetic/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...