Pesquisa | BVS - MINISTÉRIO DA SAÚDE

InterpolatedXY: a two-step strategy to normalize DNA methylation microarray data avoiding sex bias.

Wang, Yucheng; Gorrie-Stone, Tyler J; Grant, Olivia A; Andrayas, Alexandria D; Zhai, Xiaojun; McDonald-Maier, Klaus D; Schalkwyk, Leonard C.

Bioinformatics ; 38(16): 3950-3957, 2022 08 10.

Artigo em Inglês | MEDLINE | ID: mdl-35771651

RESUMO

MOTIVATION: Data normalization is an essential step to reduce technical variation within and between arrays. Due to the different karyotypes and the effects of X chromosome inactivation, females and males exhibit distinct methylation patterns on sex chromosomes; thus, it poses a significant challenge to normalize sex chromosome data without introducing bias. Currently, existing methods do not provide unbiased solutions to normalize sex chromosome data, usually, they just process autosomal and sex chromosomes indiscriminately. RESULTS: Here, we demonstrate that ignoring this sex difference will lead to introducing artificial sex bias, especially for thousands of autosomal CpGs. We present a novel two-step strategy (interpolatedXY) to address this issue, which is applicable to all quantile-based normalization methods. By this new strategy, the autosomal CpGs are first normalized independently by conventional methods, such as funnorm or dasen; then the corrected methylation values of sex chromosome-linked CpGs are estimated as the weighted average of their nearest neighbors on autosomes. The proposed two-step strategy can also be applied to other non-quantile-based normalization methods, as well as other array-based data types. Moreover, we propose a useful concept: the sex explained fraction of variance, to quantitatively measure the normalization effect. AVAILABILITY AND IMPLEMENTATION: The proposed methods are available by calling the function 'adjustedDasen' or 'adjustedFunnorm' in the latest wateRmelon package (https://github.com/schalkwyk/wateRmelon), with methods compatible with all the major workflows, including minfi. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Metilação de DNA , Sexismo , Feminino , Masculino , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Processamento de Proteína Pós-Traducional

DNA methylation-based sex classifier to predict sex and identify sex chromosome aneuploidy.

Wang, Yucheng; Hannon, Eilis; Grant, Olivia A; Gorrie-Stone, Tyler J; Kumari, Meena; Mill, Jonathan; Zhai, Xiaojun; McDonald-Maier, Klaus D; Schalkwyk, Leonard C.

BMC Genomics ; 22(1): 484, 2021 Jun 28.

Artigo em Inglês | MEDLINE | ID: mdl-34182928

RESUMO

BACKGROUND: Sex is an important covariate of epigenome-wide association studies due to its strong influence on DNA methylation patterns across numerous genomic positions. Nevertheless, many samples on the Gene Expression Omnibus (GEO) frequently lack a sex annotation or are incorrectly labelled. Considering the influence that sex imposes on DNA methylation patterns, it is necessary to ensure that methods for filtering poor samples and checking of sex assignment are accurate and widely applicable. RESULTS: Here we presented a novel method to predict sex using only DNA methylation beta values, which can be readily applied to almost all DNA methylation datasets of different formats (raw IDATs or text files with only signal intensities) uploaded to GEO. We identified 4345 significantly (p<0.01) sex-associated CpG sites present on both 450K and EPIC arrays, and constructed a sex classifier based on the two first principal components of the DNA methylation data of sex-associated probes mapped on sex chromosomes. The proposed method is constructed using whole blood samples and exhibits good performance across a wide range of tissues. We further demonstrated that our method can be used to identify samples with sex chromosome aneuploidy, this function is validated by five Turner syndrome cases and one Klinefelter syndrome case. CONCLUSIONS: This proposed sex classifier not only can be used for sex predictions but also applied to identify samples with sex chromosome aneuploidy, and it is freely and easily accessible by calling the 'estimateSex' function from the newest wateRmelon Bioconductor package ( https://github.com/schalkwyk/wateRmelon ).

Assuntos

Metilação de DNA , Genômica , Aneuploidia , Ilhas de CpG , Humanos , Cromossomos Sexuais/genética

Fragment binding to the Nsp3 macrodomain of SARS-CoV-2 identified through crystallographic screening and computational docking.

Schuller, Marion; Correy, Galen J; Gahbauer, Stefan; Fearon, Daren; Wu, Taiasean; Díaz, Roberto Efraín; Young, Iris D; Carvalho Martins, Luan; Smith, Dominique H; Schulze-Gahmen, Ursula; Owens, Tristan W; Deshpande, Ishan; Merz, Gregory E; Thwin, Aye C; Biel, Justin T; Peters, Jessica K; Moritz, Michelle; Herrera, Nadia; Kratochvil, Huong T; Aimon, Anthony; Bennett, James M; Brandao Neto, Jose; Cohen, Aina E; Dias, Alexandre; Douangamath, Alice; Dunnett, Louise; Fedorov, Oleg; Ferla, Matteo P; Fuchs, Martin R; Gorrie-Stone, Tyler J; Holton, James M; Johnson, Michael G; Krojer, Tobias; Meigs, George; Powell, Ailsa J; Rack, Johannes Gregor Matthias; Rangel, Victor L; Russi, Silvia; Skyner, Rachael E; Smith, Clyde A; Soares, Alexei S; Wierman, Jennifer L; Zhu, Kang; O'Brien, Peter; Jura, Natalia; Ashworth, Alan; Irwin, John J; Thompson, Michael C; Gestwicki, Jason E; von Delft, Frank.

Sci Adv ; 7(16)2021 04.

Artigo em Inglês | MEDLINE | ID: mdl-33853786

RESUMO

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) macrodomain within the nonstructural protein 3 counteracts host-mediated antiviral adenosine diphosphate-ribosylation signaling. This enzyme is a promising antiviral target because catalytic mutations render viruses nonpathogenic. Here, we report a massive crystallographic screening and computational docking effort, identifying new chemical matter primarily targeting the active site of the macrodomain. Crystallographic screening of 2533 diverse fragments resulted in 214 unique macrodomain-binders. An additional 60 molecules were selected from docking more than 20 million fragments, of which 20 were crystallographically confirmed. X-ray data collection to ultra-high resolution and at physiological temperature enabled assessment of the conformational heterogeneity around the active site. Several fragment hits were confirmed by solution binding using three biophysical techniques (differential scanning fluorimetry, homogeneous time-resolved fluorescence, and isothermal titration calorimetry). The 234 fragment structures explore a wide range of chemotypes and provide starting points for development of potent SARS-CoV-2 macrodomain inhibitors.

Assuntos

Domínio Catalítico/fisiologia , Ligação Proteica/fisiologia , Proteínas não Estruturais Virais/metabolismo , Domínio Catalítico/genética , Cristalografia por Raios X , Humanos , Modelos Moleculares , Simulação de Acoplamento Molecular , Conformação Proteica , SARS-CoV-2/genética , SARS-CoV-2/fisiologia , Proteínas não Estruturais Virais/genética , Tratamento Farmacológico da COVID-19

Fragment Binding to the Nsp3 Macrodomain of SARS-CoV-2 Identified Through Crystallographic Screening and Computational Docking.

Schuller, Marion; Correy, Galen J; Gahbauer, Stefan; Fearon, Daren; Wu, Taiasean; Díaz, Roberto Efraín; Young, Iris D; Martins, Luan Carvalho; Smith, Dominique H; Schulze-Gahmen, Ursula; Owens, Tristan W; Deshpande, Ishan; Merz, Gregory E; Thwin, Aye C; Biel, Justin T; Peters, Jessica K; Moritz, Michelle; Herrera, Nadia; Kratochvil, Huong T; Aimon, Anthony; Bennett, James M; Neto, Jose Brandao; Cohen, Aina E; Dias, Alexandre; Douangamath, Alice; Dunnett, Louise; Fedorov, Oleg; Ferla, Matteo P; Fuchs, Martin; Gorrie-Stone, Tyler J; Holton, James M; Johnson, Michael G; Krojer, Tobias; Meigs, George; Powell, Ailsa J; Rangel, Victor L; Russi, Silvia; Skyner, Rachael E; Smith, Clyde A; Soares, Alexei S; Wierman, Jennifer L; Zhu, Kang; Jura, Natalia; Ashworth, Alan; Irwin, John; Thompson, Michael C; Gestwicki, Jason E; von Delft, Frank; Shoichet, Brian K; Fraser, James S.

bioRxiv ; 2020 Nov 24.

Artigo em Inglês | MEDLINE | ID: mdl-33269349

RESUMO

The SARS-CoV-2 macrodomain (Mac1) within the non-structural protein 3 (Nsp3) counteracts host-mediated antiviral ADP-ribosylation signalling. This enzyme is a promising antiviral target because catalytic mutations render viruses non-pathogenic. Here, we report a massive crystallographic screening and computational docking effort, identifying new chemical matter primarily targeting the active site of the macrodomain. Crystallographic screening of diverse fragment libraries resulted in 214 unique macrodomain-binding fragments, out of 2,683 screened. An additional 60 molecules were selected from docking over 20 million fragments, of which 20 were crystallographically confirmed. X-ray data collection to ultra-high resolution and at physiological temperature enabled assessment of the conformational heterogeneity around the active site. Several crystallographic and docking fragment hits were validated for solution binding using three biophysical techniques (DSF, HTRF, ITC). Overall, the 234 fragment structures presented explore a wide range of chemotypes and provide starting points for development of potent SARS-CoV-2 macrodomain inhibitors.

The DNA methylome of human sperm is distinct from blood with little evidence for tissue-consistent obesity associations.

Åsenius, Fredrika; Gorrie-Stone, Tyler J; Brew, Ama; Panchbhaya, Yasmin; Williamson, Elizabeth; Schalkwyk, Leonard C; Rakyan, Vardhman K; Holland, Michelle L; Marzi, Sarah J; Williams, David J.

PLoS Genet ; 16(10): e1009035, 2020 10.

Artigo em Inglês | MEDLINE | ID: mdl-33048947

RESUMO

Epidemiological research suggests that paternal obesity may increase the risk of fathering small for gestational age offspring. Studies in non-human mammals indicate that such associations could be mediated by DNA methylation changes in spermatozoa that influence offspring development in utero. Human obesity is associated with differential DNA methylation in peripheral blood. It is unclear, however, whether this differential DNA methylation is reflected in spermatozoa. We profiled genome-wide DNA methylation using the Illumina MethylationEPIC array in a cross-sectional study of matched human blood and sperm from lean (discovery n = 47; replication n = 21) and obese (n = 22) males to analyse tissue covariation of DNA methylation, and identify obesity-associated methylomic signatures. We found that DNA methylation signatures of human blood and spermatozoa are highly discordant, and methylation levels are correlated at only a minority of CpG sites (~1%). At the majority of these sites, DNA methylation appears to be influenced by genetic variation. Obesity-associated DNA methylation in blood was not generally reflected in spermatozoa, and obesity was not associated with altered covariation patterns or accelerated epigenetic ageing in the two tissues. However, one cross-tissue obesity-specific hypermethylated site (cg19357369; chr4:2429884; P = 8.95 × 10-8; 2% DNA methylation difference) was identified, warranting replication and further investigation. When compared to a wide range of human somatic tissue samples (n = 5,917), spermatozoa displayed differential DNA methylation across pathways enriched in transcriptional regulation. Overall, human sperm displays a unique DNA methylation profile that is highly discordant to, and practically uncorrelated with, that of matched peripheral blood. We observed that obesity was only nominally associated with differential DNA methylation in sperm, and therefore suggest that spermatozoal DNA methylation is an unlikely mediator of intergenerational effects of metabolic traits.

Assuntos

Metilação de DNA/genética , Epigenoma/genética , Obesidade/genética , Espermatozoides/metabolismo , Adolescente , Adulto , Índice de Massa Corporal , Criança , Pré-Escolar , Ilhas de CpG/genética , Replicação do DNA/genética , Epigênese Genética/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica/genética , Genoma Humano/genética , Idade Gestacional , Humanos , Lactente , Recém-Nascido , Masculino , Pessoa de Meia-Idade , Obesidade/sangue , Obesidade/epidemiologia , Obesidade/patologia , Polimorfismo de Nucleotídeo Único/genética , Espermatozoides/crescimento & desenvolvimento , Espermatozoides/imunologia , Adulto Jovem

Guidance for DNA methylation studies: statistical insights from the Illumina EPIC array.

Mansell, Georgina; Gorrie-Stone, Tyler J; Bao, Yanchun; Kumari, Meena; Schalkwyk, Leonard S; Mill, Jonathan; Hannon, Eilis.

BMC Genomics ; 20(1): 366, 2019 May 14.

Artigo em Inglês | MEDLINE | ID: mdl-31088362

RESUMO

BACKGROUND: There has been a steady increase in the number of studies aiming to identify DNA methylation differences associated with complex phenotypes. Many of the challenges of epigenetic epidemiology regarding study design and interpretation have been discussed in detail, however there are analytical concerns that are outstanding and require further exploration. In this study we seek to address three analytical issues. First, we quantify the multiple testing burden and propose a standard statistical significance threshold for identifying DNA methylation sites that are associated with an outcome. Second, we establish whether linear regression, the chosen statistical tool for the majority of studies, is appropriate and whether it is biased by the underlying distribution of DNA methylation data. Finally, we assess the sample size required for adequately powered DNA methylation association studies. RESULTS: We quantified DNA methylation in the Understanding Society cohort (n = 1175), a large population based study, using the Illumina EPIC array to assess the statistical properties of DNA methylation association analyses. By simulating null DNA methylation studies, we generated the distribution of p-values expected by chance and calculated the 5% family-wise error for EPIC array studies to be 9 × 10- 8. Next, we tested whether the assumptions of linear regression are violated by DNA methylation data and found that the majority of sites do not satisfy the assumption of normal residuals. Nevertheless, we found no evidence that this bias influences analyses by increasing the likelihood of affected sites to be false positives. Finally, we performed power calculations for EPIC based DNA methylation studies, demonstrating that existing studies with data on ~ 1000 samples are adequately powered to detect small differences at the majority of sites. CONCLUSION: We propose that a significance threshold of P < 9 × 10- 8 adequately controls the false positive rate for EPIC array DNA methylation studies. Moreover, our results indicate that linear regression is a valid statistical methodology for DNA methylation studies, despite the fact that the data do not always satisfy the assumptions of this test. These findings have implications for epidemiological-based studies of DNA methylation and provide a framework for the interpretation of findings from current and future studies.

Assuntos

Metilação de DNA , Epigenômica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Ilhas de CpG , Epigênese Genética , Estudo de Associação Genômica Ampla , Humanos , Modelos Lineares

Bigmelon: tools for analysing large DNA methylation datasets.

Gorrie-Stone, Tyler J; Smart, Melissa C; Saffari, Ayden; Malki, Karim; Hannon, Eilis; Burrage, Joe; Mill, Jonathan; Kumari, Meena; Schalkwyk, Leonard C.

Bioinformatics ; 35(6): 981-986, 2019 03 15.

Artigo em Inglês | MEDLINE | ID: mdl-30875430

RESUMO

MOTIVATION: The datasets generated by DNA methylation analyses are getting bigger. With the release of the HumanMethylationEPIC micro-array and datasets containing thousands of samples, analyses of these large datasets using R are becoming impractical due to large memory requirements. As a result there is an increasing need for computationally efficient methodologies to perform meaningful analysis on high dimensional data. RESULTS: Here we introduce the bigmelon R package, which provides a memory efficient workflow that enables users to perform the complex, large scale analyses required in epigenome wide association studies (EWAS) without the need for large RAM. Building on top of the CoreArray Genomic Data Structure file format and libraries packaged in the gdsfmt package, we provide a practical workflow that facilitates the reading-in, preprocessing, quality control and statistical analysis of DNA methylation data.We demonstrate the capabilities of the bigmelon package using a large dataset consisting of 1193 human blood samples from the Understanding Society: UK Household Longitudinal Study, assayed on the EPIC micro-array platform. AVAILABILITY AND IMPLEMENTATION: The bigmelon package is available on Bioconductor (http://bioconductor.org/packages/bigmelon/). The Understanding Society dataset is available at https://www.understandingsociety.ac.uk/about/health/data upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Metilação de DNA , Software , Genômica , Humanos , Estudos Longitudinais , Fluxo de Trabalho

Leveraging DNA-Methylation Quantitative-Trait Loci to Characterize the Relationship between Methylomic Variation, Gene Expression, and Complex Traits.

Hannon, Eilis; Gorrie-Stone, Tyler J; Smart, Melissa C; Burrage, Joe; Hughes, Amanda; Bao, Yanchun; Kumari, Meena; Schalkwyk, Leonard C; Mill, Jonathan.

Am J Hum Genet ; 103(5): 654-665, 2018 11 01.

Artigo em Inglês | MEDLINE | ID: mdl-30401456

RESUMO

Characterizing the complex relationship between genetic, epigenetic, and transcriptomic variation has the potential to increase understanding about the mechanisms underpinning health and disease phenotypes. We undertook a comprehensive analysis of common genetic variation on DNA methylation (DNAm) by using the Illumina EPIC array to profile samples from the UK Household Longitudinal study. We identified 12,689,548 significant DNA methylation quantitative trait loci (mQTL) associations (p < 6.52 × 10-14) occurring between 2,907,234 genetic variants and 93,268 DNAm sites, including a large number not identified by previous DNAm-profiling methods. We demonstrate the utility of these data for interpreting the functional consequences of common genetic variation associated with > 60 human traits by using summary-data-based Mendelian randomization (SMR) to identify 1,662 pleiotropic associations between 36 complex traits and 1,246 DNAm sites. We also use SMR to characterize the relationship between DNAm and gene expression and thereby identify 6,798 pleiotropic associations between 5,420 DNAm sites and the transcription of 1,702 genes. Our mQTL database and SMR results are available via a searchable online database as a resource to the research community.

Assuntos

Metilação de DNA/genética , DNA/genética , Epigênese Genética/genética , Expressão Gênica/genética , Variação Genética/genética , Locos de Características Quantitativas/genética , Transcriptoma/genética , Estudo de Associação Genômica Ampla/métodos , Humanos , Estudos Longitudinais , Fenótipo , Característica Quantitativa Herdável , Transcrição Gênica/genética

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA