RESUMEN
BACKGROUND: Studies on the relationship between renal function and the human plasma proteome have identified several potential biomarkers. However, investigations have been conducted largely in European populations, and causality of the associations between plasma proteins and kidney function has never been addressed. METHODS: A cross-sectional study of 993 plasma proteins among 2882 participants in four studies of European and admixed ancestries (KORA, INTERVAL, HUNT, QMDiab) identified transethnic associations between eGFR/CKD and proteomic biomarkers. For the replicated associations, two-sample bidirectional Mendelian randomization (MR) was used to investigate potential causal relationships. Publicly available datasets and transcriptomic data from independent studies were used to examine the association between gene expression in kidney tissue and eGFR. RESULTS: In total, 57 plasma proteins were associated with eGFR, including one novel protein. Of these, 23 were additionally associated with CKD. The strongest inferred causal effect was the positive effect of eGFR on testican-2, in line with the known biological role of this protein and the expression of its protein-coding gene (SPOCK2) in renal tissue. We also observed suggestive evidence of an effect of melanoma inhibitory activity (MIA), carbonic anhydrase III, and cystatin-M on eGFR. CONCLUSIONS: In a discovery-replication setting, we identified 57 proteins transethnically associated with eGFR. The revealed causal relationships are an important stepping stone in establishing testican-2 as a clinically relevant physiological marker of kidney disease progression, and point to additional proteins warranting further investigation.
RESUMEN
BACKGROUND: The metabolic syndrome (MetS), defined by the simultaneous clustering of cardio-metabolic risk factors, is a significant worldwide public health burden with an estimated 25% prevalence worldwide. The pathogenesis of MetS is not entirely clear and the use of molecular level data could help uncover common pathogenic pathways behind the observed clustering. METHODS: Using a highly multiplexed aptamer-based affinity proteomics platform, we examined associations between plasma proteins and prevalent and incident MetS in the KORA cohort (n = 998) and replicated our results for prevalent MetS in the HUNT3 study (n = 923). We applied logistic regression models adjusted for age, sex, smoking status, and physical activity. We used the bootstrap ranking algorithm of least absolute shrinkage and selection operator (LASSO) to select a predictive model from the incident MetS associated proteins and used area under the curve (AUC) to assess its performance. Finally, we investigated the causal effect of the replicated proteins on MetS using two-sample Mendelian randomization. RESULTS: Prevalent MetS was associated with 116 proteins, of which 53 replicated in HUNT. These included previously reported proteins like leptin, and new proteins like NTR domain-containing protein 2 and endoplasmic reticulum protein 29. Incident MetS was associated with 14 proteins in KORA, of which 13 overlap the prevalent MetS associated proteins with soluble advanced glycosylation end product-specific receptor (sRAGE) being unique to incident MetS. The LASSO selected an eight-protein predictive model with an (AUC = 0.75; 95% CI = 0.71-0.79) in KORA. Mendelian randomization suggested causal effects of three proteins on MetS, namely apolipoprotein E2 (APOE2) (Wald-Ratio = - 0.12, Wald-p = 3.63e-13), apolipoprotein B (APOB) (Wald-Ratio = - 0.09, Wald-p = 2.54e-04) and proto-oncogene tyrosine-protein kinase receptor (RET) (Wald-Ratio = 0.10, Wald-p = 5.40e-04). CONCLUSIONS: Our findings offer new insights into the plasma proteome underlying MetS and identify new protein associations. We reveal possible casual effects of APOE2, APOB and RET on MetS. Our results highlight protein candidates that could potentially serve as targets for prevention and therapy.
Asunto(s)
Proteínas Sanguíneas/análisis , Síndrome Metabólico/sangre , Proteoma , Proteómica , Adulto , Anciano , Anciano de 80 o más Años , Apolipoproteína B-100/sangre , Apolipoproteína B-100/genética , Apolipoproteína E2/sangre , Apolipoproteína E2/genética , Biomarcadores/sangre , Proteínas Sanguíneas/genética , Factores de Riesgo Cardiometabólico , Estudios Transversales , Femenino , Alemania/epidemiología , Humanos , Incidencia , Masculino , Análisis de la Aleatorización Mendeliana , Síndrome Metabólico/diagnóstico , Síndrome Metabólico/epidemiología , Síndrome Metabólico/genética , Persona de Mediana Edad , Noruega/epidemiología , Valor Predictivo de las Pruebas , Prevalencia , Estudios Prospectivos , Proto-Oncogenes Mas , Proteínas Proto-Oncogénicas c-ret/sangre , Proteínas Proto-Oncogénicas c-ret/genética , Medición de RiesgoRESUMEN
Epigenetic regulation of cellular function provides a mechanism for rapid organismal adaptation to changes in health, lifestyle and environment. Associations of cytosine-guanine di-nucleotide (CpG) methylation with clinical endpoints that overlap with metabolic phenotypes suggest a regulatory role for these CpG sites in the body's response to disease or environmental stress. We previously identified 20 CpG sites in an epigenome-wide association study (EWAS) with metabolomics that were also associated in recent EWASs with diabetes-, obesity-, and smoking-related endpoints. To elucidate the molecular pathways that connect these potentially regulatory CpG sites to the associated disease or lifestyle factors, we conducted a multi-omics association study including 2474 mass-spectrometry-based metabolites in plasma, urine and saliva, 225 NMR-based lipid and metabolite measures in blood, 1124 blood-circulating proteins using aptamer technology, 113 plasma protein N-glycans and 60 IgG-glyans, using 359 samples from the multi-ethnic Qatar Metabolomics Study on Diabetes (QMDiab). We report 138 multi-omics associations at these CpG sites, including diabetes biomarkers at the diabetes-associated TXNIP locus, and smoking-specific metabolites and proteins at multiple smoking-associated loci, including AHRR. Mendelian randomization suggests a causal effect of metabolite levels on methylation of obesity-associated CpG sites, i.e. of glycerophospholipid PC(O-36: 5), glycine and a very low-density lipoprotein (VLDL-A) on the methylation of the obesity-associated CpG loci DHCR24, MYO5C and CPT1A, respectively. Taken together, our study suggests that multi-omics-associated CpG methylation can provide functional read-outs for the underlying regulatory response mechanisms to disease or environmental insults.
Asunto(s)
Islas de CpG , Metilación de ADN , Trastornos del Metabolismo de la Glucosa/genética , Obesidad/genética , Fumar Tabaco/genética , Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/genética , Proteínas Portadoras/genética , Biología Computacional/métodos , Epigénesis Genética , Femenino , Estudios de Asociación Genética/métodos , Genoma Humano , Estudio de Asociación del Genoma Completo/métodos , Humanos , Lípidos/sangre , Masculino , Metaboloma , Proteínas Represoras/genéticaRESUMEN
BACKGROUND: Family-based designs, from twin studies to isolated populations with their complex genealogical data, are a valuable resource for genetic studies of heritable molecular biomarkers. Existing software for family-based studies have mainly focused on facilitating association between response phenotypes and genetic markers, and no user-friendly tools are at present available to straightforwardly extend association studies in related samples to large datasets of generic quantitative data, as those generated by current -omics technologies. RESULTS: We developed PopPAnTe, a user-friendly Java program, which evaluates the association of quantitative data in related samples. Additionally, PopPAnTe implements data pre and post processing, region based testing, and empirical assessment of associations. CONCLUSIONS: PopPAnTe is an integrated and flexible framework for pairwise association testing in related samples with a large number of predictors and response variables. It works either with family data of any size and complexity, or, when the genealogical information is unknown, it uses genetic similarity information between individuals as those inferred from genome-wide genetic data. It can therefore be particularly useful in facilitating usage of biobank data collections from population isolates when extensive genealogical information is missing.
Asunto(s)
Genómica/métodos , Linaje , Programas Informáticos , Epigénesis Genética , Femenino , Perfilación de la Expresión Génica , Genética de Población , Humanos , Masculino , Gemelos/genéticaRESUMEN
Thousands of proteins circulate in the bloodstream; identifying those which associate with weight and intervention-induced weight loss may help explain mechanisms of diseases associated with adiposity. We aimed to identify consistent protein signatures of weight loss across independent studies capturing changes in body mass index (BMI). We analysed proteomic data from studies implementing caloric restriction (Diabetes Remission Clinical trial) and bariatric surgery (By-Band-Sleeve), using SomaLogic and Olink Explore1536 technologies, respectively. Linear mixed models were used to estimate the effect of the interventions on circulating proteins. Twenty-three proteins were altered in a consistent direction after both bariatric surgery and caloric restriction, suggesting that these proteins are modulated by weight change, independent of intervention type. We also integrated Mendelian randomisation (MR) estimates of the effect of BMI on proteins measured by SomaLogic from a UK blood donor cohort as a third line of causal evidence. These MR estimates provided further corroborative evidence for a role of BMI in regulating the levels of six proteins including alcohol dehydrogenase-4, nogo receptor and interleukin-1 receptor antagonist protein. These results indicate the importance of triangulation in interrogating causal relationships; further study into the role of proteins modulated by weight in disease is now warranted.
Asunto(s)
Cirugía Bariátrica , Proteoma , Humanos , Índice de Masa Corporal , Restricción Calórica , Proteómica , Pérdida de Peso/fisiologíaRESUMEN
Type 2 diabetes (T2D) has a heterogeneous etiology influencing its progression, treatment, and complications. A data driven cluster analysis in European individuals with T2D previously identified four subtypes: severe insulin deficient (SIDD), severe insulin resistant (SIRD), mild obesity-related (MOD), and mild age-related (MARD) diabetes. Here, the clustering approach was applied to individuals with T2D from the Qatar Biobank and validated in an independent set. Cluster-specific signatures of circulating metabolites and proteins were established, revealing subtype-specific molecular mechanisms, including activation of the complement system with features of autoimmune diabetes and reduced 1,5-anhydroglucitol in SIDD, impaired insulin signaling in SIRD, and elevated leptin and fatty acid binding protein levels in MOD. The MARD cluster was the healthiest with metabolomic and proteomic profiles most similar to the controls. We have translated the T2D subtypes to an Arab population and identified distinct molecular signatures to further our understanding of the etiology of these subtypes.
Asunto(s)
Diabetes Mellitus Tipo 1 , Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Proteómica , Árabes , InsulinaRESUMEN
Protein biomarkers have been identified across many age-related morbidities. However, characterising epigenetic influences could further inform disease predictions. Here, we leverage epigenome-wide data to study links between the DNA methylation (DNAm) signatures of the circulating proteome and incident diseases. Using data from four cohorts, we trained and tested epigenetic scores (EpiScores) for 953 plasma proteins, identifying 109 scores that explained between 1% and 58% of the variance in protein levels after adjusting for known protein quantitative trait loci (pQTL) genetic effects. By projecting these EpiScores into an independent sample (Generation Scotland; n = 9537) and relating them to incident morbidities over a follow-up of 14 years, we uncovered 137 EpiScore-disease associations. These associations were largely independent of immune cell proportions, common lifestyle and health factors, and biological aging. Notably, we found that our diabetes-associated EpiScores highlighted previous top biomarker associations from proteome-wide assessments of diabetes. These EpiScores for protein levels can therefore be a valuable resource for disease prediction and risk stratification.
Although our genetic code does not change throughout our lives, our genes can be turned on and off as a result of epigenetics. Epigenetics can track how the environment and even certain behaviors add or remove small chemical markers to the DNA that makes up the genome. The type and location of these markers may affect whether genes are active or silent, this is, whether the protein coded for by that gene is being produced or not. One common epigenetic marker is known as DNA methylation. DNA methylation has been linked to the levels of a range of proteins in our cells and the risk people have of developing chronic diseases. Blood samples can be used to determine the epigenetic markers a person has on their genome and to study the abundance of many proteins. Gadd, Hillary, McCartney, Zaghlool et al. studied the relationships between DNA methylation and the abundance of 953 different proteins in blood samples from individuals in the German KORA cohort and the Scottish Lothian Birth Cohort 1936. They then used machine learning to analyze the relationship between epigenetic markers found in people's blood and the abundance of proteins, obtaining epigenetic scores or 'EpiScores' for each protein. They found 109 proteins for which DNA methylation patterns explained between at least 1% and up to 58% of the variation in protein levels. Integrating the 'EpiScores' with 14 years of medical records for more than 9000 individuals from the Generation Scotland study revealed 130 connections between EpiScores for proteins and a future diagnosis of common adverse health outcomes. These included diabetes, stroke, depression, various cancers, and inflammatory conditions such as rheumatoid arthritis and inflammatory bowel disease. Age-related chronic diseases are a growing issue worldwide and place pressure on healthcare systems. They also severely reduce quality of life for individuals over many years. This work shows how epigenetic scores based on protein levels in the blood could predict a person's risk of several of these diseases. In the case of type 2 diabetes, the EpiScore results replicated previous research linking protein levels in the blood to future diagnosis of diabetes. Protein EpiScores could therefore allow researchers to identify people with the highest risk of disease, making it possible to intervene early and prevent these people from developing chronic conditions as they age.
Asunto(s)
Enfermedades Cardiovasculares/diagnóstico , Metilación de ADN/genética , Diabetes Mellitus/diagnóstico , Epigenómica/métodos , Neoplasias/diagnóstico , Proteoma/genética , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Envejecimiento , Biomarcadores , Epigénesis Genética , Femenino , Humanos , Estilo de Vida , Masculino , Persona de Mediana Edad , Factores de Riesgo , Escocia , Adulto JovenRESUMEN
Blood circulating proteins are confounded readouts of the biological processes that occur in different tissues and organs. Many proteins have been linked to complex disorders and are also under substantial genetic control. Here, we investigate the associations between over 1000 blood circulating proteins and body mass index (BMI) in three studies including over 4600 participants. We show that BMI is associated with widespread changes in the plasma proteome. We observe 152 replicated protein associations with BMI. 24 proteins also associate with a genome-wide polygenic score (GPS) for BMI. These proteins are involved in lipid metabolism and inflammatory pathways impacting clinically relevant pathways of adiposity. Mendelian randomization suggests a bi-directional causal relationship of BMI with LEPR/LEP, IGFBP1, and WFIKKN2, a protein-to-BMI relationship for AGER, DPT, and CTSA, and a BMI-to-protein relationship for another 21 proteins. Combined with animal model and tissue-specific gene expression data, our findings suggest potential therapeutic targets further elucidating the role of these proteins in obesity associated pathologies.
Asunto(s)
Obesidad/metabolismo , Proteoma/metabolismo , Adulto , Anciano , Índice de Masa Corporal , Femenino , Humanos , Metabolismo de los Lípidos/genética , Metabolismo de los Lípidos/fisiología , Masculino , Análisis de la Aleatorización Mendeliana , Persona de Mediana Edad , Obesidad/genética , Proteómica/métodosRESUMEN
The increasing prevalence of type 2 diabetes poses a major challenge to societies worldwide. Blood-based factors like serum proteins are in contact with every organ in the body to mediate global homeostasis and may thus directly regulate complex processes such as aging and the development of common chronic diseases. We applied a data-driven proteomics approach, measuring serum levels of 4,137 proteins in 5,438 elderly Icelanders, and identified 536 proteins associated with prevalent and/or incident type 2 diabetes. We validated a subset of the observed associations in an independent case-control study of type 2 diabetes. These protein associations provide novel biological insights into the molecular mechanisms that are dysregulated prior to and following the onset of type 2 diabetes and can be detected in serum. A bidirectional two-sample Mendelian randomization analysis indicated that serum changes of at least 23 proteins are downstream of the disease or its genetic liability, while 15 proteins were supported as having a causal role in type 2 diabetes.
Asunto(s)
Diabetes Mellitus Tipo 2/genética , Anciano , Anciano de 80 o más Años , Estudios de Casos y Controles , Diabetes Mellitus Tipo 2/sangre , Femenino , Predisposición Genética a la Enfermedad/genética , Genotipo , Humanos , Masculino , Análisis de la Aleatorización Mendeliana , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
DNA methylation and blood circulating proteins have been associated with many complex disorders, but the underlying disease-causing mechanisms often remain unclear. Here, we report an epigenome-wide association study of 1123 proteins from 944 participants of the KORA population study and replication in a multi-ethnic cohort of 344 individuals. We identify 98 CpG-protein associations (pQTMs) at a stringent Bonferroni level of significance. Overlapping associations with transcriptomics, metabolomics, and clinical endpoints suggest implication of processes related to chronic low-grade inflammation, including a network involving methylation of NLRC5, a regulator of the inflammasome, and associated pQTMs implicating key proteins of the immune system, such as CD48, CD163, CXCL10, CXCL11, LAG3, FCGR3B, and B2M. Our study links DNA methylation to disease endpoints via intermediate proteomics phenotypes and identifies correlative networks that may eventually be targeted in a personalized approach of chronic low-grade inflammation.
Asunto(s)
Proteínas Sanguíneas/genética , Inflamación/genética , Adulto , Anciano , Anciano de 80 o más Años , Quimiocina CXCL10/genética , Estudios de Cohortes , Islas de CpG , Metilación de ADN , Epigenoma , Epigenómica , Femenino , Proteínas Ligadas a GPI/genética , Alemania , Humanos , Péptidos y Proteínas de Señalización Intracelular/genética , Masculino , Persona de Mediana Edad , Proteómica , Receptores de IgG/genéticaRESUMEN
Suppressing glutaminolysis does not always induce cancer cell death in glutamine dependent tumors because cells may switch to alternative energy sources. To reveal compensatory metabolic pathways, we investigated the metabolome-wide cellular response to inhibited glutaminolysis in cancer cells. Glutaminolysis inhibition with C.968 suppressed cell proliferation but was insufficient to induce cancer cell death. We found that lipid catabolism was activated as a compensation for glutaminolysis inhibition. Accelerated lipid catabolism, together with oxidative stress induced by glutaminolysis inhibition, triggered autophagy. Simultaneously inhibiting glutaminolysis and either beta oxidation with trimetazidine or autophagy with chloroquine both induced cancer cell death. Here we identified metabolic escape mechanisms contributing to cancer cell survival under treatment and we suggest potentially translational strategy for combined cancer therapy, given that chloroquine is an FDA approved drug. Our findings are first to show efficiency of combined inhibition of glutaminolysis and beta oxidation as potential anti-cancer strategy as well as add to the evidence that combined inhibition of glutaminolysis and autophagy may be effective in glutamine-addicted cancers.
Asunto(s)
Protocolos de Quimioterapia Combinada Antineoplásica/farmacología , Autofagia/efectos de los fármacos , Glutamina/metabolismo , Lipólisis/efectos de los fármacos , Neoplasias/patología , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Apoptosis/efectos de los fármacos , Benzofenantridinas/farmacología , Benzofenantridinas/uso terapéutico , Línea Celular Tumoral , Proliferación Celular/efectos de los fármacos , Cloroquina/farmacología , Cloroquina/uso terapéutico , Glutaminasa/antagonistas & inhibidores , Glutaminasa/metabolismo , Humanos , Metabolómica , Neoplasias/tratamiento farmacológico , Neoplasias/metabolismo , Estrés Oxidativo/efectos de los fármacosRESUMEN
BACKGROUND: The prevalence of type 2 diabetes (T2D) and obesity has dramatically increased within a few generations, reaching epidemic levels. In addition to genetic risk factors, epigenetic mechanisms triggered by changing environment are investigated for their role in the pathogenesis of these complex diseases. Epigenome-wide association studies (EWASs) have revealed significant associations of T2D, obesity, and BMI with DNA methylation. However, populations from the Middle East, where T2D and obesity rates are highest worldwide, have not been investigated so far. METHODS: We performed the first EWAS in an Arab population with T2D and BMI and attempted to replicate 47 EWAS associations previously reported in Caucasians. We used the Illumina Infinium HumanMethylation450 BeadChip to quantify DNA methylation in whole blood DNA from 123 subjects of 15 multigenerational families from Qatar. To investigate the effect of differing genetic background and environment on the epigenetic associations, we further assessed the effect of replicated loci in 810 twins from UK. RESULTS: Our EWAS suggested a novel association between T2D and cg06721411 (DQX1; p value = 1.18 × 10(-9)). We replicated in the Qatari population seven CpG associations with BMI (SOCS3, p value = 3.99 × 10(-6); SREBF1, p value = 4.33 × 10(-5); SBNO2, p value = 5.87 × 10(-5); CPT1A, p value = 7.99 × 10(-5); PRR5L, p value = 1.85 × 10(-4); cg03078551, intergenic region on chromosome 17; p value = 1.00 × 10(-3); LY6G6E, p value = 1.10 × 10(-3)) and one with T2D (TXNIP, p value = 2.46 × 10(-5)). All the associations were further confirmed in the UK cohort for both BMI and T2D. Meta-analysis increased the significance of the observed associations and revealed strong heterogeneity of the effect sizes (apart from CPT1A), although associations at these loci showed concordant direction in the two populations. CONCLUSIONS: Our study replicated eight known CpG associations with T2D or BMI in an Arab population. Heterogeneity of the effects at all loci except CPT1A between the Qatari and UK studies suggests that the underlying mechanisms might depend on genetic background and environmental pressure. Our EWAS results provide a basis for comparison with other ethnicities.
Asunto(s)
Árabes/genética , Índice de Masa Corporal , Diabetes Mellitus Tipo 2/genética , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Islas de CpG/genética , Metilación de ADN , Diabetes Mellitus Tipo 2/etnología , Enfermedades en Gemelos/genética , Epigénesis Genética , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Persona de Mediana Edad , Obesidad/etnología , Obesidad/genética , Análisis de Secuencia por Matrices de Oligonucleótidos , Qatar/epidemiología , Reino Unido/epidemiología , Adulto JovenRESUMEN
BACKGROUND: Environmentally influenced phenotypes, such as obesity and insulin resistance, can be transmitted over multiple generations. Epigenetic modifications, such as methylation of DNA cytosine-guanine (CpG) pairs, may be carriers of inherited information. At the population level, the methylation state of such "heritable" CpG sites is expected to follow a trimodal distribution, and their mode of inheritance should be Mendelian. METHODS: Using the Illumina Infinium 450 K DNA methylation array, we determined DNA CpG-methylation in blood cells from a family cohort 123 individuals of Arab ethnicity, including 18 elementary father-mother-child trios, we asked whether Mendelian inheritance of CpG methylation is observed, and most importantly, whether it is independent of any genetic signals. Using 40× whole genome sequencing, we therefore excluded all CpG sites with possibly confounding genetic variants (SNP) within the binding regions of the Illumina probes. RESULTS: We identified a total of 955 CpG sites that displayed a trimodal distribution and confirmed trimodality in a study of 1805 unrelated Caucasians. Of 955 CpG sites, 99.9% observed a strict Mendelian pattern of inheritance and had no SNP within +/-110 nucleotides of the CpG site by design. However, in 97% of these cases a distal cis-acting SNP within a +/-1 Mbp window was found that explained the observed CpG distribution, excluding the hypothesis of epigenetic inheritance for these clear-cut trimodal sites. Using power analysis, we showed that in 46% of all cases, the closest CpG-associated SNP was located more than 1000 bp from the CpG site. CONCLUSIONS: Our findings suggest that CpG methylation is maintained over larger genomic distances. Furthermore, nearly half of the SNPs associated with these trimodal sites were also associated with the expression of nearby genes (P = 4.08 × 10-6), implying a regulatory effect of these trimodal CpG sites.
Asunto(s)
Islas de CpG , Metilación de ADN , Herencia , Análisis de la Aleatorización Mendeliana/métodos , Análisis de Secuencia de ADN/métodos , Adulto , Epigénesis Genética , Femenino , Humanos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Qatar/etnología , Adulto JovenRESUMEN
BACKGROUND: Modification of DNA by methylation of cytosines at CpG dinucleotides is a widespread phenomenon that leads to changes in gene expression, thereby influencing and regulating many biological processes. Recent technical advances in the genome-wide determination of single-base DNA-methylation enabled epigenome-wide association studies (EWASs). Early EWASs established robust associations between age and gender with the degree of CpG methylation at specific sites. Other studies uncovered associations with cigarette smoking. However, so far these studies were mainly conducted in Caucasians, raising the question of whether these findings can also be extrapolated to other populations. RESULTS: Here, we present an EWAS with age, gender, and smoking status in a family study of 123 individuals of Arab descent. We determined DNA methylation at over 450,000 CpG sites using the Illumina Infinium HumanMethylation450 BeadChip, applied state-of-the-art data processing protocols, including correction for blood cell type heterogeneity and hidden confounders, and eliminated probes containing SNPs at the targeted CpG site using 40× whole-genome sequencing data. Using this approach, we could replicate the leading published EWAS associations with age, gender and smoking, and recovered hallmarks of gender-specific epigenetic changes. Interestingly, we could even replicate the recently reported precise prediction of chronological age based on the methylation of only a few selected CpG sites. CONCLUSION: Our study supports the view that when applied with state-of-the art protocols to account for all potential confounders, DNA methylation arrays represent powerful tools for EWAS with more complex phenotypes that can also be successfully applied to non-Caucasian populations.
RESUMEN
Dynamic Causal Modeling (DCM) can be used to quantify cognitive function in individuals as effective connectivity. However, ambiguity among subjects in the number and location of discernible active regions prevents all candidate models from being compared in all subjects, precluding the use of DCM as an individual cognitive phenotyping tool. This paper proposes a solution to this problem by treating missing regions in the first-level analysis as missing data, and performing estimation of the time course associated with any missing region using one of four candidate methods: zero-filling, average-filling, noise-filling using a fixed stochastic process, or one estimated using expectation-maximization. The effect of this estimation scheme was analyzed by treating it as a preprocessing step to DCM and observing the resulting effects on model evidence. Simulation studies show that estimation using expectation-maximization yields the highest classification accuracy using a simple loss function and highest model evidence, relative to other methods. This result held for various dataset sizes and varying numbers of model choice. In real data, application to Go/No-Go and Simon tasks allowed computation of signals from the missing nodes and the consequent computation of model evidence in all subjects compared to 62 and 48 percent respectively if no preprocessing was performed. These results demonstrate the face validity of the preprocessing scheme and open the possibility of using single-subject DCM as an individual cognitive phenotyping tool.