RESUMEN
We deployed the Blended Genome Exome (BGE), a DNA library blending approach that generates low pass whole genome (1-4× mean depth) and deep whole exome (30-40× mean depth) data in a single sequencing run. This technology is cost-effective, empowers most genomic discoveries possible with deep whole genome sequencing, and provides an unbiased method to capture the diversity of common SNP variation across the globe. To evaluate this new technology at scale, we applied BGE to sequence >53,000 samples from the Populations Underrepresented in Mental Illness Associations Studies (PUMAS) Project, which included participants across African, African American, and Latin American populations. We evaluated the accuracy of BGE imputed genotypes against raw genotype calls from the Illumina Global Screening Array. All PUMAS cohorts had R 2 concordance ≥95% among SNPs with MAF≥1%, and never fell below ≥90% R 2 for SNPs with MAF<1%. Furthermore, concordance rates among local ancestries within two recently admixed cohorts were consistent among SNPs with MAF≥1%, with only minor deviations in SNPs with MAF<1%. We also benchmarked the discovery capacity of BGE to access protein-coding copy number variants (CNVs) against deep whole genome data, finding that deletions and duplications spanning at least 3 exons had a positive predicted value of ~90%. Our results demonstrate BGE scalability and efficacy in capturing SNPs, indels, and CNVs in the human genome at 28% of the cost of deep whole-genome sequencing. BGE is poised to enhance access to genomic testing and empower genomic discoveries, particularly in underrepresented populations.
RESUMEN
Genetic susceptibility to metabolic associated fatty liver disease (MAFLD) is complex and poorly characterized. Accurate characterization of the genetic background of hepatic fat content would provide insights into disease etiology and causality of risk factors. We performed genome-wide association study (GWAS) on two noninvasive definitions of hepatic fat content: magnetic resonance imaging proton density fat fraction (MRI-PDFF) in 16,050 participants and fatty liver index (FLI) in 388,701 participants from the United Kingdom (UK) Biobank (UKBB). Heritability, genetic overlap, and similarity between hepatic fat content phenotypes were analyzed, and replicated in 10,398 participants from the University Medical Center Groningen (UMCG) Genetics Lifelines Initiative (UGLI). Meta-analysis of GWASs of MRI-PDFF in UKBB revealed five statistically significant loci, including two novel genomic loci harboring CREB3L1 (rs72910057-T, P = 5.40E-09) and GCM1 (rs1491489378-T, P = 3.16E-09), respectively, as well as three previously reported loci: PNPLA3, TM6SF2, and APOE. GWAS of FLI in UKBB identified 196 genome-wide significant loci, of which 49 were replicated in UGLI, with top signals in ZPR1 (P = 3.35E-13) and FTO (P = 2.11E-09). Statistically significant genetic correlation (rg) between MRI-PDFF (UKBB) and FLI (UGLI) GWAS results was found (rg = 0.5276, P = 1.45E-03). Novel MRI-PDFF genetic signals (CREB3L1 and GCM1) were replicated in the FLI GWAS. We identified two novel genes for MRI-PDFF and 49 replicable loci for FLI. Despite a difference in hepatic fat content assessment between MRI-PDFF and FLI, a substantial similar genetic architecture was found. FLI is identified as an easy and reliable approach to study hepatic fat content at the population level.
Asunto(s)
Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Hígado , Humanos , Femenino , Masculino , Factores de Riesgo , Predisposición Genética a la Enfermedad/genética , Hígado/diagnóstico por imagen , Hígado/metabolismo , Hígado/patología , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple/genética , Imagen por Resonancia Magnética , Enfermedad del Hígado Graso no Alcohólico/genética , Enfermedad del Hígado Graso no Alcohólico/diagnóstico por imagen , Adulto , Anciano , Hígado Graso/genética , Hígado Graso/diagnóstico por imagenRESUMEN
Although the impact of host genetics on gut microbial diversity and the abundance of specific taxa is well established1-6, little is known about how host genetics regulates the genetic diversity of gut microorganisms. Here we conducted a meta-analysis of associations between human genetic variation and gut microbial structural variation in 9,015 individuals from four Dutch cohorts. Strikingly, the presence rate of a structural variation segment in Faecalibacterium prausnitzii that harbours an N-acetylgalactosamine (GalNAc) utilization gene cluster is higher in individuals who secrete the type A oligosaccharide antigen terminating in GalNAc, a feature that is jointly determined by human ABO and FUT2 genotypes, and we could replicate this association in a Tanzanian cohort. In vitro experiments demonstrated that GalNAc can be used as the sole carbohydrate source for F. prausnitzii strains that carry the GalNAc-metabolizing pathway. Further in silico and in vitro studies demonstrated that other ABO-associated species can also utilize GalNAc, particularly Collinsella aerofaciens. The GalNAc utilization genes are also associated with the host's cardiometabolic health, particularly in individuals with mucosal A-antigen. Together, the findings of our study demonstrate that genetic associations across the human genome and bacterial metagenome can provide functional insights into the reciprocal host-microbiome relationship.
Asunto(s)
Bacterias , Microbioma Gastrointestinal , Interacciones Microbiota-Huesped , Metagenoma , Humanos , Acetilgalactosamina/metabolismo , Bacterias/clasificación , Bacterias/genética , Bacterias/aislamiento & purificación , Estudios de Cohortes , Simulación por Computador , Faecalibacterium prausnitzii/genética , Microbioma Gastrointestinal/genética , Genoma Humano/genética , Genotipo , Interacciones Microbiota-Huesped/genética , Técnicas In Vitro , Metagenoma/genética , Familia de Multigenes , Países Bajos , TanzaníaRESUMEN
The c.40_42delAGA variant in the phospholamban gene (PLN) has been associated with dilated and arrhythmogenic cardiomyopathy, with up to 70% of carriers experiencing a major cardiac event by age 70. However, there are carriers who remain asymptomatic at older ages. To understand the mechanisms behind this incomplete penetrance, we evaluated potential phenotypic and genetic modifiers in 74 PLN:c.40_42delAGA carriers identified in 36,339 participants of the Lifelines population cohort. Asymptomatic carriers (N = 48) showed shorter QRS duration (- 5.73 ms, q value = 0.001) compared to asymptomatic non-carriers, an effect we could replicate in two different independent cohorts. Furthermore, symptomatic carriers showed a higher correlation (rPearson = 0.17) between polygenic predisposition to higher QRS (PGSQRS) and QRS (p value = 1.98 × 10-8), suggesting that the effect of the genetic variation on cardiac rhythm might be increased in symptomatic carriers. Our results allow for improved clinical interpretation for asymptomatic carriers, while our approach could guide future studies on genetic diseases with incomplete penetrance.
Asunto(s)
Cardiomiopatías , Humanos , Anciano , Mutación , Cardiomiopatías/diagnóstico , Cardiomiopatías/genética , Proteínas de Unión al Calcio/genética , GenotipoRESUMEN
Host genetics are known to influence the gut microbiome, yet their role remains poorly understood. To robustly characterize these effects, we performed a genome-wide association study of 207 taxa and 205 pathways representing microbial composition and function in 7,738 participants of the Dutch Microbiome Project. Two robust, study-wide significant (P < 1.89 × 10-10) signals near the LCT and ABO genes were found to be associated with multiple microbial taxa and pathways and were replicated in two independent cohorts. The LCT locus associations seemed modulated by lactose intake, whereas those at ABO could be explained by participant secretor status determined by their FUT2 genotype. Twenty-two other loci showed suggestive evidence (P < 5 × 10-8) of association with microbial taxa and pathways. At a more lenient threshold, the number of loci we identified strongly correlated with trait heritability, suggesting that much larger sample sizes are needed to elucidate the remaining effects of host genetics on the gut microbiome.
Asunto(s)
Sistema del Grupo Sanguíneo ABO/genética , Fenómenos Fisiológicos Bacterianos , Microbioma Gastrointestinal , Tracto Gastrointestinal/microbiología , Variación Genética , Interacciones Microbiota-Huesped , Lactasa/genética , Bifidobacterium/fisiología , Dieta , Fucosiltransferasas/genética , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Redes y Vías Metabólicas , Metagenoma , Herencia Multifactorial , Países Bajos , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Cloruro de Sodio Dietético , Triglicéridos/sangre , Galactósido 2-alfa-L-FucosiltransferasaRESUMEN
Biobanks facilitate genome-wide association studies (GWASs), which have mapped genomic loci across a range of human diseases and traits. However, most biobanks are primarily composed of individuals of European ancestry. We introduce the Global Biobank Meta-analysis Initiative (GBMI)-a collaborative network of 23 biobanks from 4 continents representing more than 2.2 million consented individuals with genetic data linked to electronic health records. GBMI meta-analyzes summary statistics from GWASs generated using harmonized genotypes and phenotypes from member biobanks for 14 exemplar diseases and endpoints. This strategy validates that GWASs conducted in diverse biobanks can be integrated despite heterogeneity in case definitions, recruitment strategies, and baseline characteristics. This collaborative effort improves GWAS power for diseases, benefits understudied diseases, and improves risk prediction while also enabling the nomination of disease genes and drug candidates by incorporating gene and protein expression data and providing insight into the underlying biology of human diseases and traits.
RESUMEN
Epidemiological and genetic studies on COVID-19 are currently hindered by inconsistent and limited testing policies to confirm SARS-CoV-2 infection. Recently, it was shown that it is possible to predict COVID-19 cases using cross-sectional self-reported disease-related symptoms. Here, we demonstrate that this COVID-19 prediction model has reasonable and consistent performance across multiple independent cohorts and that our attempt to improve upon this model did not result in improved predictions. Using the existing COVID-19 prediction model, we then conducted a GWAS on the predicted phenotype using a total of 1,865 predicted cases and 29,174 controls. While we did not find any common, large-effect variants that reached genome-wide significance, we do observe suggestive genetic associations at two SNPs (rs11844522, p = 1.9x10-7; rs5798227, p = 2.2x10-7). Explorative analyses furthermore suggest that genetic variants associated with other viral infectious diseases do not overlap with COVID-19 susceptibility and that severity of COVID-19 may have a different genetic architecture compared to COVID-19 susceptibility. This study represents a first effort that uses a symptom-based predicted phenotype as a proxy for COVID-19 in our pursuit of understanding the genetic susceptibility of the disease. We conclude that the inclusion of symptom-based predicted cases could be a useful strategy in a scenario of limited testing, either during the current COVID-19 pandemic or any future viral outbreak.
Asunto(s)
COVID-19/patología , Predisposición Genética a la Enfermedad , Área Bajo la Curva , COVID-19/genética , COVID-19/virología , Estudios Transversales , Estudio de Asociación del Genoma Completo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple , Curva ROC , SARS-CoV-2/aislamiento & purificaciónRESUMEN
PURPOSE: The Lifelines COVID-19 cohort was set up to assess the psychological and societal impacts of the COVID-19 pandemic and investigate potential risk factors for COVID-19 within the Lifelines prospective population cohort. PARTICIPANTS: Participants were recruited from the 140 000 eligible participants of Lifelines and the Lifelines NEXT birth cohort, who are all residents of the three northern provinces of the Netherlands. Participants filled out detailed questionnaires about their physical and mental health and experiences on a weekly basis starting in late March 2020, and the cohort consists of everyone who filled in at least one questionnaire in the first 8 weeks of the project. FINDINGS TO DATE: >71 000 unique participants responded to the questionnaires at least once during the first 8 weeks, with >22 000 participants responding to seven questionnaires. Compiled questionnaire results are continuously updated and shared with the public through the Corona Barometer website. Early results included a clear signal that younger people living alone were experiencing greater levels of loneliness due to lockdown, and subsequent results showed the easing of anxiety as lockdown was eased in June 2020. FUTURE PLANS: Questionnaires were sent on a (bi)weekly basis starting in March 2020 and on a monthly basis starting July 2020, with plans for new questionnaire rounds to continue through 2020 and early 2021. Questionnaire frequency can be increased again for subsequent waves of infections. Cohort data will be used to address how the COVID-19 pandemic developed in the northern provinces of the Netherlands, which environmental and genetic risk factors predict disease susceptibility and severity and the psychological and societal impacts of the crisis. Cohort data are linked to the extensive health, lifestyle and sociodemographic data held for these participants by Lifelines, a 30-year project that started in 2006, and to data about participants held in national databases.
Asunto(s)
COVID-19/psicología , Pandemias , Adulto , Ansiedad , Control de Enfermedades Transmisibles , Femenino , Humanos , Soledad , Masculino , Persona de Mediana Edad , Países Bajos/epidemiología , Estudios Prospectivos , Calidad de Vida , Encuestas y CuestionariosRESUMEN
Coronavirus disease 2019 (COVID-19) shows a wide variation in expression and severity of symptoms, from very mild or no symptoms, to flu-like symptoms, and in more severe cases, to pneumonia, acute respiratory distress syndrome, and even death. Large differences in outcome have also been observed between males and females. The causes for this variability are likely to be multifactorial, and to include genetics. The SARS-CoV-2 virus responsible for the infection depends on two human genes: the human receptor angiotensin converting enzyme 2 (ACE2) for cell invasion, and the serine protease TMPRSS2 for S protein priming. Genetic variation in these two genes may thus modulate an individual's genetic predisposition to infection and virus clearance. While genetic data on COVID-19 patients is being gathered, we carried out a phenome-wide association scan (PheWAS) to investigate the role of these genes in other human phenotypes in the general population. We examined 178 quantitative phenotypes including cytokines and cardio-metabolic biomarkers, as well as usage of 58 medications in 36,339 volunteers from the Lifelines population cohort, in relation to 1,273 genetic variants located in or near ACE2 and TMPRSS2. While none reached our threshold for significance, we observed several interesting suggestive associations. For example, single nucleotide polymorphisms (SNPs) near the TMPRSS2 genes were associated with thrombocytes count (p = 1.8 × 10-5). SNPs within the ACE2 gene were associated with (1) the use of angiotensin II receptor blockers (ARBs) combination therapies (p = 5.7 × 10-4), an association that is significantly stronger in females (p dif f = 0.01), and (2) with the use of non-steroid anti-inflammatory and antirheumatic products (p = 5.5 × 10-4). While these associations need to be confirmed in larger sample sizes, they suggest that these variants could play a role in diseases such as thrombocytopenia, hypertension, and chronic inflammation that are often observed in the more severe COVID-19 cases. Further investigation of these genetic variants in the context of COVID-19 is thus promising for better understanding of disease variability. Full results are available at https://covid19research.nl.