RESUMEN
Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes1,2 and molecular mechanisms that are often specific to cell type3,4. Here, to characterize the genetic contribution to these processes across ancestry groups, we aggregate genome-wide association study data from 2,535,601 individuals (39.7% not of European ancestry), including 428,452 cases of T2D. We identify 1,289 independent association signals at genome-wide significance (P < 5 × 10-8) that map to 611 loci, of which 145 loci are, to our knowledge, previously unreported. We define eight non-overlapping clusters of T2D signals that are characterized by distinct profiles of cardiometabolic trait associations. These clusters are differentially enriched for cell-type-specific regions of open chromatin, including pancreatic islets, adipocytes, endothelial cells and enteroendocrine cells. We build cluster-specific partitioned polygenic scores5 in a further 279,552 individuals of diverse ancestry, including 30,288 cases of T2D, and test their association with T2D-related vascular outcomes. Cluster-specific partitioned polygenic scores are associated with coronary artery disease, peripheral artery disease and end-stage diabetic nephropathy across ancestry groups, highlighting the importance of obesity-related processes in the development of vascular outcomes. Our findings show the value of integrating multi-ancestry genome-wide association study data with single-cell epigenomics to disentangle the aetiological heterogeneity that drives the development and progression of T2D. This might offer a route to optimize global access to genetically informed diabetes care.
Asunto(s)
Diabetes Mellitus Tipo 2 , Progresión de la Enfermedad , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Adipocitos/metabolismo , Cromatina/genética , Cromatina/metabolismo , Enfermedad de la Arteria Coronaria/complicaciones , Enfermedad de la Arteria Coronaria/genética , Diabetes Mellitus Tipo 2/clasificación , Diabetes Mellitus Tipo 2/complicaciones , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/patología , Diabetes Mellitus Tipo 2/fisiopatología , Nefropatías Diabéticas/complicaciones , Nefropatías Diabéticas/genética , Células Endoteliales/metabolismo , Células Enteroendocrinas , Epigenómica , Predisposición Genética a la Enfermedad/genética , Islotes Pancreáticos/metabolismo , Herencia Multifactorial/genética , Enfermedad Arterial Periférica/complicaciones , Enfermedad Arterial Periférica/genética , Análisis de la Célula IndividualRESUMEN
Genome-wide association studies (GWAS) have laid the foundation for investigations into the biology of complex traits, drug development and clinical guidelines. However, the majority of discovery efforts are based on data from populations of European ancestry1-3. In light of the differential genetic architecture that is known to exist between populations, bias in representation can exacerbate existing disease and healthcare disparities. Critical variants may be missed if they have a low frequency or are completely absent in European populations, especially as the field shifts its attention towards rare variants, which are more likely to be population-specific4-10. Additionally, effect sizes and their derived risk prediction scores derived in one population may not accurately extrapolate to other populations11,12. Here we demonstrate the value of diverse, multi-ethnic participants in large-scale genomic studies. The Population Architecture using Genomics and Epidemiology (PAGE) study conducted a GWAS of 26 clinical and behavioural phenotypes in 49,839 non-European individuals. Using strategies tailored for analysis of multi-ethnic and admixed populations, we describe a framework for analysing diverse populations, identify 27 novel loci and 38 secondary signals at known loci, as well as replicate 1,444 GWAS catalogue associations across these traits. Our data show evidence of effect-size heterogeneity across ancestries for published GWAS associations, substantial benefits for fine-mapping using diverse cohorts and insights into clinical implications. In the United States-where minority populations have a disproportionately higher burden of chronic conditions13-the lack of representation of diverse populations in genetic research will result in inequitable access to precision medicine for those with the highest burden of disease. We strongly advocate for continued, large genome-wide efforts in diverse populations to maximize genetic discovery and reduce health disparities.
Asunto(s)
Pueblo Asiatico/genética , Población Negra/genética , Estudio de Asociación del Genoma Completo/métodos , Hispánicos o Latinos/genética , Grupos Minoritarios , Herencia Multifactorial/genética , Salud de la Mujer , Estatura/genética , Estudios de Cohortes , Femenino , Genética Médica/métodos , Equidad en Salud/tendencias , Disparidades en el Estado de Salud , Humanos , Masculino , Estados UnidosRESUMEN
Platelets play a key role in thrombosis and hemostasis. Platelet count (PLT) and mean platelet volume (MPV) are highly heritable quantitative traits, with hundreds of genetic signals previously identified, mostly in European ancestry populations. We here utilize whole genome sequencing (WGS) from NHLBI's Trans-Omics for Precision Medicine initiative (TOPMed) in a large multi-ethnic sample to further explore common and rare variation contributing to PLT (n = 61 200) and MPV (n = 23 485). We identified and replicated secondary signals at MPL (rs532784633) and PECAM1 (rs73345162), both more common in African ancestry populations. We also observed rare variation in Mendelian platelet-related disorder genes influencing variation in platelet traits in TOPMed cohorts (not enriched for blood disorders). For example, association of GP9 with lower PLT and higher MPV was partly driven by a pathogenic Bernard-Soulier syndrome variant (rs5030764, p.Asn61Ser), and the signals at TUBB1 and CD36 were partly driven by loss of function variants not annotated as pathogenic in ClinVar (rs199948010 and rs571975065). However, residual signal remained for these gene-based signals after adjusting for lead variants, suggesting that additional variants in Mendelian genes with impacts in general population cohorts remain to be identified. Gene-based signals were also identified at several genome-wide association study identified loci for genes not annotated for Mendelian platelet disorders (PTPRH, TET2, CHEK2), with somatic variation driving the result at TET2. These results highlight the value of WGS in populations of diverse genetic ancestry to identify novel regulatory and coding signals, even for well-studied traits like platelet traits.
Asunto(s)
Estudio de Asociación del Genoma Completo , Medicina de Precisión , Plaquetas , Humanos , National Heart, Lung, and Blood Institute (U.S.) , Fenotipo , Polimorfismo de Nucleótido Simple , Medicina de Precisión/métodos , Estados UnidosRESUMEN
Given the coronavirus disease 2019 (COVID-19) pandemic, investigations into host susceptibility to infectious diseases and downstream sequelae have never been more relevant. Pneumonia is a lung disease that can cause respiratory failure and hypoxia and is a common complication of infectious diseases, including COVID-19. Few genome-wide association studies (GWASs) of host susceptibility and severity of pneumonia have been conducted. We performed GWASs of pneumonia susceptibility and severity in the Vanderbilt University biobank (BioVU) with linked electronic health records (EHRs), including Illumina Expanded Multi-Ethnic Global Array (MEGAEX)-genotyped European ancestry (EA, n= 69,819) and African ancestry (AA, n = 15,603) individuals. Two regions of large effect were identified: the CFTR locus in EA (rs113827944; OR = 1.84, p value = 1.2 × 10-36) and HBB in AA (rs334 [p.Glu7Val]; OR = 1.63, p value = 3.5 × 10-13). Mutations in these genes cause cystic fibrosis (CF) and sickle cell disease (SCD), respectively. After removing individuals diagnosed with CF and SCD, we assessed heterozygosity effects at our lead variants. Further GWASs after removing individuals with CF uncovered an additional association in R3HCC1L (rs10786398; OR = 1.22, p value = 3.5 × 10-8), which was replicated in two independent datasets: UK Biobank (n = 459,741) and 7,985 non-overlapping BioVU subjects, who are genotyped on arrays other than MEGAEX. This variant was also validated in GWASs of COVID-19 hospitalization and lung function. Our results highlight the importance of the host genome in infectious disease susceptibility and severity and offer crucial insight into genetic effects that could potentially influence severity of COVID-19 sequelae.
Asunto(s)
COVID-19/complicaciones , COVID-19/genética , Interacciones Huésped-Patógeno/genética , Neumonía Viral/complicaciones , Neumonía Viral/genética , Bronquitis/genética , COVID-19/patología , COVID-19/fisiopatología , Regulador de Conductancia de Transmembrana de Fibrosis Quística/genética , Bases de Datos Genéticas , Registros Electrónicos de Salud , Femenino , Estudio de Asociación del Genoma Completo , Genotipo , Hemoglobinas/genética , Humanos , Pacientes Internos , Desequilibrio de Ligamiento , Masculino , Pacientes Ambulatorios , Neumonía Viral/patología , Neumonía Viral/fisiopatología , Polimorfismo de Nucleótido Simple/genética , Análisis de Componente Principal , Enfermedad Pulmonar Obstructiva Crónica/genética , Reproducibilidad de los Resultados , Reino UnidoRESUMEN
Whole-genome sequencing (WGS), a powerful tool for detecting novel coding and non-coding disease-causing variants, has largely been applied to clinical diagnosis of inherited disorders. Here we leveraged WGS data in up to 62,653 ethnically diverse participants from the NHLBI Trans-Omics for Precision Medicine (TOPMed) program and assessed statistical association of variants with seven red blood cell (RBC) quantitative traits. We discovered 14 single variant-RBC trait associations at 12 genomic loci, which have not been reported previously. Several of the RBC trait-variant associations (RPN1, ELL2, MIDN, HBB, HBA1, PIEZO1, and G6PD) were replicated in independent GWAS datasets imputed to the TOPMed reference panel. Most of these discovered variants are rare/low frequency, and several are observed disproportionately among non-European Ancestry (African, Hispanic/Latino, or East Asian) populations. We identified a 3 bp indel p.Lys2169del (g.88717175_88717177TCT[4]) (common only in the Ashkenazi Jewish population) of PIEZO1, a gene responsible for the Mendelian red cell disorder hereditary xerocytosis (MIM: 194380), associated with higher mean corpuscular hemoglobin concentration (MCHC). In stepwise conditional analysis and in gene-based rare variant aggregated association analysis, we identified several of the variants in HBB, HBA1, TMPRSS6, and G6PD that represent the carrier state for known coding, promoter, or splice site loss-of-function variants that cause inherited RBC disorders. Finally, we applied base and nuclease editing to demonstrate that the sentinel variant rs112097551 (nearest gene RPN1) acts through a cis-regulatory element that exerts long-range control of the gene RUVBL1 which is essential for hematopoiesis. Together, these results demonstrate the utility of WGS in ethnically diverse population-based samples and gene editing for expanding knowledge of the genetic architecture of quantitative hematologic traits and suggest a continuum between complex trait and Mendelian red cell disorders.
Asunto(s)
Eritrocitos/metabolismo , Eritrocitos/patología , Estudio de Asociación del Genoma Completo , National Heart, Lung, and Blood Institute (U.S.)/organización & administración , Fenotipo , Adulto , Anciano , Cromosomas Humanos Par 16/genética , Conjuntos de Datos como Asunto , Femenino , Edición Génica , Variación Genética/genética , Células HEK293 , Humanos , Masculino , Persona de Mediana Edad , Control de Calidad , Reproducibilidad de los Resultados , Estados UnidosRESUMEN
Many common and rare variants associated with hematologic traits have been discovered through imputation on large-scale reference panels. However, the majority of genome-wide association studies (GWASs) have been conducted in Europeans, and determining causal variants has proved challenging. We performed a GWAS of total leukocyte, neutrophil, lymphocyte, monocyte, eosinophil, and basophil counts generated from 109,563,748 variants in the autosomes and the X chromosome in the Trans-Omics for Precision Medicine (TOPMed) program, which included data from 61,802 individuals of diverse ancestry. We discovered and replicated 7 leukocyte trait associations, including (1) the association between a chromosome X, pseudo-autosomal region (PAR), noncoding variant located between cytokine receptor genes (CSF2RA and CLRF2) and lower eosinophil count; and (2) associations between single variants found predominantly among African Americans at the S1PR3 (9q22.1) and HBB (11p15.4) loci and monocyte and lymphocyte counts, respectively. We further provide evidence indicating that the newly discovered eosinophil-lowering chromosome X PAR variant might be associated with reduced susceptibility to common allergic diseases such as atopic dermatitis and asthma. Additionally, we found a burden of very rare FLT3 (13q12.2) variants associated with monocyte counts. Together, these results emphasize the utility of whole-genome sequencing in diverse samples in identifying associations missed by European-ancestry-driven GWASs.
Asunto(s)
Asma/epidemiología , Biomarcadores/metabolismo , Dermatitis Atópica/epidemiología , Leucocitos/patología , Polimorfismo de Nucleótido Simple , Enfermedad Pulmonar Obstructiva Crónica/epidemiología , Sitios de Carácter Cuantitativo , Asma/genética , Asma/metabolismo , Asma/patología , Dermatitis Atópica/genética , Dermatitis Atópica/metabolismo , Dermatitis Atópica/patología , Predisposición Genética a la Enfermedad , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , National Heart, Lung, and Blood Institute (U.S.) , Fenotipo , Pronóstico , Proteoma/análisis , Proteoma/metabolismo , Enfermedad Pulmonar Obstructiva Crónica/genética , Enfermedad Pulmonar Obstructiva Crónica/metabolismo , Enfermedad Pulmonar Obstructiva Crónica/patología , Reino Unido/epidemiología , Estados Unidos/epidemiología , Secuenciación Completa del GenomaRESUMEN
AIMS/HYPOTHESIS: We examined the contribution of rare HNF1A variants to type 2 diabetes risk and age of diagnosis, and the extent to which their impact is affected by overall genetic susceptibility, across three ancestry groups. METHODS: Using exome sequencing data of 160,615 individuals of the UK Biobank and 18,797 individuals of the BioMe Biobank, we identified 746 carriers of rare functional HNF1A variants (minor allele frequency ≤1%), of which 507 carry variants in the functional domains. We calculated polygenic risk scores (PRSs) based on genome-wide association study summary statistics for type 2 diabetes, and examined the association of HNF1A variants and PRS with risk of type 2 diabetes and age of diagnosis. We also tested whether the PRS affects the association between HNF1A variants and type 2 diabetes risk by including an interaction term. RESULTS: Rare HNF1A variants that are predicted to impair protein function are associated with increased risk of type 2 diabetes in individuals of European ancestry (OR 1.46, p=0.049), particularly when the variants are located in the functional domains (OR 1.89, p=0.002). No association was observed for individuals of African ancestry (OR 1.10, p=0.60) or Hispanic-Latino ancestry (OR 1.00, p=1.00). Rare functional HNF1A variants were associated with an earlier age at diagnosis in the Hispanic-Latino population (ß=-5.0 years, p=0.03), and this association was marginally more pronounced for variants in the functional domains (ß=-5.59 years, p=0.03). No associations were observed for other ancestries (African ancestry ß=-2.7 years, p=0.13; European ancestry ß=-3.5 years, p=0.20). A higher PRS was associated with increased odds of type 2 diabetes in all ancestries (OR 1.61-2.11, p<10-5) and an earlier age at diagnosis in individuals of African ancestry (ß=-1.4 years, p=3.7 × 10-6) and Hispanic-Latino ancestry (ß=-2.4 years, p<2 × 10-16). Furthermore, a higher PRS exacerbated the effect of the functional HNF1A variants on type 2 diabetes in the European ancestry population (pinteraction=0.037). CONCLUSIONS/INTERPRETATION: We show that rare functional HNF1A variants, in particular those located in the functional domains, increase the risk of type 2 diabetes, at least among individuals of European ancestry. Their effect is even more pronounced in individuals with a high polygenic susceptibility. Our analyses highlight the importance of the location of functional variants within a gene and an individual's overall polygenic susceptibility, and emphasise the need for more genetic data in non-European populations.
Asunto(s)
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Estudio de Asociación del Genoma Completo , Factor Nuclear 1-alfa del Hepatocito/genéticaRESUMEN
Circulating levels of adiponectin, an adipocyte-secreted protein associated with cardiovascular and metabolic risk, are highly heritable. To gain insights into the biology that regulates adiponectin levels, we performed an exome array meta-analysis of 265,780 genetic variants in 67,739 individuals of European, Hispanic, African American, and East Asian ancestry. We identified 20 loci associated with adiponectin, including 11 that had been reported previously (p < 2 × 10-7). Comparison of exome array variants to regional linkage disequilibrium (LD) patterns and prior genome-wide association study (GWAS) results detected candidate variants (r2 > .60) spanning as much as 900 kb. To identify potential genes and mechanisms through which the previously unreported association signals act to affect adiponectin levels, we assessed cross-trait associations, expression quantitative trait loci in subcutaneous adipose, and biological pathways of nearby genes. Eight of the nine loci were also associated (p < 1 × 10-4) with at least one obesity or lipid trait. Candidate genes include PRKAR2A, PTH1R, and HDAC9, which have been suggested to play roles in adipocyte differentiation or bone marrow adipose tissue. Taken together, these findings provide further insights into the processes that influence circulating adiponectin levels.
Asunto(s)
Adiponectina/genética , Tejido Adiposo/patología , Exoma/genética , Predisposición Genética a la Enfermedad , Lípidos/análisis , Obesidad/etiología , Polimorfismo de Nucleótido Simple , Tejido Adiposo/metabolismo , Adolescente , Adulto , Negro o Afroamericano/genética , Anciano , Anciano de 80 o más Años , Femenino , Hispánicos o Latinos/genética , Humanos , Masculino , Persona de Mediana Edad , Obesidad/patología , Fenotipo , Sitios de Carácter Cuantitativo , Población Blanca/genética , Adulto JovenRESUMEN
Despite the dramatic underrepresentation of non-European populations in human genetics studies, researchers continue to exclude participants of non-European ancestry, as well as variants rare in European populations, even when these data are available. This practice perpetuates existing research disparities and can lead to important and large effect size associations being missed. Here, we conducted genome-wide association studies (GWAS) of 31 serum and urine biomarker quantitative traits in African (n = 9354), East Asian (n = 2559), and South Asian (n = 9823) ancestry UK Biobank (UKBB) participants. We adjusted for all known GWAS catalog variants for each trait, as well as novel signals identified in a recent European ancestry-focused analysis of UKBB participants. We identify 7 novel signals in African ancestry and 2 novel signals in South Asian ancestry participants (p < 1.61E-10). Many of these signals are highly plausible, including a cis pQTL for the gene encoding gamma-glutamyl transferase and PIEZO1 and G6PD variants with impacts on HbA1c through likely erythrocytic mechanisms. This work illustrates the importance of using the genetic data we already have in diverse populations, with novel discoveries possible in even modest sample sizes.
Asunto(s)
Bancos de Muestras Biológicas/estadística & datos numéricos , Biomarcadores/metabolismo , Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo/genética , Alelos , Pueblo Asiatico/genética , Biomarcadores/sangre , Biomarcadores/orina , Población Negra/genética , Femenino , Frecuencia de los Genes , Predisposición Genética a la Enfermedad/etnología , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Genotipo , Humanos , Masculino , Fenotipo , Reino Unido , Población Blanca/genéticaRESUMEN
E-selectin mediates the rolling of circulating leukocytes during inflammatory processes. Previous genome-wide association studies in European and Asian individuals have identified the ABO locus associated with E-selectin levels. Using Trans-Omics for Precision Medicine whole genome sequencing data in 2249 African Americans (AAs) from the Jackson Heart Study, we examined genome-wide associations with soluble E-selectin levels. In addition to replicating known signals at ABO, we identified a novel association of a common loss-of-function, missense variant in Fucosyltransferase 6 (FUT6; rs17855739,p.Glu274Lys, P = 9.02 × 10-24) with higher soluble E-selectin levels. This variant is considerably more common in populations of African ancestry compared to non-African ancestry populations. We replicated the association of FUT6 p.Glu274Lys with higher soluble E-selectin in an independent population of 748 AAs from the Women's Health Initiative and identified an additional pleiotropic association with vitamin B12 levels. Despite the broad role of both selectins and fucosyltransferases in various inflammatory, immune and cancer-related processes, we were unable to identify any additional disease associations of the FUT6 p.Glu274Lys variant in an electronic medical record-based phenome-wide association scan of over 9000 AAs.
Asunto(s)
Negro o Afroamericano/genética , Selectina E/genética , Fucosiltransferasas/genética , Adulto , Femenino , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Polimorfismo de Nucleótido Simple , Secuenciación Completa del Genoma/métodosRESUMEN
Atrial fibrillation (AF) is a common cardiac arrhythmia and a major risk factor for stroke, heart failure, and premature death. The pathogenesis of AF remains poorly understood, which contributes to the current lack of highly effective treatments. To understand the genetic variation and biology underlying AF, we undertook a genome-wide association study (GWAS) of 6,337 AF individuals and 61,607 AF-free individuals from Norway, including replication in an additional 30,679 AF individuals and 278,895 AF-free individuals. Through genotyping and dense imputation mapping from whole-genome sequencing, we tested almost nine million genetic variants across the genome and identified seven risk loci, including two novel loci. One novel locus (lead single-nucleotide variant [SNV] rs12614435; p = 6.76 × 10-18) comprised intronic and several highly correlated missense variants situated in the I-, A-, and M-bands of titin, which is the largest protein in humans and responsible for the passive elasticity of heart and skeletal muscle. The other novel locus (lead SNV rs56202902; p = 1.54 × 10-11) covered a large, gene-dense chromosome 1 region that has previously been linked to cardiac conduction. Pathway and functional enrichment analyses suggested that many AF-associated genetic variants act through a mechanism of impaired muscle cell differentiation and tissue formation during fetal heart development.
Asunto(s)
Fibrilación Atrial/genética , Sitios Genéticos , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Corazón/embriología , Secuencias Reguladoras de Ácidos Nucleicos/genética , Humanos , Patrón de Herencia/genética , Herencia Multifactorial/genética , Especificidad de Órganos/genética , Mapeo Físico de Cromosoma , Sitios de Carácter Cuantitativo/genética , Reproducibilidad de los Resultados , Factores de RiesgoAsunto(s)
Mutación con Pérdida de Función , Transcobalaminas/genética , Vitamina B 12/sangre , Deficiencia de Vitamina B/genética , Negro o Afroamericano/genética , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Polimorfismo de Nucleótido Simple , Deficiencia de Vitamina B/sangreRESUMEN
Whole genome sequences (WGS) enable discovery of rare variants which may contribute to missing heritability of coronary artery disease (CAD). To measure their contribution, we apply the GREML-LDMS-I approach to WGS of 4949 cases and 17,494 controls of European ancestry from the NHLBI TOPMed program. We estimate CAD heritability at 34.3% assuming a prevalence of 8.2%. Ultra-rare (minor allele frequency ≤ 0.1%) variants with low linkage disequilibrium (LD) score contribute ~50% of the heritability. We also investigate CAD heritability enrichment using a diverse set of functional annotations: i) constraint; ii) predicted protein-altering impact; iii) cis-regulatory elements from a cell-specific chromatin atlas of the human coronary; and iv) annotation principal components representing a wide range of functional processes. We observe marked enrichment of CAD heritability for most functional annotations. These results reveal the predominant role of ultra-rare variants in low LD on the heritability of CAD. Moreover, they highlight several functional processes including cell type-specific regulatory mechanisms as key drivers of CAD genetic risk.
Asunto(s)
Enfermedad de la Arteria Coronaria , Predisposición Genética a la Enfermedad , Desequilibrio de Ligamiento , Polimorfismo de Nucleótido Simple , Humanos , Enfermedad de la Arteria Coronaria/genética , Masculino , Femenino , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Población Blanca/genética , Estudios de Casos y Controles , Secuenciación Completa del Genoma , Variación Genética , Persona de Mediana EdadRESUMEN
Most genome-wide association studies (GWAS) of major depression (MD) have been conducted in samples of European ancestry. Here we report a multi-ancestry GWAS of MD, adding data from 21 cohorts with 88,316 MD cases and 902,757 controls to previously reported data. This analysis used a range of measures to define MD and included samples of African (36% of effective sample size), East Asian (26%) and South Asian (6%) ancestry and Hispanic/Latin American participants (32%). The multi-ancestry GWAS identified 53 significantly associated novel loci. For loci from GWAS in European ancestry samples, fewer than expected were transferable to other ancestry groups. Fine mapping benefited from additional sample diversity. A transcriptome-wide association study identified 205 significantly associated novel genes. These findings suggest that, for MD, increasing ancestral and global diversity in genetic studies may be particularly important to ensure discovery of core genes and inform about transferability of findings.
Asunto(s)
Trastorno Depresivo Mayor , Estudio de Asociación del Genoma Completo , Humanos , Predisposición Genética a la Enfermedad , Trastorno Depresivo Mayor/genética , Depresión , Mapeo Cromosómico , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
BACKGROUND: The branched chain amino acids (BCAA) leucine, isoleucine, and valine are essential nutrients that have been associated with diabetes, cancers, and cardiovascular diseases. Observational studies suggest that BCAAs exert homogeneous phenotypic effects, but these findings are inconsistent with results from experimental human and animal studies. METHODS: Hypothesizing that inconsistencies between observational and experimental BCAA studies reflect bias from shared lifestyle and genetic factors in observational studies, we used data from the UK Biobank and applied multivariable Mendelian randomization causal inference methods designed to address these biases. RESULTS: In n = 97,469 participants of European ancestry (mean age = 56.7 years; 54.1% female), we estimate distinct and often opposing total causal effects for each BCAA. For example, of the 117 phenotypes with evidence of a statistically significant total causal effect for at least one BCAA, almost half (44%, n = 52) are associated with only one BCAA. These 52 associations include total causal effects of valine on diabetic eye disease [odds ratio = 1.51, 95% confidence interval (CI) = 1.31, 1.76], valine on albuminuria (odds ratio = 1.14, 95% CI = 1.08, 1.20), and isoleucine on angina (odds ratio = 1.17, 95% CI = 1.31, 1.76). CONCLUSIONS: Our results suggest that the observational literature provides a flawed picture of BCAA phenotypic effects that is inconsistent with experimental studies and could mislead efforts developing novel therapeutics. More broadly, these findings motivate the development and application of causal inference approaches that enable 'omics studies conducted in observational settings to account for the biasing effects of shared genetic and lifestyle factors.
The three branched chain amino acids (BCAAs) leucine, isoleucine, and valine are important building blocks of muscle proteins that are obtained from the diet. Many studies in human populations have examined whether BCAAs affect health and disease. These human studies report results that are inconsistent with results from highly controlled animal studies. Because interest in the therapeutic targeting of BCAAs is growing, we wanted to better understand these discrepancies. Briefly, we used data from a large database that captured many diseases (e.g., cardiovascular disease, cancers, and respiratory disease) and new statistical methods. Our results showed that discrepancies between human studies and animal studies may reflect errors in the ways human studies were designed and conducted. As a result, these human studies may provide a flawed picture of BCAA effects that could mislead efforts developing novel therapeutics.
RESUMEN
INTRODUCTION: The independent and causal cardiovascular disease risk factor lipoprotein(a) (Lp(a)) is elevated in >1.5 billion individuals worldwide, but studies have prioritised European populations. METHODS: Here, we examined how ancestrally diverse studies could clarify Lp(a)'s genetic architecture, inform efforts examining application of Lp(a) polygenic risk scores (PRS), enable causal inference and identify unexpected Lp(a) phenotypic effects using data from African (n=25 208), East Asian (n=2895), European (n=362 558), South Asian (n=8192) and Hispanic/Latino (n=8946) populations. RESULTS: Fourteen genome-wide significant loci with numerous population specific signals of large effect were identified that enabled construction of Lp(a) PRS of moderate (R2=15% in East Asians) to high (R2=50% in Europeans) accuracy. For all populations, PRS showed promise as a 'rule out' for elevated Lp(a) because certainty of assignment to the low-risk threshold was high (88.0%-99.9%) across PRS thresholds (80th-99th percentile). Causal effects of increased Lp(a) with increased glycated haemoglobin were estimated for Europeans (p value =1.4×10-6), although inverse effects in Africans and East Asians suggested the potential for heterogeneous causal effects. Finally, Hispanic/Latinos were the only population in which known associations with coronary atherosclerosis and ischaemic heart disease were identified in external testing of Lp(a) PRS phenotypic effects. CONCLUSIONS: Our results emphasise the merits of prioritising ancestral diversity when addressing Lp(a) evidence gaps.
Asunto(s)
Enfermedad de la Arteria Coronaria , Isquemia Miocárdica , Humanos , Lipoproteína(a)/genética , Lagunas en las Evidencias , Factores de Riesgo , Enfermedad de la Arteria Coronaria/diagnóstico , Enfermedad de la Arteria Coronaria/epidemiología , Enfermedad de la Arteria Coronaria/genéticaRESUMEN
Ever larger Structural Variant (SV) catalogs highlighting the diversity within and between populations help researchers better understand the links between SVs and disease. The identification of SVs from DNA sequence data is non-trivial and requires a balance between comprehensiveness and precision. Here we present a catalog of 355,667 SVs (59.34% novel) across autosomes and the X chromosome (50bp+) from 138,134 individuals in the diverse TOPMed consortium. We describe our methodologies for SV inference resulting in high variant quality and >90% allele concordance compared to long-read de-novo assemblies of well-characterized control samples. We demonstrate utility through significant associations between SVs and important various cardio-metabolic and hemotologic traits. We have identified 690 SV hotspots and deserts and those that potentially impact the regulation of medically relevant genes. This catalog characterizes SVs across multiple populations and will serve as a valuable tool to understand the impact of SV on disease development and progression.
RESUMEN
Ever larger Structural Variant (SV) catalogs highlighting the diversity within and between populations help researchers better understand the links between SVs and disease. The identification of SVs from DNA sequence data is non-trivial and requires a balance between comprehensiveness and precision. Here we present a catalog of 355,667 SVs (59.34% novel) across autosomes and the X chromosome (50bp+) from 138,134 individuals in the diverse TOPMed consortium. We describe our methodologies for SV inference resulting in high variant quality and >90% allele concordance compared to long-read de-novo assemblies of well-characterized control samples. We demonstrate utility through significant associations between SVs and important various cardio-metabolic and hematologic traits. We have identified 690 SV hotspots and deserts and those that potentially impact the regulation of medically relevant genes. This catalog characterizes SVs across multiple populations and will serve as a valuable tool to understand the impact of SV on disease development and progression.