RESUMEN
Tobacco and alcohol use are heritable behaviours associated with 15% and 5.3% of worldwide deaths, respectively, due largely to broad increased risk for disease and injury1-4. These substances are used across the globe, yet genome-wide association studies have focused largely on individuals of European ancestries5. Here we leveraged global genetic diversity across 3.4 million individuals from four major clines of global ancestry (approximately 21% non-European) to power the discovery and fine-mapping of genomic loci associated with tobacco and alcohol use, to inform function of these loci via ancestry-aware transcriptome-wide association studies, and to evaluate the genetic architecture and predictive power of polygenic risk within and across populations. We found that increases in sample size and genetic diversity improved locus identification and fine-mapping resolution, and that a large majority of the 3,823 associated variants (from 2,143 loci) showed consistent effect sizes across ancestry dimensions. However, polygenic risk scores developed in one ancestry performed poorly in others, highlighting the continued need to increase sample sizes of diverse ancestries to realize any potential benefit of polygenic prediction.
Asunto(s)
Consumo de Bebidas Alcohólicas , Predisposición Genética a la Enfermedad , Variación Genética , Internacionalidad , Herencia Multifactorial , Uso de Tabaco , Humanos , Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Estudio de Asociación del Genoma Completo/métodos , Herencia Multifactorial/genética , Factores de Riesgo , Uso de Tabaco/genética , Consumo de Bebidas Alcohólicas/genética , Transcriptoma , Tamaño de la Muestra , Sitios Genéticos/genética , Europa (Continente)/etnologíaRESUMEN
Precision medicine initiatives across the globe have led to a revolution of repositories linking large-scale genomic data with electronic health records, enabling genomic analyses across the entire phenome. Many of these initiatives focus solely on research insights, leading to limited direct benefit to patients. We describe the biobank at the Colorado Center for Personalized Medicine (CCPM Biobank) that was jointly developed by the University of Colorado Anschutz Medical Campus and UCHealth to serve as a unique, dual-purpose research and clinical resource accelerating personalized medicine. This living resource currently has more than 200,000 participants with ongoing recruitment. We highlight the clinical, laboratory, regulatory, and HIPAA-compliant informatics infrastructure along with our stakeholder engagement, consent, recontact, and participant engagement strategies. We characterize aspects of genetic and geographic diversity unique to the Rocky Mountain region, the primary catchment area for CCPM Biobank participants. We leverage linked health and demographic information of the CCPM Biobank participant population to demonstrate the utility of the CCPM Biobank to replicate complex trait associations in the first 33,674 genotyped individuals across multiple disease domains. Finally, we describe our current efforts toward return of clinical genetic test results, including high-impact pathogenic variants and pharmacogenetic information, and our broader goals as the CCPM Biobank continues to grow. Bringing clinical and research interests together fosters unique clinical and translational questions that can be addressed from the large EHR-linked CCPM Biobank resource within a HIPAA- and CLIA-certified environment.
Asunto(s)
Aprendizaje del Sistema de Salud , Medicina de Precisión , Humanos , Bancos de Muestras Biológicas , Colorado , GenómicaRESUMEN
Many common and rare variants associated with hematologic traits have been discovered through imputation on large-scale reference panels. However, the majority of genome-wide association studies (GWASs) have been conducted in Europeans, and determining causal variants has proved challenging. We performed a GWAS of total leukocyte, neutrophil, lymphocyte, monocyte, eosinophil, and basophil counts generated from 109,563,748 variants in the autosomes and the X chromosome in the Trans-Omics for Precision Medicine (TOPMed) program, which included data from 61,802 individuals of diverse ancestry. We discovered and replicated 7 leukocyte trait associations, including (1) the association between a chromosome X, pseudo-autosomal region (PAR), noncoding variant located between cytokine receptor genes (CSF2RA and CLRF2) and lower eosinophil count; and (2) associations between single variants found predominantly among African Americans at the S1PR3 (9q22.1) and HBB (11p15.4) loci and monocyte and lymphocyte counts, respectively. We further provide evidence indicating that the newly discovered eosinophil-lowering chromosome X PAR variant might be associated with reduced susceptibility to common allergic diseases such as atopic dermatitis and asthma. Additionally, we found a burden of very rare FLT3 (13q12.2) variants associated with monocyte counts. Together, these results emphasize the utility of whole-genome sequencing in diverse samples in identifying associations missed by European-ancestry-driven GWASs.
Asunto(s)
Asma/epidemiología , Biomarcadores/metabolismo , Dermatitis Atópica/epidemiología , Leucocitos/patología , Polimorfismo de Nucleótido Simple , Enfermedad Pulmonar Obstructiva Crónica/epidemiología , Sitios de Carácter Cuantitativo , Asma/genética , Asma/metabolismo , Asma/patología , Dermatitis Atópica/genética , Dermatitis Atópica/metabolismo , Dermatitis Atópica/patología , Predisposición Genética a la Enfermedad , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , National Heart, Lung, and Blood Institute (U.S.) , Fenotipo , Pronóstico , Proteoma/análisis , Proteoma/metabolismo , Enfermedad Pulmonar Obstructiva Crónica/genética , Enfermedad Pulmonar Obstructiva Crónica/metabolismo , Enfermedad Pulmonar Obstructiva Crónica/patología , Reino Unido/epidemiología , Estados Unidos/epidemiología , Secuenciación Completa del GenomaRESUMEN
BACKGROUND: While numerous genetic loci associated with atopic dermatitis (AD) have been discovered, to date, work leveraging the combined burden of AD risk variants across the genome to predict disease risk has been limited. OBJECTIVES: This study aims to determine whether polygenic risk scores (PRSs) relying on genetic determinants for AD provide useful predictions for disease occurrence and severity. It also explicitly tests the value of including genome-wide association studies of related allergic phenotypes and known FLG loss-of-function (LOF) variants. METHODS: AD PRSs were constructed for 1619 European American individuals from the Atopic Dermatitis Research Network using an AD training dataset and an atopic training dataset including AD, childhood onset asthma, and general allergy. Additionally, whole genome sequencing data were used to explore genetic scoring specific to FLG LOF mutations. RESULTS: Genetic scores derived from the AD-only genome-wide association studies were predictive of AD cases (PRSAD: odds ratio [OR], 1.70; 95% CI, 1.49-1.93). Accuracy was first improved when PRSs were built off the larger atopy genome-wide association studies (PRSAD+: OR, 2.16; 95% CI, 1.89-2.47) and further improved when including FLG LOF mutations (PRSAD++: OR, 3.23; 95% CI, 2.57-4.07). Importantly, while all 3 PRSs correlated with AD severity, the best prediction was from PRSAD++, which distinguished individuals with severe AD from control subjects with OR of 3.86 (95% CI, 2.77-5.36). CONCLUSIONS: This study demonstrates how PRSs for AD that include genetic determinants across atopic phenotypes and FLG LOF variants may be a promising tool for identifying individuals at high risk for developing disease and specifically severe disease.
Asunto(s)
Dermatitis Atópica/genética , Proteínas Filagrina/genética , Femenino , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Lactante , Desequilibrio de Ligamiento , Mutación con Pérdida de Función , Masculino , FenotipoRESUMEN
BACKGROUND: Genetic ancestry plays a role in asthma health disparities. OBJECTIVE: Our aim was to evaluate the impact of ancestry on and identify genetic variants associated with asthma, total serum IgE level, and lung function. METHODS: A total of 436 Peruvian children (aged 9-19 years) with asthma and 291 without asthma were genotyped by using the Illumina Multi-Ethnic Global Array. Genome-wide proportions of indigenous ancestry populations from continental America (NAT) and European ancestry from the Iberian populations in Spain (IBS) were estimated by using ADMIXTURE. We assessed the relationship between ancestry and the phenotypes and performed a genome-wide association study. RESULTS: The mean ancestry proportions were 84.7% NAT (case patients, 84.2%; controls, 85.4%) and 15.3% IBS (15.8%; 14.6%). With adjustment for asthma, NAT was associated with higher total serum IgE levels (P < .001) and IBS was associated with lower total serum IgE levels (P < .001). NAT was associated with higher FEV1 percent predicted values (P < .001), whereas IBS was associated with lower FEV1 values in the controls but not in the case patients. The HLA-DR/DQ region on chromosome 6 (Chr6) was strongly associated with total serum IgE (rs3135348; P = 3.438 × 10-10) and was independent of an association with the haplotype HLA-DQA1â¼HLA-DQB1:04.01â¼04.02 (P = 1.55 × 10-05). For lung function, we identified a locus (rs4410198; P = 5.536 × 10-11) mapping to Chr19, near a cluster of zinc finger interacting genes that colocalizes to the long noncoding RNA CTD-2537I9.5. This novel locus was replicated in an independent sample of pediatric case patients with asthma with similar admixture from Brazil (P = .005). CONCLUSION: This study confirms the role of HLA in atopy, and identifies a novel locus mapping to a long noncoding RNA for lung function that may be specific to children with NAT.
Asunto(s)
Asma/genética , Genotipo , Inmunoglobulina E/metabolismo , Pueblos Indígenas , Pulmón/metabolismo , Adolescente , Américas , Asma/epidemiología , Niño , Estudios de Cohortes , Femenino , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Antígenos HLA-DQ/metabolismo , Humanos , Pulmón/inmunología , Masculino , Perú/epidemiología , Polimorfismo de Nucleótido Simple , ARN Largo no Codificante/genética , España , Adulto JovenRESUMEN
BACKGROUND: Total serum IgE (tIgE) is an important intermediate phenotype of allergic disease. Whole genome genetic association studies across ancestries may identify important determinants of IgE. OBJECTIVE: We aimed to increase understanding of genetic variants affecting tIgE production across the ancestry and allergic disease spectrum by leveraging data from the National Heart, Lung and Blood Institute Trans-Omics for Precision Medicine program; the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA); and the Atopic Dermatitis Research Network (N = 21,901). METHODS: We performed genome-wide association within strata of study, disease, and ancestry groups, and we combined results via a meta-regression approach that models heterogeneity attributable to ancestry. We also tested for association between HLA alleles called from whole genome sequence data and tIgE, assessing replication of associations in HLA alleles called from genotype array data. RESULTS: We identified 6 loci at genome-wide significance (P < 5 × 10-9), including 4 loci previously reported as genome-wide significant for tIgE, as well as new regions in chr11q13.5 and chr15q22.2, which were also identified in prior genome-wide association studies of atopic dermatitis and asthma. In the HLA allele association study, HLA-A∗02:01 was associated with decreased tIgE level (Pdiscovery = 2 × 10-4; Preplication = 5 × 10-4; Pdiscovery+replication = 4 × 10-7), and HLA-DQB1∗03:02 was strongly associated with decreased tIgE level in Hispanic/Latino ancestry populations (PHispanic/Latino discovery+replication = 8 × 10-8). CONCLUSION: We performed the largest genome-wide association study and HLA association study of tIgE focused on ancestrally diverse populations and found several known tIgE and allergic disease loci that are relevant in non-European ancestry populations.
Asunto(s)
Asma/genética , Dermatitis Atópica/genética , Etnicidad , Genotipo , Antígeno HLA-A2/genética , Cadenas beta de HLA-DQ/genética , Adolescente , Adulto , Anciano , Niño , Preescolar , Femenino , Frecuencia de los Genes , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Inmunoglobulina E/sangre , Masculino , Persona de Mediana Edad , National Heart, Lung, and Blood Institute (U.S.) , Estados Unidos , Secuenciación Completa del Genoma , Adulto JovenRESUMEN
BACKGROUND: Eczema herpeticum (EH) is a rare complication of atopic dermatitis (AD) caused by disseminated herpes simplex virus (HSV) infection. The role of rare and/or deleterious genetic variants in disease etiology is largely unknown. This study aimed to identify genes that harbor damaging genetic variants associated with HSV infection in AD with a history of recurrent eczema herpeticum (ADEH+). METHODS: Whole genome sequencing (WGS) was performed on 49 recurrent ADEH+ (≥3 EH episodes), 491 AD without a history of eczema herpeticum (ADEH-) and 237 non-atopic control (NA) subjects. Variants were annotated, and a gene-based approach (SKAT-O) was used to identify genes harboring damaging genetic variants associated with ADEH+. Genes identified through WGS were studied for effects on HSV responses and keratinocyte differentiation. RESULTS: Eight genes were identified in the comparison of recurrent ADEH+to ADEH-and NA subjects: SIDT2, CLEC7A, GSTZ1, TPSG1, SP110, RBBP8NL, TRIM15, and FRMD3. Silencing SIDT2 and RBBP8NL in normal human primary keratinocytes (NHPKs) led to significantly increased HSV-1 replication. SIDT2-silenced NHPKs had decreased gene expression of IFNk and IL1b in response to HSV-1 infection. RBBP8NL-silenced NHPKs had decreased gene expression of IFNk, but increased IL1b. Additionally, silencing SIDT2 and RBBP8NL also inhibited gene expression of keratinocyte differentiation markers keratin 10 (KRT10) and loricrin (LOR). CONCLUSION: SIDT2 and RBBP8NL participate in keratinocyte's response to HSV-1 infection. SIDT2 and RBBP8NL also regulate expression of keratinocyte differentiation genes of KRT10 and LOR.
Asunto(s)
Dermatitis Atópica , Herpesvirus Humano 1 , Erupción Variceliforme de Kaposi , Proteínas de Transporte de Nucleótidos , Dermatitis Atópica/genética , Glutatión Transferasa , Herpesvirus Humano 1/genética , Humanos , Erupción Variceliforme de Kaposi/genética , Mutación , Secuenciación Completa del GenomaRESUMEN
BACKGROUND: Asthma is a complex chronic inflammatory disease of the airways. Association studies between HLA and asthma were first reported in the 1970s, and yet, the precise role of HLA alleles in asthma is not fully understood. Numerous genome-wide association studies were recently conducted on asthma, but were always limited to simple genetic markers (single nucleotide polymorphisms) and not complex HLA gene polymorphisms (alleles/haplotypes), therefore not capturing the biological relevance of this complex locus for asthma pathogenesis. OBJECTIVE: To run the first HLA-centric association study with asthma and specific asthma-related phenotypes in a large cohort of African-ancestry individuals. METHODS: We collected high-density genomics data for the Consortium on Asthma among African-ancestry Populations in the Americas (N = 4993) participants. Using computer-intensive machine-learning attribute bagging methods to infer HLA alleles, and Easy-HLA to infer HLA 5-gene haplotypes, we conducted a high-throughput HLA-centric association study of asthma susceptibility and total serum IgE (tIgE) levels in subjects with and without asthma. RESULTS: Among the 1607 individuals with asthma, 972 had available tIgE levels, with a mean tIgE level of 198.7 IU/mL. We could not identify any association with asthma susceptibility. However, we showed that HLA-DRB1∗09:01 was associated with increased tIgE levels (P = 8.5 × 10-4; weighted effect size, 0.51 [0.15-0.87]). CONCLUSIONS: We identified for the first time an HLA allele associated with tIgE levels in African-ancestry individuals with asthma. Our report emphasizes that by leveraging powerful computational machine-learning methods, specific/extreme phenotypes, and population diversity, we can explore HLA gene polymorphisms in depth and reveal the full extent of complex disease associations.
Asunto(s)
Alelos , Negro o Afroamericano/genética , Cadenas HLA-DRB1/genética , Inmunoglobulina E/inmunología , Polimorfismo de Nucleótido Simple , Asma , Femenino , Cadenas HLA-DRB1/inmunología , Humanos , MasculinoRESUMEN
MOTIVATION: Sample source, procurement process and other technical variations introduce batch effects into genomics data. Algorithms to remove these artifacts enhance differences between known biological covariates, but also carry potential concern of removing intragroup biological heterogeneity and thus any personalized genomic signatures. As a result, accurate identification of novel subtypes from batch-corrected genomics data is challenging using standard algorithms designed to remove batch effects for class comparison analyses. Nor can batch effects be corrected reliably in future applications of genomics-based clinical tests, in which the biological groups are by definition unknown a priori. RESULTS: Therefore, we assess the extent to which various batch correction algorithms remove true biological heterogeneity. We also introduce an algorithm, permuted-SVA (pSVA), using a new statistical model that is blind to biological covariates to correct for technical artifacts while retaining biological heterogeneity in genomic data. This algorithm facilitated accurate subtype identification in head and neck cancer from gene expression data in both formalin-fixed and frozen samples. When applied to predict Human Papillomavirus (HPV) status, pSVA improved cross-study validation even if the sample batches were highly confounded with HPV status in the training set. AVAILABILITY AND IMPLEMENTATION: All analyses were performed using R version 2.15.0. The code and data used to generate the results of this manuscript is available from https://sourceforge.net/projects/psva.
Asunto(s)
Algoritmos , Genómica/métodos , Neoplasias de Cabeza y Cuello/genética , Infecciones por Papillomavirus/diagnóstico , Artefactos , Biología Computacional/métodos , Neoplasias de Cabeza y Cuello/virología , Humanos , Modelos Estadísticos , Reproducibilidad de los Resultados , Programas InformáticosAsunto(s)
Proteína 1 de la Translocación del Linfoma del Tejido Linfático Asociado a Mucosas/genética , Hipersensibilidad al Cacahuete/genética , Alérgenos/inmunología , Arachis/inmunología , Niño , Femenino , Sitios Genéticos , Predisposición Genética a la Enfermedad , Humanos , Inmunoglobulina E/inmunología , Masculino , RiesgoRESUMEN
CONTEXT: Thyroid nodule ultrasound-based risk stratification schemas rely on the presence of high-risk sonographic features. However, some malignant thyroid nodules have benign appearance on thyroid ultrasound. New methods for thyroid nodule risk assessment are needed. OBJECTIVE: We investigated polygenic risk score (PRS) accounting for inherited thyroid cancer risk combined with ultrasound-based analysis for improved thyroid nodule risk assessment. METHODS: The convolutional neural network classifier was trained on thyroid ultrasound still images and cine clips from 621 thyroid nodules. Phenome-wide association study (PheWAS) and PRS PheWAS were used to optimize PRS for distinguishing benign and malignant nodules. PRS was evaluated in 73 346 participants in the Colorado Center for Personalized Medicine Biobank. RESULTS: When the deep learning model output was combined with thyroid cancer PRS and genetic ancestry estimates, the area under the receiver operating characteristic curve (AUROC) of the benign vs malignant thyroid nodule classifier increased from 0.83 to 0.89 (DeLong, P value = .007). The combined deep learning and genetic classifier achieved a clinically relevant sensitivity of 0.95, 95% CI [0.88-0.99], specificity of 0.63 [0.55-0.70], and positive and negative predictive values of 0.47 [0.41-0.58] and 0.97 [0.92-0.99], respectively. AUROC improvement was consistent in European ancestry-stratified analysis (0.83 and 0.87 for deep learning and deep learning combined with PRS classifiers, respectively). Elevated PRS was associated with a greater risk of thyroid cancer structural disease recurrence (ordinal logistic regression, P value = .002). CONCLUSION: Augmenting ultrasound-based risk assessment with PRS improves diagnostic accuracy.
Asunto(s)
Neoplasias de la Tiroides , Nódulo Tiroideo , Humanos , Nódulo Tiroideo/diagnóstico por imagen , Nódulo Tiroideo/genética , Puntuación de Riesgo Genético , Sensibilidad y Especificidad , Recurrencia Local de Neoplasia , Neoplasias de la Tiroides/diagnóstico por imagen , Neoplasias de la Tiroides/genética , Ultrasonografía/métodosRESUMEN
Asthma has striking disparities across ancestral groups, but the molecular underpinning of these differences is poorly understood and minimally studied. A goal of the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) is to understand multi-omic signatures of asthma focusing on populations of African ancestry. RNASeq and DNA methylation data are generated from nasal epithelium including cases (current asthma, N = 253) and controls (never-asthma, N = 283) from 7 different geographic sites to identify differentially expressed genes (DEGs) and gene networks. We identify 389 DEGs; the top DEG, FN1, was downregulated in cases (q = 3.26 × 10-9) and encodes fibronectin which plays a role in wound healing. The top three gene expression modules implicate networks related to immune response (CEACAM5; p = 9.62 × 10-16 and CPA3; p = 2.39 × 10-14) and wound healing (FN1; p = 7.63 × 10-9). Multi-omic analysis identifies FKBP5, a co-chaperone of glucocorticoid receptor signaling known to be involved in drug response in asthma, where the association between nasal epithelium gene expression is likely regulated by methylation and is associated with increased use of inhaled corticosteroids. This work reveals molecular dysregulation on three axes - increased Th2 inflammation, decreased capacity for wound healing, and impaired drug response - that may play a critical role in asthma within the African Diaspora.
Asunto(s)
Asma , Población Negra , Metilación de ADN , Mucosa Nasal , Proteínas de Unión a Tacrolimus , Humanos , Asma/genética , Asma/metabolismo , Mucosa Nasal/metabolismo , Proteínas de Unión a Tacrolimus/genética , Proteínas de Unión a Tacrolimus/metabolismo , Femenino , Masculino , Población Negra/genética , Adulto , Redes Reguladoras de Genes , Fibronectinas/metabolismo , Fibronectinas/genética , Estudios de Casos y Controles , Regulación de la Expresión Génica , Persona de Mediana Edad , MultiómicaRESUMEN
Anthropometric traits, measuring body size and shape, are highly heritable and significant clinical risk factors for cardiometabolic disorders. These traits have been extensively studied in genome-wide association studies (GWASs), with hundreds of genome-wide significant loci identified. We performed a whole-exome sequence analysis of the genetics of height, body mass index (BMI) and waist/hip ratio (WHR). We meta-analyzed single-variant and gene-based associations of whole-exome sequence variation with height, BMI, and WHR in up to 22,004 individuals, and we assessed replication of our findings in up to 16,418 individuals from 10 independent cohorts from Trans-Omics for Precision Medicine (TOPMed). We identified four trait associations with single-nucleotide variants (SNVs; two for height and two for BMI) and replicated the LECT2 gene association with height. Our expression quantitative trait locus (eQTL) analysis within previously reported GWAS loci implicated CEP63 and RFT1 as potential functional genes for known height loci. We further assessed enrichment of SNVs, which were monogenic or syndromic variants within loci associated with our three traits. This led to the significant enrichment results for height, whereas we observed no Bonferroni-corrected significance for all SNVs. With a sample size of â¼20,000 whole-exome sequences in our discovery dataset, our findings demonstrate the importance of genomic sequencing in genetic association studies, yet they also illustrate the challenges in identifying effects of rare genetic variants.
Asunto(s)
Exoma , Estudio de Asociación del Genoma Completo , Humanos , Exoma/genética , Índice de Masa Corporal , Sitios de Carácter Cuantitativo/genética , Antropometría , Péptidos y Proteínas de Señalización Intercelular , Proteínas de Ciclo CelularRESUMEN
Most transcriptome-wide association studies (TWASs) so far focus on European ancestry and lack diversity. To overcome this limitation, we aggregated genome-wide association study (GWAS) summary statistics, whole-genome sequences and expression quantitative trait locus (eQTL) data from diverse ancestries. We developed a new approach, TESLA (multi-ancestry integrative study using an optimal linear combination of association statistics), to integrate an eQTL dataset with a multi-ancestry GWAS. By exploiting shared phenotypic effects between ancestries and accommodating potential effect heterogeneities, TESLA improves power over other TWAS methods. When applied to tobacco use phenotypes, TESLA identified 273 new genes, up to 55% more compared with alternative TWAS methods. These hits and subsequent fine mapping using TESLA point to target genes with biological relevance. In silico drug-repurposing analyses highlight several drugs with known efficacy, including dextromethorphan and galantamine, and new drugs such as muscle relaxants that may be repurposed for treating nicotine addiction.
Asunto(s)
Reposicionamiento de Medicamentos , Transcriptoma , Humanos , Transcriptoma/genética , Estudio de Asociación del Genoma Completo/métodos , Uso de Tabaco , Biología , Polimorfismo de Nucleótido Simple/genética , Predisposición Genética a la EnfermedadRESUMEN
Obesity is a major public health crisis associated with high mortality rates. Previous genome-wide association studies (GWAS) investigating body mass index (BMI) have largely relied on imputed data from European individuals. This study leveraged whole-genome sequencing (WGS) data from 88,873 participants from the Trans-Omics for Precision Medicine (TOPMed) Program, of which 51% were of non-European population groups. We discovered 18 BMI-associated signals (P < 5 × 10-9). Notably, we identified and replicated a novel low frequency single nucleotide polymorphism (SNP) in MTMR3 that was common in individuals of African descent. Using a diverse study population, we further identified two novel secondary signals in known BMI loci and pinpointed two likely causal variants in the POC5 and DMD loci. Our work demonstrates the benefits of combining WGS and diverse cohorts in expanding current catalog of variants and genes confer risk for obesity, bringing us one step closer to personalized medicine.
RESUMEN
Common genetic variants explain less variation in complex phenotypes than inferred from family-based studies, and there is a debate on the source of this 'missing heritability'. We investigated the contribution of rare genetic variants to tobacco use with whole-genome sequences from up to 26,257 unrelated individuals of European ancestries and 11,743 individuals of African ancestries. Across four smoking traits, single-nucleotide-polymorphism-based heritability ([Formula: see text]) was estimated from 0.13 to 0.28 (s.e., 0.10-0.13) in European ancestries, with 35-74% of it attributable to rare variants with minor allele frequencies between 0.01% and 1%. These heritability estimates are 1.5-4 times higher than past estimates based on common variants alone and accounted for 60% to 100% of our pedigree-based estimates of narrow-sense heritability ([Formula: see text], 0.18-0.34). In the African ancestry samples, [Formula: see text] was estimated from 0.03 to 0.33 (s.e., 0.09-0.14) across the four smoking traits. These results suggest that rare variants are important contributors to the heritability of smoking.
Asunto(s)
Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Frecuencia de los Genes , Polimorfismo de Nucleótido Simple/genética , Fenotipo , Fumar/genéticaRESUMEN
Biobanks facilitate genome-wide association studies (GWASs), which have mapped genomic loci across a range of human diseases and traits. However, most biobanks are primarily composed of individuals of European ancestry. We introduce the Global Biobank Meta-analysis Initiative (GBMI)-a collaborative network of 23 biobanks from 4 continents representing more than 2.2 million consented individuals with genetic data linked to electronic health records. GBMI meta-analyzes summary statistics from GWASs generated using harmonized genotypes and phenotypes from member biobanks for 14 exemplar diseases and endpoints. This strategy validates that GWASs conducted in diverse biobanks can be integrated despite heterogeneity in case definitions, recruitment strategies, and baseline characteristics. This collaborative effort improves GWAS power for diseases, benefits understudied diseases, and improves risk prediction while also enabling the nomination of disease genes and drug candidates by incorporating gene and protein expression data and providing insight into the underlying biology of human diseases and traits.
RESUMEN
BACKGROUND: Although epigenetic mechanisms are important risk factors for allergic disease, few studies have evaluated DNA methylation differences associated with atopic dermatitis (AD), and none has focused on AD with eczema herpeticum (ADEH+). We will determine how methylation varies in AD individuals with/without EH and associated traits. We modeled differences in genome-wide DNA methylation in whole blood cells from 90 ADEH+, 83 ADEH-, and 84 non-atopic, healthy control subjects, replicating in 36 ADEH+, 53 ADEH-, and 55 non-atopic healthy control subjects. We adjusted for cell-type composition in our models and used genome-wide and candidate-gene approaches. RESULTS: We replicated one CpG which was significantly differentially methylated by severity, with suggestive replication at four others showing differential methylation by phenotype or severity. Not adjusting for eosinophil content, we identified 490 significantly differentially methylated CpGs (ADEH+ vs healthy controls, genome-wide). Many of these associated with severity measures, especially eosinophil count (431/490 sites). CONCLUSIONS: We identified a CpG in IL4 associated with serum tIgE levels, supporting a role for Th2 immune mediating mechanisms in AD. Changes in eosinophil level, a measure of disease severity, are associated with methylation changes, providing a potential mechanism for phenotypic changes in immune response-related traits.
Asunto(s)
Metilación de ADN , Dermatitis Atópica/genética , Interleucina-4/genética , Erupción Variceliforme de Kaposi/genética , Estudios de Casos y Controles , Islas de CpG , Dermatitis Atópica/inmunología , Eosinófilos/inmunología , Epigénesis Genética , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Inmunoglobulina E/metabolismo , Erupción Variceliforme de Kaposi/inmunología , Masculino , Índice de Severidad de la Enfermedad , Células Th2/inmunologíaRESUMEN
In the version of this article initially published, the statement "there are no pan-genomes for any other animal or plant species" was incorrect. The statement has been corrected to "there are no reported pan-genomes for any other animal species, to our knowledge." We thank David Edwards for bringing this error to our attention. The error has been corrected in the HTML and PDF versions of the article.
RESUMEN
We used a deeply sequenced dataset of 910 individuals, all of African descent, to construct a set of DNA sequences that is present in these individuals but missing from the reference human genome. We aligned 1.19 trillion reads from the 910 individuals to the reference genome (GRCh38), collected all reads that failed to align, and assembled these reads into contiguous sequences (contigs). We then compared all contigs to one another to identify a set of unique sequences representing regions of the African pan-genome missing from the reference genome. Our analysis revealed 296,485,284 bp in 125,715 distinct contigs present in the populations of African descent, demonstrating that the African pan-genome contains ~10% more DNA than the current human reference genome. Although the functional significance of nearly all of this sequence is unknown, 387 of the novel contigs fall within 315 distinct protein-coding genes, and the rest appear to be intergenic.