RESUMEN
Genome-wide association studies have identified >250 genetic variants associated with coronary artery disease (CAD), but the causal variants, genes and molecular mechanisms remain unknown at most loci. We performed pooled CRISPR screens to test the impact of sequences at or near CAD-associated genetic variants on vascular endothelial cell functions. Using CRISPR knockout, inhibition and activation, we targeted 1998 variants at 83 CAD loci to assess their effect on three adhesion proteins (E-selectin, ICAM1, VCAM1) and three key endothelial functions (nitric oxide and reactive oxygen species production, calcium signalling). At a false discovery rate ≤10%, we identified significant CRISPR perturbations near 42 variants located within 26 CAD loci. We used base editing to validate a putative causal variant in the promoter of the FES gene. Although a few of the loci include genes previously characterized in endothelial cells (e.g. AIDA, ARHGEF26, ADAMTS7), most are implicated in endothelial dysfunction for the first time. Detailed characterization of one of these new loci implicated the RNA helicase DHX38 in vascular endothelial cell senescence. While promising, our results also highlighted several limitations in using CRISPR perturbations to functionally dissect GWAS loci, including an unknown false negative rate and potential off-target effects.
Asunto(s)
Enfermedad de la Arteria Coronaria , Humanos , Enfermedad de la Arteria Coronaria/genética , Enfermedad de la Arteria Coronaria/metabolismo , Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Células Endoteliales/metabolismo , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , Polimorfismo de Nucleótido Simple/genética , Predisposición Genética a la Enfermedad , Factores de Empalme de ARN/genética , ARN Helicasas DEAD-box/genéticaRESUMEN
Several of the complications observed in sickle cell disease (SCD) are influenced by variation in hematologic traits (HT), such as fetal hemoglobin (HbF) level and neutrophil count. Previous large-scale genome-wide association studies carried out in largely healthy individuals have identified thousands of variants associated with HT, which have then been used to develop multi-ancestry polygenic trait scores (PTS). Here, we tested whether these PTS associate with HT in SCD patients and if they can improve statistical models associated with SCD-related complications. In 2,056 SCD patients, we found that the PTS predicted less HT variance than in non-SCD individuals of African ancestry. This was particularly striking at the Duffy/DARC locus, where we observed an epistatic interaction between the SCD genotype and the Duffy null variant (rs2814778) that led to a two-fold weaker effect on neutrophil count. PTS for these HT which are measured as part of routine practice were not associated with complications in SCD. In contrast, we found that a simple PTS for HbF that includes only six variants explained a large fraction of the phenotypic variation (20.5-27.1%), associated with acute chest syndrome and stroke risk, and improved the statistical modeling of the vaso-occlusive crisis rate. Using Mendelian randomization, we found that increasing HbF by 4.8% reduces stroke risk by 39% (P=0.0006). Taken together, our results highlight the importance of validating PTS in large diseased populations before proposing their implementation in the context of precision medicine initiatives.
Asunto(s)
Anemia de Células Falciformes , Accidente Cerebrovascular , Humanos , Herencia Multifactorial , Estudio de Asociación del Genoma Completo , Anemia de Células Falciformes/genética , Anemia de Células Falciformes/complicaciones , Genotipo , Hemoglobina Fetal/genéticaRESUMEN
Most loci identified by GWASs have been found in populations of European ancestry (EUR). In trans-ethnic meta-analyses for 15 hematological traits in 746,667 participants, including 184,535 non-EUR individuals, we identified 5,552 trait-variant associations at p < 5 × 10-9, including 71 novel associations not found in EUR populations. We also identified 28 additional novel variants in ancestry-specific, non-EUR meta-analyses, including an IL7 missense variant in South Asians associated with lymphocyte count in vivo and IL-7 secretion levels in vitro. Fine-mapping prioritized variants annotated as functional and generated 95% credible sets that were 30% smaller when using the trans-ethnic as opposed to the EUR-only results. We explored the clinical significance and predictive value of trans-ethnic variants in multiple populations and compared genetic architecture and the effect of natural selection on these blood phenotypes between populations. Altogether, our results for hematological traits highlight the value of a more global representation of populations in genetic studies.
Asunto(s)
Pueblo Asiatico/genética , Mutación Missense/genética , Polimorfismo de Nucleótido Simple/genética , Población Blanca/genética , Genética , Estudio de Asociación del Genoma Completo/métodos , Células HEK293 , Humanos , Interleucina-7/genética , FenotipoRESUMEN
Blood cells play essential roles in human health, underpinning physiological processes such as immunity, oxygen transport, and clotting, which when perturbed cause a significant global health burden. Here we integrate data from UK Biobank and a large-scale international collaborative effort, including data for 563,085 European ancestry participants, and discover 5,106 new genetic variants independently associated with 29 blood cell phenotypes covering a range of variation impacting hematopoiesis. We holistically characterize the genetic architecture of hematopoiesis, assess the relevance of the omnigenic model to blood cell phenotypes, delineate relevant hematopoietic cell states influenced by regulatory genetic variants and gene networks, identify novel splice-altering variants mediating the associations, and assess the polygenic prediction potential for blood traits and clinical disorders at the interface of complex and Mendelian genetics. These results show the power of large-scale blood cell trait GWAS to interrogate clinically meaningful variants across a wide allelic spectrum of human variation.
Asunto(s)
Predisposición Genética a la Enfermedad/genética , Herencia Multifactorial/genética , Femenino , Redes Reguladoras de Genes/genética , Estudio de Asociación del Genoma Completo/métodos , Hematopoyesis/genética , Humanos , Masculino , Fenotipo , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
BACKGROUND: Genome-wide association studies (GWAS) have identified hundreds of loci associated with coronary artery disease (CAD) and blood pressure (BP) or hypertension. Many of these loci are not linked to traditional risk factors, nor do they include obvious candidate genes, complicating their functional characterization. We hypothesize that many GWAS loci associated with vascular diseases modulate endothelial functions. Endothelial cells play critical roles in regulating vascular homeostasis, such as roles in forming a selective barrier, inflammation, hemostasis, and vascular tone, and endothelial dysfunction is a hallmark of atherosclerosis and hypertension. To test this hypothesis, we generate an integrated map of gene expression, open chromatin region, and 3D interactions in resting and TNFα-treated human endothelial cells. RESULTS: We show that genetic variants associated with CAD and BP are enriched in open chromatin regions identified in endothelial cells. We identify physical loops by Hi-C and link open chromatin peaks that include CAD or BP SNPs with the promoters of genes expressed in endothelial cells. This analysis highlights 991 combinations of open chromatin regions and gene promoters that map to 38 CAD and 92 BP GWAS loci. We validate one CAD locus, by engineering a deletion of the TNFα-sensitive regulatory element using CRISPR/Cas9 and measure the effect on the expression of the novel CAD candidate gene AIDA. CONCLUSIONS: Our data support an important role played by genetic variants acting in the vascular endothelium to modulate inter-individual risk in CAD and hypertension.
Asunto(s)
Enfermedad de la Arteria Coronaria/genética , Proteínas de Transferencia de Fosfolípidos/genética , Sistemas CRISPR-Cas , Células Endoteliales/metabolismo , Epigenómica , Estudio de Asociación del Genoma Completo , Humanos , Elementos Reguladores de la Transcripción , TranscriptomaRESUMEN
Body-fat distribution is a risk factor for adverse cardiovascular health consequences. We analyzed the association of body-fat distribution, assessed by waist-to-hip ratio adjusted for body mass index, with 228,985 predicted coding and splice site variants available on exome arrays in up to 344,369 individuals from five major ancestries (discovery) and 132,177 European-ancestry individuals (validation). We identified 15 common (minor allele frequency, MAF ≥5%) and nine low-frequency or rare (MAF <5%) coding novel variants. Pathway/gene set enrichment analyses identified lipid particle, adiponectin, abnormal white adipose tissue physiology and bone development and morphology as important contributors to fat distribution, while cross-trait associations highlight cardiometabolic traits. In functional follow-up analyses, specifically in Drosophila RNAi-knockdowns, we observed a significant increase in the total body triglyceride levels for two genes (DNAH10 and PLXND1). We implicate novel genes in fat distribution, stressing the importance of interrogating low-frequency and protein-coding variants.
Asunto(s)
Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Homeostasis/genética , Lípidos/genética , Proteínas/genética , Animales , Distribución de la Grasa Corporal/métodos , Índice de Masa Corporal , Estudios de Casos y Controles , Drosophila/genética , Exoma/genética , Femenino , Frecuencia de los Genes/genética , Estudio de Asociación del Genoma Completo/métodos , Humanos , Masculino , Factores de Riesgo , Relación Cintura-Cadera/métodosRESUMEN
Background Macrophage cholesterol efflux to high-density lipoproteins ( HDLs ) is the first step of reverse cholesterol transport. The cholesterol efflux capacity ( CEC ) of HDL particles is a protective risk factor for coronary artery disease independent of HDL cholesterol levels. Using a genome-wide association study approach, we aimed to identify pathways that regulate CEC in humans. Methods and Results We measured CEC in 5293 French Canadians. We tested the genetic association between 4 CEC measures and genotypes at >9 million common autosomal DNA sequence variants. These analyses yielded 10 genome-wide significant signals ( P<6.25×10-9) representing 7 loci. Five of these loci harbor genes with important roles in lipid biology ( CETP , LIPC , LPL , APOA 1/C3/A4/A5, and APOE /C1/C2/C4). Except for the APOE /C1/C2/C4 variant ( rs141622900, P nonadjusted=1.0×10-11; P adjusted=8.8×10-9), the association signals disappear when correcting for HDL cholesterol and triglyceride levels. The additional 2 significant signals were near the PPP 1 CB / PLB 1 and RBFOX 3/ ENPP 7 genes. In secondary analyses, we considered candidate functional variants for 58 genes implicated in HDL biology, as well as 239 variants associated with blood lipid levels and/or coronary artery disease risk by genome-wide association study . These analyses identified 27 significant CEC associations, implicating 5 additional loci ( GCKR , LIPG , PLTP , PPARA , and TRIB 1). Conclusions Our genome-wide association study identified common genetic variation at the APOE /C1/C2/C4 locus as a major determinant of CEC that acts largely independently of HDL cholesterol. We predict that HDL -based therapies aiming at increasing CEC will be modulated by changes in the expression of apolipoproteins in this gene cluster.
Asunto(s)
Apolipoproteínas C/genética , Apolipoproteínas E/genética , HDL-Colesterol/metabolismo , Colesterol/metabolismo , Enfermedad de la Arteria Coronaria/genética , Macrófagos/metabolismo , Anciano , Apolipoproteína C-I/genética , Apolipoproteína C-II/genética , Canadá , Estudios de Casos y Controles , Enfermedad de la Arteria Coronaria/metabolismo , Femenino , Variación Genética , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Persona de Mediana EdadRESUMEN
BACKGROUND: Genome-wide association studies (GWAS) have identified a variant (rs9349379) at the phosphatase and actin regulator 1 (PHACTR1) locus that is associated with coronary artery disease (CAD). The same variant is also an expression quantitative trait locus (eQTL) for PHACTR1 in human coronary arteries (hCA). Here, we sought to characterize PHACTR1 splicing pattern in atherosclerosis-relevant human cells. We also explored how rs9349379 modulates the expression of the different PHACTR1 splicing isoforms. METHODS: We combined rapid amplification of cDNA ends (RACE) with next-generation long-read DNA sequencing to discover all PHACTR1 transcripts in many human tissues and cell types. We measured PHACTR1 transcripts by qPCR to identify transcript-specific eQTLs. RESULTS: We confirmed a brain-specific long transcript, a short transcript expressed in monocytes and four intermediate transcripts that are different due to alternative splicing of two in-frame exons. In contrast to a previous report, we confirmed that the PHACTR1 protein is present in vascular smooth muscle cells. In 158 hCA from our collection and the GTEx dataset, rs9349379 was only associated with the expression levels of the intermediate PHACTR1 transcripts. CONCLUSIONS: Our comprehensive transcriptomic profiling of PHACTR1 indicates that this gene encodes six main transcripts. Five of them are expressed in hCA, where atherosclerotic plaques develop. In this tissue, genotypes at rs9349379 are associated with the expression of the intermediate transcripts, but not the immune-specific short transcript. This result suggests that rs9349379 may in part influence CAD by modulating the expression of intermediate PHACTR1 transcripts in endothelial or vascular smooth muscle cells found in hCA.
Asunto(s)
Empalme Alternativo , Aterosclerosis/genética , Aterosclerosis/patología , Regulación de la Expresión Génica , Proteínas de Microfilamentos/genética , Músculo Liso Vascular/metabolismo , Sitios de Carácter Cuantitativo , Células Cultivadas , Humanos , Músculo Liso Vascular/citología , Isoformas de ProteínasRESUMEN
Height is a highly heritable, classic polygenic trait with approximately 700 common associated variants identified through genome-wide association studies so far. Here, we report 83 height-associated coding variants with lower minor-allele frequencies (in the range of 0.1-4.8%) and effects of up to 2 centimetres per allele (such as those in IHH, STC2, AR and CRISPLD2), greater than ten times the average effect of common variants. In functional follow-up studies, rare height-increasing alleles of STC2 (giving an increase of 1-2 centimetres per allele) compromised proteolytic inhibition of PAPP-A and increased cleavage of IGFBP-4 in vitro, resulting in higher bioavailability of insulin-like growth factors. These 83 height-associated variants overlap genes that are mutated in monogenic growth disorders and highlight new biological candidates (such as ADAMTS3, IL11RA and NOX4) and pathways (such as proteoglycan and glycosaminoglycan synthesis) involved in growth. Our results demonstrate that sufficiently large sample sizes can uncover rare and low-frequency variants of moderate-to-large effect associated with polygenic human phenotypes, and that these variants implicate relevant genes and pathways.
Asunto(s)
Estatura/genética , Frecuencia de los Genes/genética , Variación Genética/genética , Proteínas ADAMTS/genética , Adulto , Alelos , Moléculas de Adhesión Celular/genética , Femenino , Genoma Humano/genética , Glicoproteínas/genética , Glicoproteínas/metabolismo , Glicosaminoglicanos/biosíntesis , Proteínas Hedgehog/genética , Humanos , Péptidos y Proteínas de Señalización Intercelular/genética , Péptidos y Proteínas de Señalización Intercelular/metabolismo , Factores Reguladores del Interferón/genética , Subunidad alfa del Receptor de Interleucina-11/genética , Masculino , Herencia Multifactorial/genética , NADPH Oxidasa 4 , NADPH Oxidasas/genética , Fenotipo , Proteína Plasmática A Asociada al Embarazo/metabolismo , Procolágeno N-Endopeptidasa/genética , Proteoglicanos/biosíntesis , Proteolisis , Receptores Androgénicos/genética , Somatomedinas/metabolismoRESUMEN
Genome-wide association studies (GWAS) have had a tremendous success in the identification of common DNA sequence variants associated with complex human diseases and traits. However, because of their design, GWAS are largely inappropriate to characterize the role of rare and low-frequency DNA variants on human phenotypic variation. Rarer genetic variation is geographically more restricted, supporting the need for local whole-genome sequencing (WGS) efforts to study these variants in specific populations. Here, we present the first large-scale low-pass WGS of the French-Canadian population. Specifically, we sequenced at ~5.6× coverage the whole genome of 1970 French Canadians recruited by the Montreal Heart Institute Biobank and identified 29 million bi-allelic variants (31 % novel), including 19 million variants with a minor allele frequency (MAF) <0.5 %. Genotypes from the WGS data are highly concordant with genotypes obtained by exome array on the same individuals (99.8 %), even when restricting this analysis to rare variants (MAF <0.5, 99.9 %) or heterozygous sites (98.9 %). To further validate our data set, we showed that we can effectively use it to replicate several genetic associations with myocardial infarction risk and blood lipid levels. Furthermore, we analyze the utility of our WGS data set to generate a French-Canadian-specific imputation reference panel and to infer population structure in the Province of Quebec. Our results illustrate the value of low-pass WGS to study the genetics of human diseases in the founder French-Canadian population.
Asunto(s)
Exoma/genética , Enfermedades Genéticas Congénitas/genética , Variación Genética , Secuenciación de Nucleótidos de Alto Rendimiento , Canadá , Frecuencia de los Genes , Enfermedades Genéticas Congénitas/epidemiología , Genoma Humano , Genotipo , Humanos , Fenotipo , QuebecRESUMEN
OBJECTIVE: Coronary artery disease (CAD), including myocardial infarction (MI), is the main cause of death in the world. Genome-wide association studies have identified dozens of single nucleotide polymorphisms (SNPs) associated with CAD/MI. One of the most robust CAD/MI genetic associations is with intronic SNPs in the gene PHACTR1 on chromosome 6p24. How these PHACTR1 SNPs influence CAD/MI risk, and whether PHACTR1 itself is the causal gene at the locus, is currently unknown. APPROACH AND RESULTS: Using genetic fine-mapping and DNA resequencing experiments, we prioritized an intronic SNP (rs9349379) in PHACTR1 as causal variant. We showed that this variant is an expression quantitative trait locus for PHACTR1 expression in human coronary arteries. Experiments in endothelial cell extracts confirmed that alleles at rs9349379 are differentially bound by the transcription factors myocyte enhancer factor-2. We engineered a deletion of this myocyte enhancer factor-2-binding site using CRISPR/Cas9 genome-editing methodology. Heterozygous endothelial cells carrying this deletion express 35% less PHACTR1. Finally, we found no evidence that PHACTR1 expression levels are induced when stimulating human endothelial cells with vascular endothelial growth factor, tumor necrosis factor-α, or shear stress. CONCLUSIONS: Our results establish a link between intronic SNPs in PHACTR1, myocyte enhancer factor-2 binding, and transcriptional functions at the locus, PHACTR1 expression levels in coronary arteries and CAD/MI risk. Because PHACTR1 SNPs are not associated with the traditional risk factors for CAD/MI (eg, blood lipids or pressure, diabetes mellitus), our results suggest that PHACTR1 may influence CAD/MI risk through as yet unknown mechanisms in the vascular endothelium.
Asunto(s)
Cromosomas Humanos Par 6/genética , Vasos Coronarios/metabolismo , Factores de Transcripción MEF2/metabolismo , Proteínas de Microfilamentos/metabolismo , Infarto del Miocardio/genética , Polimorfismo de Nucleótido Simple , Alelos , Células Endoteliales/metabolismo , Endotelio Vascular/metabolismo , Estudio de Asociación del Genoma Completo , Humanos , Infarto del Miocardio/metabolismo , Ombligo/irrigación sanguínea , VenasRESUMEN
BACKGROUND: Dilated cardiomyopathy (DCM) is a major cause of heart failure that may require heart transplantation. Approximately one third of DCM cases are familial. Next-generation DNA sequencing of large panels of candidate genes (ie, targeted sequencing) or of the whole exome can rapidly and economically identify pathogenic mutations in familial DCM. METHODS: We recruited 64 individuals from 26 DCM families followed at the Montreal Heart Institute Cardiovascular Genetic Center and sequenced the whole exome of 44 patients and 2 controls. Both affected and unaffected family members underwent genotyping for segregation analysis. RESULTS: We found 2 truncating mutations in BAG3 in 4 DCM families (15%) and confirmed segregation with disease status by linkage (log of the odds [LOD] score = 3.8). BAG3 nonsense mutations conferred a worse prognosis as evidenced by a younger age of clinical onset (37 vs 48 years for carriers and noncarriers respectively; P = 0.037). We also found truncating mutations in TTN in 5 families (19%). Finally, we identified potential pathogenic mutations for 9 DCM families in 6 candidate genes (DSP, LMNA, MYH7, MYPN, RBM20, and TNNT2). We still need to confirm several of these mutations by segregation analysis. CONCLUSIONS: Screening an extended panel of 41 candidate genes allowed us to identify probable pathogenic mutations in 69% of families with DCM in our cohort of mostly French-Canadian patients. We confirmed the prevalence of TTN nonsense mutations in DCM. Furthermore, to our knowledge, we are the first to present an association between nonsense mutations in BAG3 and early-onset DCM.
Asunto(s)
Proteínas Adaptadoras Transductoras de Señales/genética , Proteínas Reguladoras de la Apoptosis/genética , Cardiomiopatía Dilatada/genética , Codón sin Sentido , ADN/genética , Proteínas Adaptadoras Transductoras de Señales/metabolismo , Adulto , Edad de Inicio , Proteínas Reguladoras de la Apoptosis/metabolismo , Canadá/epidemiología , Cardiomiopatía Dilatada/etnología , Cardiomiopatía Dilatada/metabolismo , Análisis Mutacional de ADN , Femenino , Francia/etnología , Ligamiento Genético , Genotipo , Heterocigoto , Humanos , Masculino , Persona de Mediana Edad , Linaje , FenotipoRESUMEN
Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated â¼2,000, â¼3,700 and â¼9,500 SNPs explained â¼21%, â¼24% and â¼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/ß-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.
Asunto(s)
Estatura/genética , Variación Genética/genética , Polimorfismo de Nucleótido Simple/genética , Población Blanca/genética , Adulto , Análisis de Varianza , Genética de Población , Estudio de Asociación del Genoma Completo/métodos , Humanos , Análisis de Secuencia por Matrices de OligonucleótidosRESUMEN
Characterization of the epigenome promises to yield the functional elements buried in the human genome sequence, thus helping to annotate non-coding DNA polymorphisms with regulatory functions. Here, we develop two novel strategies to combine epigenomic data with transcriptomic profiles in humans or mice to prioritize potential candidate SNPs associated with lipid levels by genome-wide association study (GWAS). First, after confirming that lipid-associated loci that are also expression quantitative trait loci (eQTL) in human livers are enriched for ENCODE regulatory marks in the human hepatocellular HepG2 cell line, we prioritize candidate SNPs based on the number of these marks that overlap the variant position. This method recognized the known SORT1 rs12740374 regulatory SNP associated with LDL-cholesterol, and highlighted candidate functional SNPs at 15 additional lipid loci. In the second strategy, we combine ENCODE chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq) data and liver expression datasets from knockout mice lacking specific transcription factors. This approach identified SNPs in specific transcription factor binding sites that are located near target genes of these transcription factors. We show that FOXA2 transcription factor binding sites are enriched at lipid-associated loci and experimentally validate that alleles of one such proxy SNP located near the FOXA2 target gene BIRC5 show allelic differences in FOXA2-DNA binding and enhancer activity. These methods can be used to generate testable hypotheses for many non-coding SNPs associated with complex diseases or traits.
Asunto(s)
Epigenómica/métodos , Estudios de Asociación Genética/métodos , Lípidos/química , Hígado/metabolismo , Factores de Transcripción/genética , Alelos , Animales , Sitios de Unión , Inmunoprecipitación de Cromatina/métodos , Mapeo Cromosómico , Genoma Humano , Células Hep G2 , Factor Nuclear 3-beta del Hepatocito/genética , Factor Nuclear 3-beta del Hepatocito/metabolismo , Factor Nuclear 4 del Hepatocito/genética , Factor Nuclear 4 del Hepatocito/metabolismo , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Células K562 , Ratones , Ratones Noqueados , Fenotipo , Polimorfismo de Nucleótido Simple , Unión Proteica/genética , Sitios de Carácter Cuantitativo , Factores de Transcripción/metabolismo , Transcriptoma/genéticaRESUMEN
Hematological traits are important clinical parameters. To test the effects of rare and low-frequency coding variants on hematological traits, we analyzed hemoglobin concentration, hematocrit levels, white blood cell (WBC) counts and platelet counts in 31,340 individuals genotyped on an exome array. We identified several missense variants in CXCR2 associated with reduced WBC count (gene-based P = 2.6 × 10(-13)). In a separate family-based resequencing study, we identified a CXCR2 frameshift mutation in a pedigree with congenital neutropenia that abolished ligand-induced CXCR2 signal transduction and chemotaxis. We also identified missense or splice-site variants in key hematopoiesis regulators (EPO, TFR2, HBB, TUBB1 and SH2B3) associated with blood cell traits. Finally, we were able to detect associations between a rare somatic JAK2 mutation (encoding p.Val617Phe) and platelet count (P = 3.9 × 10(-22)) as well as hemoglobin concentration (P = 0.002), hematocrit levels (P = 9.5 × 10(-7)) and WBC count (P = 3.1 × 10(-5)). In conclusion, exome arrays complement genome-wide association studies in identifying new variants that contribute to complex human traits.
Asunto(s)
Hemoglobinas/genética , Recuento de Leucocitos , Neutropenia/congénito , Recuento de Plaquetas , Receptores de Interleucina-8B/genética , Adulto , Anciano , Quimiotaxis , Síndromes Congénitos de Insuficiencia de la Médula Ósea , Exoma , Femenino , Mutación del Sistema de Lectura , Estudio de Asociación del Genoma Completo , Genotipo , Hematócrito , Hematopoyesis , Humanos , Janus Quinasa 2/genética , Masculino , Persona de Mediana Edad , Mutación Missense , Neutropenia/genética , LinajeRESUMEN
Genome-wide association studies and follow-up meta-analyses in Crohn's disease (CD) and ulcerative colitis (UC) have recently identified 163 disease-associated loci that meet genome-wide significance for these two inflammatory bowel diseases (IBD). These discoveries have already had a tremendous impact on our understanding of the genetic architecture of these diseases and have directed functional studies that have revealed some of the biological functions that are important to IBD (e.g. autophagy). Nonetheless, these loci can only explain a small proportion of disease variance (~14% in CD and 7.5% in UC), suggesting that not only are additional loci to be found but that the known loci may contain high effect rare risk variants that have gone undetected by GWAS. To test this, we have used a targeted sequencing approach in 200 UC cases and 150 healthy controls (HC), all of French Canadian descent, to study 55 genes in regions associated with UC. We performed follow-up genotyping of 42 rare non-synonymous variants in independent case-control cohorts (totaling 14,435 UC cases and 20,204 HC). Our results confirmed significant association to rare non-synonymous coding variants in both IL23R and CARD9, previously identified from sequencing of CD loci, as well as identified a novel association in RNF186. With the exception of CARD9 (OR = 0.39), the rare non-synonymous variants identified were of moderate effect (OR = 1.49 for RNF186 and OR = 0.79 for IL23R). RNF186 encodes a protein with a RING domain having predicted E3 ubiquitin-protein ligase activity and two transmembrane domains. Importantly, the disease-coding variant is located in the ubiquitin ligase domain. Finally, our results suggest that rare variants in genes identified by genome-wide association in UC are unlikely to contribute significantly to the overall variance for the disease. Rather, these are expected to help focus functional studies of the corresponding disease loci.
Asunto(s)
Proteínas Adaptadoras de Señalización CARD/genética , Colitis Ulcerosa/genética , Enfermedad de Crohn/genética , Estudio de Asociación del Genoma Completo , Receptores de Interleucina/genética , Ubiquitina-Proteína Ligasas/genética , Canadá , Colitis Ulcerosa/patología , Enfermedad de Crohn/patología , Etnicidad , Predisposición Genética a la Enfermedad , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Polimorfismo de Nucleótido SimpleRESUMEN
BACKGROUND: Familial history is a strong risk factor for coronary artery disease (CAD), especially for early-onset myocardial infarction (MI). Several genes and chromosomal regions have been implicated in the genetic cause of coronary artery disease/MI, mostly through the discovery of familial mutations implicated in hyper-/hypocholesterolemia by linkage studies and single nucleotide polymorphisms by genome-wide association studies. Except for a few examples (eg, PCSK9), the role of low-frequency genetic variation (minor allele frequency [MAF]) ≈0.1%-5% on MI/coronary artery disease predisposition has not been extensively investigated. METHODS AND RESULTS: We selected 68 candidate genes and sequenced their exons (394 kb) in 500 early-onset MI cases and 500 matched controls, all of French-Canadian ancestry, using solution-based capture in pools of nonindexed DNA samples. In these regions, we identified 1852 single nucleotide variants (695 novel) and captured 85% of the variants with MAF≥1% found by the 1000 Genomes Project in Europe-ancestry individuals. Using gene-based association testing, we prioritized for follow-up 29 low-frequency variants in 8 genes and attempted to genotype them for replication in 1594 MI cases and 2988 controls from 2 French-Canadian panels. Our pilot association analysis of low-frequency variants in 68 candidate genes did not identify genes with large effect on MI risk in French Canadians. CONCLUSIONS: We have optimized a strategy, applicable to all complex diseases and traits, to discover efficiently and cost-effectively DNA sequence variants in large populations. Resequencing endeavors to find low-frequency variants implicated in common human diseases are likely to require very large sample size.
Asunto(s)
Infarto del Miocardio/genética , Población Blanca/genética , Adulto , Anciano , Canadá , Estudios de Cohortes , Exones , Femenino , Frecuencia de los Genes , Ligamiento Genético , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Factores de Riesgo , Análisis de Secuencia de ADNRESUMEN
More than 1,000 susceptibility loci have been identified through genome-wide association studies (GWAS) of common variants; however, the specific genes and full allelic spectrum of causal variants underlying these findings have not yet been defined. Here we used pooled next-generation sequencing to study 56 genes from regions associated with Crohn's disease in 350 cases and 350 controls. Through follow-up genotyping of 70 rare and low-frequency protein-altering variants in nine independent case-control series (16,054 Crohn's disease cases, 12,153 ulcerative colitis cases and 17,575 healthy controls), we identified four additional independent risk factors in NOD2, two additional protective variants in IL23R, a highly significant association with a protective splice variant in CARD9 (P < 1 × 10(-16), odds ratio ≈ 0.29) and additional associations with coding variants in IL18RAP, CUL2, C1orf106, PTPN22 and MUC19. We extend the results of successful GWAS by identifying new, rare and probably functional variants that could aid functional experiments and predictive models.
Asunto(s)
Estudio de Asociación del Genoma Completo , Enfermedades Inflamatorias del Intestino/genética , Análisis de Secuencia de ADN , Estudios de Casos y Controles , Línea Celular , Predisposición Genética a la Enfermedad , Humanos , Proteína Adaptadora de Señalización NOD2/genética , Empalme del ARN , Receptores de Interleucina/genéticaRESUMEN
Red blood cell, white blood cell, and platelet measures, including their count, sub-type and volume, are important diagnostic and prognostic clinical parameters for several human diseases. To identify novel loci associated with hematological traits, and compare the architecture of these phenotypes between ethnic groups, the CARe Project genotyped 49,094 single nucleotide polymorphisms (SNPs) that capture variation in ~2,100 candidate genes in DNA of 23,439 Caucasians and 7,112 African Americans from five population-based cohorts. We found strong novel associations between erythrocyte phenotypes and the glucose-6 phosphate dehydrogenase (G6PD) A-allele in African Americans (rs1050828, P<2.0×10(-13), T-allele associated with lower red blood cell count, hemoglobin, and hematocrit, and higher mean corpuscular volume), and between platelet count and a SNP at the tropomyosin-4 (TPM4) locus (rs8109288, P=3.0×10(-7) in Caucasians; P=3.0×10(-7) in African Americans, T-allele associated with lower platelet count). We strongly replicated many genetic associations to blood cell phenotypes previously established in Caucasians. A common variant of the α-globin (HBA2-HBA1) locus was associated with red blood cell traits in African Americans, but not in Caucasians (rs1211375, P<7×10(-8), A-allele associated with lower hemoglobin, mean corpuscular hemoglobin, and mean corpuscular volume). Our results show similarities but also differences in the genetic regulation of hematological traits in European- and African-derived populations, and highlight the role of natural selection in shaping these differences.
Asunto(s)
Negro o Afroamericano/genética , Recuento de Células Sanguíneas , Estudios de Asociación Genética , Sitios Genéticos , Carácter Cuantitativo Heredable , Población Blanca/genética , alfa-Globulinas/genética , Estudios de Cohortes , Recuento de Eritrocitos , Índices de Eritrocitos/genética , Eritrocitos/citología , Eritrocitos/enzimología , Femenino , Glucosafosfato Deshidrogenasa/genética , Hemoglobinas/genética , Humanos , Masculino , Persona de Mediana Edad , Recuento de Plaquetas , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
BACKGROUND: Polymerase chain reaction (PCR) remains a simple, flexible, and inexpensive method for enriching genomic regions of interest for next-generation sequencing. In order to utilize PCR in this context, a major challenge facing researchers is how to generate a very large number of functional PCR primers that will successfully generate useable amplicons. For instance, in an exon-only re-sequencing project targeting 100 genes, each with 10 exons, 1,000 pairs of primers are required. In fact, the reality is often more complex as each gene might have several isoforms and large exons need to be divided to maintain the desired amplicon size. With only a list of gene names, our program Optimus Primer (OP) automatically takes into account all these variables, and can generate primers with no need to provide genome coordinates. More importantly however, OP, unlike other primer design programs, uniquely utilizes Primer3 in an iterative manner that allows the user to progressively design up to four iterations of primer designs. Through a single interface, the user can specify up to four different design parameters with different stringencies, thus increasing the probability that a functional PCR primer pair will be designed for all regions of interest in a single pass of the pipeline. FINDINGS: To demonstrate the effectiveness of the program, we designed PCR primers against 77 genes located in loci associated with ulcerative colitis as part of a candidate gene re-sequencing experiment. We achieved an experimental success rate of 93% or 472 out of 508 amplicons spanning the exonic regions of the 77 genes. Moreover, by automatically passing amplicons that failed primer design through three additional iterations of design parameters, we achieved an additional 170 successful primer pairs or 34% more in a single pass of OP than by conventional methods. CONCLUSION: With only a gene list and PCR parameters, a user can produce hundreds of PCR primer designs for regions of interest with a high probability of success in a very short amount of time. Optimus Primer is an essential tool for researchers who want to pursue PCR-based enrichment strategies for next-generation re-sequencing applications. The program can be accessed via website at http://op.pgx.ca.