RESUMO
High-throughput proteomics platforms measuring thousands of proteins in plasma combined with genomic and phenotypic information have the power to bridge the gap between the genome and diseases. Here we performed association studies of Olink Explore 3072 data generated by the UK Biobank Pharma Proteomics Project1 on plasma samples from more than 50,000 UK Biobank participants with phenotypic and genotypic data, stratifying on British or Irish, African and South Asian ancestries. We compared the results with those of a SomaScan v4 study on plasma from 36,000 Icelandic people2, for 1,514 of whom Olink data were also available. We found modest correlation between the two platforms. Although cis protein quantitative trait loci were detected for a similar absolute number of assays on the two platforms (2,101 on Olink versus 2,120 on SomaScan), the proportion of assays with such supporting evidence for assay performance was higher on the Olink platform (72% versus 43%). A considerable number of proteins had genomic associations that differed between the platforms. We provide examples where differences between platforms may influence conclusions drawn from the integration of protein levels with the study of diseases. We demonstrate how leveraging the diverse ancestries of participants in the UK Biobank helps to detect novel associations and refine genomic location. Our results show the value of the information provided by the two most commonly used high-throughput proteomics platforms and demonstrate the differences between them that at times provides useful complementarity.
Assuntos
Proteínas Sanguíneas , Suscetibilidade a Doenças , Genômica , Genótipo , Fenótipo , Proteômica , Humanos , África/etnologia , Ásia Meridional/etnologia , Bancos de Espécimes Biológicos , Proteínas Sanguíneas/análise , Proteínas Sanguíneas/genética , Conjuntos de Dados como Assunto , Genoma Humano/genética , Islândia/etnologia , Irlanda/etnologia , Plasma/química , Proteoma/análise , Proteoma/genética , Proteômica/métodos , Locos de Características Quantitativas , Reino UnidoRESUMO
Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data1,2. Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank3. This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation.
Assuntos
Bancos de Espécimes Biológicos , Bases de Dados Genéticas , Variação Genética , Genoma Humano , Genômica , Sequenciamento Completo do Genoma , África/etnologia , Ásia/etnologia , Estudos de Coortes , Sequência Conservada , Éxons/genética , Genoma Humano/genética , Haplótipos/genética , Humanos , Mutação INDEL , Irlanda/etnologia , Repetições de Microssatélites , Polimorfismo de Nucleotídeo Único/genética , Reino UnidoRESUMO
BACKGROUND: In 2021, the American College of Medical Genetics and Genomics (ACMG) recommended reporting actionable genotypes in 73 genes associated with diseases for which preventive or therapeutic measures are available. Evaluations of the association of actionable genotypes in these genes with life span are currently lacking. METHODS: We assessed the prevalence of coding and splice variants in genes on the ACMG Secondary Findings, version 3.0 (ACMG SF v3.0), list in the genomes of 57,933 Icelanders. We assigned pathogenicity to all reviewed variants using reported evidence in the ClinVar database, the frequency of variants, and their associations with disease to create a manually curated set of actionable genotypes (variants). We assessed the relationship between these genotypes and life span and further examined the specific causes of death among carriers. RESULTS: Through manual curation of 4405 sequence variants in the ACMG SF v3.0 genes, we identified 235 actionable genotypes in 53 genes. Of the 57,933 participants, 2306 (4.0%) carried at least one actionable genotype. We found shorter median survival among persons carrying actionable genotypes than among noncarriers. Specifically, we found that carrying an actionable genotype in a cancer gene was associated with survival that was 3 years shorter than that among noncarriers, with causes of death among carriers attributed primarily to cancer-related conditions. Furthermore, we found evidence of association between carrying an actionable genotype in certain genes in the cardiovascular disease group and a reduced life span. CONCLUSIONS: On the basis of the ACMG SF v3.0 guidelines, we found that approximately 1 in 25 Icelanders carried an actionable genotype and that carrying such a genotype was associated with a reduced life span. (Funded by deCODE Genetics-Amgen.).
Assuntos
Doença , Genômica , Longevidade , Humanos , Alelos , Testes Genéticos , Variação Genética , Genótipo , Islândia/epidemiologia , Longevidade/genética , Doença/genética , Doenças Cardiovasculares/genética , Neoplasias/genéticaRESUMO
Mosaic loss of chromosome Y (LOY) in circulating white blood cells is the most common form of clonal mosaicism1-5, yet our knowledge of the causes and consequences of this is limited. Here, using a computational approach, we estimate that 20% of the male population represented in the UK Biobank study (n = 205,011) has detectable LOY. We identify 156 autosomal genetic determinants of LOY, which we replicate in 757,114 men of European and Japanese ancestry. These loci highlight genes that are involved in cell-cycle regulation and cancer susceptibility, as well as somatic drivers of tumour growth and targets of cancer therapy. We demonstrate that genetic susceptibility to LOY is associated with non-haematological effects on health in both men and women, which supports the hypothesis that clonal haematopoiesis is a biomarker of genomic instability in other tissues. Single-cell RNA sequencing identifies dysregulated expression of autosomal genes in leukocytes with LOY and provides insights into why clonal expansion of these cells may occur. Collectively, these data highlight the value of studying clonal mosaicism to uncover fundamental mechanisms that underlie cancer and other ageing-related diseases.
Assuntos
Deleção Cromossômica , Cromossomos Humanos Y/genética , Predisposição Genética para Doença/genética , Instabilidade Genômica/genética , Leucócitos/patologia , Mosaicismo , Adulto , Idoso , Biologia Computacional , Bases de Dados Genéticas , Feminino , Marcadores Genéticos/genética , Humanos , Masculino , Pessoa de Meia-Idade , Neoplasias/genética , Reino UnidoRESUMO
The characterization of mutational processes that generate sequence diversity in the human genome is of paramount importance both to medical genetics and to evolutionary studies. To understand how the age and sex of transmitting parents affect de novo mutations, here we sequence 1,548 Icelanders, their parents, and, for a subset of 225, at least one child, to 35× genome-wide coverage. We find 108,778 de novo mutations, both single nucleotide polymorphisms and indels, and determine the parent of origin of 42,961. The number of de novo mutations from mothers increases by 0.37 per year of age (95% CI 0.32-0.43), a quarter of the 1.51 per year from fathers (95% CI 1.45-1.57). The number of clustered mutations increases faster with the mother's age than with the father's, and the genomic span of maternal de novo mutation clusters is greater than that of paternal ones. The types of de novo mutation from mothers change substantially with age, with a 0.26% (95% CI 0.19-0.33%) decrease in cytosine-phosphate-guanine to thymine-phosphate-guanine (CpG>TpG) de novo mutations and a 0.33% (95% CI 0.28-0.38%) increase in C>G de novo mutations per year, respectively. Remarkably, these age-related changes are not distributed uniformly across the genome. A striking example is a 20 megabase region on chromosome 8p, with a maternal C>G mutation rate that is up to 50-fold greater than the rest of the genome. The age-related accumulation of maternal non-crossover gene conversions also mostly occurs within these regions. Increased sequence diversity and linkage disequilibrium of C>G variants within regions affected by excess maternal mutations indicate that the underlying mutational process has persisted in humans for thousands of years. Moreover, the regional excess of C>G variation in humans is largely shared by chimpanzees, less by gorillas, and is almost absent from orangutans. This demonstrates that sequence diversity in humans results from evolving interactions between age, sex, mutation type, and genomic location.
Assuntos
Envelhecimento/genética , Mutação em Linhagem Germinativa/genética , Idade Materna , Mutagênese , Pais , Idade Paterna , Adolescente , Adulto , Idoso , Animais , Criança , Cromossomos Humanos Par 8/genética , Evolução Molecular , Feminino , Sequência Rica em GC , Genoma Humano/genética , Gorilla gorilla/genética , Humanos , Mutação INDEL , Islândia , Desequilíbrio de Ligação/genética , Masculino , Pessoa de Meia-Idade , Taxa de Mutação , Pan troglodytes/genética , Polimorfismo de Nucleotídeo Único , Pongo/genética , Adulto JovemRESUMO
Epidemiological and genetic association studies show that genetics play an important role in the attainment of education. Here, we investigate the effect of this genetic component on the reproductive history of 109,120 Icelanders and the consequent impact on the gene pool over time. We show that an educational attainment polygenic score, POLYEDU, constructed from results of a recent study is associated with delayed reproduction (P < 10-100) and fewer children overall. The effect is stronger for women and remains highly significant after adjusting for educational attainment. Based on 129,808 Icelanders born between 1910 and 1990, we find that the average POLYEDU has been declining at a rate of â¼0.010 standard units per decade, which is substantial on an evolutionary timescale. Most importantly, because POLYEDU only captures a fraction of the overall underlying genetic component the latter could be declining at a rate that is two to three times faster.
Assuntos
Escolaridade , Variação Genética , Adolescente , Adulto , Feminino , Fertilidade , Genoma Humano , Genótipo , Humanos , Islândia , Inteligência , Masculino , Adulto JovemRESUMO
Common sequence variants at the haptoglobin gene (HP) have been associated with blood lipid levels. Through whole-genome sequencing of 8,453 Icelanders, we discovered a splice donor founder mutation in HP (NM_001126102.1:c.190 + 1G > C, minor allele frequency = 0.56%). This mutation occurs on the HP1 allele of the common copy number variant in HP and leads to a loss of function of HP1. It associates with lower levels of haptoglobin (P = 2.1 × 10-54), higher levels of non-high density lipoprotein cholesterol (ß = 0.26 mmol/l, P = 2.6 × 10-9) and greater risk of coronary artery disease (odds ratio = 1.30, 95% confidence interval: 1.10-1.54, P = 0.0024). Through haplotype analysis and with RNA sequencing, we provide evidence of a causal relationship between one of the two haptoglobin isoforms, namely Hp1, and lower levels of non-HDL cholesterol. Furthermore, we show that the HP1 allele associates with various other quantitative biological traits.
Assuntos
Doença da Artéria Coronariana/genética , Haptoglobinas/genética , Adulto , Alelos , Sequência de Bases , Doença da Artéria Coronariana/metabolismo , Variações do Número de Cópias de DNA/genética , Feminino , Frequência do Gene/genética , Estudos de Associação Genética/métodos , Variação Genética , Haptoglobinas/metabolismo , Humanos , Islândia , Lipídeos/sangue , Lipídeos/genética , Lipoproteínas/genética , Masculino , Mutação , Razão de Chances , Sítios de Splice de RNA/genética , Fatores de RiscoRESUMO
Clonal hematopoiesis (CH) arises when a substantial proportion of mature blood cells is derived from a single dominant hematopoietic stem cell lineage. Somatic mutations in candidate driver (CD) genes are thought to be responsible for at least some cases of CH. Using whole-genome sequencing of 11 262 Icelanders, we found 1403 cases of CH by using barcodes of mosaic somatic mutations in peripheral blood, whether or not they have a mutation in a CD gene. We find that CH is very common in the elderly, trending toward inevitability. We show that somatic mutations in TET2, DNMT3A, ASXL1, and PPM1D are associated with CH at high significance. However, known CD mutations were evident in only a fraction of CH cases. Nevertheless, the highly prevalent CH we detect associates with increased mortality rates, risk for hematological malignancy, smoking behavior, telomere length, Y-chromosome loss, and other phenotypic characteristics. Modeling suggests some CH cases could arise in the absence of CD mutations as a result of neutral drift acting on a small population of active hematopoietic stem cells. Finally, we find a germline deletion in intron 3 of the telomerase reverse transcriptase (TERT) gene that predisposes to CH (rs34002450; P = 7.4 × 10-12; odds ratio, 1.37).
Assuntos
DNA (Citosina-5-)-Metiltransferases/genética , Proteínas de Ligação a DNA/genética , Hematopoese , Células-Tronco Hematopoéticas/citologia , Mutação , Proteína Fosfatase 2C/genética , Proteínas Proto-Oncogênicas/genética , Proteínas Repressoras/genética , Adulto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Células Clonais , DNA Metiltransferase 3A , Dioxigenases , Feminino , Neoplasias Hematológicas/epidemiologia , Neoplasias Hematológicas/genética , Células-Tronco Hematopoéticas/metabolismo , Humanos , Masculino , Pessoa de Meia-Idade , Fatores de RiscoRESUMO
Transcriptional and splicing anomalies have been observed in intron 8 of the CASP8 gene (encoding procaspase-8) in association with cutaneous basal-cell carcinoma (BCC) and linked to a germline SNP rs700635. Here, we show that the rs700635[C] allele, which is associated with increased risk of BCC and breast cancer, is protective against prostate cancer [odds ratio (OR) = 0.91, P = 1.0 × 10(-6)]. rs700635[C] is also associated with failures to correctly splice out CASP8 intron 8 in breast and prostate tumours and in corresponding normal tissues. Investigation of rs700635[C] carriers revealed that they have a human-specific short interspersed element-variable number of tandem repeat-Alu (SINE-VNTR-Alu), subfamily-E retrotransposon (SVA-E) inserted into CASP8 intron 8. The SVA-E shows evidence of prior activity, because it has transduced some CASP8 sequences during subsequent retrotransposition events. Whole-genome sequence (WGS) data were used to tag the SVA-E with a surrogate SNP rs1035142[T] (r(2) = 0.999), which showed associations with both the splicing anomalies (P = 6.5 × 10(-32)) and with protection against prostate cancer (OR = 0.91, P = 3.8 × 10(-7)).
Assuntos
Neoplasias da Mama/genética , Carcinoma Basocelular/genética , Caspase 8/genética , Neoplasias da Próstata/genética , Splicing de RNA , Retroelementos , Neoplasias Cutâneas/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Alelos , Sequência de Bases , Neoplasias da Mama/metabolismo , Neoplasias da Mama/patologia , Carcinoma Basocelular/metabolismo , Carcinoma Basocelular/patologia , Caspase 8/metabolismo , Feminino , Estudo de Associação Genômica Ampla , Humanos , Íntrons , Masculino , Pessoa de Meia-Idade , Dados de Sequência Molecular , Razão de Chances , Polimorfismo de Nucleotídeo Único , Neoplasias da Próstata/metabolismo , Neoplasias da Próstata/patologia , Neoplasias da Próstata/prevenção & controle , Fatores de Proteção , Neoplasias Cutâneas/metabolismo , Neoplasias Cutâneas/patologiaRESUMO
Common human diseases result from the interplay of many genes and environmental factors. Therefore, a more integrative biology approach is needed to unravel the complexity and causes of such diseases. To elucidate the complexity of common human diseases such as obesity, we have analysed the expression of 23,720 transcripts in large population-based blood and adipose tissue cohorts comprehensively assessed for various phenotypes, including traits related to clinical obesity. In contrast to the blood expression profiles, we observed a marked correlation between gene expression in adipose tissue and obesity-related traits. Genome-wide linkage and association mapping revealed a highly significant genetic component to gene expression traits, including a strong genetic effect of proximal (cis) signals, with 50% of the cis signals overlapping between the two tissues profiled. Here we demonstrate an extensive transcriptional network constructed from the human adipose data that exhibits significant overlap with similar network modules constructed from mouse adipose data. A core network module in humans and mice was identified that is enriched for genes involved in the inflammatory and immune response and has been found to be causally associated to obesity-related traits.
Assuntos
Perfilação da Expressão Gênica , Regulação da Expressão Gênica/genética , Obesidade/genética , Tecido Adiposo/metabolismo , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Animais , Sangue/metabolismo , Índice de Massa Corporal , Estudos de Coortes , Feminino , Genoma Humano , Humanos , Islândia , Escore Lod , Masculino , Camundongos , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Tamanho da Amostra , Relação Cintura-Quadril , População Branca/genéticaRESUMO
BACKGROUND: Long-read sequencing can enable the detection of base modifications, such as CpG methylation, in single molecules of DNA. The most commonly used methods for long-read sequencing are nanopore developed by Oxford Nanopore Technologies (ONT) and single molecule real-time (SMRT) sequencing developed by Pacific Bioscience (PacBio). In this study, we systematically compare the performance of CpG methylation detection from long-read sequencing. RESULTS: We demonstrate that CpG methylation detection from 7179 nanopore-sequenced DNA samples is highly accurate and consistent with 132 oxidative bisulfite-sequenced (oxBS) samples, isolated from the same blood draws. We introduce quality filters for CpGs that further enhance the accuracy of CpG methylation detection from nanopore-sequenced DNA, while removing at most 30% of CpGs. We evaluate the per-site performance of CpG methylation detection across different genomic features and CpG methylation rates and demonstrate how the latest R10.4 flowcell chemistry and base-calling algorithms improve methylation detection from nanopore sequencing. Additionally, we show how the methylation detection of 50 SMRT-sequenced genomes compares to nanopore sequencing and oxBS. CONCLUSIONS: This study provides the first systematic comparison of CpG methylation detection tools for long-read sequencing methods. We compare two commonly used computational methods for the detection of CpG methylation in a large number of nanopore genomes, including samples sequenced using the latest R10.4 nanopore flowcell chemistry and 50 SMRT sequenced samples. We provide insights into the strengths and limitations of each sequencing method as well as recommendations for standardization and evaluation of tools designed for genome-scale modified base detection using long-read sequencing.
Assuntos
Metilação de DNA , Genoma Humano , Humanos , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , DNARESUMO
Gene promoter and enhancer sequences are bound by transcription factors and are depleted of methylated CpG sites (cytosines preceding guanines in DNA). The absence of methylated CpGs in these sequences typically correlates with increased gene expression, indicating a regulatory role for methylation. We used nanopore sequencing to determine haplotype-specific methylation rates of 15.3 million CpG units in 7,179 whole-blood genomes. We identified 189,178 methylation depleted sequences where three or more proximal CpGs were unmethylated on at least one haplotype. A total of 77,789 methylation depleted sequences (~41%) associated with 80,503 cis-acting sequence variants, which we termed allele-specific methylation quantitative trait loci (ASM-QTLs). RNA sequencing of 896 samples from the same blood draws used to perform nanopore sequencing showed that the ASM-QTL, that is, DNA sequence variability, drives most of the correlation found between gene expression and CpG methylation. ASM-QTLs were enriched 40.2-fold (95% confidence interval 32.2, 49.9) among sequence variants associating with hematological traits, demonstrating that ASM-QTLs are important functional units in the noncoding genome.
Assuntos
Ilhas de CpG , Metilação de DNA , Locos de Características Quantitativas , Humanos , Regiões Promotoras Genéticas , Haplótipos , Alelos , Regulação da Expressão Gênica , Variação Genética , Sequenciamento por Nanoporos/métodos , Genoma HumanoRESUMO
Clonal hematopoiesis (CH) arises when a substantial proportion of mature blood cells is derived from a single hematopoietic stem cell lineage. Using whole-genome sequencing of 45,510 Icelandic and 130,709 UK Biobank participants combined with a mutational barcode method, we identified 16,306 people with CH. Prevalence approaches 50% in elderly participants. Smoking demonstrates a dosage-dependent impact on risk of CH. CH associates with several smoking-related diseases. Contrary to published claims, we find no evidence that CH is associated with cardiovascular disease. We provide evidence that CH is driven by genes that are commonly mutated in myeloid neoplasia and implicate several new driver genes. The presence and nature of a driver mutation alters the risk profile for hematological disorders. Nevertheless, most CH cases have no known driver mutations. A CH genome-wide association study identified 25 loci, including 19 not implicated previously in CH. Splicing, protein and expression quantitative trait loci were identified for CD164 and TCL1A.
Assuntos
Hematopoiese Clonal , Estudo de Associação Genômica Ampla , Humanos , Idoso , Hematopoiese Clonal/genética , Hematopoese/genética , Mutação/genética , Células-Tronco Hematopoéticas/metabolismoRESUMO
Migraine is a complex neurovascular disease with a range of severity and symptoms, yet mostly studied as one phenotype in genome-wide association studies (GWAS). Here we combine large GWAS datasets from six European populations to study the main migraine subtypes, migraine with aura (MA) and migraine without aura (MO). We identified four new MA-associated variants (in PRRT2, PALMD, ABO and LRRK2) and classified 13 MO-associated variants. Rare variants with large effects highlight three genes. A rare frameshift variant in brain-expressed PRRT2 confers large risk of MA and epilepsy, but not MO. A burden test of rare loss-of-function variants in SCN11A, encoding a neuron-expressed sodium channel with a key role in pain sensation, shows strong protection against migraine. Finally, a rare variant with cis-regulatory effects on KCNK5 confers large protection against migraine and brain aneurysms. Our findings offer new insights with therapeutic potential into the complex biology of migraine and its subtypes.
Assuntos
Epilepsia , Transtornos de Enxaqueca , Enxaqueca com Aura , Humanos , Estudo de Associação Genômica Ampla , Transtornos de Enxaqueca/genética , Enxaqueca com Aura/genética , FenótipoRESUMO
Back pain is a common and debilitating disorder with largely unknown underlying biology. Here we report a genome-wide association study of back pain using diagnoses assigned in clinical practice; dorsalgia (119,100 cases, 909,847 controls) and intervertebral disc disorder (IDD) (58,854 cases, 922,958 controls). We identify 41 variants at 33 loci. The most significant association (ORIDD = 0.92, P = 1.6 × 10-39; ORdorsalgia = 0.92, P = 7.2 × 10-15) is with a 3'UTR variant (rs1871452-T) in CHST3, encoding a sulfotransferase enzyme expressed in intervertebral discs. The largest effects on IDD are conferred by rare (MAF = 0.07 - 0.32%) loss-of-function (LoF) variants in SLC13A1, encoding a sodium-sulfate co-transporter (LoF burden OR = 1.44, P = 3.1 × 10-11); variants that also associate with reduced serum sulfate. Genes implicated by this study are involved in cartilage and bone biology, as well as neurological and inflammatory processes.
Assuntos
Degeneração do Disco Intervertebral/genética , Deslocamento do Disco Intervertebral/genética , Disco Intervertebral/metabolismo , Cotransportador de Sódio-Sulfato/genética , Cotransportador de Sódio-Sulfato/metabolismo , Sulfatos/metabolismo , Regiões 3' não Traduzidas , Osso e Ossos/metabolismo , Estudo de Associação Genômica Ampla , Humanos , Simportadores/genética , Simportadores/metabolismoRESUMO
Despite the important role that monozygotic twins have played in genetics research, little is known about their genomic differences. Here we show that monozygotic twins differ on average by 5.2 early developmental mutations and that approximately 15% of monozygotic twins have a substantial number of these early developmental mutations specific to one of them. Using the parents and offspring of twins, we identified pre-twinning mutations. We observed instances where a twin was formed from a single cell lineage in the pre-twinning cell mass and instances where a twin was formed from several cell lineages. CpG>TpG mutations increased in frequency with embryonic development, coinciding with an increase in DNA methylation. Our results indicate that allocations of cells during development shapes genomic differences between monozygotic twins.
Assuntos
Genoma Humano , Células Germinativas/metabolismo , Gêmeos Monozigóticos/genética , Desenvolvimento Embrionário/genética , Feminino , Frequência do Gene/genética , Humanos , Masculino , Mosaicismo , Mutação/genética , Zigoto/metabolismoRESUMO
PURPOSE: The aim of Copenhagen Hospital Biobank-Cardiovascular Disease Cohort (CHB-CVDC) is to establish a cohort that can accelerate our understanding of CVD initiation and progression by jointly studying genetics, diagnoses, treatments and risk factors. PARTICIPANTS: The CHB-CVDC is a large genomic cohort of patients with CVD. CHB-CVDC currently includes 96 308 patients. The cohort is part of CHB initiated in 2009 in the Capital Region of Denmark. CHB is continuously growing with ~40 000 samples/year. Patients in CHB were included in CHB-CVDC if they were above 18 years of age and assigned at least one cardiovascular diagnosis. Additionally, up-to 110 000 blood donors can be analysed jointly with CHB-CVDC. Linkage with the Danish National Health Registries, Electronic Patient Records, and Clinical Quality Databases allow up-to 41 years of medical history. All individuals are genotyped using the Infinium Global Screening Array from Illumina and imputed using a reference panel consisting of whole-genome sequence data from 8429 Danes along with 7146 samples from North-Western Europe. Currently, 39 539 of the patients are deceased. FINDINGS TO DATE: Here, we demonstrate the utility of the cohort by showing concordant effects between known variants and selected CVDs, that is, >93% concordance for coronary artery disease, atrial fibrillation, heart failure and cholesterol measurements and 85% concordance for hypertension. Furthermore, we evaluated multiple study designs and the validity of using Danish blood donors as part of CHB-CVDC. Lastly, CHB-CVDC has already made major contributions to studies of sick sinus syndrome and the role of phytosterols in development of atherosclerosis. FUTURE PLANS: In addition to genetics, electronic patient records, national socioeconomic and health registries extensively characterise each patient in CHB-CVDC and provides a promising framework for improved understanding of risk and protective variants. We aim to include other measurable biomarkers for example, proteins in CHB-CVDC making it a platform for multiomics cardiovascular studies.
Assuntos
Doenças Cardiovasculares , Cardiopatias , Bancos de Espécimes Biológicos , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/genética , Estudos de Coortes , Hospitais , HumanosRESUMO
The plasma proteome can help bridge the gap between the genome and diseases. Here we describe genome-wide association studies (GWASs) of plasma protein levels measured with 4,907 aptamers in 35,559 Icelanders. We found 18,084 associations between sequence variants and levels of proteins in plasma (protein quantitative trait loci; pQTL), of which 19% were with rare variants (minor allele frequency (MAF) < 1%). We tested plasma protein levels for association with 373 diseases and other traits and identified 257,490 associations. We integrated pQTL and genetic associations with diseases and other traits and found that 12% of 45,334 lead associations in the GWAS Catalog are with variants in high linkage disequilibrium with pQTL. We identified 938 genes encoding potential drug targets with variants that influence levels of possible biomarkers. Combining proteomics, genomics and transcriptomics, we provide a valuable resource that can be used to improve understanding of disease pathogenesis and to assist with drug discovery and development.
Assuntos
Proteínas Sanguíneas/genética , Doença/genética , Proteoma/genética , Biomarcadores/sangue , Proteínas Sanguíneas/metabolismo , Feminino , Frequência do Gene , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Masculino , Pessoa de Meia-Idade , Locos de Características QuantitativasRESUMO
The success of genome-wide association studies (GWAS) in identifying common, low-penetrance variant-cancer associations for the past decade is undisputed. However, discovering additional high-penetrance cancer mutations in unknown cancer predisposing genes requires detection of variant-cancer association of ultra-rare coding variants. Consequently, large-scale next-generation sequence data with associated phenotype information are needed. Here, we used genotype data on 166,281 Icelanders, of which, 49,708 were whole-genome sequenced and 408,595 individuals from the UK Biobank, of which, 41,147 were whole-exome sequenced, to test for association between loss-of-function burden in autosomal genes and basal cell carcinoma (BCC), the most common cancer in Caucasians. A total of 25,205 BCC cases and 683,058 controls were tested. Rare germline loss-of-function variants in PTPN14 conferred substantial risks of BCC (OR, 8.0; P = 1.9 × 10-12), with a quarter of carriers getting BCC before age 70 and over half in their lifetime. Furthermore, common variants at the PTPN14 locus were associated with BCC, suggesting PTPN14 as a new, high-impact BCC predisposition gene. A follow-up investigation of 24 cancers and three benign tumor types showed that PTPN14 loss-of-function variants are associated with high risk of cervical cancer (OR, 12.7, P = 1.6 × 10-4) and low age at diagnosis. Our findings, using power-increasing methods with high-quality rare variant genotypes, highlight future prospects for new discoveries on carcinogenesis. SIGNIFICANCE: This study identifies the tumor-suppressor gene PTPN14 as a high-impact BCC predisposition gene and indicates that inactivation of PTPN14 by germline sequence variants may also lead to increased risk of cervical cancer.