RESUMO
Rare copy-number variants (rCNVs) include deletions and duplications that occur infrequently in the global human population and can confer substantial risk for disease. In this study, we aimed to quantify the properties of haploinsufficiency (i.e., deletion intolerance) and triplosensitivity (i.e., duplication intolerance) throughout the human genome. We harmonized and meta-analyzed rCNVs from nearly one million individuals to construct a genome-wide catalog of dosage sensitivity across 54 disorders, which defined 163 dosage sensitive segments associated with at least one disorder. These segments were typically gene dense and often harbored dominant dosage sensitive driver genes, which we were able to prioritize using statistical fine-mapping. Finally, we designed an ensemble machine-learning model to predict probabilities of dosage sensitivity (pHaplo & pTriplo) for all autosomal genes, which identified 2,987 haploinsufficient and 1,559 triplosensitive genes, including 648 that were uniquely triplosensitive. This dosage sensitivity resource will provide broad utility for human disease research and clinical genetics.
Assuntos
Variações do Número de Cópias de DNA , Genoma Humano , Variações do Número de Cópias de DNA/genética , Dosagem de Genes , Haploinsuficiência/genética , HumanosRESUMO
X-linked Dystonia-Parkinsonism (XDP) is a Mendelian neurodegenerative disease that is endemic to the Philippines and is associated with a founder haplotype. We integrated multiple genome and transcriptome assembly technologies to narrow the causal mutation to the TAF1 locus, which included a SINE-VNTR-Alu (SVA) retrotransposition into intron 32 of the gene. Transcriptome analyses identified decreased expression of the canonical cTAF1 transcript among XDP probands, and de novo assembly across multiple pluripotent stem-cell-derived neuronal lineages discovered aberrant TAF1 transcription that involved alternative splicing and intron retention (IR) in proximity to the SVA that was anti-correlated with overall TAF1 expression. CRISPR/Cas9 excision of the SVA rescued this XDP-specific transcriptional signature and normalized TAF1 expression in probands. These data suggest an SVA-mediated aberrant transcriptional mechanism associated with XDP and may provide a roadmap for layered technologies and integrated assembly-based analyses for other unsolved Mendelian disorders.
Assuntos
Distúrbios Distônicos/genética , Doenças Genéticas Ligadas ao Cromossomo X/genética , Genoma Humano , Transcriptoma/genética , Processamento Alternativo/genética , Elementos Alu/genética , Sequência de Bases , Sistemas CRISPR-Cas/genética , Estudos de Coortes , Família , Feminino , Loci Gênicos , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Histona Acetiltransferases/genética , Histona Acetiltransferases/metabolismo , Humanos , Células-Tronco Pluripotentes Induzidas/metabolismo , Íntrons/genética , Masculino , Repetições Minissatélites/genética , Modelos Genéticos , Degeneração Neural/genética , Degeneração Neural/patologia , Células-Tronco Neurais/metabolismo , Neurônios/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Elementos Nucleotídeos Curtos e Dispersos , Fatores Associados à Proteína de Ligação a TATA/genética , Fatores Associados à Proteína de Ligação a TATA/metabolismo , Fator de Transcrição TFIID/genética , Fator de Transcrição TFIID/metabolismoRESUMO
The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.
Assuntos
Genoma Humano , Genômica , Modelos Genéticos , Mutação , Humanos , Acesso à Informação , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Frequência do Gene , Genoma Humano/genética , Mutação/genética , Seleção GenéticaRESUMO
Genome-wide association studies (GWAS) have identified thousands of noncoding loci that are associated with human diseases and complex traits, each of which could reveal insights into the mechanisms of disease1. Many of the underlying causal variants may affect enhancers2,3, but we lack accurate maps of enhancers and their target genes to interpret such variants. We recently developed the activity-by-contact (ABC) model to predict which enhancers regulate which genes and validated the model using CRISPR perturbations in several cell types4. Here we apply this ABC model to create enhancer-gene maps in 131 human cell types and tissues, and use these maps to interpret the functions of GWAS variants. Across 72 diseases and complex traits, ABC links 5,036 GWAS signals to 2,249 unique genes, including a class of 577 genes that appear to influence multiple phenotypes through variants in enhancers that act in different cell types. In inflammatory bowel disease (IBD), causal variants are enriched in predicted enhancers by more than 20-fold in particular cell types such as dendritic cells, and ABC achieves higher precision than other regulatory methods at connecting noncoding variants to target genes. These variant-to-function maps reveal an enhancer that contains an IBD risk variant and that regulates the expression of PPIF to alter the membrane potential of mitochondria in macrophages. Our study reveals principles of genome regulation, identifies genes that affect IBD and provides a resource and generalizable strategy to connect risk variants of common diseases to their molecular and cellular functions.
Assuntos
Elementos Facilitadores Genéticos/genética , Predisposição Genética para Doença , Variação Genética/genética , Genoma Humano/genética , Estudo de Associação Genômica Ampla , Doenças Inflamatórias Intestinais/genética , Linhagem Celular , Cromossomos Humanos Par 10/genética , Ciclofilinas/genética , Células Dendríticas , Feminino , Humanos , Macrófagos/metabolismo , Masculino , Mitocôndrias/metabolismo , Especificidade de Órgãos/genética , FenótipoRESUMO
Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.
Assuntos
Transtorno do Espectro Autista , Feminino , Gravidez , Humanos , Transtorno do Espectro Autista/diagnóstico , Transtorno do Espectro Autista/genética , Primeiro Trimestre da Gravidez , Ultrassonografia Pré-Natal , Mapeamento Cromossômico , ExomaRESUMO
Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.
Assuntos
Doença/genética , Variação Genética , Genética Médica/normas , Genética Populacional/normas , Genoma Humano/genética , Feminino , Testes Genéticos , Técnicas de Genotipagem , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Polimorfismo de Nucleotídeo Único/genética , Grupos Raciais/genética , Padrões de Referência , Seleção Genética , Sequenciamento Completo do GenomaRESUMO
Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.
Assuntos
Exoma/genética , Genes Essenciais/genética , Variação Genética/genética , Genoma Humano/genética , Adulto , Encéfalo/metabolismo , Doenças Cardiovasculares/genética , Estudos de Coortes , Bases de Dados Genéticas , Feminino , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Humanos , Mutação com Perda de Função/genética , Masculino , Taxa de Mutação , Pró-Proteína Convertase 9/genética , RNA Mensageiro/genética , Reprodutibilidade dos Testes , Sequenciamento do Exoma , Sequenciamento Completo do GenomaRESUMO
Chromosome 16p11.2 reciprocal genomic disorder, resulting from recurrent copy-number variants (CNVs), involves intellectual disability, autism spectrum disorder (ASD), and schizophrenia, but the responsible mechanisms are not known. To systemically dissect molecular effects, we performed transcriptome profiling of 350 libraries from six tissues (cortex, cerebellum, striatum, liver, brown fat, and white fat) in mouse models harboring CNVs of the syntenic 7qF3 region, as well as cellular, transcriptional, and single-cell analyses in 54 isogenic neural stem cell, induced neuron, and cerebral organoid models of CRISPR-engineered 16p11.2 CNVs. Transcriptome-wide differentially expressed genes were largely tissue-, cell-type-, and dosage-specific, although more effects were shared between deletion and duplication and across tissue than expected by chance. The broadest effects were observed in the cerebellum (2,163 differentially expressed genes), and the greatest enrichments were associated with synaptic pathways in mouse cerebellum and human induced neurons. Pathway and co-expression analyses identified energy and RNA metabolism as shared processes and enrichment for ASD-associated, loss-of-function constraint, and fragile X messenger ribonucleoprotein target gene sets. Intriguingly, reciprocal 16p11.2 dosage changes resulted in consistent decrements in neurite and electrophysiological features, and single-cell profiling of organoids showed reciprocal alterations to the proportions of excitatory and inhibitory GABAergic neurons. Changes both in neuronal ratios and in gene expression in our organoid analyses point most directly to calretinin GABAergic inhibitory neurons and the excitatory/inhibitory balance as targets of disruption that might contribute to changes in neurodevelopmental and cognitive function in 16p11.2 carriers. Collectively, our data indicate the genomic disorder involves disruption of multiple contributing biological processes and that this disruption has relative impacts that are context specific.
Assuntos
Transtorno do Espectro Autista , Transtornos Cromossômicos , Deficiência Intelectual , Animais , Transtorno do Espectro Autista/genética , Calbindina 2/genética , Córtex Cerebral , Deleção Cromossômica , Transtornos Cromossômicos/genética , Cromossomos Humanos Par 16/genética , Variações do Número de Cópias de DNA , Genômica , Humanos , Deficiência Intelectual/genética , Camundongos , Neurônios , RNARESUMO
Virtually all genome sequencing efforts in national biobanks, complex and Mendelian disease programs, and medical genetic initiatives are reliant upon short-read whole-genome sequencing (srWGS), which presents challenges for the detection of structural variants (SVs) relative to emerging long-read WGS (lrWGS) technologies. Given this ubiquity of srWGS in large-scale genomics initiatives, we sought to establish expectations for routine SV detection from this data type by comparison with lrWGS assembly, as well as to quantify the genomic properties and added value of SVs uniquely accessible to each technology. Analyses from the Human Genome Structural Variation Consortium (HGSVC) of three families captured ~11,000 SVs per genome from srWGS and ~25,000 SVs per genome from lrWGS assembly. Detection power and precision for SV discovery varied dramatically by genomic context and variant class: 9.7% of the current GRCh38 reference is defined by segmental duplication (SD) and simple repeat (SR), yet 91.4% of deletions that were specifically discovered by lrWGS localized to these regions. Across the remaining 90.3% of reference sequence, we observed extremely high (93.8%) concordance between technologies for deletions in these datasets. In contrast, lrWGS was superior for detection of insertions across all genomic contexts. Given that non-SD/SR sequences encompass 95.9% of currently annotated disease-associated exons, improved sensitivity from lrWGS to discover novel pathogenic deletions in these currently interpretable genomic regions is likely to be incremental. However, these analyses highlight the considerable added value of assembly-based lrWGS to create new catalogs of insertions and transposable elements, as well as disease-associated repeat expansions in genomic sequences that were previously recalcitrant to routine assessment.
Assuntos
Genoma Humano/genética , Variação Estrutural do Genoma , Genômica/métodos , Objetivos , Sequenciamento Completo do Genoma/métodos , Sequenciamento Completo do Genoma/normas , Variações do Número de Cópias de DNA , Éxons/genética , Humanos , Projetos de Pesquisa , Duplicações Segmentares Genômicas , Alinhamento de SequênciaRESUMO
MOTIVATION: Pathogenic copy-number variants (CNVs) can cause a heterogeneous spectrum of rare and severe disorders. However, most CNVs are benign and are part of natural variation in human genomes. CNV pathogenicity classification, genotype-phenotype analyses, and therapeutic target identification are challenging and time-consuming tasks that require the integration and analysis of information from multiple scattered sources by experts. RESULTS: Here, we introduce the CNV-ClinViewer, an open-source web application for clinical evaluation and visual exploration of CNVs. The application enables real-time interactive exploration of large CNV datasets in a user-friendly designed interface and facilitates semi-automated clinical CNV interpretation following the ACMG guidelines by integrating the ClassifCNV tool. In combination with clinical judgment, the application enables clinicians and researchers to formulate novel hypotheses and guide their decision-making process. Subsequently, the CNV-ClinViewer enhances for clinical investigators' patient care and for basic scientists' translational genomic research. AVAILABILITY AND IMPLEMENTATION: The web application is freely available at https://cnv-ClinViewer.broadinstitute.org and the open-source code can be found at https://github.com/LalResearchGroup/CNV-clinviewer.
Assuntos
Variações do Número de Cópias de DNA , Software , Humanos , Genômica , Fenótipo , Genoma HumanoRESUMO
OBJECTIVE: Identification of genetic risk factors for Parkinson disease (PD) has to date been primarily limited to the study of single nucleotide variants, which only represent a small fraction of the genetic variation in the human genome. Consequently, causal variants for most PD risk are not known. Here we focused on structural variants (SVs), which represent a major source of genetic variation in the human genome. We aimed to discover SVs associated with PD risk by performing the first large-scale characterization of SVs in PD. METHODS: We leveraged a recently developed computational pipeline to detect and genotype SVs from 7,772 Illumina short-read whole genome sequencing samples. Using this set of SV variants, we performed a genome-wide association study using 2,585 cases and 2,779 controls and identified SVs associated with PD risk. Furthermore, to validate the presence of these variants, we generated a subset of matched whole-genome long-read sequencing data. RESULTS: We genotyped and tested 3,154 common SVs, representing over 412 million nucleotides of previously uncatalogued genetic variation. Using long-read sequencing data, we validated the presence of three novel deletion SVs that are associated with risk of PD from our initial association analysis, including a 2 kb intronic deletion within the gene LRRN4. INTERPRETATION: We identified three SVs associated with genetic risk of PD. This study represents the most comprehensive assessment of the contribution of SVs to the genetic risk of PD to date. ANN NEUROL 2023;93:1012-1022.
Assuntos
Estudo de Associação Genômica Ampla , Doença de Parkinson , Humanos , Doença de Parkinson/genética , Genoma Humano , Sequenciamento Completo do Genoma , GenótipoRESUMO
Huntington's disease pathogenesis involves a genetic gain-of-function toxicity mechanism triggered by the expanded HTT CAG repeat. Current therapeutic efforts aim to suppress expression of total or mutant huntingtin, though the relationship of huntingtin's normal activities to the gain-of-function mechanism and what the effects of huntingtin-lowering might be are unclear. Here, we have re-investigated a rare family segregating two presumed HTT loss-of-function (LoF) variants associated with the developmental disorder, Lopes-Maciel-Rodan syndrome (LOMARS), using whole-genome sequencing of DNA from cell lines, in conjunction with analysis of mRNA and protein expression. Our findings correct the muddled annotation of these HTT variants, reaffirm they are the genetic cause of the LOMARS phenotype and demonstrate that each variant is a huntingtin hypomorphic mutation. The NM_002111.8: c.4469+1G>A splice donor variant results in aberrant (exon 34) splicing and severely reduced mRNA, whereas, surprisingly, the NM_002111.8: c.8157T>A NP_002102.4: Phe2719Leu missense variant results in abnormally rapid turnover of the Leu2719 huntingtin protein. Thus, although rare and subject to an as yet unknown LoF intolerance at the population level, bona fide HTT LoF variants can be transmitted by normal individuals leading to severe consequences in compound heterozygotes due to huntingtin deficiency.
Assuntos
Regulação da Expressão Gênica , Proteína Huntingtina/genética , Mutação , Transtornos do Neurodesenvolvimento/genética , Sequência de Aminoácidos , Linhagem Celular , Criança , Pré-Escolar , Feminino , Humanos , Proteína Huntingtina/química , Proteína Huntingtina/metabolismo , Mutação com Perda de Função , Masculino , Mutação de Sentido Incorreto , Transtornos do Neurodesenvolvimento/metabolismo , Linhagem , Fenótipo , Splicing de RNA , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Alinhamento de Sequência , Análise de Sequência de DNARESUMO
The 6%-9% risk of an untoward outcome previously established by Warburton for prenatally detected de novo balanced chromosomal rearrangements (BCRs) does not account for long-term morbidity. We performed long-term follow-up (mean 17 years) of a registry-based nationwide cohort of 41 individuals carrying a prenatally detected de novo BCR with normal first trimester screening/ultrasound scan. We observed a significantly higher frequency of neurodevelopmental and/or neuropsychiatric disorders than in a matched control group (19.5% versus 8.3%, p = 0.04), which was increased to 26.8% upon clinical follow-up. Chromosomal microarray of 32 carriers revealed no pathogenic imbalances, illustrating a low prognostic value when fetal ultrasound scan is normal. In contrast, mate-pair sequencing revealed disrupted genes (ARID1B, NPAS3, CELF4), regulatory domains of known developmental genes (ZEB2, HOXC), and complex BCRs associated with adverse outcomes. Seven unmappable autosomal-autosomal BCRs with breakpoints involving pericentromeric/heterochromatic regions may represent a low-risk group. We performed independent phenotype-aware and blinded interpretation, which accurately predicted benign outcomes (specificity = 100%) but demonstrated relatively low sensitivity for prediction of the clinical outcome in affected carriers (sensitivity = 45%-55%). This sensitivity emphasizes the challenges associated with prenatal risk prediction for long-term morbidity in the absence of phenotypic data given the still immature annotation of the morbidity genome and poorly understood long-range regulatory mechanisms. In conclusion, we upwardly revise the previous estimates of Warburton to a morbidity risk of 27% and recommend sequencing of the chromosomal breakpoints as the first-tier diagnostic test in pregnancies with a de novo BCR.
Assuntos
Aberrações Cromossômicas , Diagnóstico Pré-Natal/métodos , Pontos de Quebra do Cromossomo , Estudos de Coortes , Sequência Conservada/genética , Evolução Molecular , Feminino , Genoma Humano , Humanos , Cariotipagem , Gravidez , RNA Longo não Codificante/genética , Fatores de Risco , Análise de Sequência de DNA , Fatores de TempoRESUMO
Autism is a multifactorial neurodevelopmental disorder affecting more males than females; consequently, under a multifactorial genetic hypothesis, females are affected only when they cross a higher biological threshold. We hypothesize that deleterious variants at conserved residues are enriched in severely affected patients arising from female-enriched multiplex families with severe disease, enhancing the detection of key autism genes in modest numbers of cases. Here we show the use of this strategy by identifying missense and dosage sequence variants in the gene encoding the adhesive junction-associated δ-catenin protein (CTNND2) in female-enriched multiplex families and demonstrating their loss-of-function effect by functional analyses in zebrafish embryos and cultured hippocampal neurons from wild-type and Ctnnd2 null mouse embryos. Finally, through gene expression and network analyses, we highlight a critical role for CTNND2 in neuronal development and an intimate connection to chromatin biology. Our data contribute to the understanding of the genetic architecture of autism and suggest that genetic analyses of phenotypic extremes, such as female-enriched multiplex families, are of innate value in multifactorial disorders.
Assuntos
Transtorno Autístico/genética , Transtorno Autístico/metabolismo , Encéfalo/metabolismo , Cateninas/deficiência , Cateninas/genética , Animais , Encéfalo/embriologia , Cateninas/metabolismo , Células Cultivadas , Cromatina/genética , Cromatina/metabolismo , Variações do Número de Cópias de DNA/genética , Embrião de Mamíferos/citologia , Embrião de Mamíferos/metabolismo , Exoma/genética , Feminino , Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Hipocampo/patologia , Humanos , Masculino , Camundongos , Modelos Genéticos , Herança Multifatorial/genética , Mutação de Sentido Incorreto , Rede Nervosa , Neurônios/citologia , Neurônios/metabolismo , Caracteres Sexuais , Peixe-Zebra/embriologia , Peixe-Zebra/genética , Peixe-Zebra/metabolismo , delta CateninaRESUMO
BACKGROUND: The relative prevalence and clinical importance of monogenic mutations related to familial hypercholesterolemia and of high polygenic score (cumulative impact of many common variants) pathways for early-onset myocardial infarction remain uncertain. Whole-genome sequencing enables simultaneous ascertainment of both monogenic mutations and polygenic score for each individual. METHODS: We performed deep-coverage whole-genome sequencing of 2081 patients from 4 racial subgroups hospitalized in the United States with early-onset myocardial infarction (age ≤55 years) recruited with a 2:1 female-to-male enrollment design. We compared these genomes with those of 3761 population-based control subjects. We first identified individuals with a rare, monogenic mutation related to familial hypercholesterolemia. Second, we calculated a recently developed polygenic score of 6.6 million common DNA variants to quantify the cumulative susceptibility conferred by common variants. We defined high polygenic score as the top 5% of the control distribution because this cutoff has previously been shown to confer similar risk to that of familial hypercholesterolemia mutations. RESULTS: The mean age of the 2081 patients presenting with early-onset myocardial infarction was 48 years, and 66% were female. A familial hypercholesterolemia mutation was present in 36 of these patients (1.7%) and was associated with a 3.8-fold (95% CI, 2.1-6.8; P<0.001) increased odds of myocardial infarction. Of the patients with early-onset myocardial infarction, 359 (17.3%) carried a high polygenic score, associated with a 3.7-fold (95% CI, 3.1-4.6; P<0.001) increased odds. Mean estimated untreated low-density lipoprotein cholesterol was 206 mg/dL in those with a familial hypercholesterolemia mutation, 132 mg/dL in those with high polygenic score, and 122 mg/dL in those in the remainder of the population. Although associated with increased risk in all racial groups, high polygenic score demonstrated the strongest association in white participants ( P for heterogeneity=0.008). CONCLUSIONS: Both familial hypercholesterolemia mutations and high polygenic score are associated with a >3-fold increased odds of early-onset myocardial infarction. However, high polygenic score has a 10-fold higher prevalence among patients presents with early-onset myocardial infarction. CLINICAL TRIAL REGISTRATION: URL: https://www.clinicaltrials.gov . Unique identifier: NCT00597922.