RESUMO
The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.
Assuntos
Genoma Humano , Genômica , Modelos Genéticos , Mutação , Humanos , Acesso à Informação , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Frequência do Gene , Genoma Humano/genética , Mutação/genética , Seleção GenéticaRESUMO
Human microbiome research is an actively developing area of inquiry, with ramifications for our lifestyles, our interactions with microbes, and how we treat disease. Advances depend on carefully executed, controlled, and reproducible studies. Here, we provide a Primer for researchers from diverse disciplines interested in conducting microbiome research. We discuss factors to be considered in the design, execution, and data analysis of microbiome studies. These recommendations should help researchers to enter and contribute to this rapidly developing field.
Assuntos
Técnicas Microbiológicas , Microbiota , Animais , Archaea/classificação , Archaea/genética , Archaea/isolamento & purificação , Bactérias/classificação , Bactérias/genética , Bactérias/isolamento & purificação , Guias como Assunto , Humanos , Reação em Cadeia da Polimerase , RibotipagemRESUMO
Host genetics and the gut microbiome can both influence metabolic phenotypes. However, whether host genetic variation shapes the gut microbiome and interacts with it to affect host phenotype is unclear. Here, we compared microbiotas across >1,000 fecal samples obtained from the TwinsUK population, including 416 twin pairs. We identified many microbial taxa whose abundances were influenced by host genetics. The most heritable taxon, the family Christensenellaceae, formed a co-occurrence network with other heritable Bacteria and with methanogenic Archaea. Furthermore, Christensenellaceae and its partners were enriched in individuals with low body mass index (BMI). An obese-associated microbiome was amended with Christensenella minuta, a cultured member of the Christensenellaceae, and transplanted to germ-free mice. C. minuta amendment reduced weight gain and altered the microbiome of recipient mice. Our findings indicate that host genetics influence the composition of the human gut microbiome and can do so in ways that impact host metabolism.
Assuntos
Bactérias/classificação , Bactérias/isolamento & purificação , Fezes/microbiologia , Microbiota , Animais , Bactérias/metabolismo , Índice de Massa Corporal , Feminino , Trato Gastrointestinal/microbiologia , Vida Livre de Germes , Humanos , Masculino , Camundongos , Obesidade/microbiologia , Gêmeos Dizigóticos , Gêmeos MonozigóticosRESUMO
Copy number variants (CNVs) are significant contributors to the pathogenicity of rare genetic diseases and, with new innovative methods, can now reliably be identified from exome sequencing. Challenges still remain in accurate classification of CNV pathogenicity. CNV calling using GATK-gCNV was performed on exomes from a cohort of 6,633 families (15,759 individuals) with heterogeneous phenotypes and variable prior genetic testing collected at the Broad Institute Center for Mendelian Genomics of the Genomics Research to Elucidate the Genetics of Rare Diseases consortium and analyzed using the seqr platform. The addition of CNV detection to exome analysis identified causal CNVs for 171 families (2.6%). The estimated sizes of CNVs ranged from 293 bp to 80 Mb. The causal CNVs consisted of 140 deletions, 15 duplications, 3 suspected complex structural variants (SVs), 3 insertions, and 10 complex SVs, the latter two groups being identified by orthogonal confirmation methods. To classify CNV variant pathogenicity, we used the 2020 American College of Medical Genetics and Genomics/ClinGen CNV interpretation standards and developed additional criteria to evaluate allelic and functional data as well as variants on the X chromosome to further advance the framework. We interpreted 151 CNVs as likely pathogenic/pathogenic and 20 CNVs as high-interest variants of uncertain significance. Calling CNVs from existing exome data increases the diagnostic yield for individuals undiagnosed after standard testing approaches, providing a higher-resolution alternative to arrays at a fraction of the cost of genome sequencing. Our improvements to the classification approach advances the systematic framework to assess the pathogenicity of CNVs.
Assuntos
Variações do Número de Cópias de DNA , Sequenciamento do Exoma , Exoma , Doenças Raras , Humanos , Variações do Número de Cópias de DNA/genética , Doenças Raras/genética , Doenças Raras/diagnóstico , Exoma/genética , Masculino , Feminino , Estudos de Coortes , Testes Genéticos/métodosRESUMO
CHASERR encodes a human long noncoding RNA (lncRNA) adjacent to CHD2, a coding gene in which de novo loss-of-function variants cause developmental and epileptic encephalopathy. Here, we report our findings in three unrelated children with a syndromic, early-onset neurodevelopmental disorder, each of whom had a de novo deletion in the CHASERR locus. The children had severe encephalopathy, shared facial dysmorphisms, cortical atrophy, and cerebral hypomyelination - a phenotype that is distinct from the phenotypes of patients with CHD2 haploinsufficiency. We found that the CHASERR deletion results in increased CHD2 protein abundance in patient-derived cell lines and increased expression of the CHD2 transcript in cis. These findings indicate that CHD2 has bidirectional dosage sensitivity in human disease, and we recommend that other lncRNA-encoding genes be evaluated, particularly those upstream of genes associated with mendelian disorders. (Funded by the National Human Genome Research Institute and others.).
Assuntos
Transtornos do Neurodesenvolvimento , RNA Longo não Codificante , Pré-Escolar , Feminino , Humanos , Lactente , Masculino , Encéfalo/patologia , Encéfalo/diagnóstico por imagem , Encéfalo/metabolismo , Proteínas de Ligação a DNA/análise , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Deleção de Genes , Haploinsuficiência , Transtornos do Neurodesenvolvimento/diagnóstico , Transtornos do Neurodesenvolvimento/genética , Transtornos do Neurodesenvolvimento/patologia , Fenótipo , RNA Longo não Codificante/genética , Deleção de SequênciaRESUMO
Underrepresented populations are often excluded from genomic studies owing in part to a lack of resources supporting their analyses. The 1000 Genomes Project (1kGP) and Human Genome Diversity Project (HGDP), which have recently been sequenced to high coverage, are valuable genomic resources because of the global diversity they capture and their open data sharing policies. Here, we harmonized a high-quality set of 4094 whole genomes from 80 populations in the HGDP and 1kGP with data from the Genome Aggregation Database (gnomAD) and identified over 153 million high-quality SNVs, indels, and SVs. We performed a detailed ancestry analysis of this cohort, characterizing population structure and patterns of admixture across populations, analyzing site frequency spectra, and measuring variant counts at global and subcontinental levels. We also show substantial added value from this data set compared with the prior versions of the component resources, typically combined via liftOver and variant intersection; for example, we catalog millions of new genetic variants, mostly rare, compared with previous releases. In addition to unrestricted individual-level public release, we provide detailed tutorials for conducting many of the most common quality-control steps and analyses with these data in a scalable cloud-computing environment and publicly release this new phased joint callset for use as a haplotype resource in phasing and imputation pipelines. This jointly called reference panel will serve as a key resource to support research of diverse ancestry populations.
Assuntos
Bases de Dados Genéticas , Genoma Humano , Humanos , Projeto Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Variação Genética , Genômica/métodosRESUMO
Many of the immune and metabolic changes occurring during normal pregnancy also describe metabolic syndrome. Gut microbiota can cause symptoms of metabolic syndrome in nonpregnant hosts. Here, to explore their role in pregnancy, we characterized fecal bacteria of 91 pregnant women of varying prepregnancy BMIs and gestational diabetes status and their infants. Similarities between infant-mother microbiotas increased with children's age, and the infant microbiota was unaffected by mother's health status. Gut microbiota changed dramatically from first (T1) to third (T3) trimesters, with vast expansion of diversity between mothers, an overall increase in Proteobacteria and Actinobacteria, and reduced richness. T3 stool showed strongest signs of inflammation and energy loss; however, microbiome gene repertoires were constant between trimesters. When transferred to germ-free mice, T3 microbiota induced greater adiposity and insulin insensitivity compared to T1. Our findings indicate that host-microbial interactions that impact host metabolism can occur and may be beneficial in pregnancy.
Assuntos
Fezes/microbiologia , Trato Gastrointestinal/microbiologia , Metagenoma , Gravidez , Actinobacteria/isolamento & purificação , Animais , Feminino , Vida Livre de Germes , Humanos , Lactente , Síndrome Metabólica/microbiologia , Camundongos , Proteobactérias/isolamento & purificaçãoRESUMO
DNA sample contamination is a major issue in clinical and research applications of whole-genome and -exome sequencing. Even modest levels of contamination can substantially affect the overall quality of variant calls and lead to widespread genotyping errors. Currently, popular tools for estimating the contamination level use short-read data (BAM/CRAM files), which are expensive to store and manipulate and often not retained or shared widely. We propose a metric to estimate DNA sample contamination from variant-level whole-genome and -exome sequence data called CHARR, contamination from homozygous alternate reference reads, which leverages the infiltration of reference reads within homozygous alternate variant calls. CHARR uses a small proportion of variant-level genotype information and thus can be computed from single-sample gVCFs or callsets in VCF or BCF formats, as well as efficiently stored variant calls in Hail VariantDataset format. Our results demonstrate that CHARR accurately recapitulates results from existing tools with substantially reduced costs, improving the accuracy and efficiency of downstream analyses of ultra-large whole-genome and exome sequencing datasets.
Assuntos
DNA , Truta , Humanos , Animais , Análise de Sequência de DNA/métodos , Genótipo , Homozigoto , Sequenciamento de Nucleotídeos em Larga Escala/métodos , SoftwareRESUMO
The body's microbiome, composed of microbial cells that number in the trillions, is involved in human health and disease in ways that are just starting to emerge. The microbiome is assembled at birth, develops with its host, and is greatly influenced by environmental factors such as diet and other exposures. Recently, a role for human genetic variation has emerged as also influential in accounting for interpersonal differences in microbiomes. Thus, human genes may influence health directly or by promoting a beneficial microbiome. Studies of the heritability of gut microbiotas reveal a subset of microbes whose abundances are partly genetically determined by the host. However, the use of genome-wide association studies (GWASs) to identify human genetic variants associated with microbiome phenotypes has proven challenging. Studies to date are small by GWAS standards, and cross-study comparisons are hampered by differences in analytical approaches. Nevertheless, associations between microbes or microbial genes and human genes have emerged that are consistent between human populations. Most notably, higher levels of beneficial gut bacteria called Bifidobacteria are associated with the human lactase nonpersister genotype, which typically confers lactose intolerance, in several different human populations. It is time for the microbiome to be incorporated into studies that quantify interactions among genotype, environment, and the microbiome in order to predict human disease susceptibility.
Assuntos
Esclerose Lateral Amiotrófica/genética , Microbioma Gastrointestinal/fisiologia , Genoma Humano , Intolerância à Lactose/genética , Obesidade/genética , Esquizofrenia/genética , Esclerose Lateral Amiotrófica/metabolismo , Esclerose Lateral Amiotrófica/microbiologia , Esclerose Lateral Amiotrófica/patologia , Bifidobacterium/crescimento & desenvolvimento , Bifidobacterium/metabolismo , Dieta/métodos , Trato Gastrointestinal/microbiologia , Variação Genética , Estudo de Associação Genômica Ampla , Genótipo , Genética Humana , Humanos , Intolerância à Lactose/metabolismo , Intolerância à Lactose/microbiologia , Intolerância à Lactose/patologia , Obesidade/metabolismo , Obesidade/microbiologia , Obesidade/patologia , Fenótipo , Característica Quantitativa Herdável , Esquizofrenia/metabolismo , Esquizofrenia/microbiologia , Esquizofrenia/patologiaRESUMO
JAG2 encodes the Notch ligand Jagged2. The conserved Notch signaling pathway contributes to the development and homeostasis of multiple tissues, including skeletal muscle. We studied an international cohort of 23 individuals with genetically unsolved muscular dystrophy from 13 unrelated families. Whole-exome sequencing identified rare homozygous or compound heterozygous JAG2 variants in all 13 families. The identified bi-allelic variants include 10 missense variants that disrupt highly conserved amino acids, a nonsense variant, two frameshift variants, an in-frame deletion, and a microdeletion encompassing JAG2. Onset of muscle weakness occurred from infancy to young adulthood. Serum creatine kinase (CK) levels were normal or mildly elevated. Muscle histology was primarily dystrophic. MRI of the lower extremities revealed a distinct, slightly asymmetric pattern of muscle involvement with cores of preserved and affected muscles in quadriceps and tibialis anterior, in some cases resembling patterns seen in POGLUT1-associated muscular dystrophy. Transcriptome analysis of muscle tissue from two participants suggested misregulation of genes involved in myogenesis, including PAX7. In complementary studies, Jag2 downregulation in murine myoblasts led to downregulation of multiple components of the Notch pathway, including Megf10. Investigations in Drosophila suggested an interaction between Serrate and Drpr, the fly orthologs of JAG1/JAG2 and MEGF10, respectively. In silico analysis predicted that many Jagged2 missense variants are associated with structural changes and protein misfolding. In summary, we describe a muscular dystrophy associated with pathogenic variants in JAG2 and evidence suggests a disease mechanism related to Notch pathway dysfunction.
Assuntos
Proteína Jagged-2/genética , Distrofias Musculares/genética , Adolescente , Adulto , Sequência de Aminoácidos , Animais , Linhagem Celular , Criança , Pré-Escolar , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Feminino , Glucosiltransferases/genética , Haplótipos/genética , Humanos , Proteína Jagged-1/genética , Proteína Jagged-2/química , Proteína Jagged-2/deficiência , Proteína Jagged-2/metabolismo , Masculino , Proteínas de Membrana/genética , Camundongos , Pessoa de Meia-Idade , Modelos Moleculares , Músculos/metabolismo , Músculos/patologia , Distrofias Musculares/patologia , Mioblastos/metabolismo , Mioblastos/patologia , Linhagem , Fenótipo , Receptores Notch/metabolismo , Transdução de Sinais , Sequenciamento do Exoma , Adulto JovemRESUMO
Reference population databases are an essential tool in variant and gene interpretation. Their use guides the identification of pathogenic variants amidst the sea of benign variation present in every human genome, and supports the discovery of new disease-gene relationships. The Genome Aggregation Database (gnomAD) is currently the largest and most widely used publicly available collection of population variation from harmonized sequencing data. The data is available through the online gnomAD browser (https://gnomad.broadinstitute.org/) that enables rapid and intuitive variant analysis. This review provides guidance on the content of the gnomAD browser, and its usage for variant and gene interpretation. We introduce key features including allele frequency, per-base expression levels, constraint scores, and variant co-occurrence, alongside guidance on how to use these in analysis, with a focus on the interpretation of candidate variants and novel genes in rare disease.
Assuntos
Doenças Raras , Software , Bases de Dados Genéticas , Frequência do Gene , Humanos , Doenças Raras/genéticaRESUMO
The intestinal tract is inhabited by a large and diverse community of microbes collectively referred to as the gut microbiota. While the gut microbiota provides important benefits to its host, especially in metabolism and immune development, disturbance of the microbiota-host relationship is associated with numerous chronic inflammatory diseases, including inflammatory bowel disease and the group of obesity-associated diseases collectively referred to as metabolic syndrome. A primary means by which the intestine is protected from its microbiota is via multi-layered mucus structures that cover the intestinal surface, thereby allowing the vast majority of gut bacteria to be kept at a safe distance from epithelial cells that line the intestine. Thus, agents that disrupt mucus-bacterial interactions might have the potential to promote diseases associated with gut inflammation. Consequently, it has been hypothesized that emulsifiers, detergent-like molecules that are a ubiquitous component of processed foods and that can increase bacterial translocation across epithelia in vitro, might be promoting the increase in inflammatory bowel disease observed since the mid-twentieth century. Here we report that, in mice, relatively low concentrations of two commonly used emulsifiers, namely carboxymethylcellulose and polysorbate-80, induced low-grade inflammation and obesity/metabolic syndrome in wild-type hosts and promoted robust colitis in mice predisposed to this disorder. Emulsifier-induced metabolic syndrome was associated with microbiota encroachment, altered species composition and increased pro-inflammatory potential. Use of germ-free mice and faecal transplants indicated that such changes in microbiota were necessary and sufficient for both low-grade inflammation and metabolic syndrome. These results support the emerging concept that perturbed host-microbiota interactions resulting in low-grade inflammation can promote adiposity and its associated metabolic effects. Moreover, they suggest that the broad use of emulsifying agents might be contributing to an increased societal incidence of obesity/metabolic syndrome and other chronic inflammatory diseases.
Assuntos
Colite/induzido quimicamente , Colite/microbiologia , Dieta/efeitos adversos , Emulsificantes/efeitos adversos , Trato Gastrointestinal/efeitos dos fármacos , Trato Gastrointestinal/microbiologia , Síndrome Metabólica/induzido quimicamente , Síndrome Metabólica/microbiologia , Adiposidade/efeitos dos fármacos , Animais , Carboximetilcelulose Sódica/administração & dosagem , Carboximetilcelulose Sódica/efeitos adversos , Colite/patologia , Emulsificantes/administração & dosagem , Fezes/microbiologia , Feminino , Trato Gastrointestinal/patologia , Vida Livre de Germes , Inflamação/induzido quimicamente , Inflamação/microbiologia , Inflamação/patologia , Mucosa Intestinal/efeitos dos fármacos , Mucosa Intestinal/microbiologia , Mucosa Intestinal/patologia , Masculino , Síndrome Metabólica/patologia , Camundongos , Microbiota/efeitos dos fármacos , Obesidade/induzido quimicamente , Obesidade/microbiologia , Obesidade/patologia , Polissorbatos/administração & dosagem , Polissorbatos/efeitos adversosRESUMO
We describe an infant with a phenotype typical of early onset Marfan syndrome whose genetic evaluation, including Sanger sequencing and deletion/duplication testing of FBN1 and exome sequencing, was negative. Ultimately, genome sequencing revealed a deletion missed on prior testing, demonstrating the unique utility of genome sequencing for molecular genetic diagnosis.
Assuntos
Fibrilina-1/genética , Síndrome de Marfan/diagnóstico , Síndrome de Marfan/genética , Análise de Sequência de DNA , Exoma , Evolução Fatal , Deleção de Genes , Dosagem de Genes , Variação Genética , Genoma Humano , Humanos , Lactente , Masculino , Fenótipo , Reação em Cadeia da PolimeraseRESUMO
OBJECTIVE: Proton pump inhibitors (PPIs) are drugs used to suppress gastric acid production and treat GI disorders such as peptic ulcers and gastro-oesophageal reflux. They have been considered low risk, have been widely adopted, and are often over-prescribed. Recent studies have identified an increased risk of enteric and other infections with their use. Small studies have identified possible associations between PPI use and GI microbiota, but this has yet to be carried out on a large population-based cohort. DESIGN: We investigated the association between PPI usage and the gut microbiome using 16S ribosomal RNA amplification from faecal samples of 1827 healthy twins, replicating results within unpublished data from an interventional study. RESULTS: We identified a significantly lower abundance in gut commensals and lower microbial diversity in PPI users, with an associated significant increase in the abundance of oral and upper GI tract commensals. In particular, significant increases were observed in Streptococcaceae. These associations were replicated in an independent interventional study and in a paired analysis between 70 monozygotic twin pairs who were discordant for PPI use. We propose that the observed changes result from the removal of the low pH barrier between upper GI tract bacteria and the lower gut. CONCLUSIONS: Our findings describe a significant impact of PPIs on the gut microbiome and should caution over-use of PPIs, and warrant further investigation into the mechanisms and their clinical consequences.
Assuntos
Microbioma Gastrointestinal/efeitos dos fármacos , Inibidores da Bomba de Prótons/farmacologia , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Trato Gastrointestinal Superior , Adulto JovemRESUMO
BACKGROUND: Host genetics is one of several factors known to shape human gut microbiome composition, however, the physiological processes underlying the heritability are largely unknown. Inter-individual differences in host factors secreted into the gut lumen may lead to variation in microbiome composition. One such factor is the ABO antigen. This molecule is not only expressed on the surface of red blood cells, but is also secreted from mucosal surfaces in individuals containing an intact FUT2 gene (secretors). Previous studies report differences in microbiome composition across ABO and secretor genotypes. However, due to methodological limitations, the specific bacterial taxa involved remain unknown. RESULTS: Here, we sought to determine the relationship of the microbiota to ABO blood group and secretor status in a large panel of 1503 individuals from a cohort of twins from the United Kingdom. Contrary to previous reports, robust associations between either ABO or secretor phenotypes and gut microbiome composition were not detected. Overall community structure, diversity, and the relative abundances of individual taxa were not significantly associated with ABO or secretor status. Additionally, joint-modeling approaches were unsuccessful in identifying combinations of taxa that were predictive of ABO or secretor status. CONCLUSIONS: Despite previous reports, the taxonomic composition of the microbiota does not appear to be strongly associated with ABO or secretor status in 1503 individuals from the United Kingdom. These results highlight the importance of replicating microbiome-associated traits in large, well-powered cohorts to ensure results are robust.
Assuntos
Sistema ABO de Grupos Sanguíneos/imunologia , Biodiversidade , Microbioma Gastrointestinal , Gêmeos , Sistema ABO de Grupos Sanguíneos/genética , Adulto , Idoso , Feminino , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade , Fenótipo , Reino UnidoRESUMO
The human gut microbiota harbors three main groups of H(2)-consuming microbes: methanogens including the dominant archaeon, Methanobrevibacter smithii, a polyphyletic group of acetogens, and sulfate-reducing bacteria. Defining their roles in the gut is important for understanding how hydrogen metabolism affects the efficiency of fermentation of dietary components. We quantified methanogens in fecal samples from 40 healthy adult female monozygotic (MZ) and 28 dizygotic (DZ) twin pairs, analyzed bacterial 16S rRNA datasets generated from their fecal samples to identify taxa that co-occur with methanogens, sequenced the genomes of 20 M. smithii strains isolated from families of MZ and DZ twins, and performed RNA-Seq of a subset of strains to identify their responses to varied formate concentrations. The concordance rate for methanogen carriage was significantly higher for MZ versus DZ twin pairs. Co-occurrence analysis revealed 22 bacterial species-level taxa positively correlated with methanogens: all but two were members of the Clostridiales, with several being, or related to, known hydrogen-producing and -consuming bacteria. The M. smithii pan-genome contains 987 genes conserved in all strains, and 1,860 variably represented genes. Strains from MZ and DZ twin pairs had a similar degree of shared genes and SNPs, and were significantly more similar than strains isolated from mothers or members of other families. The 101 adhesin-like proteins (ALPs) in the pan-genome (45 ± 6 per strain) exhibit strain-specific differences in expression and responsiveness to formate. We hypothesize that M. smithii strains use their different repertoires of ALPs to create diversity in their metabolic niches, by allowing them to establish syntrophic relationships with bacterial partners with differing metabolic capabilities and patterns of co-occurrence.
Assuntos
Adesinas Bacterianas/genética , Trato Gastrointestinal/microbiologia , Genoma Arqueal , Methanobrevibacter/genética , Gêmeos , Adulto , Sequência de Bases , Feminino , Formiatos/análise , Humanos , Metagenômica , Methanobrevibacter/metabolismo , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único/genética , RNA Ribossômico 16S/genética , Análise de Sequência de DNA , Especificidade da EspécieRESUMO
Incomplete penetrance, or absence of disease phenotype in an individual with a disease-associated variant, is a major challenge in variant interpretation. Studying individuals with apparent incomplete penetrance can shed light on underlying drivers of altered phenotype penetrance. Here, we investigate clinically relevant variants from ClinVar in 807,162 individuals from the Genome Aggregation Database (gnomAD), demonstrating improved representation in gnomAD version 4. We then conduct a comprehensive case-by-case assessment of 734 predicted loss of function variants (pLoF) in 77 genes associated with severe, early-onset, highly penetrant haploinsufficient disease. We identified explanations for the presumed lack of disease manifestation in 701 of the variants (95%). Individuals with unexplained lack of disease manifestation in this set of disorders rarely occur, underscoring the need and power of deep case-by-case assessment presented here to minimize false assignments of disease risk, particularly in unaffected individuals with higher rates of secondary properties that result in rescue.