RESUMEN
Pathogenic variants in RAD51C confer an elevated risk of breast and ovarian cancer, while individuals homozygous for specific RAD51C alleles may develop Fanconi anemia. Using saturation genome editing (SGE), we functionally assess 9,188 unique variants, including >99.5% of all possible coding sequence single-nucleotide alterations. By computing changes in variant abundance and Gaussian mixture modeling (GMM), we functionally classify 3,094 variants to be disruptive and use clinical truth sets to reveal an accuracy/concordance of variant classification >99.9%. Cell fitness was the primary assay readout allowing us to observe a phenomenon where specific missense variants exhibit distinct depletion kinetics potentially suggesting that they represent hypomorphic alleles. We further explored our exhaustive functional map, revealing critical residues on the RAD51C structure and resolving variants found in cancer-segregating kindred. Furthermore, through interrogation of UK Biobank and a large multi-center ovarian cancer cohort, we find significant associations between SGE-depleted variants and cancer diagnoses.
Asunto(s)
Proteínas de Unión al ADN , Edición Génica , Neoplasias Ováricas , Humanos , Femenino , Edición Génica/métodos , Proteínas de Unión al ADN/metabolismo , Proteínas de Unión al ADN/genética , Neoplasias Ováricas/genética , Neoplasias de la Mama/genética , Alelos , Sistemas CRISPR-Cas/genéticaRESUMEN
Variable levels of gene expression between tissues complicates the use of RNA sequencing of patient biosamples to delineate the impact of genomic variants. Here, we describe a gene- and tissue-specific metric to inform the feasibility of RNA sequencing. This overcomes limitations of using expression values alone as a metric to predict RNA-sequencing utility. We have derived a metric, minimum required sequencing depth (MRSD), that estimates the depth of sequencing required from RNA sequencing to achieve user-specified sequencing coverage of a gene, transcript, or group of genes. We applied MRSD across four human biosamples: whole blood, lymphoblastoid cell lines (LCLs), skeletal muscle, and cultured fibroblasts. MRSD has high precision (90.1%-98.2%) and overcomes transcript region-specific sequencing biases. Applying MRSD scoring to established disease gene panels shows that fibroblasts, of these four biosamples, are the optimum source of RNA for 63.1% of gene panels. Using this approach, up to 67.8% of the variants of uncertain significance in ClinVar that are predicted to impact splicing could be assayed by RNA sequencing in at least one of the biosamples. We demonstrate the utility and benefits of MRSD as a metric to inform functional assessment of splicing aberrations, in particular in the context of Mendelian genetic disorders to improve diagnostic yield.
Asunto(s)
Enfermedades Genéticas Congénitas/genética , Empalme del ARN , ARN Mensajero/genética , Análisis de Secuencia de ARN/estadística & datos numéricos , Programas Informáticos , Linfocitos B/metabolismo , Linfocitos B/patología , Células Sanguíneas/metabolismo , Células Sanguíneas/patología , Línea Celular , Fibroblastos/metabolismo , Fibroblastos/patología , Enfermedades Genéticas Congénitas/clasificación , Enfermedades Genéticas Congénitas/metabolismo , Enfermedades Genéticas Congénitas/patología , Variación Genética , Humanos , Músculo Esquelético/metabolismo , Músculo Esquelético/patología , ARN Mensajero/metabolismo , Proyectos de Investigación , Secuenciación del Exoma/estadística & datos numéricosRESUMEN
BACKGROUND: The 2015 American College of Medical Genetics/Association of Molecular Pathology (ACMG/AMP) variant classification framework specifies that case-control observations can be scored as 'strong' evidence (PS4) towards pathogenicity. METHODS: We developed the PS4-likelihood ratio calculator (PS4-LRCalc) for quantitative evidence assignment based on the observed variant frequencies in cases and controls. Binomial likelihoods are computed for two models, each defined by prespecified OR thresholds. Model 1 represents the hypothesis of association between variant and phenotype (eg, OR≥5) and model 2 represents the hypothesis of non-association (eg, OR≤1). RESULTS: PS4-LRCalc enables continuous quantitation of evidence for variant classification expressed as a likelihood ratio (LR), which can be log-converted into log LR (evidence points). Using PS4-LRCalc, observed data can be used to quantify evidence towards either pathogenicity or benignity. Variants can also be evaluated against models of different penetrance. The approach is applicable to balanced data sets generated for more common phenotypes and smaller data sets more typical in very rare disease variant evaluation. CONCLUSION: PS4-LRCalc enables flexible evidence quantitation on a continuous scale for observed case-control data. The converted LR is amenable to incorporation into the now widely used 2018 updated Bayesian ACMG/AMP framework.
Asunto(s)
Variación Genética , Humanos , Funciones de Verosimilitud , Estudios de Casos y Controles , Fenotipo , Penetrancia , Predisposición Genética a la EnfermedadRESUMEN
PURPOSE: Small cell carcinoma of the ovary, hypercalcemic type (SCCOHT) is an extremely rare, highly aggressive cancer (mean age of onset, 24 years). Nearly all cases are associated with somatic or germline pathogenic variants (GPVs) in SMARCA4. Early bilateral oophorectomy is recommended for unaffected females with a SMARCA4 GPV. However, the penetrance of SMARCA4 GPVs for SCCOHT is highly uncertain and subject to ascertainment bias. METHODS: Leveraging the early-onset, sex-specific, highly morbid nature of SCCOHT, we hypothesized that the penetrance for SCCOHT could be quantified from the deficit in SMARCA4 GPVs in females compared to males in UK Biobank, a population cohort for which recruitment was restricted to those age 40-69. We also analyzed pedigrees ascertained internationally by the Montreal-based SCCOHT-SMARCA4 Registry. RESULTS: We observed SMARCA4 GPVs in 8/210,182 (0.0038%) female and 18/179,210 (0.0100%) male participants in UK Biobank (p = 0.028), representing a male:female odds ratio of 2.64 (95%CI 1.09-7.02), implying a penetrance of 62% for SCCOHT (given absence of other SMARCA4-related female-specific early morbid diseases). A deficit of GPVs in females in UK Biobank was also demonstrated for BRCA1 and TP53. CONCLUSION: Our findings support bilateral oophorectomy in early adulthood as a rational choice for at-risk females with SMARCA4 GPVs.
RESUMEN
PURPOSE: Current practice is to report and manage likely pathogenic/pathogenic variants in a given cancer susceptibility gene (CSG) as though having equivalent penetrance, despite increasing evidence of inter-variant variability in risk associations. Using existing variant interpretation approaches, largely based on full-penetrance models, variants where reduced penetrance is suspected may be classified inconsistently and/or as variants of uncertain significance (VUS). We aimed to develop a national consensus approach for such variants within the Cancer Variant Interpretation Group UK (CanVIG-UK) multidisciplinary network. METHODS: A series of surveys and live polls were conducted during and between CanVIG-UK monthly meetings on various scenarios potentially indicating reduced penetrance. These informed the iterative development of a framework for the classification of variants of reduced penetrance by the CanVIG-UK Steering and Advisory Group (CStAG) working group. RESULTS: CanVIG-UK recommendations for amendment of the 2015 ACMG/AMP variant interpretation framework were developed for variants where (A) Active evidence suggests a reduced penetrance effect size (e.g. from case-control or segregation data) (B) Reduced penetrance effect is inferred from weaker/potentially-inconsistent observed data. CONCLUSIONS: CanVIG-UK propose a framework for the classification of variants of reduced penetrance in high-penetrance genes. These principles, whilst developed for CSGs, are potentially applicable to other clinical contexts.
RESUMEN
Vestibular schwannomas are benign nerve sheath tumours that arise on the vestibulocochlear nerves. Vestibular schwannomas are known to occur in the context of tumour predisposition syndromes NF2-related and LZTR1-related schwannomatosis. However, the majority of vestibular schwannomas present sporadically without identification of germline pathogenic variants. To identify novel genetic associations with risk of vestibular schwannoma development, we conducted a genome-wide association study in a cohort of 911 sporadic vestibular schwannoma cases collated from the neurofibromatosis type 2 genetic testing service in the north-west of England, UK and 5500 control samples from the UK Biobank resource. One risk locus reached genome-wide significance in our association analysis (9p21.3, rs1556516, P = 1.47 × 10-13, odds ratio = 0.67, allele frequency = 0.52). 9p21.3 is a genome-wide association study association hotspot, and a number of genes are localized to this region, notably CDKN2B-AS1 and CDKN2A/B, also referred to as the INK4 locus. Dysregulation of gene products within the INK4 locus have been associated with multiple pathologies and the genes in this region have been observed to directly impact the expression of one another. Recurrent associations of the INK4 locus with components of well-described oncogenic pathways provides compelling evidence that the 9p21.3 region is truly associated with risk of vestibular schwannoma tumorigenesis.
Asunto(s)
Neurilemoma , Neurofibromatosis , Neurofibromatosis 2 , Neuroma Acústico , Neoplasias Cutáneas , Humanos , Neuroma Acústico/genética , Estudio de Asociación del Genoma Completo , Neurilemoma/genética , Neurilemoma/patología , Neurofibromatosis/genética , Neoplasias Cutáneas/genética , Neurofibromatosis 2/genética , Factores de Transcripción/genéticaRESUMEN
BACKGROUND: It is proposed that, through restriction to individuals delineated as high risk, polygenic risk scores (PRSs) might enable more efficient targeting of existing cancer screening programmes and enable extension into new age ranges and disease types. To address this proposition, we present an overview of the performance of PRS tools (ie, models and sets of single nucleotide polymorphisms) alongside harms and benefits of PRS-stratified cancer screening for eight example cancers (breast, prostate, colorectal, pancreas, ovary, kidney, lung, and testicular cancer). METHODS: For this modelling analysis, we used age-stratified cancer incidences for the UK population from the National Cancer Registration Dataset (2016-18) and published estimates of the area under the receiver operating characteristic curve for current, future, and optimised PRS for each of the eight cancer types. For each of five PRS-defined high-risk quantiles (ie, the top 50%, 20%, 10%, 5%, and 1%) and according to each of the three PRS tools (ie, current, future, and optimised) for the eight cancers, we calculated the relative proportion of cancers arising, the odds ratios of a cancer arising compared with the UK population average, and the lifetime cancer risk. We examined maximal attainable rates of cancer detection by age stratum from combining PRS-based stratification with cancer screening tools and modelled the maximal impact on cancer-specific survival of hypothetical new UK programmes of PRS-stratified screening. FINDINGS: The PRS-defined high-risk quintile (20%) of the population was estimated to capture 37% of breast cancer cases, 46% of prostate cancer cases, 34% of colorectal cancer cases, 29% of pancreatic cancer cases, 26% of ovarian cancer cases, 22% of renal cancer cases, 26% of lung cancer cases, and 47% of testicular cancer cases. Extending UK screening programmes to a PRS-defined high-risk quintile including people aged 40-49 years for breast cancer, 50-59 years for colorectal cancer, and 60-69 years for prostate cancer has the potential to avert, respectively, a maximum of 102, 188, and 158 deaths annually. Unstratified screening of the full population aged 48-49 years for breast cancer, 58-59 years for colorectal cancer, and 68-69 years for prostate cancer would use equivalent resources and avert, respectively, an estimated maximum of 80, 155, and 95 deaths annually. These maximal modelled numbers will be substantially attenuated by incomplete population uptake of PRS profiling and cancer screening, interval cancers, non-European ancestry, and other factors. INTERPRETATION: Under favourable assumptions, our modelling suggests modest potential efficiency gain in cancer case detection and deaths averted for hypothetical new PRS-stratified screening programmes for breast, prostate, and colorectal cancer. Restriction of screening to high-risk quantiles means many or most incident cancers will arise in those assigned as being low-risk. To quantify real-world clinical impact, costs, and harms, UK-specific cluster-randomised trials are required. FUNDING: The Wellcome Trust.
Asunto(s)
Neoplasias de la Mama , Neoplasias Colorrectales , Neoplasias de la Próstata , Neoplasias Testiculares , Masculino , Humanos , Detección Precoz del Cáncer , Factores de Riesgo , Neoplasias de la Mama/diagnóstico , Neoplasias de la Mama/epidemiología , Neoplasias de la Mama/genética , Neoplasias de la Próstata/diagnóstico , Neoplasias de la Próstata/epidemiología , Neoplasias de la Próstata/genética , Neoplasias Colorrectales/diagnóstico , Neoplasias Colorrectales/epidemiología , Neoplasias Colorrectales/genética , Reino Unido/epidemiología , Predisposición Genética a la EnfermedadRESUMEN
Missense variants in the NF2 gene result in variable NF2 disease presentation. Clinical classification of missense variants often represents a challenge, due to lack of evidence for pathogenicity and function. This study provides a summary of NF2 missense variants, with variant classifications based on currently available evidence. NF2 missense variants were collated from pathology-associated databases and existing literature. Association for Clinical Genomic Sciences Best Practice Guidelines (2020) were followed in the application of evidence for variant interpretation and classification. The majority of NF2 missense variants remain classified as variants of uncertain significance. However, NF2 missense variants identified in gnomAD occurred at a consistent rate across the gene, while variants compiled from pathology-associated databases displayed differing rates of variation by exon of NF2. The highest rate of NF2 disease-associated variants was observed in exon 7, while lower rates were observed toward the C-terminus of the NF2 protein, merlin. Further phenotypic information associated with variants, alongside variant-specific functional analysis, is necessary for more definitive variant interpretation. Our data identified differences in frequency of NF2 missense variants by exon between gnomAD population data and NF2 disease-associated variants, suggesting a potential genotype-phenotype correlation; further work is necessary to substantiate this.
Asunto(s)
Genes de la Neurofibromatosis 2 , Neurofibromina 2 , Estudios de Asociación Genética , Genómica , Humanos , Mutación Missense , Neurofibromina 2/genéticaRESUMEN
The craniofacial disorder mandibulofacial dysostosis Guion-Almeida type is caused by haploinsufficiency of the U5 snRNP gene EFTUD2/SNU114. However, it is unclear how reduced expression of this core pre-mRNA splicing factor leads to craniofacial defects. Here we use a CRISPR-Cas9 nickase strategy to generate a human EFTUD2-knockdown cell line and show that reduced expression of EFTUD2 leads to diminished proliferative ability of these cells, increased sensitivity to endoplasmic reticulum (ER) stress and the mis-expression of several genes involved in the ER stress response. RNA-Seq analysis of the EFTUD2-knockdown cell line revealed transcriptome-wide changes in gene expression, with an enrichment for genes associated with processes involved in craniofacial development. Additionally, our RNA-Seq data identified widespread mis-splicing in EFTUD2-knockdown cells. Analysis of the functional and physical characteristics of mis-spliced pre-mRNAs highlighted conserved properties, including length and splice site strengths, of retained introns and skipped exons in our disease model. We also identified enriched processes associated with the affected genes, including cell death, cell and organ morphology and embryonic development. Together, these data support a model in which EFTUD2 haploinsufficiency leads to the mis-splicing of a distinct subset of pre-mRNAs with a widespread effect on gene expression, including altering the expression of ER stress response genes and genes involved in the development of the craniofacial region. The increased burden of unfolded proteins in the ER resulting from mis-splicing would exceed the capacity of the defective ER stress response, inducing apoptosis in cranial neural crest cells that would result in craniofacial abnormalities during development.
Asunto(s)
Disostosis Mandibulofacial/genética , Factores de Elongación de Péptidos/genética , Ribonucleoproteína Nuclear Pequeña U5/genética , Sistemas CRISPR-Cas , Proliferación Celular/genética , Anomalías Craneofaciales/genética , Estrés del Retículo Endoplásmico/genética , Exones , Expresión Génica/genética , Regulación del Desarrollo de la Expresión Génica/genética , Células HEK293 , Haploinsuficiencia/genética , Humanos , Intrones , Mutación , Factores de Elongación de Péptidos/metabolismo , Fenotipo , Precursores del ARN/metabolismo , Empalme del ARN/genética , Ribonucleoproteína Nuclear Pequeña U5/metabolismo , Análisis de Secuencia de ARN/métodos , Empalmosomas/genéticaRESUMEN
Many variants that we inherit from our parents or acquire de novo or somatically are rare, limiting the precision with which we can associate them with disease. We performed exhaustive saturation genome editing (SGE) of BAP1, the disruption of which is linked to tumorigenesis and altered neurodevelopment. We experimentally characterized 18,108 unique variants, of which 6,196 were found to have abnormal functions, and then used these data to evaluate phenotypic associations in the UK Biobank. We also characterized variants in a large population-ascertained tumor collection, in cancer pedigrees and ClinVar, and explored the behavior of cancer-associated variants compared to that of variants linked to neurodevelopmental phenotypes. Our analyses demonstrated that disruptive germline BAP1 variants were significantly associated with higher circulating levels of the mitogen IGF-1, suggesting a possible pathological mechanism and therapeutic target. Furthermore, we built a variant classifier with >98% sensitivity and specificity and quantify evidence strengths to aid precision variant interpretation.
Asunto(s)
Edición Génica , Mutación de Línea Germinal , Proteínas Supresoras de Tumor , Ubiquitina Tiolesterasa , Humanos , Mutación de Línea Germinal/genética , Ubiquitina Tiolesterasa/genética , Proteínas Supresoras de Tumor/genética , Edición Génica/métodos , Neoplasias/genética , Predisposición Genética a la Enfermedad , Linaje , Femenino , MasculinoRESUMEN
Aicardi-Goutières syndrome (AGS1-9) is a genetically determined encephalopathy that falls under the type I interferonopathy disease class, characterized by excessive type I interferon (IFN-I) activity, coupled with upregulation of IFN-stimulated genes (ISGs), which can be explained by the vital role these proteins play in self-non-self-discrimination. To date, few mouse models fully replicate the vast clinical phenotypes observed in AGS patients. Therefore, we investigated the use of zebrafish as an alternative species for generating a clinically relevant model of AGS. Using CRISPR-cas9 technology, we generated a stable mutant zebrafish line recapitulating AGS5, which arises from recessive mutations in SAMHD1. The resulting homozygous mutant zebrafish larvae possess a number of neurological phenotypes, exemplified by variable, but increased expression of several ISGs in the head region, a significant increase in brain cell death, microcephaly and locomotion deficits. A link between IFN-I signaling and cholesterol biosynthesis has been highlighted by others, but not previously implicated in the type I interferonopathies. Through assessment of neurovascular integrity and qPCR analysis we identified a significant dysregulation of cholesterol biosynthesis in the zebrafish model. Furthermore, dysregulation of cholesterol biosynthesis gene expression was also observed through RNA sequencing analysis of AGS patient whole blood. From this novel finding, we hypothesize that cholesterol dysregulation may play a role in AGS disease pathophysiology. Further experimentation will lend critical insight into the molecular pathophysiology of AGS and the potential links involving aberrant type I IFN signaling and cholesterol dysregulation.
Asunto(s)
Enfermedades Autoinmunes del Sistema Nervioso , Interferón Tipo I , Malformaciones del Sistema Nervioso , Animales , Ratones , Enfermedades Autoinmunes del Sistema Nervioso/genética , Enfermedades Autoinmunes del Sistema Nervioso/metabolismo , Interferón Tipo I/genética , Interferón Tipo I/metabolismo , Malformaciones del Sistema Nervioso/genética , Malformaciones del Sistema Nervioso/metabolismo , Proteína 1 que Contiene Dominios SAM y HD/genética , Pez Cebra/genética , Pez Cebra/metabolismoRESUMEN
The craniofacial developmental disorder Burn-McKeown Syndrome (BMKS) is caused by biallelic variants in the pre-messenger RNA splicing factor gene TXNL4A/DIB1. The majority of affected individuals with BMKS have a 34 base pair deletion in the promoter region of one allele of TXNL4A combined with a loss-of-function variant on the other allele, resulting in reduced TXNL4A expression. However, it is unclear how reduced expression of this ubiquitously expressed spliceosome protein results in craniofacial defects during development. Here we reprogrammed peripheral mononuclear blood cells from a BMKS patient and her unaffected mother into induced pluripotent stem cells (iPSCs) and differentiated the iPSCs into induced neural crest cells (iNCCs), the key cell type required for correct craniofacial development. BMKS patient-derived iPSCs proliferated more slowly than both mother- and unrelated control-derived iPSCs, and RNA-Seq analysis revealed significant differences in gene expression and alternative splicing. Patient iPSCs displayed defective differentiation into iNCCs compared to maternal and unrelated control iPSCs, in particular a delay in undergoing an epithelial-to-mesenchymal transition (EMT). RNA-Seq analysis of differentiated iNCCs revealed widespread gene expression changes and mis-splicing in genes relevant to craniofacial and embryonic development that highlight a dampened response to WNT signalling, the key pathway activated during iNCC differentiation. Furthermore, we identified the mis-splicing of TCF7L2 exon 4, a key gene in the WNT pathway, as a potential cause of the downregulated WNT response in patient cells. Additionally, mis-spliced genes shared common sequence properties such as length, branch point to 3' splice site (BPS-3'SS) distance and splice site strengths, suggesting that splicing of particular subsets of genes is particularly sensitive to changes in TXNL4A expression. Together, these data provide the first insight into how reduced TXNL4A expression in BMKS patients might compromise splicing and NCC function, resulting in defective craniofacial development in the embryo.
Asunto(s)
Empalme Alternativo , Atresia de las Coanas/patología , Sordera/congénito , Regulación del Desarrollo de la Expresión Génica , Cardiopatías Congénitas/patología , Células Madre Pluripotentes Inducidas/citología , Modelos Biológicos , Ribonucleoproteína Nuclear Pequeña U5/deficiencia , Empalmosomas/fisiología , Apoptosis , Diferenciación Celular , Técnicas de Reprogramación Celular , Atresia de las Coanas/genética , Células Clonales , Sordera/genética , Sordera/patología , Transición Epitelial-Mesenquimal , Exones/genética , Cara/embriología , Facies , Femenino , Cabeza/embriología , Cardiopatías Congénitas/genética , Humanos , Cresta Neural/citología , Regiones Promotoras Genéticas/genética , Sitios de Empalme de ARN , ARN Mensajero/genética , ARN Mensajero/metabolismo , Ribonucleoproteína Nuclear Pequeña U5/genética , Eliminación de Secuencia , Proteína 2 Similar al Factor de Transcripción 7/genética , Vía de Señalización WntRESUMEN
Defects in pre-mRNA splicing are frequently a cause of Mendelian disease. Despite the advent of next-generation sequencing, allowing a deeper insight into a patient's variant landscape, the ability to characterize variants causing splicing defects has not progressed with the same speed. To address this, recent years have seen a sharp spike in the number of splice prediction tools leveraging machine learning approaches, leaving clinical geneticists with a plethora of choices for in silico analysis. In this review, some basic principles of machine learning are introduced in the context of genomics and splicing analysis. A critical comparative approach is then used to describe seven recent machine learning-based splice prediction tools, revealing highly diverse approaches and common caveats. We find that, although great progress has been made in producing specific and sensitive tools, there is still much scope for personalized approaches to prediction of variant impact on splicing. Such approaches may increase diagnostic yields and underpin improvements to patient care.