RESUMO
Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.
Assuntos
Doença/genética , Variação Genética , Genética Médica/normas , Genética Populacional/normas , Genoma Humano/genética , Feminino , Testes Genéticos , Técnicas de Genotipagem , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Polimorfismo de Nucleotídeo Único/genética , Grupos Raciais/genética , Padrões de Referência , Seleção Genética , Sequenciamento Completo do GenomaRESUMO
Mitral valve prolapse (MVP) is a common cardiac valve disease that affects nearly 1 in 40 individuals. It can manifest as mitral regurgitation and is the leading indication for mitral valve surgery. Despite a clear heritable component, the genetic aetiology leading to non-syndromic MVP has remained elusive. Four affected individuals from a large multigenerational family segregating non-syndromic MVP underwent capture sequencing of the linked interval on chromosome 11. We report a missense mutation in the DCHS1 gene, the human homologue of the Drosophila cell polarity gene dachsous (ds), that segregates with MVP in the family. Morpholino knockdown of the zebrafish homologue dachsous1b resulted in a cardiac atrioventricular canal defect that could be rescued by wild-type human DCHS1, but not by DCHS1 messenger RNA with the familial mutation. Further genetic studies identified two additional families in which a second deleterious DCHS1 mutation segregates with MVP. Both DCHS1 mutations reduce protein stability as demonstrated in zebrafish, cultured cells and, notably, in mitral valve interstitial cells (MVICs) obtained during mitral valve repair surgery of a proband. Dchs1(+/-) mice had prolapse of thickened mitral leaflets, which could be traced back to developmental errors in valve morphogenesis. DCHS1 deficiency in MVP patient MVICs, as well as in Dchs1(+/-) mouse MVICs, result in altered migration and cellular patterning, supporting these processes as aetiological underpinnings for the disease. Understanding the role of DCHS1 in mitral valve development and MVP pathogenesis holds potential for therapeutic insights for this very common disease.
Assuntos
Caderinas/genética , Caderinas/metabolismo , Prolapso da Valva Mitral/genética , Prolapso da Valva Mitral/patologia , Mutação/genética , Animais , Padronização Corporal/genética , Proteínas Relacionadas a Caderinas , Caderinas/deficiência , Movimento Celular/genética , Cromossomos Humanos Par 11/genética , Feminino , Humanos , Masculino , Camundongos , Valva Mitral/anormalidades , Valva Mitral/embriologia , Valva Mitral/patologia , Valva Mitral/cirurgia , Linhagem , Fenótipo , Estabilidade Proteica , RNA Mensageiro/genética , Peixe-Zebra/genética , Proteínas de Peixe-Zebra/genética , Proteínas de Peixe-Zebra/metabolismoRESUMO
Copy-number variants (CNVs) have been the predominant focus of genetic studies of structural variation, and chromosomal microarray (CMA) for genome-wide CNV detection is the recommended first-tier genetic diagnostic screen in neurodevelopmental disorders. We compared CNVs observed by CMA to the structural variation detected by whole-genome large-insert sequencing in 259 individuals diagnosed with autism spectrum disorder (ASD) from the Simons Simplex Collection. These analyses revealed a diverse landscape of complex duplications in the human genome. One remarkably common class of complex rearrangement, which we term dupINVdup, involves two closely located duplications ("paired duplications") that flank the breakpoints of an inversion. This complex variant class is cryptic to CMA, but we observed it in 8.1% of all subjects. We also detected other paired-duplication signatures and duplication-mediated complex rearrangements in 15.8% of all ASD subjects. Breakpoint analysis showed that the predominant mechanism of formation of these complex duplication-associated variants was microhomology-mediated repair. On the basis of the striking prevalence of dupINVdups in this cohort, we explored the landscape of all inversion variation among the 235 highest-quality libraries and found abundant complexity among these variants: only 39.3% of inversions were canonical, or simple, inversions without additional rearrangement. Collectively, these findings indicate that dupINVdups, as well as other complex duplication-associated rearrangements, represent relatively common sources of genomic variation that is cryptic to population-based microarray and low-depth whole-genome sequencing. They also suggest that paired-duplication signatures detected by CMA warrant further scrutiny in genetic diagnostic testing given that they might mark complex rearrangements of potential clinical relevance.
Assuntos
Transtornos Globais do Desenvolvimento Infantil/genética , Inversão Cromossômica/genética , Variações do Número de Cópias de DNA/genética , Marcadores Genéticos/genética , Duplicações Segmentares Genômicas/genética , Estudos de Coortes , Reparo do DNA/genética , Biblioteca Gênica , HumanosRESUMO
Structural variation (SV) is a significant component of the genetic etiology of both neurodevelopmental and psychiatric disorders; however, routine guidelines for clinical genetic screening have been established only in the former category. Genome-wide chromosomal microarray (CMA) can detect genomic imbalances such as copy-number variants (CNVs), but balanced chromosomal abnormalities (BCAs) still require karyotyping for clinical detection. Moreover, submicroscopic BCAs and subarray threshold CNVs are intractable, or cryptic, to both CMA and karyotyping. Here, we performed whole-genome sequencing using large-insert jumping libraries to delineate both cytogenetically visible and cryptic SVs in a single test among 30 clinically referred youth representing a range of severe neuropsychiatric conditions. We detected 96 SVs per person on average that passed filtering criteria above our highest-confidence resolution (6,305 bp) and an additional 111 SVs per genome below this resolution. These SVs rearranged 3.8 Mb of genomic sequence and resulted in 42 putative loss-of-function (LoF) or gain-of-function mutations per person. We estimate that 80% of the LoF variants were cryptic to clinical CMA. We found myriad complex and cryptic rearrangements, including a "paired" duplication (360 kb, 169 kb) that flanks a 5.25 Mb inversion that appears in 7 additional cases from clinical CNV data among 47,562 individuals. Following convergent genomic profiling of these independent clinical CNV data, we interpreted three SVs to be of potential clinical significance. These data indicate that sequence-based delineation of the full SV mutational spectrum warrants exploration in youth referred for neuropsychiatric evaluation and clinical diagnostic SV screening more broadly.
Assuntos
Idade de Início , Aberrações Cromossômicas , Cromossomos Humanos/genética , Variações do Número de Cópias de DNA/genética , Transtornos Mentais/genética , Doenças Neurodegenerativas/genética , Hibridização Genômica Comparativa , Genoma Humano , Humanos , Transtornos Mentais/epidemiologia , Análise em Microsséries , Doenças Neurodegenerativas/epidemiologia , Fenótipo , Estados Unidos/epidemiologiaRESUMO
Natural killer (NK) cells are a promising alternative therapeutic platform to CAR T cells given their favorable safety profile and potent killing ability. However, CAR NK cells suffer from limited persistence in vivo , which is, in part, thought to be the consequence of limited cytokine signaling. To address this challenge, we developed an innovative high-throughput screening strategy to identify CAR endodomains that could drive enhanced persistence while maintaining potent cytotoxicity. We uncovered a family of TRAF-binding endodomains that outperform benchmarks in primary NK cells along dimensions of persistence and cytotoxicity, even in low IL-2 conditions. This work highlights the importance of cell-type-specific cell therapy engineering and unlocks a wide range of high-throughput molecular engineering avenues in NK cells.
RESUMO
Recent spatial gene expression technologies enable comprehensive measurement of transcriptomic profiles while retaining spatial context. However, existing analysis methods do not address the limited resolution of the technology or use the spatial information efficiently. Here, we introduce BayesSpace, a fully Bayesian statistical method that uses the information from spatial neighborhoods for resolution enhancement of spatial transcriptomic data and for clustering analysis. We benchmark BayesSpace against current methods for spatial and non-spatial clustering and show that it improves identification of distinct intra-tissue transcriptional profiles from samples of the brain, melanoma, invasive ductal carcinoma and ovarian adenocarcinoma. Using immunohistochemistry and an in silico dataset constructed from scRNA-seq data, we show that BayesSpace resolves tissue structure that is not detectable at the original resolution and identifies transcriptional heterogeneity inaccessible to histological analysis. Our results illustrate BayesSpace's utility in facilitating the discovery of biological insights from spatial transcriptomic datasets.
Assuntos
Análise de Célula Única , Transcriptoma , Teorema de Bayes , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Transcriptoma/genéticaRESUMO
Structural variants (SVs) contribute to many disorders, yet, functionally annotating them remains a major challenge. Here, we integrate SVs with RNA-sequencing from human post-mortem brains to quantify their dosage and regulatory effects. We show that genic and regulatory SVs exist at significantly lower frequencies than intergenic SVs. Functional impact of copy number variants (CNVs) stems from both the proportion of genic and regulatory content altered and loss-of-function intolerance of the gene. We train a linear model to predict expression effects of rare CNVs and use it to annotate regulatory disruption of CNVs from 14,891 independent genome-sequenced individuals. Pathogenic deletions implicated in neurodevelopmental disorders show significantly more extreme regulatory disruption scores and if rank ordered would be prioritized higher than using frequency or length alone. This work shows the deleteriousness of regulatory SVs, particularly those altering CTCF sites and provides a simple approach for functionally annotating the regulatory consequences of CNVs.
Assuntos
Encéfalo/metabolismo , Variações do Número de Cópias de DNA , Regulação da Expressão Gênica , Variação Genética , Genoma Humano/genética , Autopsia/métodos , Encéfalo/patologia , Feminino , Perfilação da Expressão Gênica/métodos , Humanos , Masculino , Transtornos do Neurodesenvolvimento/genética , Análise de Sequência de RNA/métodosRESUMO
Genomic association studies of common or rare protein-coding variation have established robust statistical approaches to account for multiple testing. Here we present a comparable framework to evaluate rare and de novo noncoding single-nucleotide variants, insertion/deletions, and all classes of structural variation from whole-genome sequencing (WGS). Integrating genomic annotations at the level of nucleotides, genes, and regulatory regions, we define 51,801 annotation categories. Analyses of 519 autism spectrum disorder families did not identify association with any categories after correction for 4,123 effective tests. Without appropriate correction, biologically plausible associations are observed in both cases and controls. Despite excluding previously identified gene-disrupting mutations, coding regions still exhibited the strongest associations. Thus, in autism, the contribution of de novo noncoding variation is probably modest in comparison to that of de novo coding variants. Robust results from future WGS studies will require large cohorts and comprehensive analytical strategies that consider the substantial multiple-testing burden.
Assuntos
Transtorno do Espectro Autista/genética , Predisposição Genética para Doença/genética , Mutação INDEL/genética , Polimorfismo de Nucleotídeo Único/genética , Isoformas de Proteínas/genética , Feminino , Genoma/genética , Estudo de Associação Genômica Ampla/métodos , Humanos , MasculinoRESUMO
BACKGROUND: Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. RESULTS: We sequenced 689 participants with autism spectrum disorder (ASD) and other developmental abnormalities to construct a genome-wide map of large SV. Using long-insert jumping libraries at 105X mean physical coverage and linked-read whole-genome sequencing from 10X Genomics, we document seven major SV classes at ~5 kb SV resolution. Our results encompass 11,735 distinct large SV sites, 38.1% of which are novel and 16.8% of which are balanced or complex. We characterize 16 recurrent subclasses of complex SV (cxSV), revealing that: (1) cxSV are larger and rarer than canonical SV; (2) each genome harbors 14 large cxSV on average; (3) 84.4% of large cxSVs involve inversion; and (4) most large cxSV (93.8%) have not been delineated in previous studies. Rare SVs are more likely to disrupt coding and regulatory non-coding loci, particularly when truncating constrained and disease-associated genes. We also identify multiple cases of catastrophic chromosomal rearrangements known as chromoanagenesis, including somatic chromoanasynthesis, and extreme balanced germline chromothripsis events involving up to 65 breakpoints and 60.6 Mb across four chromosomes, further defining rare categories of extreme cxSV. CONCLUSIONS: These data provide a foundational map of large SV in the morbid human genome and demonstrate a previously underappreciated abundance and diversity of cxSV that should be considered in genomic studies of human disease.