Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
bioRxiv ; 2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38645134

RESUMO

Missense variants can have a range of functional impacts depending on factors such as the specific amino acid substitution and location within the gene. To interpret their deleteriousness, studies have sought to identify regions within genes that are specifically intolerant of missense variation 1-12 . Here, we leverage the patterns of rare missense variation in 125,748 individuals in the Genome Aggregation Database (gnomAD) 13 against a null mutational model to identify transcripts that display regional differences in missense constraint. Missense-depleted regions are enriched for ClinVar 14 pathogenic variants, de novo missense variants from individuals with neurodevelopmental disorders (NDDs) 15,16 , and complex trait heritability. Following ClinGen calibration recommendations for the ACMG/AMP guidelines, we establish that regions with less than 20% of their expected missense variation achieve moderate support for pathogenicity. We create a missense deleteriousness metric (MPC) that incorporates regional constraint and outperforms other deleteriousness scores at stratifying case and control de novo missense variation, with a strong enrichment in NDDs. These results provide additional tools to aid in missense variant interpretation.

3.
Nat Genet ; 56(1): 152-161, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38057443

RESUMO

Recessive diseases arise when both copies of a gene are impacted by a damaging genetic variant. When a patient carries two potentially causal variants in a gene, accurate diagnosis requires determining that these variants occur on different copies of the chromosome (that is, are in trans) rather than on the same copy (that is, in cis). However, current approaches for determining phase, beyond parental testing, are limited in clinical settings. Here we developed a strategy for inferring phase for rare variant pairs within genes, leveraging genotypes observed in the Genome Aggregation Database (v2, n = 125,748 exomes). Our approach estimates phase with 96% accuracy, both in trio data and in patients with Mendelian conditions and presumed causal compound heterozygous variants. We provide a public resource of phasing estimates for coding variants and counts per gene of rare variants in trans that can aid interpretation of rare co-occurring variants in the context of recessive disease.


Assuntos
Exoma , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Exoma/genética , Sequenciamento do Exoma , Genótipo
4.
Nature ; 625(7993): 92-100, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38057664

RESUMO

The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.


Assuntos
Genoma Humano , Genômica , Modelos Genéticos , Mutação , Humanos , Acesso à Informação , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Frequência do Gene , Genoma Humano/genética , Mutação/genética , Seleção Genética
5.
bioRxiv ; 2023 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-36993580

RESUMO

Recessive diseases arise when both the maternal and the paternal copies of a gene are impacted by a damaging genetic variant in the affected individual. When a patient carries two different potentially causal variants in a gene for a given disorder, accurate diagnosis requires determining that these two variants occur on different copies of the chromosome (i.e., are in trans) rather than on the same copy (i.e. in cis). However, current approaches for determining phase, beyond parental testing, are limited in clinical settings. We developed a strategy for inferring phase for rare variant pairs within genes, leveraging genotypes observed in exome sequencing data from the Genome Aggregation Database (gnomAD v2, n=125,748). When applied to trio data where phase can be determined by transmission, our approach estimates phase with 95.7% accuracy and remains accurate even for very rare variants (allele frequency < 1×10-4). We also correctly phase 95.9% of variant pairs in a set of 293 patients with Mendelian conditions carrying presumed causal compound heterozygous variants. We provide a public resource of phasing estimates from gnomAD, including phasing estimates for coding variants across the genome and counts per gene of rare variants in trans, that can aid interpretation of rare co-occurring variants in the context of recessive disease.

6.
Nat Genet ; 54(5): 541-547, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35410376

RESUMO

We report results from the Bipolar Exome (BipEx) collaboration analysis of whole-exome sequencing of 13,933 patients with bipolar disorder (BD) matched with 14,422 controls. We find an excess of ultra-rare protein-truncating variants (PTVs) in patients with BD among genes under strong evolutionary constraint in both major BD subtypes. We find enrichment of ultra-rare PTVs within genes implicated from a recent schizophrenia exome meta-analysis (SCHEMA; 24,248 cases and 97,322 controls) and among binding targets of CHD8. Genes implicated from genome-wide association studies (GWASs) of BD, however, are not significantly enriched for ultra-rare PTVs. Combining gene-level results with SCHEMA, AKAP11 emerges as a definitive risk gene (odds ratio (OR) = 7.06, P = 2.83 × 10-9). At the protein level, AKAP-11 interacts with GSK3B, the hypothesized target of lithium, a primary treatment for BD. Our results lend support to BD's polygenicity, demonstrating a role for rare coding variation as a significant risk factor in BD etiology.


Assuntos
Transtorno Bipolar , Esquizofrenia , Proteínas de Ancoragem à Quinase A/genética , Transtorno Bipolar/genética , Exoma/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Esquizofrenia/genética , Sequenciamento do Exoma
7.
Hum Mutat ; 43(6): 698-707, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35266241

RESUMO

Exome and genome sequencing have become the tools of choice for rare disease diagnosis, leading to large amounts of data available for analyses. To identify causal variants in these datasets, powerful filtering and decision support tools that can be efficiently used by clinicians and researchers are required. To address this need, we developed seqr - an open-source, web-based tool for family-based monogenic disease analysis that allows researchers to work collaboratively to search and annotate genomic callsets. To date, seqr is being used in several research pipelines and one clinical diagnostic lab. In our own experience through the Broad Institute Center for Mendelian Genomics, seqr has enabled analyses of over 10,000 families, supporting the diagnosis of more than 3,800 individuals with rare disease and discovery of over 300 novel disease genes. Here, we describe a framework for genomic analysis in rare disease that leverages seqr's capabilities for variant filtration, annotation, and causal variant identification, as well as support for research collaboration and data sharing. The seqr platform is available as open source software, allowing low-cost participation in rare disease research, and a community effort to support diagnosis and gene discovery in rare disease.


Assuntos
Genômica , Doenças Raras , Exoma , Humanos , Internet , Doenças Raras/diagnóstico , Doenças Raras/genética , Software
8.
Cell Genom ; 2(9): 100168, 2022 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-36778668

RESUMO

Genome-wide association studies have successfully discovered thousands of common variants associated with human diseases and traits, but the landscape of rare variations in human disease has not been explored at scale. Exome-sequencing studies of population biobanks provide an opportunity to systematically evaluate the impact of rare coding variations across a wide range of phenotypes to discover genes and allelic series relevant to human health and disease. Here, we present results from systematic association analyses of 4,529 phenotypes using single-variant and gene tests of 394,841 individuals in the UK Biobank with exome-sequence data. We find that the discovery of genetic associations is tightly linked to frequency and is correlated with metrics of deleteriousness and natural selection. We highlight biological findings elucidated by these data and release the dataset as a public resource alongside the Genebass browser for rapidly exploring rare-variant association results.

9.
Hum Mutat ; 43(8): 1012-1030, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-34859531

RESUMO

Reference population databases are an essential tool in variant and gene interpretation. Their use guides the identification of pathogenic variants amidst the sea of benign variation present in every human genome, and supports the discovery of new disease-gene relationships. The Genome Aggregation Database (gnomAD) is currently the largest and most widely used publicly available collection of population variation from harmonized sequencing data. The data is available through the online gnomAD browser (https://gnomad.broadinstitute.org/) that enables rapid and intuitive variant analysis. This review provides guidance on the content of the gnomAD browser, and its usage for variant and gene interpretation. We introduce key features including allele frequency, per-base expression levels, constraint scores, and variant co-occurrence, alongside guidance on how to use these in analysis, with a focus on the interpretation of candidate variants and novel genes in rare disease.


Assuntos
Doenças Raras , Software , Bases de Dados Genéticas , Frequência do Gene , Humanos , Doenças Raras/genética
14.
Nature ; 581(7809): 444-451, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32461652

RESUMO

Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.


Assuntos
Doença/genética , Variação Genética , Genética Médica/normas , Genética Populacional/normas , Genoma Humano/genética , Feminino , Testes Genéticos , Técnicas de Genotipagem , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Polimorfismo de Nucleotídeo Único/genética , Grupos Raciais/genética , Padrões de Referência , Seleção Genética , Sequenciamento Completo do Genoma
15.
Nature ; 581(7809): 452-458, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32461655

RESUMO

The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the 'proportion expressed across transcripts', which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.


Assuntos
Doença/genética , Haploinsuficiência/genética , Mutação com Perda de Função/genética , Anotação de Sequência Molecular , Transcrição Gênica , Transcriptoma/genética , Transtorno do Espectro Autista/genética , Conjuntos de Dados como Assunto , Deficiências do Desenvolvimento/genética , Éxons/genética , Feminino , Genótipo , Humanos , Deficiência Intelectual/genética , Masculino , Anotação de Sequência Molecular/normas , Distribuição de Poisson , RNA Mensageiro/análise , RNA Mensageiro/genética , Doenças Raras/diagnóstico , Doenças Raras/genética , Reprodutibilidade dos Testes , Sequenciamento do Exoma
16.
Nature ; 581(7809): 434-443, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32461654

RESUMO

Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.


Assuntos
Exoma/genética , Genes Essenciais/genética , Variação Genética/genética , Genoma Humano/genética , Adulto , Encéfalo/metabolismo , Doenças Cardiovasculares/genética , Estudos de Coortes , Bases de Dados Genéticas , Feminino , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Humanos , Mutação com Perda de Função/genética , Masculino , Taxa de Mutação , Pró-Proteína Convertase 9/genética , RNA Mensageiro/genética , Reprodutibilidade dos Testes , Sequenciamento do Exoma , Sequenciamento Completo do Genoma
17.
Nucleic Acids Res ; 45(D1): D840-D845, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899611

RESUMO

Worldwide, hundreds of thousands of humans have had their genomes or exomes sequenced, and access to the resulting data sets can provide valuable information for variant interpretation and understanding gene function. Here, we present a lightweight, flexible browser framework to display large population datasets of genetic variation. We demonstrate its use for exome sequence data from 60 706 individuals in the Exome Aggregation Consortium (ExAC). The ExAC browser provides gene- and transcript-centric displays of variation, a critical view for clinical applications. Additionally, we provide a variant display, which includes population frequency and functional annotation data as well as short read support for the called variant. This browser is open-source, freely available at http://exac.broadinstitute.org, and has already been used extensively by clinical laboratories worldwide.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Exoma , Genômica/métodos , Navegador , Estudo de Associação Genômica Ampla/métodos , Humanos , Software , Interface Usuário-Computador
18.
ACS Chem Biol ; 10(7): 1684-93, 2015 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-25856271

RESUMO

Within a superfamily, functionally diverged metalloenzymes often favor different metals as cofactors for catalysis. One hypothesis is that incorporation of alternative metals expands the catalytic repertoire of metalloenzymes and provides evolutionary springboards toward new catalytic functions. However, there is little experimental evidence that incorporation of alternative metals changes the activity profile of metalloenzymes. Here, we systematically investigate how metals alter the activity profiles of five functionally diverged enzymes of the metallo-ß-lactamase (MBL) superfamily. Each enzyme was reconstituted in vitro with six different metals, Cd(2+), Co(2+), Fe(2+), Mn(2+), Ni(2+), and Zn(2+), and assayed against eight catalytically distinct hydrolytic reactions (representing native functions of MBL enzymes). We reveal that each enzyme metal isoform has a significantly different activity level for native and promiscuous reactions. Moreover, metal preferences for native versus promiscuous activities are not correlated and, in some cases, are mutually exclusive; only particular metal isoforms disclose cryptic promiscuous activities but often at the expense of the native activity. For example, the L1 B3 ß-lactamase displays a 1000-fold catalytic preference for Zn(2+) over Ni(2+) for its native activity but exhibits promiscuous thioester, phosphodiester, phosphotriester, and lactonase activity only with Ni(2+). Furthermore, we find that the five MBL enzymes exist as an ensemble of various metal isoforms in vivo, and this heterogeneity results in an expanded activity profile compared to a single metal isoform. Our study suggests that promiscuous activities of metalloenzymes can stem from an ensemble of metal isoforms in the cell, which could facilitate the functional divergence of metalloenzymes.


Assuntos
Alteromonas/enzimologia , Escherichia coli/enzimologia , Metais/metabolismo , Pseudomonas aeruginosa/enzimologia , Salmonella/enzimologia , beta-Lactamases/metabolismo , Alteromonas/química , Escherichia coli/química , Hidrólise , Metais/química , Modelos Moleculares , Isoformas de Proteínas/química , Isoformas de Proteínas/metabolismo , Pseudomonas aeruginosa/química , Salmonella/química , beta-Lactamases/química
19.
Structure ; 23(3): 571-583, 2015 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-25684576

RESUMO

Mycobacterium tuberculosis (Mtb) uses the ESX-1 type VII secretion system to export virulence proteins across its lipid-rich cell wall, which helps permeabilize the host's macrophage phagosomal membrane, facilitating the escape and cell-to-cell spread of Mtb. ESX-1 membranolytic activity depends on a set of specialized secreted Esp proteins, the structure and specific roles of which are not currently understood. Here, we report the X-ray and electron microscopic structures of the ESX-1-secreted EspB. We demonstrate that EspB adopts a PE/PPE-like fold that mediates oligomerization with apparent heptameric symmetry, generating a barrel-shaped structure with a central pore that we propose contributes to the macrophage killing functions of EspB. Our structural data also reveal unexpected direct interactions between the EspB bipartite secretion signal sequence elements that form a unified aromatic surface. These findings provide insight into how specialized proteins encoded within the ESX-1 locus are targeted for secretion, and for the first time indicate an oligomerization-dependent role for Esp virulence factors.


Assuntos
Proteínas de Bactérias/química , Sistemas de Secreção Bacterianos/química , Mycobacterium smegmatis/química , Mycobacterium tuberculosis/química , Sequência de Aminoácidos , Proteínas de Bactérias/fisiologia , Sistemas de Secreção Bacterianos/fisiologia , Transporte Biológico , Cristalografia por Raios X , Ligação de Hidrogênio , Interações Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Dados de Sequência Molecular , Estrutura Quaternária de Proteína , Estrutura Secundária de Proteína
20.
Proc Natl Acad Sci U S A ; 112(6): E576-85, 2015 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-25624472

RESUMO

Unique to Gram-positive bacteria, wall teichoic acids are anionic glycopolymers cross-stitched to a thick layer of peptidoglycan. The polyol phosphate subunits of these glycopolymers are decorated with GlcNAc sugars that are involved in phage binding, genetic exchange, host antibody response, resistance, and virulence. The search for the enzymes responsible for GlcNAcylation in Staphylococcus aureus has recently identified TarM and TarS with respective α- and ß-(1-4) glycosyltransferase activities. The stereochemistry of the GlcNAc attachment is important in balancing biological processes, such that the interplay of TarM and TarS is likely important for bacterial pathogenicity and survival. Here we present the crystal structure of TarM in an unusual ternary-like complex consisting of a polymeric acceptor substrate analog, UDP from a hydrolyzed donor, and an α-glyceryl-GlcNAc product formed in situ. These structures support an internal nucleophilic substitution-like mechanism, lend new mechanistic insight into the glycosylation of glycopolymers, and reveal a trimerization domain with a likely role in acceptor substrate scaffolding.


Assuntos
Proteínas de Bactérias/química , Proteínas de Bactérias/metabolismo , Parede Celular/enzimologia , Glicosiltransferases/metabolismo , Modelos Moleculares , Staphylococcus aureus/enzimologia , Ácidos Teicoicos/metabolismo , Proteínas de Bactérias/genética , Clonagem Molecular , Cristalização , Estabilidade Enzimática , Glicosiltransferases/química , Glicosiltransferases/genética , Espectrometria de Massas , Metais/análise , Ressonância Magnética Nuclear Biomolecular , Polimerização , Conformação Proteica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA