Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 61
Filtrar
1.
Cell ; 187(9): 2336-2341.e5, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38582080

RESUMO

The Genome Aggregation Database (gnomAD), widely recognized as the gold-standard reference map of human genetic variation, has largely overlooked tandem repeat (TR) expansions, despite the fact that TRs constitute ∼6% of our genome and are linked to over 50 human diseases. Here, we introduce the TR-gnomAD (https://wlcb.oit.uci.edu/TRgnomAD), a biobank-scale reference of 0.86 million TRs derived from 338,963 whole-genome sequencing (WGS) samples of diverse ancestries (39.5% non-European samples). TR-gnomAD offers critical insights into ancestry-specific disease prevalence using disparities in TR unit number frequencies among ancestries. Moreover, TR-gnomAD is able to differentiate between common, presumably benign TR expansions, which are prevalent in TR-gnomAD, from those potentially pathogenic TR expansions, which are found more frequently in disease groups than within TR-gnomAD. Together, TR-gnomAD is an invaluable resource for researchers and physicians to interpret TR expansions in individuals with genetic diseases.


Assuntos
Genoma Humano , Sequências de Repetição em Tandem , Humanos , Sequências de Repetição em Tandem/genética , Sequenciamento Completo do Genoma , Bases de Dados Genéticas , Expansão das Repetições de DNA/genética , Estudo de Associação Genômica Ampla
2.
Am J Hum Genet ; 110(9): 1496-1508, 2023 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-37633279

RESUMO

Predicted loss of function (pLoF) variants are often highly deleterious and play an important role in disease biology, but many pLoF variants may not result in loss of function (LoF). Here we present a framework that advances interpretation of pLoF variants in research and clinical settings by considering three categories of LoF evasion: (1) predicted rescue by secondary sequence properties, (2) uncertain biological relevance, and (3) potential technical artifacts. We also provide recommendations on adjustments to ACMG/AMP guidelines' PVS1 criterion. Applying this framework to all high-confidence pLoF variants in 22 genes associated with autosomal-recessive disease from the Genome Aggregation Database (gnomAD v.2.1.1) revealed predicted LoF evasion or potential artifacts in 27.3% (304/1,113) of variants. The major reasons were location in the last exon, in a homopolymer repeat, in a low proportion expressed across transcripts (pext) scored region, or the presence of cryptic in-frame splice rescues. Variants predicted to evade LoF or to be potential artifacts were enriched for ClinVar benign variants. PVS1 was downgraded in 99.4% (162/163) of pLoF variants predicted as likely not LoF/not LoF, with 17.2% (28/163) downgraded as a result of our framework, adding to previous guidelines. Variant pathogenicity was affected (mostly from likely pathogenic to VUS) in 20 (71.4%) of these 28 variants. This framework guides assessment of pLoF variants beyond standard annotation pipelines and substantially reduces false positive rates, which is key to ensure accurate LoF variant prediction in both a research and clinical setting.


Assuntos
Padrões de Herança , Humanos , Éxons , Incerteza
3.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36528806

RESUMO

Determining the pathogenicity and functional impact (i.e. gain-of-function; GOF or loss-of-function; LOF) of a variant is vital for unraveling the genetic level mechanisms of human diseases. To provide a 'one-stop' framework for the accurate identification of pathogenicity and functional impact of variants, we developed a two-stage deep-learning-based computational solution, termed VPatho, which was trained using a total of 9619 pathogenic GOF/LOF and 138 026 neutral variants curated from various databases. A total number of 138 variant-level, 262 protein-level and 103 genome-level features were extracted for constructing the models of VPatho. The development of VPatho consists of two stages: (i) a random under-sampling multi-scale residual neural network (ResNet) with a newly defined weighted-loss function (RUS-Wg-MSResNet) was proposed to predict variants' pathogenicity on the gnomAD_NV + GOF/LOF dataset; and (ii) an XGBOD model was constructed to predict the functional impact of the given variants. Benchmarking experiments demonstrated that RUS-Wg-MSResNet achieved the highest prediction performance with the weights calculated based on the ratios of neutral versus pathogenic variants. Independent tests showed that both RUS-Wg-MSResNet and XGBOD achieved outstanding performance. Moreover, assessed using variants from the CAGI6 competition, RUS-Wg-MSResNet achieved superior performance compared to state-of-the-art predictors. The fine-trained XGBOD models were further used to blind test the whole LOF data downloaded from gnomAD and accordingly, we identified 31 nonLOF variants that were previously labeled as LOF/uncertain variants. As an implementation of the developed approach, a webserver of VPatho is made publicly available at http://csbio.njust.edu.cn/bioinf/vpatho/ to facilitate community-wide efforts for profiling and prioritizing the query variants with respect to their pathogenicity and functional impact.


Assuntos
Aprendizado Profundo , Humanos , Mutação com Ganho de Função , Genoma
4.
Am J Hum Genet ; 108(7): 1270-1282, 2021 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-34157305

RESUMO

Publicly available genetic summary data have high utility in research and the clinic, including prioritizing putative causal variants, polygenic scoring, and leveraging common controls. However, summarizing individual-level data can mask population structure, resulting in confounding, reduced power, and incorrect prioritization of putative causal variants. This limits the utility of publicly available data, especially for understudied or admixed populations where additional research and resources are most needed. Although several methods exist to estimate ancestry in individual-level data, methods to estimate ancestry proportions in summary data are lacking. Here, we present Summix, a method to efficiently deconvolute ancestry and provide ancestry-adjusted allele frequencies (AFs) from summary data. Using continental reference ancestry, African (AFR), non-Finnish European (EUR), East Asian (EAS), Indigenous American (IAM), South Asian (SAS), we obtain accurate and precise estimates (within 0.1%) for all simulation scenarios. We apply Summix to gnomAD v.2.1 exome and genome groups and subgroups, finding heterogeneous continental ancestry for several groups, including African/African American (∼84% AFR, ∼14% EUR) and American/Latinx (∼4% AFR, ∼5% EAS, ∼43% EUR, ∼46% IAM). Compared to the unadjusted gnomAD AFs, Summix's ancestry-adjusted AFs more closely match respective African and Latinx reference samples. Even on modern, dense panels of summary statistics, Summix yields results in seconds, allowing for estimation of confidence intervals via block bootstrap. Given an accompanying R package, Summix increases the utility and equity of public genetic resources, empowering novel research opportunities.


Assuntos
Interpretação Estatística de Dados , Metagenômica/métodos , Linhagem , Grupos Raciais/genética , Alelos , Simulação por Computador , Frequência do Gene , Humanos , Padrões de Herança , Software
5.
BMC Genomics ; 24(1): 12, 2023 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-36627554

RESUMO

BACKGROUND: COVID-19 caused by the SARS-CoV-2 infection may result in various disease symptoms and severity, ranging from asymptomatic, through mildly symptomatic, up to very severe and even fatal cases. Although environmental, clinical, and social factors play important roles in both susceptibility to the SARS-CoV-2 infection and progress of COVID-19 disease, it is becoming evident that both pathogen and host genetic factors are important too. In this study, we report findings from whole-exome sequencing (WES) of 27 individuals who died due to COVID-19, especially focusing on frequencies of DNA variants in genes previously associated with the SARS-CoV-2 infection and the severity of COVID-19. RESULTS: We selected the risk DNA variants/alleles or target genes using four different approaches: 1) aggregated GWAS results from the GWAS Catalog; 2) selected publications from PubMed; 3) the aggregated results of the Host Genetics Initiative database; and 4) a commercial DNA variant annotation/interpretation tool providing its own knowledgebase. We divided these variants/genes into those reported to influence the susceptibility to the SARS-CoV-2 infection and those influencing the severity of COVID-19. Based on the above, we compared the frequencies of alleles found in the fatal COVID-19 cases to the frequencies identified in two population control datasets (non-Finnish European population from the gnomAD database and genomic frequencies specific for the Slovak population from our own database). When compared to both control population datasets, our analyses indicated a trend of higher frequencies of severe COVID-19 associated risk alleles among fatal COVID-19 cases. This trend reached statistical significance specifically when using the HGI-derived variant list. We also analysed other approaches to WES data evaluation, demonstrating its utility as well as limitations. CONCLUSIONS: Although our results proved the likely involvement of host genetic factors pointed out by previous studies looking into severity of COVID-19 disease, careful considerations of the molecular-testing strategies and the evaluated genomic positions may have a strong impact on the utility of genomic testing.


Assuntos
COVID-19 , Humanos , COVID-19/genética , SARS-CoV-2 , Sequenciamento do Exoma , Alelos , DNA
6.
Am J Hum Genet ; 107(3): 487-498, 2020 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-32800095

RESUMO

The aggregation and joint analysis of large numbers of exome sequences has recently made it possible to derive estimates of intolerance to loss-of-function (LoF) variation for human genes. Here, we demonstrate strong and widespread coupling between genic LoF intolerance and promoter CpG density across the human genome. Genes downstream of the most CpG-rich promoters (top 10% CpG density) have a 67.2% probability of being highly LoF intolerant, using the LOEUF metric from gnomAD. This is in contrast to 7.4% of genes downstream of the most CpG-poor (bottom 10% CpG density) promoters. Combining promoter CpG density with exonic and promoter conservation explains 33.4% of the variation in LOEUF, and the contribution of CpG density exceeds the individual contributions of exonic and promoter conservation. We leverage this to train a simple and easily interpretable predictive model that outperforms other existing predictors and allows us to classify 1,760 genes-which are currently unascertained in gnomAD-as highly LoF intolerant or not. These predictions have the potential to aid in the interpretation of novel variants in the clinical setting. Moreover, our results reveal that high CpG density is not merely a generic feature of human promoters but is preferentially encountered at the promoters of the most selectively constrained genes, calling into question the prevailing view that CpG islands are not subject to selection.


Assuntos
Ilhas de CpG/genética , Genoma Humano/genética , Mutação com Perda de Função/genética , Regiões Promotoras Genéticas/genética , Metilação de DNA/genética , Éxons/genética , Humanos , RNA Polimerase II/genética , Sítio de Iniciação de Transcrição
7.
Hum Mutat ; 43(8): 1012-1030, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-34859531

RESUMO

Reference population databases are an essential tool in variant and gene interpretation. Their use guides the identification of pathogenic variants amidst the sea of benign variation present in every human genome, and supports the discovery of new disease-gene relationships. The Genome Aggregation Database (gnomAD) is currently the largest and most widely used publicly available collection of population variation from harmonized sequencing data. The data is available through the online gnomAD browser (https://gnomad.broadinstitute.org/) that enables rapid and intuitive variant analysis. This review provides guidance on the content of the gnomAD browser, and its usage for variant and gene interpretation. We introduce key features including allele frequency, per-base expression levels, constraint scores, and variant co-occurrence, alongside guidance on how to use these in analysis, with a focus on the interpretation of candidate variants and novel genes in rare disease.


Assuntos
Doenças Raras , Software , Bases de Dados Genéticas , Frequência do Gene , Humanos , Doenças Raras/genética
8.
Hum Mutat ; 42(8): 903-946, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34082484

RESUMO

Rare variants of the olfactomedin domain of myocilin are considered causative for inherited, early-onset open-angle glaucoma, with a misfolding toxic gain-of-function pathogenic mechanism detailed by 20 years of laboratory research. Myocilin variants are documented in the scientific literature and identified through large-scale genetic sequencing projects such as those curated in the Genome Aggregation Database (gnomAD). In the absence of key clinical and laboratory information, however, the pathogenicity of any given variant is not clear, because glaucoma is a heterogeneous and prevalent age-onset disease, and common variants are likely benign. In this review, we reevaluate the likelihood of pathogenicity for the ~100 nonsynonymous missense, insertion-deletion, and premature termination of myocilin olfactomedin variants documented in the literature. We integrate available clinical, laboratory cellular, biochemical and biophysical data, the olfactomedin domain structure, and population genetics data from gnomAD. Of the variants inspected, ~50% can be binned based on a preponderance of data, leaving many of uncertain pathogenicity that motivate additional studies. Ultimately, the approach of combining metrics from different disciplines will likely resolve outstanding complexities regarding the role of this misfolding-prone protein within the context of a multifactorial and prevalent ocular disease, and pave the way for new precision medicine therapeutics.


Assuntos
Glaucoma de Ângulo Aberto , Glaucoma , Proteínas do Citoesqueleto , Proteínas do Olho/química , Proteínas do Olho/genética , Glaucoma/genética , Glaucoma de Ângulo Aberto/genética , Glaucoma de Ângulo Aberto/metabolismo , Glicoproteínas , Humanos , Mutação , Virulência
9.
Hum Mutat ; 42(5): 530-536, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33600021

RESUMO

Aggregate population genomics data from large cohorts are vital for assessing germline variant pathogenicity. However, there are no specifications on how sequencing quality metrics should be considered, and whether exome-derived and genome-derived allele frequencies should be considered in isolation. Germline genome sequence data were simulated for nine read-depths to identify a minimum acceptable read-depth for detecting variants. gnomAD exome-derived and genome-derived datasets were assessed for read-depth, for six key cancer genes selected for variant curation by ClinGen expert panels. Non-Finnish European allele frequency (AF) or filter AF of coding variants in these genes, assigned into frequency bins using modified ACMG-AMP criteria, was compared between exome-derived and genome-derived datasets. A 30X read-depth achieved acceptable precision and recall for detection of substitutions, but poor recall for small insertions/deletions. Exome-derived and genome-derived datasets exhibited low read-depth for different gene exons. Individual variants were mostly assigned to non-divergent AF bins (>95%) or filter AF bins (>97%). Two major bin divergences were resolved by applying the minimal acceptable read-depth threshold. These findings show the importance of assessing read-depth separately for population datasets sourced from different short-read sequencing technologies before assigning a frequency-based ACMG-AMP classification code for variant interpretation.


Assuntos
Genoma Humano , Neoplasias , Frequência do Gene , Testes Genéticos , Variação Genética , Genômica , Células Germinativas , Humanos , Neoplasias/genética
10.
Hum Mutat ; 42(9): 1107-1123, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34153149

RESUMO

Next-generation sequencing technology has afforded the discovery of many novel variants that are of significance to inheritable pharmacogenomics (PGx) traits but a large proportion of them have unknown consequences. These include missense variants resulting in single amino acid substitutions in cytochrome P450 (CYP) proteins that can impair enzyme function, leading to altered drug efficacy and toxicity. While most unknown variants are rare, an overlooked minority are variants that are collectively rare but enriched in specific populations. Here, we analyzed sequence variation data in 141,456 individuals from across eight study populations in gnomAD for 38 CYP genes to identify such variants in addition to common variants. By further comparison with data from two PGx-specific databases (PharmVar and PharmGKB) and ClinVar, we identified 234 missense variants in 35 CYP genes, of which 107 were unknown to these databases. Most unknown variants (n = 83) were population-specific common variants and several (n = 7) were found in important CYP pharmacogenes (CYP2D6, CYP4F2, and CYP2C19). Overall, 29% (n = 31) of 107 unknown variants were predicted to affect CYP enzyme function although further biochemical characterization is necessary. These variants may elucidate part of the unexplained interpopulation differences observed in drug response.


Assuntos
Citocromo P-450 CYP2D6 , Sistema Enzimático do Citocromo P-450 , Citocromo P-450 CYP2D6/genética , Sistema Enzimático do Citocromo P-450/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Farmacogenética/métodos , Fenótipo
11.
Am J Hum Genet ; 102(3): 415-426, 2018 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-29455857

RESUMO

The spatial distribution of genetic variation within proteins is shaped by evolutionary constraint and provides insight into the functional importance of protein regions and the potential pathogenicity of protein alterations. Here, we comprehensively evaluate the 3D spatial patterns of human germline and somatic variation in 6,604 experimentally derived protein structures and 33,144 computationally derived homology models covering 77% of all human proteins. Using a systematic approach, we quantify differences in the spatial distributions of neutral germline variants, disease-causing germline variants, and recurrent somatic variants. Neutral missense variants exhibit a general trend toward spatial dispersion, which is driven by constraint on core residues. In contrast, germline disease-causing variants are generally clustered in protein structures and form clusters more frequently than recurrent somatic variants identified from tumor sequencing. In total, we identify 215 proteins with significant spatial constraints on the distribution of disease-causing missense variants in experimentally derived protein structures, only 65 (30%) of which have been previously reported. This analysis identifies many clusters not detectable from sequence information alone; only 12% of proteins with significant clustering in 3D were identified from similar analyses of linear protein sequence. Furthermore, spatial analyses of mutations in homology-based structural models are highly correlated with those from experimentally derived structures, supporting the use of computationally derived models. Our approach highlights significant differences in the spatial constraints on different classes of mutations in protein structure and identifies regions of potential function within individual proteins.


Assuntos
Mutação de Sentido Incorreto/genética , Proteínas/química , Proteínas/genética , Sequência de Aminoácidos , Análise por Conglomerados , Humanos , Modelos Moleculares
12.
Hum Mutat ; 41(1): 81-102, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31553106

RESUMO

Massive parallel sequencing technologies are facilitating the faster identification of sequence variants with the consequent capability of untangling the molecular bases of many human genetic syndromes. However, it is not always easy to understand the impact of novel variants, especially for missense changes, which can lead to a spectrum of phenotypes. This study presents a custom-designed multistep methodology to evaluate the impact of novel variants aggregated in the genome aggregation database for the HBB, HBA2, and HBA1 genes, by testing and improving its performance with a dataset of previously described alterations affecting those same genes. This approach scored high sensitivity and specificity values and showed an overall better performance than sequence-derived predictors, highlighting the importance of protein conformation and interaction specific analyses in curating variant databases. This study also describes the strengths and limitations of these structural studies and allows identifying residues in the globin chains more prone to tolerate substitutions.


Assuntos
Biologia Computacional , Bases de Dados Genéticas , Variação Genética , Hemoglobinas/genética , Alelos , Substituição de Aminoácidos , Biologia Computacional/métodos , Biologia Computacional/normas , Genótipo , Hemoglobinas/química , Humanos , Mutação com Perda de Função , Mutação , Fases de Leitura Aberta , Fenótipo , Sensibilidade e Especificidade , alfa-Globinas/química , alfa-Globinas/genética , Globinas beta/química , Globinas beta/genética
13.
Hum Mutat ; 41(9): 1629-1644, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32598555

RESUMO

Genetic variation of the multi-zinc finger BTB domain transcription factor ZBTB18 can cause a spectrum of human neurodevelopmental disorders, but the underlying mechanisms are not well understood. Recently, we reported that pathogenic, de novo ZBTB18 missense mutations alter its DNA-binding specificity and gene regulatory functions, leading to human neurodevelopmental disease. However, the functional impact of the general population ZBTB18 missense variants is unknown. Here, we investigated such variants documented in the Genome Aggregation Database (gnomAD) to discover that ZBTB gene family members are intolerant to loss-of-function and missense mutations, but not synonymous mutations. We studied ZBTB18 as a protein-DNA complex to find that general population missense variants are rare, and disproportionately map to non-DNA-contact residues, in contrast to the majority of disease-associated variants that map to DNA-contact residues, essential to motif binding. We studied a selection of variants (n = 12), which spans the multi-zinc finger region to find 58.3% (7/12) of variants displayed altered DNA binding, 41.6% (5/12) exhibited altered transcriptional activity in a luciferase reporter assay, 33.3% (4/12) exhibited altered DNA binding and transcriptional activity, whereas 33.3% (4/12) displayed a negligible functional impact. Our results demonstrate that general population variants, while rare, can influence ZBTB18 function, with potential consequences for neurodevelopment, homeostasis, and disease.


Assuntos
Proteínas de Ligação a DNA/genética , Mutação de Sentido Incorreto , Proteínas Repressoras/genética , Regulação da Expressão Gênica , Frequência do Gene , Genética Populacional , Células HEK293 , Humanos , Estrutura Terciária de Proteína , Dedos de Zinco
14.
Ann Hum Genet ; 84(6): 463-468, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-32484936

RESUMO

The complexity in the molecular diagnosis of Cystic Fibrosis (CF) also depends on the variable prevalence/incidence of the disease associated with the wide CFTR allelic heterogeneity among different populations. In fact, CF incidence in Asian and African countries is underestimated and the few patients reported so far have rare or unique CFTR pathogenic variants. To obtain insights into CF variants profile and frequency, we used the large population sequencing data in the Genome Aggregation Database (gnomAD). We selected 207 CF-causing/varying clinical consequence variants from CFTR2 database and additional 15 variants submitted to the ClinVar database. Only 14 of these variants were found in the East-Asian population, while for South-Asian and African populations we identified 43 and 52 variants, respectively, confirming the peculiarity of the CFTR allelic spectrum with only few population-specific variants. These data could be used to optimize CFTR carrier screening in non-Caucasian subjects, choosing between the full gene sequencing and cost and time-effective targeted panels.


Assuntos
Povo Asiático/genética , População Negra/genética , Regulador de Condutância Transmembrana em Fibrose Cística/genética , Fibrose Cística/genética , Fibrose Cística/patologia , Bases de Dados Genéticas , Mutação , Alelos , Triagem de Portadores Genéticos , Humanos , Prognóstico
15.
Hum Genomics ; 13(1): 61, 2019 12 03.
Artigo em Inglês | MEDLINE | ID: mdl-31796115

RESUMO

Retinoic acid (RA) is a potent morphogen required for embryonic development. RA is formed in a multistep process from vitamin A (retinol); RA acts in a paracrine fashion to shape the developing eye and is essential for normal optic vesicle and anterior segment formation. Perturbation in RA-signaling can result in severe ocular developmental diseases-including microphthalmia, anophthalmia, and coloboma. RA-signaling is also essential for embryonic development and life, as indicated by the significant consequences of mutations in genes involved in RA-signaling. The requirement of RA-signaling for normal development is further supported by the manifestation of severe pathologies in animal models of RA deficiency-such as ventral lens rotation, failure of optic cup formation, and embryonic and postnatal lethality. In this review, we summarize RA-signaling, recent advances in our understanding of this pathway in eye development, and the requirement of RA-signaling for embryonic development (e.g., organogenesis and limb bud development) and life.


Assuntos
Olho/metabolismo , Transdução de Sinais/genética , Tretinoína/metabolismo , Animais , Olho/embriologia , Regulação da Expressão Gênica , Humanos , Fenótipo
16.
Int J Mol Sci ; 21(7)2020 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-32225115

RESUMO

Although several pharmacogenetic (PGx) predispositions affecting drug efficacy and safety are well established, drug selection and dosing as well as clinical trials are often performed in a non-pharmacogenetically-stratified manner, ultimately burdening healthcare systems. Pre-emptive PGx testing offers a solution which is often performed using microarrays or targeted gene panels, testing for common/known PGx variants. However, as an added value, whole-genome sequencing (WGS) could detect not only disease-causing but also pharmacogenetically-relevant variants in a single assay. Here, we present our WGS-based pipeline that extends the genetic testing of Mendelian diseases with PGx profiling, enabling the detection of rare/novel PGx variants as well. From our in-house WGS (PCR-free 60× PE150) data of 547 individuals we extracted PGx variants with drug-dosing recommendations of the Dutch Pharmacogenetics Working Group (DPWG). Furthermore, we explored the landscape of DPWG pharmacogenes in gnomAD and our in-house cohort as well as compared bioinformatic tools for WGS-based structural variant detection in CYP2D6. We show that although common/known PGx variants comprise the vast majority of detected DPWG pharmacogene alleles, for better precision medicine, PGx testing should move towards WGS-based approaches. Indeed, WGS-based PGx profiling is not only feasible and future-oriented but also the most comprehensive all-in-one approach without generating significant additional costs.


Assuntos
Testes Genéticos/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Variantes Farmacogenômicos , Sequenciamento Completo do Genoma/métodos , Citocromo P-450 CYP2D6/genética , Testes Genéticos/normas , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Sequenciamento Completo do Genoma/normas
17.
Hum Mutat ; 40(1): 97-105, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30352134

RESUMO

Reports of variable cancer penetrance in Li-Fraumeni syndrome (LFS) have raised questions regarding the prevalence of pathogenic germline TP53 variants. We previously reported higher-than-expected population prevalence estimates in sequencing databases composed of individuals unselected for cancer history. This study aimed to expand and further evaluate the prevalence of pathogenic and likely pathogenic germline TP53 variants in the gnomAD dataset (version r2.0.2, n = 138,632). Variants were selected and classified based on our previously published algorithm and compared with alternative estimates based on three different classification databases: ClinVar, HGMD, and the UMD_TP53 database. Conservative prevalence estimates of pathogenic and likely pathogenic TP53 variants were within the range of one carrier in 3,555-5,476 individuals. Less stringent classification increased the approximate prevalence to one carrier in every 400-865 individuals, mainly due to the inclusion of the controvertible p.N235S, p.V31I, and p.R290H variants. This study shows a higher-than-expected population prevalence of pathogenic and likely pathogenic germline TP53 variants even with the most conservative estimates. However, these estimates may not necessarily reflect the prevalence of the classical LFS phenotype, which is based upon family history of cancer. Comprehensive approaches are needed to better understand the interplay of germline TP53 variant classification, prevalence estimates, cancer penetrance, and LFS-associated phenotype.


Assuntos
Bases de Dados Genéticas , Genética Populacional , Mutação em Linhagem Germinativa/genética , Anotação de Sequência Molecular , Proteína Supressora de Tumor p53/genética , Humanos , Pessoa de Meia-Idade
18.
Hum Mutat ; 40(8): 1030-1038, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31116477

RESUMO

The growing availability of human genetic variation has given rise to novel methods of measuring genetic tolerance that better interpret variants of unknown significance. We recently developed a concept based on protein domain homology in the human genome to improve variant interpretation. For this purpose, we mapped population variation from the Exome Aggregation Consortium (ExAC) and pathogenic mutations from the Human Gene Mutation Database (HGMD) onto Pfam protein domains. The aggregation of these variation data across homologous domains into meta-domains allowed us to generate amino acid resolution of genetic intolerance profiles for human protein domains. Here, we developed MetaDome, a fast and easy-to-use web server that visualizes meta-domain information and gene-wide profiles of genetic tolerance. We updated the underlying data of MetaDome to contain information from 56,319 human transcripts, 71,419 protein domains, 12,164,292 genetic variants from gnomAD, and 34,076 pathogenic mutations from ClinVar. MetaDome allows researchers to easily investigate their variants of interest for the presence or absence of variation at corresponding positions within homologous domains. We illustrate the added value of MetaDome by an example that highlights how it may help in the interpretation of variants of unknown significance. The MetaDome web server is freely accessible at https://stuart.radboudumc.nl/metadome.


Assuntos
Biologia Computacional/métodos , Variação Genética , Proteínas/química , Proteínas/genética , Bases de Dados Genéticas , Predisposição Genética para Doença , Genoma Humano , Humanos , Internet , Domínios Proteicos , Software , Homologia Estrutural de Proteína
19.
Hum Mutat ; 40(5): 516-524, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30720243

RESUMO

The 1,000 genome project, the Exome Aggregation Consortium (ExAC) or the Genome Aggregation database (gnomAD) datasets, were developed to provide large-scale reference data of genetic variations for various populations to filter out common benign variants and identify rare variants of clinical importance based on their frequency in the human population. Using a TP53 repository of 80,000 cancer variants, as well as TP53 variants from multiple cancer genome projects, we have defined a set of certified oncogenic TP53 variants. This specific set has been independently validated by functional and in silico predictive analysis. Here we show that a significant number of these variants are included in gnomAD and ExAC. Most of them correspond to TP53 hotspot variants occurring as somatic and germline events in human cancer. Similarly, disease-associated variants for five other tumor suppressor genes, including BRCA1, BRCA2, APC, PTEN, and MLH1, have also been identified. This study demonstrates that germline TP53 variants in the human population are more frequent than previously thought. Furthermore, population databases such as gnomAD or ExAC must be used with caution and need to be annotated for the presence of oncogenic variants to improve their clinical utility.


Assuntos
Bases de Dados Genéticas , Predisposição Genética para Doença , Variação Genética , Neoplasias/genética , Proteína Supressora de Tumor p53/genética , Alelos , Estudos de Associação Genética , Genótipo , Mutação em Linhagem Germinativa , Humanos , Proteínas Supressoras de Tumor/genética
20.
Clin Genet ; 96(6): 506-514, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31402444

RESUMO

Arrhythmogenic right ventricular cardiomyopathy (ARVC) is one of the most common causes of sudden cardiac death in young people. Patients diagnosed with ARVC may experience increased likelihood of development of anxiety and depression, emphasizing the need for accurate diagnosis. To assist future genetic diagnosis and avoidance of misdiagnosis, we evaluated the reported monogenic disease-causing variants in ARVD/C Genetic Variants Database, Human Gene Mutation Database, and ClinVar. Within the aforementioned databases, 630 monogenic disease-causing variants from 18 genes were identified. In the genome Aggregation Database, 226 of these were identified; 68 of which were found at greater than expected prevalence. Furthermore, 37/226 genetic variants were identified amongst the 409 000 UK biobank participants, 23 were not associated with ARVC. Among the 14 remaining variants, 13 were previously found with greater than expected prevalence for a monogenic variant. Nevertheless, they were associated with serious cardiac phenotypes, suggesting that these 13 variants may be disease-modifiers of ARVC, rather than monogenic disease-causing. In summary, more than 10% of variants previously reported to cause ARVC were found unlikely to be associated with highly penetrant monogenic forms of ARVC. Notably, all variants in OBSCN and MYBPC3 were found, making these unlikely to be monogenic causes of ARVC.


Assuntos
Displasia Arritmogênica Ventricular Direita/genética , Variação Genética , Proteômica , Estudos de Coortes , Bases de Dados Genéticas , Genoma Humano , Humanos , Miocárdio/patologia , Fenótipo , Prevalência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA