Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
1.
Mol Cell ; 83(15): 2792-2809.e9, 2023 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-37478847

RESUMO

To maintain genome integrity, cells must accurately duplicate their genome and repair DNA lesions when they occur. To uncover genes that suppress DNA damage in human cells, we undertook flow-cytometry-based CRISPR-Cas9 screens that monitored DNA damage. We identified 160 genes whose mutation caused spontaneous DNA damage, a list enriched in essential genes, highlighting the importance of genomic integrity for cellular fitness. We also identified 227 genes whose mutation caused DNA damage in replication-perturbed cells. Among the genes characterized, we discovered that deoxyribose-phosphate aldolase DERA suppresses DNA damage caused by cytarabine (Ara-C) and that GNB1L, a gene implicated in 22q11.2 syndrome, promotes biogenesis of ATR and related phosphatidylinositol 3-kinase-related kinases (PIKKs). These results implicate defective PIKK biogenesis as a cause of some phenotypes associated with 22q11.2 syndrome. The phenotypic mapping of genes that suppress DNA damage therefore provides a rich resource to probe the cellular pathways that influence genome maintenance.


Assuntos
Sistemas CRISPR-Cas , Dano ao DNA , Humanos , Mutação , Reparo do DNA , Fenótipo
2.
Cell ; 161(3): 647-660, 2015 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-25910212

RESUMO

How disease-associated mutations impair protein activities in the context of biological networks remains mostly undetermined. Although a few renowned alleles are well characterized, functional information is missing for over 100,000 disease-associated variants. Here we functionally profile several thousand missense mutations across a spectrum of Mendelian disorders using various interaction assays. The majority of disease-associated alleles exhibit wild-type chaperone binding profiles, suggesting they preserve protein folding or stability. While common variants from healthy individuals rarely affect interactions, two-thirds of disease-associated alleles perturb protein-protein interactions, with half corresponding to "edgetic" alleles affecting only a subset of interactions while leaving most other interactions unperturbed. With transcription factors, many alleles that leave protein-protein interactions intact affect DNA binding. Different mutations in the same gene leading to different interaction profiles often result in distinct disease phenotypes. Thus disease-associated alleles that perturb distinct protein activities rather than grossly affecting folding and stability are relatively widespread.


Assuntos
Doença/genética , Mutação de Sentido Incorreto , Mapas de Interação de Proteínas , Proteínas/genética , Proteínas/metabolismo , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Estudo de Associação Genômica Ampla , Humanos , Fases de Leitura Aberta , Dobramento de Proteína , Estabilidade Proteica
3.
Am J Hum Genet ; 110(10): 1769-1786, 2023 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-37729906

RESUMO

Defects in hydroxymethylbilane synthase (HMBS) can cause acute intermittent porphyria (AIP), an acute neurological disease. Although sequencing-based diagnosis can be definitive, ∼⅓ of clinical HMBS variants are missense variants, and most clinically reported HMBS missense variants are designated as "variants of uncertain significance" (VUSs). Using saturation mutagenesis, en masse selection, and sequencing, we applied a multiplexed validated assay to both the erythroid-specific and ubiquitous isoforms of HMBS, obtaining confident functional impact scores for >84% of all possible amino acid substitutions. The resulting variant effect maps generally agreed with biochemical expectations and provide further evidence that HMBS can function as a monomer. Additionally, the maps implicated specific residues as having roles in active site dynamics, which was further supported by molecular dynamics simulations. Most importantly, these maps can help discriminate pathogenic from benign HMBS variants, proactively providing evidence even for yet-to-be-observed clinical missense variants.


Assuntos
Hidroximetilbilano Sintase , Porfiria Aguda Intermitente , Humanos , Hidroximetilbilano Sintase/química , Hidroximetilbilano Sintase/genética , Hidroximetilbilano Sintase/metabolismo , Mutação de Sentido Incorreto/genética , Porfiria Aguda Intermitente/diagnóstico , Porfiria Aguda Intermitente/genética , Substituição de Aminoácidos , Simulação de Dinâmica Molecular
4.
Circulation ; 2024 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-39315434

RESUMO

BACKGROUND: Long QT syndrome is a lethal arrhythmia syndrome, frequently caused by rare loss-of-function variants in the potassium channel encoded by KCNH2. Variant classification is difficult, often because of lack of functional data. Moreover, variant-based risk stratification is also complicated by heterogenous clinical data and incomplete penetrance. Here we sought to test whether variant-specific information, primarily from high-throughput functional assays, could improve both classification and cardiac event risk stratification in a large, harmonized cohort of KCNH2 missense variant heterozygotes. METHODS: We quantified cell-surface trafficking of 18 796 variants in KCNH2 using a multiplexed assay of variant effect (MAVE). We recorded KCNH2 current density for 533 variants by automated patch clamping. We calibrated the strength of evidence of MAVE data according to ClinGen guidelines. We deeply phenotyped 1458 patients with KCNH2 missense variants, including QTc, cardiac event history, and mortality. We correlated variant functional data and Bayesian long QT syndrome penetrance estimates with cohort phenotypes and assessed hazard ratios for cardiac events. RESULTS: Variant MAVE trafficking scores and automated patch clamping peak tail currents were highly correlated (Spearman rank-order ρ=0.69; n=433). The MAVE data were found to provide up to pathogenic very strong evidence for severe loss-of-function variants. In the cohort, both functional assays and Bayesian long QT syndrome penetrance estimates were significantly predictive of cardiac events when independently modeled with patient sex and adjusted QT interval (QTc); however, MAVE data became nonsignificant when peak tail current and penetrance estimates were also available. The area under the receiver operator characteristic curve for 20-year event outcomes based on patient-specific sex and QTc (area under the curve, 0.80 [0.76-0.83]) was improved with prospectively available penetrance scores conditioned on MAVE (area under the curve, 0.86 [0.83-0.89]) or attainable automated patch clamping peak tail current data (area under the curve, 0.84 [0.81-0.88]). CONCLUSIONS: High-throughput KCNH2 variant MAVE data meaningfully contribute to variant classification at scale, whereas long QT syndrome penetrance estimates and automated patch clamping peak tail current measurements meaningfully contribute to risk stratification of cardiac events in patients with heterozygous KCNH2 missense variants.

5.
Bioinformatics ; 40(4)2024 03 29.
Artigo em Inglês | MEDLINE | ID: mdl-38569896

RESUMO

MOTIVATION: Long-read sequencing technologies, an attractive solution for many applications, often suffer from higher error rates. Alignment of multiple reads can improve base-calling accuracy, but some applications, e.g. sequencing mutagenized libraries where multiple distinct clones differ by one or few variants, require the use of barcodes or unique molecular identifiers. Unfortunately, sequencing errors can interfere with correct barcode identification, and a given barcode sequence may be linked to multiple independent clones within a given library. RESULTS: Here we focus on the target application of sequencing mutagenized libraries in the context of multiplexed assays of variant effects (MAVEs). MAVEs are increasingly used to create comprehensive genotype-phenotype maps that can aid clinical variant interpretation. Many MAVE methods use long-read sequencing of barcoded mutant libraries for accurate association of barcode with genotype. Existing long-read sequencing pipelines do not account for inaccurate sequencing or nonunique barcodes. Here, we describe Pacybara, which handles these issues by clustering long reads based on the similarities of (error-prone) barcodes while also detecting barcodes that have been associated with multiple genotypes. Pacybara also detects recombinant (chimeric) clones and reduces false positive indel calls. In three example applications, we show that Pacybara identifies and correctly resolves these issues. AVAILABILITY AND IMPLEMENTATION: Pacybara, freely available at https://github.com/rothlab/pacybara, is implemented using R, Python, and bash for Linux. It runs on GNU/Linux HPC clusters via Slurm, PBS, or GridEngine schedulers. A single-machine simplex version is also available.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Biblioteca Gênica , Genótipo , Análise por Conglomerados
6.
Am J Hum Genet ; 108(10): 1891-1906, 2021 10 07.
Artigo em Inglês | MEDLINE | ID: mdl-34551312

RESUMO

The success of personalized genomic medicine depends on our ability to assess the pathogenicity of rare human variants, including the important class of missense variation. There are many challenges in training accurate computational systems, e.g., in finding the balance between quantity, quality, and bias in the variant sets used as training examples and avoiding predictive features that can accentuate the effects of bias. Here, we describe VARITY, which judiciously exploits a larger reservoir of training examples with uncertain accuracy and representativity. To limit circularity and bias, VARITY excludes features informed by variant annotation and protein identity. To provide a rationale for each prediction, we quantified the contribution of features and feature combinations to the pathogenicity inference of each variant. VARITY outperformed all previous computational methods evaluated, identifying at least 10% more pathogenic variants at thresholds achieving high (90% precision) stringency.


Assuntos
Algoritmos , Biologia Computacional/normas , Doença/etiologia , Mutação de Sentido Incorreto , Predisposição Genética para Doença , Humanos , Fenótipo , Medicina de Precisão , Software
7.
Am J Hum Genet ; 108(7): 1283-1300, 2021 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-34214447

RESUMO

Most rare clinical missense variants cannot currently be classified as pathogenic or benign. Deficiency in human 5,10-methylenetetrahydrofolate reductase (MTHFR), the most common inherited disorder of folate metabolism, is caused primarily by rare missense variants. Further complicating variant interpretation, variant impacts often depend on environment. An important example of this phenomenon is the MTHFR variant p.Ala222Val (c.665C>T), which is carried by half of all humans and has a phenotypic impact that depends on dietary folate. Here we describe the results of 98,336 variant functional-impact assays, covering nearly all possible MTHFR amino acid substitutions in four folinate environments, each in the presence and absence of p.Ala222Val. The resulting atlas of MTHFR variant effects reveals many complex dependencies on both folinate and p.Ala222Val. MTHFR atlas scores can distinguish pathogenic from benign variants and, among individuals with severe MTHFR deficiency, correlate with age of disease onset. Providing a powerful tool for understanding structure-function relationships, the atlas suggests a role for a disordered loop in retaining cofactor at the active site and identifies variants that enable escape of inhibition by S-adenosylmethionine. Thus, a model based on eight MTHFR variant effect maps illustrates how shifting landscapes of environment- and genetic-background-dependent missense variation can inform our clinical, structural, and functional understanding of MTHFR deficiency.


Assuntos
Metilenotetra-Hidrofolato Redutase (NADPH2)/genética , Mutação de Sentido Incorreto , Substituição de Aminoácidos , Análise Mutacional de DNA , Diploide , Biblioteca Gênica , Genótipo , Humanos , Metilenotetra-Hidrofolato Redutase (NADPH2)/deficiência , Metilenotetra-Hidrofolato Redutase (NADPH2)/fisiologia , Saccharomyces cerevisiae/genética
8.
Bioinformatics ; 37(19): 3382-3383, 2021 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-33774657

RESUMO

SUMMARY: Multiplexed assays of variant effect (MAVEs) are capable of experimentally testing all possible single nucleotide or amino acid variants in selected genomic regions, generating 'variant effect maps', which provide biochemical insight and functional evidence to enable more rapid and accurate clinical interpretation of human variation. Because the international community applying MAVE approaches is growing rapidly, we developed the online MaveRegistry platform to catalyze collaboration, reduce redundant efforts, allow stakeholders to nominate targets and enable tracking and sharing of progress on ongoing MAVE projects. AVAILABILITY AND IMPLEMENTATION: MaveRegistry service: https://registry.varianteffect.org. MaveRegistry source code: https://github.com/kvnkuang/maveregistry-front-end.

9.
Bioinformatics ; 36(22-23): 5448-5455, 2021 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-33300982

RESUMO

MOTIVATION: When rare missense variants are clinically interpreted as to their pathogenicity, most are classified as variants of uncertain significance (VUS). Although functional assays can provide strong evidence for variant classification, such results are generally unavailable. Multiplexed assays of variant effect can generate experimental 'variant effect maps' that score nearly all possible missense variants in selected protein targets for their impact on protein function. However, these efforts have not always prioritized proteins for which variant effect maps would have the greatest impact on clinical variant interpretation. RESULTS: Here, we mined databases of clinically interpreted variants and applied three strategies, each building on the previous, to prioritize genes for systematic functional testing of missense variation. The strategies ranked genes (i) by the number of unique missense VUS that had been reported to ClinVar; (ii) by movability- and reappearance-weighted impact scores, to give extra weight to reappearing, movable VUS and (iii) by difficulty-adjusted impact scores, to account for the more resource-intensive nature of generating variant effect maps for longer genes. Our results could be used to guide systematic functional testing of missense variation toward greater impact on clinical variant interpretation. AVAILABILITY AND IMPLEMENTATION: Source code available at: https://github.com/rothlab/mave-gene-prioritization. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Mutação de Sentido Incorreto , Proteínas
10.
Bioinformatics ; 36(12): 3938-3940, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32251504

RESUMO

SUMMARY: Fully realizing the promise of personalized medicine will require rapid and accurate classification of pathogenic human variation. Multiplexed assays of variant effect (MAVEs) can experimentally test nearly all possible variants in selected gene targets. Planning a MAVE study involves identifying target genes with clinical impact, and identifying scalable functional assays for that target. Here, we describe MaveQuest, a web-based resource enabling systematic variant effect mapping studies by identifying potential functional assays, disease phenotypes and clinical relevance for nearly all human protein-coding genes. AVAILABILITY AND IMPLEMENTATION: MaveQuest service: https://mavequest.varianteffect.org/. MaveQuest source code: https://github.com/kvnkuang/mavequest-front-end/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Humanos , Fenótipo
11.
Mol Syst Biol ; 16(9): e9828, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32939983

RESUMO

Essential genes tend to be highly conserved across eukaryotes, but, in some cases, their critical roles can be bypassed through genetic rewiring. From a systematic analysis of 728 different essential yeast genes, we discovered that 124 (17%) were dispensable essential genes. Through whole-genome sequencing and detailed genetic analysis, we investigated the genetic interactions and genome alterations underlying bypass suppression. Dispensable essential genes often had paralogs, were enriched for genes encoding membrane-associated proteins, and were depleted for members of protein complexes. Functionally related genes frequently drove the bypass suppression interactions. These gene properties were predictive of essential gene dispensability and of specific suppressors among hundreds of genes on aneuploid chromosomes. Our findings identify yeast's core essential gene set and reveal that the properties of dispensable essential genes are conserved from yeast to human cells, correlating with human genes that display cell line-specific essentiality in the Cancer Dependency Map (DepMap) project.


Assuntos
Genes Essenciais , Genes Fúngicos , Saccharomyces cerevisiae/genética , Supressão Genética , Aneuploidia , Evolução Molecular , Deleção de Genes , Duplicação Gênica , Redes Reguladoras de Genes , Genes Supressores , Complexos Multiproteicos/metabolismo
12.
Bioinformatics ; 35(17): 3191-3193, 2019 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-30649215

RESUMO

SUMMARY: The promise of personalized genomic medicine depends on our ability to assess the functional impact of rare sequence variation. Multiplexed assays can experimentally measure the functional impact of missense variants on a massive scale. However, even after such assays, many missense variants remain poorly measured. Here we describe a software pipeline and application to impute missing information in experimentally determined variant effect maps. AVAILABILITY AND IMPLEMENTATION: http://impute.varianteffect.org source code: https://github.com/joewuca/imputation. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Genoma , Genômica , Mutação de Sentido Incorreto
13.
Hum Mutat ; 40(9): 1463-1473, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31283071

RESUMO

This paper reports the evaluation of predictions for the "CALM1" challenge in the fifth round of the Critical Assessment of Genome Interpretation held in 2018. In the challenge, the participants were asked to predict effects on yeast growth caused by missense variants of human calmodulin, a highly conserved protein in eukaryotic cells sensing calcium concentration. The performance of predictors implementing different algorithms and methods is similar. Most predictors are able to identify the deleterious or tolerated variants with modest accuracy, with a baseline predictor based purely on sequence conservation slightly outperforming the submitted predictions. Nevertheless, we think that the accuracy of predictions remains far from satisfactory, and the field awaits substantial improvements. The most poorly predicted variants in this round surround functional CALM1 sites that bind calcium or peptide, which suggests that better incorporation of structural analysis may help improve predictions.


Assuntos
Calmodulina/química , Calmodulina/genética , Biologia Computacional/métodos , Mutação de Sentido Incorreto , Leveduras/crescimento & desenvolvimento , Algoritmos , Sítios de Ligação , Cálcio/metabolismo , Calmodulina/metabolismo , Evolução Molecular , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Aptidão Genética , Humanos , Modelos Genéticos , Modelos Moleculares , Conformação Proteica , Engenharia de Proteínas , Leveduras/genética
15.
Mol Syst Biol ; 14(5): e7985, 2018 05 28.
Artigo em Inglês | MEDLINE | ID: mdl-29807908

RESUMO

Condition-dependent genetic interactions can reveal functional relationships between genes that are not evident under standard culture conditions. State-of-the-art yeast genetic interaction mapping, which relies on robotic manipulation of arrays of double-mutant strains, does not scale readily to multi-condition studies. Here, we describe barcode fusion genetics to map genetic interactions (BFG-GI), by which double-mutant strains generated via en masse "party" mating can also be monitored en masse for growth to detect genetic interactions. By using site-specific recombination to fuse two DNA barcodes, each representing a specific gene deletion, BFG-GI enables multiplexed quantitative tracking of double mutants via next-generation sequencing. We applied BFG-GI to a matrix of DNA repair genes under nine different conditions, including methyl methanesulfonate (MMS), 4-nitroquinoline 1-oxide (4NQO), bleomycin, zeocin, and three other DNA-damaging environments. BFG-GI recapitulated known genetic interactions and yielded new condition-dependent genetic interactions. We validated and further explored a subnetwork of condition-dependent genetic interactions involving MAG1, SLX4, and genes encoding the Shu complex, and inferred that loss of the Shu complex leads to an increase in the activation of the checkpoint protein kinase Rad53.


Assuntos
Mapeamento Cromossômico , Código de Barras de DNA Taxonômico , Dano ao DNA , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Reparo do DNA , Epistasia Genética , Deleção de Genes , Loci Gênicos , Sequenciamento de Nucleotídeos em Larga Escala , Metanossulfonato de Metila , Modelos Teóricos , Regiões Promotoras Genéticas , Reprodutibilidade dos Testes
16.
Hum Genet ; 137(9): 665-678, 2018 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-30073413

RESUMO

Given the constantly improving cost and speed of genome sequencing, it is reasonable to expect that personal genomes will soon be known for many millions of humans. This stands in stark contrast with our limited ability to interpret the sequence variants which we find. Although it is, perhaps, easiest to interpret variants in coding regions, knowledge of functional impact is unknown for the vast majority of missense variants. While many computational approaches can predict the impact of coding variants, they are given a little weight in the current guidelines for interpreting clinical variants. Laboratory assays produce comparatively more trustworthy results, but until recently did not scale to the space of all possible mutations. The development of deep mutational scanning and other multiplexed assays of variant effect has now brought feasibility of this endeavour within view. Here, we review progress in this field over the last decade, break down the different approaches into their components, and compare methodological differences.


Assuntos
Biologia Computacional/métodos , Estudos de Associação Genética/métodos , Variação Genética , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Genótipo , Humanos , Fenótipo
17.
Mol Syst Biol ; 13(12): 957, 2017 12 21.
Artigo em Inglês | MEDLINE | ID: mdl-29269382

RESUMO

Although we now routinely sequence human genomes, we can confidently identify only a fraction of the sequence variants that have a functional impact. Here, we developed a deep mutational scanning framework that produces exhaustive maps for human missense variants by combining random codon mutagenesis and multiplexed functional variation assays with computational imputation and refinement. We applied this framework to four proteins corresponding to six human genes: UBE2I (encoding SUMO E2 conjugase), SUMO1 (small ubiquitin-like modifier), TPK1 (thiamin pyrophosphokinase), and CALM1/2/3 (three genes encoding the protein calmodulin). The resulting maps recapitulate known protein features and confidently identify pathogenic variation. Assays potentially amenable to deep mutational scanning are already available for 57% of human disease genes, suggesting that DMS could ultimately map functional variation for all human disease genes.


Assuntos
Análise Mutacional de DNA/métodos , Mutação de Sentido Incorreto/genética , Calmodulina/genética , Doença/genética , Humanos , Aprendizado de Máquina , Fenótipo , Filogenia , Reprodutibilidade dos Testes , Proteína SUMO-1/genética , Enzimas de Conjugação de Ubiquitina/genética , Enzimas de Conjugação de Ubiquitina/metabolismo
18.
Hum Mutat ; 38(9): 1051-1063, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28817247

RESUMO

The exponential growth of genomic variants uncovered by next-generation sequencing necessitates efficient and accurate computational analyses to predict their functional effects. A number of computational methods have been developed for the task, but few unbiased comparisons of their performance are available. To fill the gap, The Critical Assessment of Genome Interpretation (CAGI) comprehensively assesses phenotypic predictions on newly collected experimental datasets. Here, we present the results of the SUMO conjugase challenge where participants were predicting functional effects of missense mutations in human SUMO-conjugating enzyme UBE2I. The performance of the predictors is similar to each other and is far from perfection. Evolutionary information from sequence alignments dominates the success: deleterious mutations at conserved positions and benign mutations at variable positions are accurately predicted. Prediction accuracy of other mutations remains unsatisfactory, and this fast-growing field of research is yet to learn the use of spatial structure information to improve the predictions significantly.


Assuntos
Biologia Computacional/métodos , Mutação de Sentido Incorreto , Enzimas de Conjugação de Ubiquitina/genética , Enzimas de Conjugação de Ubiquitina/metabolismo , Bases de Dados Genéticas , Evolução Molecular , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Modelos Moleculares , Ligação Proteica , Seleção Genética , Alinhamento de Sequência , Enzimas de Conjugação de Ubiquitina/química
19.
Mol Syst Biol ; 12(4): 863, 2016 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-27107012

RESUMO

High-throughput binary protein interaction mapping is continuing to extend our understanding of cellular function and disease mechanisms. However, we remain one or two orders of magnitude away from a complete interaction map for humans and other major model organisms. Completion will require screening at substantially larger scales with many complementary assays, requiring further efficiency gains in proteome-scale interaction mapping. Here, we report Barcode Fusion Genetics-Yeast Two-Hybrid (BFG-Y2H), by which a full matrix of protein pairs can be screened in a single multiplexed strain pool. BFG-Y2H uses Cre recombination to fuse DNA barcodes from distinct plasmids, generating chimeric protein-pair barcodes that can be quantified via next-generation sequencing. We applied BFG-Y2H to four different matrices ranging in scale from ~25 K to 2.5 M protein pairs. The results show that BFG-Y2H increases the efficiency of protein matrix screening, with quality that is on par with state-of-the-art Y2H methods.


Assuntos
Centrossomo/metabolismo , Mapeamento de Interação de Proteínas/métodos , Proteoma/metabolismo , Saccharomyces cerevisiae/genética , Cromossomos Humanos/metabolismo , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Ligação Proteica , Técnicas do Sistema de Duplo-Híbrido
20.
Genome Biol ; 25(1): 172, 2024 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-38951922

RESUMO

BACKGROUND: Computational variant effect predictors offer a scalable and increasingly reliable means of interpreting human genetic variation, but concerns of circularity and bias have limited previous methods for evaluating and comparing predictors. Population-level cohorts of genotyped and phenotyped participants that have not been used in predictor training can facilitate an unbiased benchmarking of available methods. Using a curated set of human gene-trait associations with a reported rare-variant burden association, we evaluate the correlations of 24 computational variant effect predictors with associated human traits in the UK Biobank and All of Us cohorts. RESULTS: AlphaMissense outperformed all other predictors in inferring human traits based on rare missense variants in UK Biobank and All of Us participants. The overall rankings of computational variant effect predictors in these two cohorts showed a significant positive correlation. CONCLUSION: We describe a method to assess computational variant effect predictors that sidesteps the limitations of previous evaluations. This approach is generalizable to future predictors and could continue to inform predictor choice for personal and clinical genetics.


Assuntos
Benchmarking , Variação Genética , Humanos , Fenótipo , Biologia Computacional/métodos , Genótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA