Calibration of variant effect predictors on genome-wide data masks heterogeneous performance across genes.

Tejura, Malvika; Fayer, Shawn; McEwen, Abbye E; Flynn, Jake; Starita, Lea M; Fowler, Douglas M

Tejura, Malvika; Fayer, Shawn; McEwen, Abbye E; Flynn, Jake; Starita, Lea M; Fowler, Douglas M.

Afiliação

Tejura M; Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
Fayer S; Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
McEwen AE; Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA; Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, USA.
Flynn J; University of Washington Interdisciplinary Data Science Group, Seattle, WA 98195, USA.
Starita LM; Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA. Electronic address: lstarita@uw.edu.
Fowler DM; Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Department of Bioengineering, University of Washington, Seattle, WA 98195, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA. Electronic address: dfowler@uw.edu.

Am J Hum Genet ; 2024 Aug 20.

Article em En | MEDLINE | ID: mdl-39173626

ABSTRACT

ABSTRACT

In silico variant effect predictions are available for nearly all missense variants but played a minimal role in clinical variant classification because they were deemed to provide only supporting evidence. Recently, the ClinGen Sequence Variant Interpretation (SVI) Working Group updated recommendations for variant effect prediction use. By analyzing control pathogenic and benign variants across all genes, they were able to compute evidence strength for predictor score intervals with some intervals generating moderate, strong, or even very strong evidence. However, this genome-wide approach could obscure heterogeneous predictor performance in different genes. We quantified the gene-by-gene performance of two top predictors, REVEL and BayesDel, by analyzing control variants in each predictor score interval in 3,668 disease-relevant genes. Approximately 10% of intervals had sufficient control variants for analysis, and â¼70% of these intervals exceeded the maximum number of incorrect predictions implied by the SVI recommendations. These trending discordant intervals arose owing to the divergence of the gene-specific distribution of predictions from the genome-wide distribution, suggesting that gene-specific calibration is needed in many cases. Approximately 22% of ClinVar missense variants of uncertain significance in genes we analyzed (REVEL = 100,629, BayesDel = 71,928) had predictions in trending discordant intervals. Thus, genome-wide calibrations could result in many variants receiving inappropriate evidence strength. To facilitate a review of the SVI's calibrations, we developed a web application enabling visualization of gene-specific predictions and trending concordant and discordant intervals.

Palavras-chave

calibrations; variant effect predictors; variant interpretation

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article