Your browser doesn't support javascript.
loading
Evaluating the relevance of sequence conservation in the prediction of pathogenic missense variants.
Capriotti, Emidio; Fariselli, Piero.
Afiliación
  • Capriotti E; BioFolD Unit, Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Via F. Selmi 3, 40126, Bologna, Italy. emidio.capriotti@unibo.it.
  • Fariselli P; Department of Medical Sciences, University of Torino, Via Santena 19, 10126, Turin, Italy. piero.fariselli@unito.it.
Hum Genet ; 141(10): 1649-1658, 2022 Oct.
Article en En | MEDLINE | ID: mdl-35098354
ABSTRACT
Evolutionary information is the primary tool for detecting functional conservation in nucleic acid and protein. This information has been extensively used to predict structure, interactions and functions in macromolecules. Pathogenicity prediction models rely on multiple sequence alignment information at different levels. However, most accurate genome-wide variant deleteriousness ranking algorithms consider different features to assess the impact of variants. Here, we analyze three different ways of extracting evolutionary information from sequence alignments in the context of pathogenicity predictions at DNA and protein levels. We showed that protein sequence-based information is slightly more informative in the annotation of Clinvar missense variants than those obtained at the DNA level. Furthermore, to achieve the performance of state-of-the-art methods, such as CADD and REVEL, the conservation of reference and variant, encoded as frequencies of reference/alternate alleles or wild-type/mutant residues, should be included. Our results on a large set of missense variants show that a basic method based on three input features derived from the protein sequence profile performs similarly to the CADD algorithm which uses hundreds of genomic features. As expected, our method results in ~ 3% lower area under the receiver-operating characteristic curve (AUC). When compared with an ensemble-based algorithm (REVEL). Nevertheless, the combination of predictions of multiple methods can help to identify more reliable predictions. These observations indicate that for missense variants, evolutionary information, when properly encoded, plays the primary role in ranking pathogenicity.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Ácidos Nucleicos / Biología Computacional Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Hum Genet Año: 2022 Tipo del documento: Article País de afiliación: Italia

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Ácidos Nucleicos / Biología Computacional Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Hum Genet Año: 2022 Tipo del documento: Article País de afiliación: Italia