Proteins and domains vary in their tolerance of non-synonymous single nucleotide polymorphisms (nsSNPs).

Yates, Christopher M; Sternberg, Michael J E

Yates, Christopher M; Sternberg, Michael J E.

Afiliação

Yates CM; Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, Sir Ernst Chain Building, South Kensington, London SW7 2AZ, UK. c.yates11@imperial.ac.uk

J Mol Biol ; 425(8): 1274-86, 2013 Apr 26.

Article em En | MEDLINE | ID: mdl-23357174

ABSTRACT

ABSTRACT

The widespread application of whole-genome sequencing is identifying numerous non-synonymous single nucleotide polymorphisms (nsSNPs), many of which are associated with disease. We analyzed nsSNPs from Humsavar and the 1000 Genomes Project to investigate why some proteins and domains are more tolerant of mutations than others. We identified 311 proteins and 112 Pfam families, corresponding to 2910 domains, as diseasesusceptible and 32 proteins and 67 Pfam families (10,783 domains) as diseaseresistant based on the relative numbers of disease-associated and neutral polymorphisms. Proteins with no significant difference from expected numbers of disease and polymorphism nsSNPs are classified as other. This classification takes into account the phenotypes of all known mutations in the protein or domain rather than simply classifying based on the presence or absence of disease nsSNPs. Of the two hypotheses suggested, our results support the model that disease-resistant domains and proteins are more able to tolerate mutations rather than having more lethal mutations that are not observed. Disease-resistant proteins and domains show significantly higher mutation rates and lower sequence conservation than disease-susceptible proteins and domains. Disease-susceptible proteins are more likely to be encoded by essential genes, are more central in protein-protein interaction networks and are less likely to contain loss-of-function mutations in healthy individuals. We use this classification for nsSNP phenotype prediction, predicting nsSNPs in disease-susceptible domains to be disease and those in disease-resistant domains to be polymorphism. In this way, we achieve higher accuracy than SIFT, a state-of-the-art algorithm.

Assuntos

Substituição de Aminoácidos; Polimorfismo de Nucleotídeo Único; Proteínas/genética; Animais; Biologia Computacional; Resistência à Doença; Predisposição Genética para Doença; Genômica; Humanos; Camundongos

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Substituição de Aminoácidos / Polimorfismo de Nucleotídeo Único Tipo de estudo: Prognostic_studies Limite: Animals / Humans Idioma: En Revista: J Mol Biol Ano de publicação: 2013 Tipo de documento: Article País de afiliação: Reino Unido

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google