RESUMEN
MOTIVATION: Computational studies of molecular evolution are usually performed from a multiple alignment of homologous sequences, on which sequences resulting from a common ancestor are aligned so that equivalent residues are placed in the same position. Residues frequency patterns of a full alignment or from a subset of its sequences can be highly useful for suggesting positions under selection. Most methods mapping co-evolving or specificity determinant sites are focused on positions, however, they do not consider the case for residues that are specificity determinants in one subclass, but variable in others. In addition, many methods are impractical for very large alignments, such as those obtained from Pfam, or require a priori information of the subclasses to be analyzed. RESULTS: In this paper we apply the complex networks theory, widely used to analyze co-affiliation systems in the social and ecological contexts, to map groups of functional related residues. This methodology was initially evaluated in simulated environments and then applied to four different protein families datasets, in which several specificity determinant sets and functional motifs were successfully detected. AVAILABILITY AND IMPLEMENTATION: The algorithms and datasets used in the development of this project are available on http://www.biocomp.icb.ufmg.br/biocomp/software-and-databases/networkstats/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Algoritmos , Programas Informáticos , Biología Computacional , Proteínas , Alineación de SecuenciaRESUMEN
Trypsin-like serine proteases are a group of homologous enzymes which exert multiple roles in both vertebrate and invertebrate organisms. Key properties of these enzymes include their activation from an inactive zymogen form to their active form by cleavage of residues in their N-terminus, the presence of a conserved catalytic triad of residues, and the existence of different patterns of substrate selectivity for residue cleavage between the various members of this protein family. In this article, we apply the decomposition of residue coevolution networks computational method to find sets of residues related to some of these key properties, especially to zymogen activation. Positive selection detection, normal modes analysis, and the calculation of thermal couplings between the bovine trypsinogen and bovine trypsin structures residues yielded further information for understanding the zymogen activation process and highlighted the importance of some of the coevolved set residues during these transitions.
Asunto(s)
Evolución Molecular , Serina Endopeptidasas/química , Serina Endopeptidasas/metabolismo , Animales , Bovinos , Activación Enzimática , Humanos , Modelos Moleculares , Conformación Proteica , Alineación de Secuencia , TemperaturaRESUMEN
Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed.