RESUMEN
In multicellular organisms, duplicated genes can diverge through tissue-specific gene expression patterns, as exemplified by highly regulated expression of RUNX transcription factor paralogs with apparent functional redundancy. Here we asked what cell-type-specific biologies might be supported by the selective expression of RUNX paralogs during Langerhans cell and inducible regulatory T cell differentiation. We uncovered functional nonequivalence between RUNX paralogs. Selective expression of native paralogs allowed integration of transcription factor activity with extrinsic signals, while non-native paralogs enforced differentiation even in the absence of exogenous inducers. DNA binding affinity was controlled by divergent amino acids within the otherwise highly conserved RUNT domain and evolutionary reconstruction suggested convergence of RUNT domain residues toward submaximal strength. Hence, the selective expression of gene duplicates in specialized cell types can synergize with the acquisition of functional differences to enable appropriate gene expression, lineage choice and differentiation in the mammalian immune system.
Asunto(s)
Subunidades alfa del Factor de Unión al Sitio Principal/genética , Sistema Inmunológico/fisiología , Células de Langerhans/fisiología , Especificidad de Órganos/genética , Linfocitos T Reguladores/fisiología , Animales , Diferenciación Celular , Linaje de la Célula , Secuencia Conservada , Evolución Molecular , Duplicación de Gen , Humanos , Mamíferos , Transducción de Señal , TranscriptomaRESUMEN
Transcription and translation require a high concentration of potassium across the entire tree of life. The conservation of a high intracellular potassium was an absolute requirement for the evolution of life on Earth. This was achieved by the interplay of P- and V-ATPases that can set up electrochemical gradients across the cell membrane, an energetically costly process requiring the synthesis of ATP by F-ATPases. In animals, the control of an extracellular compartment was achieved by the emergence of multicellular organisms able to produce tight epithelial barriers creating a stable extracellular milieu. Finally, the adaptation to a terrestrian environment was achieved by the evolution of distinct regulatory pathways allowing salt and water conservation. In this review we emphasize the critical and dual role of Na(+)-K(+)-ATPase in the control of the ionic composition of the extracellular fluid and the renin-angiotensin-aldosterone system (RAAS) in salt and water conservation in vertebrates. The action of aldosterone on transepithelial sodium transport by activation of the epithelial sodium channel (ENaC) at the apical membrane and that of Na(+)-K(+)-ATPase at the basolateral membrane may have evolved in lungfish before the emergence of tetrapods. Finally, we discuss the implication of RAAS in the origin of the present pandemia of hypertension and its associated cardiovascular diseases.
Asunto(s)
Aldosterona/metabolismo , Evolución Biológica , Canales Epiteliales de Sodio/metabolismo , ATPasa Intercambiadora de Sodio-Potasio/metabolismo , Sodio/metabolismo , Animales , Canales Epiteliales de Sodio/química , Canales Epiteliales de Sodio/genética , Genoma Humano , Humanos , Nefronas/fisiología , Transducción de Señal/fisiología , ATPasa Intercambiadora de Sodio-Potasio/química , ATPasa Intercambiadora de Sodio-Potasio/genéticaRESUMEN
The latest version of the CATH-Gene3D protein structure classification database (4.0, http://www.cathdb.info) provides annotations for over 235,000 protein domain structures and includes 25 million domain predictions. This article provides an update on the major developments in the 2 years since the last publication in this journal including: significant improvements to the predictive power of our functional families (FunFams); the release of our 'current' putative domain assignments (CATH-B); a new, strictly non-redundant data set of CATH domains suitable for homology benchmarking experiments (CATH-40) and a number of improvements to the web pages.
Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Estructura Terciaria de Proteína , Genómica , Internet , Estructura Terciaria de Proteína/genética , Proteínas/clasificaciónRESUMEN
A well-known case of evolutionary adaptation is that of ribulose-1,5-bisphosphate carboxylase (RubisCO), the enzyme responsible for fixation of CO2 during photosynthesis. Although the majority of plants use the ancestral C3 photosynthetic pathway, many flowering plants have evolved a derived pathway named C4 photosynthesis. The latter concentrates CO2, and C4 RubisCOs consequently have lower specificity for, and faster turnover of, CO2. The C4 forms result from convergent evolution in multiple clades, with substitutions at a small number of sites under positive selection. To understand the physical constraints on these evolutionary changes, we reconstructed in silico ancestral sequences and 3D structures of RubisCO from a large group of related C3 and C4 species. We were able to precisely track their past evolutionary trajectories, identify mutations on each branch of the phylogeny, and evaluate their stability effect. We show that RubisCO evolution has been constrained by stability-activity tradeoffs similar in character to those previously identified in laboratory-based experiments. The C4 properties require a subset of several ancestral destabilizing mutations, which from their location in the structure are inferred to mainly be involved in enhancing conformational flexibility of the open-closed transition in the catalytic cycle. These mutations are near, but not in, the active site or at intersubunit interfaces. The C3 to C4 transition is preceded by a sustained period in which stability of the enzyme is increased, creating the capacity to accept the functionally necessary destabilizing mutations, and is immediately followed by compensatory mutations that restore global stability.
Asunto(s)
Evolución Biológica , Ribulosa-Bifosfato Carboxilasa/fisiología , Adaptación Fisiológica , Dióxido de Carbono/metabolismo , Estabilidad de Enzimas , Modelos Moleculares , Mutación , Fotosíntesis , Fenómenos Fisiológicos de las Plantas , Ribulosa-Bifosfato Carboxilasa/química , Ribulosa-Bifosfato Carboxilasa/genéticaRESUMEN
Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of protein domain structure annotations for protein sequences. Domains are predicted using a library of profile HMMs from 2738 CATH superfamilies. Gene3D assigns domain annotations to Ensembl and UniProt sequence sets including >6000 cellular genomes and >20 million unique protein sequences. This represents an increase of 45% in the number of protein sequences since our last publication. Thanks to improvements in the underlying data and pipeline, we see large increases in the domain coverage of sequences. We have expanded this coverage by integrating Pfam and SUPERFAMILY domain annotations, and we now resolve domain overlaps to provide highly comprehensive composite multi-domain architectures. To make these data more accessible for comparative genome analyses, we have developed novel search algorithms for searching genomes to identify related multi-domain architectures. In addition to providing domain family annotations, we have now developed a pipeline for 3D homology modelling of domains in Gene3D. This has been applied to the human genome and will be rolled out to other major organisms over the next year.
Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Estructura Terciaria de Proteína , Genoma , Genómica , Internet , Modelos Moleculares , Estructura Terciaria de Proteína/genética , Análisis de Secuencia de ProteínaRESUMEN
Selectome (http://selectome.unil.ch/) is a database of positive selection, based on a branch-site likelihood test. This model estimates the number of nonsynonymous substitutions (dN) and synonymous substitutions (dS) to evaluate the variation in selective pressure (dN/dS ratio) over branches and over sites. Since the original release of Selectome, we have benchmarked and implemented a thorough quality control procedure on multiple sequence alignments, aiming to provide minimum false-positive results. We have also improved the computational efficiency of the branch-site test implementation, allowing larger data sets and more frequent updates. Release 6 of Selectome includes all gene trees from Ensembl for Primates and Glires, as well as a large set of vertebrate gene trees. A total of 6810 gene trees have some evidence of positive selection. Finally, the web interface has been improved to be more responsive and to facilitate searches and browsing.
Asunto(s)
Bases de Datos de Ácidos Nucleicos , Selección Genética , Variación Genética , Genómica/normas , Humanos , Internet , Control de Calidad , Alineación de SecuenciaRESUMEN
The reproductive ground plan hypothesis (RGPH) proposes that the physiological pathways regulating reproduction were co-opted to regulate worker division of labor. Support for this hypothesis in honeybees is provided by studies demonstrating that the reproductive potential of workers, assessed by the levels of vitellogenin (Vg), is linked to task performance. Interestingly, contrary to honeybees that have a single Vg ortholog and potentially fertile nurses, the genome of the harvester ant Pogonomyrmex barbatus harbors two Vg genes (Pb_Vg1 and Pb_Vg2) and nurses produce infertile trophic eggs. P. barbatus, thus, provides a unique model to investigate whether Vg duplication in ants was followed by subfunctionalization to acquire reproductive and non-reproductive functions and whether Vg reproductive function was co-opted to regulate behavior in sterile workers. To investigate these questions, we compared the expression patterns of P. barbatus Vg genes and analyzed the phylogenetic relationships and molecular evolution of Vg genes in ants. qRT-PCRs revealed that Pb_Vg1 is more highly expressed in queens compared to workers and in nurses compared to foragers. By contrast, the level of expression of Pb_Vg2 was higher in foragers than in nurses and queens. Phylogenetic analyses show that a first duplication of the ancestral Vg gene occurred after the divergence between the poneroid and formicoid clades and subsequent duplications occurred in the lineages leading to Solenopsis invicta, Linepithema humile and Acromyrmex echinatior. The initial duplication resulted in two Vg gene subfamilies preferentially expressed in queens and nurses (subfamily A) or in foraging workers (subfamily B). Finally, molecular evolution analyses show that the subfamily A experienced positive selection, while the subfamily B showed overall relaxation of purifying selection. Our results suggest that in P. barbatus the Vg gene underwent subfunctionalization after duplication to acquire caste- and behavior- specific expression associated with reproductive and non-reproductive functions, supporting the validity of the RGPH in ants.
Asunto(s)
Hormigas/genética , Abejas/genética , Reproducción/genética , Vitelogeninas/metabolismo , Animales , Hormigas/fisiología , Abejas/fisiología , Conducta Animal , Evolución Molecular , Genoma de los Insectos , Filogenia , Reproducción/fisiología , Vitelogeninas/genéticaRESUMEN
Blood coagulation occurs through a cascade of enzymes and cofactors that produces a fibrin clot, while otherwise maintaining hemostasis. The 11 human coagulation factors (FG, FII-FXIII) have been identified across all vertebrates, suggesting that they emerged with the first vertebrates around 500 Ma. Human FVIII, FIX, and FXI are associated with thousands of disease-causing mutations. Here, we evaluated the strength of selective pressures on the 14 genes coding for the 11 factors during vertebrate evolution, and compared these with human mutations in FVIII, FIX, and FXI. Positive selection was identified for fibrinogen (FG), FIII, FVIII, FIX, and FX in the mammalian Primates and Laurasiatheria and the Sauropsida (reptiles and birds). This showed that the coagulation system in vertebrates was under strong selective pressures, perhaps to adapt against blood-invading pathogens. The comparison of these results with disease-causing mutations reported in FVIII, FIX, and FXI showed that the number of disease-causing mutations, and the probability of positive selection were inversely related to each other. It was concluded that when a site was under positive selection, it was less likely to be associated with disease-causing mutations. In contrast, sites under negative selection were more likely to be associated with disease-causing mutations and be destabilizing. A residue-by-residue comparison of the FVIII, FIX, and FXI sequence alignments confirmed this. This improved understanding of evolutionary changes in FVIII, FIX, and FXI provided greater insight into disease-causing mutations, and better assessments of the codon sites that may be mutated in applications of gene therapy.
Asunto(s)
Trastornos de la Coagulación Sanguínea/genética , Factor IX/genética , Factor VIII/genética , Factor XI/genética , Fibrinógeno/genética , Vertebrados/genética , Animales , Secuencia de Bases , Evolución Molecular , Humanos , Datos de Secuencia Molecular , Mutación , Selección Genética , Alineación de SecuenciaRESUMEN
CATH version 3.5 (Class, Architecture, Topology, Homology, available at http://www.cathdb.info/) contains 173 536 domains, 2626 homologous superfamilies and 1313 fold groups. When focusing on structural genomics (SG) structures, we observe that the number of new folds for CATH v3.5 is slightly less than for previous releases, and this observation suggests that we may now know the majority of folds that are easily accessible to structure determination. We have improved the accuracy of our functional family (FunFams) sub-classification method and the CATH sequence domain search facility has been extended to provide FunFam annotations for each domain. The CATH website has been redesigned. We have improved the display of functional data and of conserved sequence features associated with FunFams within each CATH superfamily.
Asunto(s)
Bases de Datos de Proteínas , Estructura Terciaria de Proteína , Genómica , Internet , Anotación de Secuencia Molecular , Pliegue de Proteína , Proteínas/química , Proteínas/clasificación , Proteínas/genética , Alineación de Secuencia , Análisis de Secuencia de Proteína , Homología Estructural de ProteínaRESUMEN
The present review focuses on the evolution of proteins and the impact of amino acid mutations on function from a structural perspective. Proteins evolve under the law of natural selection and undergo alternating periods of conservative evolution and of relatively rapid change. The likelihood of mutations being fixed in the genome depends on various factors, such as the fitness of the phenotype or the position of the residues in the three-dimensional structure. For example, co-evolution of residues located close together in three-dimensional space can occur to preserve global stability. Whereas point mutations can fine-tune the protein function, residue insertions and deletions ('decorations' at the structural level) can sometimes modify functional sites and protein interactions more dramatically. We discuss recent developments and tools to identify such episodic mutations, and examine their applications in medical research. Such tools have been tested on simulated data and applied to real data such as viruses or animal sequences. Traditionally, there has been little if any cross-talk between the fields of protein biophysics, protein structure-function and molecular evolution. However, the last several years have seen some exciting developments in combining these approaches to obtain an in-depth understanding of how proteins evolve. For example, a better understanding of how structural constraints affect protein evolution will greatly help us to optimize our models of sequence evolution. The present review explores this new synthesis of perspectives.
Asunto(s)
Evolución Molecular , Mutación , Proteínas/química , Proteínas/genética , Adaptación Fisiológica/genética , Sustitución de Aminoácidos , Animales , Sitios de Unión/genética , Bases de Datos Genéticas , Femenino , Enfermedades Genéticas Congénitas/genética , Humanos , Mutación INDEL , Infecciones/genética , Masculino , Modelos Genéticos , Modelos Moleculares , Neoplasias/genética , Dominios y Motivos de Interacción de Proteínas/genética , Estabilidad Proteica , Proteínas/metabolismoRESUMEN
The function of most proteins is not determined experimentally, but is extrapolated from homologs. According to the "ortholog conjecture", or standard model of phylogenomics, protein function changes rapidly after duplication, leading to paralogs with different functions, while orthologs retain the ancestral function. We report here that a comparison of experimentally supported functional annotations among homologs from 13 genomes mostly supports this model. We show that to analyze GO annotation effectively, several confounding factors need to be controlled: authorship bias, variation of GO term frequency among species, variation of background similarity among species pairs, and propagated annotation bias. After controlling for these biases, we observe that orthologs have generally more similar functional annotations than paralogs. This is especially strong for sub-cellular localization. We observe only a weak decrease in functional similarity with increasing sequence divergence. These findings hold over a large diversity of species; notably orthologs from model organisms such as E. coli, yeast or mouse have conserved function with human proteins.
Asunto(s)
Evolución Molecular , Modelos Químicos , Modelos Genéticos , Proteínas/química , Proteínas/genética , Secuencia de Aminoácidos , Simulación por Computador , Datos de Secuencia Molecular , Proteínas/metabolismo , Análisis de Secuencia de Proteína , Homología de Secuencia de Aminoácido , Relación Estructura-ActividadRESUMEN
Homologous genes are classified into orthologs and paralogs, depending on whether they arose by speciation or duplication. It is widely assumed that orthologs share similar functions, whereas paralogs are expected to diverge more from each other. But does this assumption hold up on further examination? We present evidence that orthologs and paralogs are not so different in either their evolutionary rates or their mechanisms of divergence. We emphasize the importance of appropriately designed studies to test models of gene evolution between orthologs and between paralogs. Thus, functional change between orthologs might be as common as between paralogs, and future studies should be designed to test the impact of duplication against this alternative model.
Asunto(s)
Duplicación de Gen , Especiación Genética , Genómica/métodos , Modelos Genéticos , Animales , Evolución Molecular , Variación Genética , Genómica/normas , Humanos , Filogenia , Proteómica/métodosRESUMEN
Despite large changes in salt intake, the mammalian kidney is able to maintain the extracellular sodium concentration and osmolarity within very narrow margins, thereby controlling blood volume and blood pressure. In the aldosterone-sensitive distal nephron (ASDN), aldosterone tightly controls the activities of epithelial sodium channel (ENaC) and Na,K-ATPase, the two limiting factors in establishing transepithelial sodium transport. It has been proposed that the ENaC/degenerin gene family is restricted to Metazoans, whereas the α- and ß-subunits of Na,K-ATPase have homologous genes in prokaryotes. This raises the question of the emergence of osmolarity control. By exploring recent genomic data of diverse organisms, we found that: 1) ENaC/degenerin exists in all of the Metazoans screened, including nonbilaterians and, by extension, was already present in ancestors of Metazoa; 2) ENaC/degenerin is also present in Naegleria gruberi, an eukaryotic microbe, consistent with either a vertical inheritance from the last common ancestor of Eukaryotes or a lateral transfer between Naegleria and Metazoan ancestors; and 3) The Na,K-ATPase ß-subunit is restricted to Holozoa, the taxon that includes animals and their closest single-cell relatives. Since the ß-subunit of Na,K-ATPase plays a key role in targeting the α-subunit to the plasma membrane and has an additional function in the formation of cell junctions, we propose that the emergence of Na,K-ATPase, together with ENaC/degenerin, is linked to the development of multicellularity in the Metazoan kingdom. The establishment of multicellularity and the associated extracellular compartment ("internal milieu") precedes the emergence of other key elements of the aldosterone signaling pathway.
Asunto(s)
Aldosterona/metabolismo , Canales Epiteliales de Sodio/genética , Evolución Molecular , ATPasa Intercambiadora de Sodio-Potasio/genética , Sodio/metabolismo , Canales Iónicos Sensibles al Ácido , Animales , Canales de Sodio Degenerina , Humanos , Transporte Iónico/efectos de los fármacos , Proteínas de la Membrana/genética , Proteínas del Tejido Nervioso/genética , Fosfoproteínas/genética , FilogeniaRESUMEN
Functional divergence between homologous proteins is expected to affect amino acid sequences in two main ways, which can be considered as proxies of biochemical divergence: a "covarion-like" pattern of correlated changes in evolutionary rates, and switches in conserved residues ("conserved but different"). Although these patterns have been used in case studies, a large-scale analysis is needed to estimate their frequency and distribution. We use a phylogenomic framework of animal genes to answer three questions: 1) What is the prevalence of such patterns? 2) Can we link such patterns at the amino acid level with selection inferred at the codon level? 3) Are patterns different between paralogs and orthologs? We find that covarion-like patterns are more frequently detected than "constant but different," but that only the latter are correlated with signal for positive selection. Finally, there is no obvious difference in patterns between orthologs and paralogs.
Asunto(s)
Aminoácidos/genética , Evolución Molecular , Modelos Genéticos , Homología de Secuencia de Aminoácido , Animales , Bases de Datos de Ácidos Nucleicos , Duplicación de Gen/genética , Humanos , Selección GenéticaRESUMEN
Gene duplication and neofunctionalization are known to be important processes in the evolution of phenotypic complexity. They account for important evolutionary novelties that confer ecological adaptation, such as the major histocompatibility complex (MHC), a multigene family crucial to the vertebrate immune system. In birds, two MHC class II ß (MHCIIß) exon 3 lineages have been recently characterized, and two hypotheses for the evolutionary history of MHCIIß lineages were proposed. These lineages could have arisen either by 1) an ancient duplication and subsequent divergence of one paralog or by 2) recent parallel duplications followed by functional convergence. Here, we compiled a data set consisting of 63 MHCIIß exon 3 sequences from six avian orders to distinguish between these hypotheses and to understand the role of selection in the divergent evolution of the two avian MHCIIß lineages. Based on phylogenetic reconstructions and simulations, we show that a unique duplication event preceding the major avian radiations gave rise to two ancestral MHCIIß lineages that were each likely lost once later during avian evolution. Maximum likelihood estimation shows that following the ancestral duplication, positive selection drove a radical shift from basic to acidic amino acid composition of a protein domain facing the α-chain in the MHCII α ß-heterodimer. Structural analyses of the MHCII α ß-heterodimer highlight that three of these residues are potentially involved in direct interactions with the α-chain, suggesting that the shift following duplication may have been accompanied by coevolution of the interacting α- and ß-chains. These results provide new insights into the long-term evolutionary relationships among avian MHC genes and open interesting perspectives for comparative and population genomic studies of avian MHC evolution.
Asunto(s)
Adaptación Biológica/genética , Aves/genética , Evolución Molecular , Genes Duplicados/genética , Genes MHC Clase II/genética , Filogenia , Selección Genética , Secuencia de Aminoácidos , Animales , Teorema de Bayes , Biología Computacional , Simulación por Computador , Exones/genética , Funciones de Verosimilitud , Modelos Genéticos , Alineación de SecuenciaRESUMEN
Genome wide scans have shown that positive selection is relatively frequent at the molecular level. It is of special interest to identify which protein sites and which phylogenetic branches are affected. We present Selectome, a database which provides the results of a rigorous branch-site specific likelihood test for positive selection. The Web interface presents test results mapped both onto phylogenetic trees and onto protein alignments. It allows rapid access to results by keyword, gene name, or taxonomy based queries. Selectome is freely available at http://bioinfo.unil.ch/selectome/.
Asunto(s)
Bases de Datos Genéticas , Genes , Proteínas/genética , Selección Genética , Filogenia , Proteínas/clasificación , Homología de Secuencia de Aminoácido , Interfaz Usuario-ComputadorRESUMEN
Understanding how the variability of protein structure arises during evolution and leads to new structure-function relationships ultimately promoting evolutionary novelties is a major goal of molecular evolution and is critical for interpreting genome sequences. We addressed this issue using the ecdysone receptor (ECR), a major developmental factor that controls development and reproduction of arthropods. The functional ECR is a heterodimer of two nuclear receptors: ECR, which binds ecdysteroids, and its obligatory partner ultraspirade (USP), which is orthologous to the retinoid X receptor of vertebrates. Both genes underwent a dramatic increase of evolutionary rate in Mecopterida, the major insect terminal group containing Dipteras and Lepidopteras. We therefore questioned the implication of this event in terms of coevolution of their dimerization interface. A structural comparison revealed a 30% larger ligand-binding domain (LBD) heterodimerization surface in the Lepidoptera Heliothis when compared with basal insects, associated with a symmetrization of the interface, which is exceptional for nuclear receptors. Reconstruction of ancestral sequences and homology modeling of the ancestral Mecopterida ECR-USP reveal that this enlarged dimerization surface is a synapomorphy for Mecopterida. Furthermore, we show that the residues implicated in the new dimerization surface underwent specific evolutionary constraints in Mecopterida indicative of their new and conserved role in the dimerization interface. Most of all, the novel surface originates from a 15 degrees torsion of a subdomain of USP LBD toward its partner ECR, which is a long-range consequence of the peculiar position of a Mecopterida-specific insertion in loop L1-3, located outside of the interaction surface, in a less crucial domain of the partner protein. These results indicate that the coevolution between ECR and USP occurred through a novel mechanism of intramolecular epistasis that will undoubtedly be generalized for other molecules because it uses flexibility of a less-constrained region of a protein to modify the structure of another, critical part of the molecule.
Asunto(s)
Proteínas de Insectos/genética , Receptores de Esteroides/genética , Animales , Proteínas de Insectos/química , Proteínas de Insectos/metabolismo , Insectos , Modelos Moleculares , Unión Proteica , Multimerización de Proteína , Receptores de Esteroides/química , Receptores de Esteroides/metabolismo , Homología Estructural de ProteínaRESUMEN
The evolution of protein function appears to involve alternating periods of conservative evolution and of relatively rapid change. Evidence for such episodic evolution, consistent with some theoretical expectations, comes from the application of increasingly sophisticated models of evolution to large sequence datasets. We present here some of the recent methods to detect functional shifts, using amino acid or codon models. Both provide evidence for punctual shifts in patterns of amino acid conservation, including the fixation of key changes by positive selection. Although a link to gene duplication, a presumed source of functional changes, has been difficult to establish, this episodic model appears to apply to a wide variety of proteins and organisms.
Asunto(s)
Evolución Molecular , Modelos Genéticos , Proteínas/genética , Secuencia de Aminoácidos , Selección GenéticaRESUMEN
Post-translational modification (PTM) serves as a regulatory mechanism for protein function, influencing their stability, interactions, activity and localization, and is critical in many signaling pathways. The best characterized PTM is phosphorylation, whereby a phosphate is added to an acceptor residue, most commonly serine, threonine and tyrosine in metazoans. As proteins are often phosphorylated at multiple sites, identifying those sites that are important for function is a challenging problem. Considering that any given phosphorylation site might be non-functional, prioritizing evolutionarily conserved phosphosites provides a general strategy to identify the putative functional sites. To facilitate the identification of conserved phosphosites, we generated a large-scale phosphoproteomics dataset from Drosophila embryos collected from six closely-related species. We built iProteinDB (https://www.flyrnai.org/tools/iproteindb/), a resource integrating these data with other high-throughput PTM datasets, including vertebrates, and manually curated information for Drosophila At iProteinDB, scientists can view the PTM landscape for any Drosophila protein and identify predicted functional phosphosites based on a comparative analysis of data from closely-related Drosophila species. Further, iProteinDB enables comparison of PTM data from Drosophila to that of orthologous proteins from other model organisms, including human, mouse, rat, Xenopus tropicalis, Danio rerio, and Caenorhabditis elegans.
Asunto(s)
Bases de Datos de Proteínas , Proteínas de Drosophila/genética , Drosophila/genética , Procesamiento Proteico-Postraduccional/genética , Animales , Humanos , Fosforilación , Proteómica , Transducción de SeñalRESUMEN
Protein phosphorylation is the best characterized post-translational modification that regulates almost all cellular processes through diverse mechanisms such as changing protein conformations, interactions, and localization. While the inventory for phosphorylation sites across different species has rapidly expanded, their functional role remains poorly investigated. Here, we combine 537,321 phosphosites from 40 eukaryotic species to identify highly conserved phosphorylation hotspot regions within domain families. Mapping these regions onto structural data reveals that they are often found at interfaces, near catalytic residues and tend to harbor functionally important phosphosites. Notably, functional studies of a phospho-deficient mutant in the C-terminal hotspot region within the ribosomal S11 domain in the yeast ribosomal protein uS11 shows impaired growth and defective cytoplasmic 20S pre-rRNA processing at 16 °C and 20 °C. Altogether, our study identifies phosphorylation hotspots for 162 protein domains suggestive of an ancient role for the control of diverse eukaryotic domain families.