RESUMO
The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/.
Assuntos
Bases de Dados de Proteínas , Evolução Molecular , Conformação Proteica , Análise de Sequência de ProteínaRESUMO
Negative autoregulation is universally found across organisms. In the bacterium Escherichia coli, transcription factors often repress their own expression to form a negative feedback network motif that enables robustness to changes in biochemical parameters. Here we present a simple phenomenological model of a negative feedback transcription factor repressing both itself and another target gene. The strength of the negative feedback is characterized by three parameters: the cooperativity in self-repression, the maximal expression rate of the transcription factor, and the apparent dissociation constant of the transcription factor binding to its own promoter. Analysis of the model shows that the target gene levels are robust to mutations in the transcription factor, and that the robustness improves as the degree of cooperativity in self-repression increases. The prediction is tested in the LexA transcriptional network of E. coli by altering cooperativity in self-repression and promoter strength. Indeed, we find robustness is correlated with the former. Considering the proposed importance of gene regulation in speciation, parameters governing a transcription factor's robustness to mutation may have significant influence on a cell or organism's capacity to evolve.
Assuntos
Escherichia coli/genética , Regulação da Expressão Gênica , Homeostase , Mutação , Transcrição Gênica , Redes Reguladoras de Genes , Regiões Promotoras Genéticas , Fatores de TranscriçãoRESUMO
Structural Genomics aims to elucidate protein structures to identify their functions. Unfortunately, the variation of just a few residues can be enough to alter activity or binding specificity and limit the functional resolution of annotations based on sequence and structure; in enzymes, substrates are especially difficult to predict. Here, large-scale controls and direct experiments show that the local similarity of five or six residues selected because they are evolutionarily important and on the protein surface can suffice to identify an enzyme activity and substrate. A motif of five residues predicted that a previously uncharacterized Silicibacter sp. protein was a carboxylesterase for short fatty acyl chains, similar to hormone-sensitive-lipase-like proteins that share less than 20% sequence identity. Assays and directed mutations confirmed this activity and showed that the motif was essential for catalysis and substrate specificity. We conclude that evolutionary and structural information may be combined on a Structural Genomics scale to create motifs of mixed catalytic and noncatalytic residues that identify enzyme activity and substrate specificity.
Assuntos
Biologia Computacional/métodos , Enzimas/metabolismo , Proteômica/métodos , Clonagem Molecular , Primers do DNA/genética , Evolução Molecular , Anotação de Sequência Molecular , Relação Estrutura-Atividade , Especificidade por SubstratoRESUMO
Robustness is a property built into biological systems to ensure stereotypical outcomes despite fluctuating inputs from gene dosage, biochemical noise, and the environment. During development, robustness safeguards embryos against structural and functional defects. Yet, our understanding of how robustness is achieved in embryos is limited. While much attention has been paid to the role of gene and signaling networks in promoting robust cell fate determination, little has been done to rigorously assay how mechanical processes like morphogenesis are designed to buffer against variable conditions. Here we show that the cell shape changes that drive morphogenesis can be made robust by mechanisms targeting the actin cytoskeleton. We identified two novel members of the Vinculin/α-Catenin Superfamily that work together to promote robustness during Drosophila cellularization, the dramatic tissue-building event that generates the primary epithelium of the embryo. We find that zygotically-expressed Serendipity-α (Sry-α) and maternally-loaded Spitting Image (Spt) share a redundant, actin-regulating activity during cellularization. Spt alone is sufficient for cellularization at an optimal temperature, but both Spt plus Sry-α are required at high temperature and when actin assembly is compromised by genetic perturbation. Our results offer a clear example of how the maternal and zygotic genomes interact to promote the robustness of early developmental events. Specifically, the Spt and Sry-α collaboration is informative when it comes to genes that show both a maternal and zygotic requirement during a given morphogenetic process. For the cellularization of Drosophilids, Sry-α and its expression profile may represent a genetic adaptive trait with the sole purpose of making this extreme event more reliable. Since all morphogenesis depends on cytoskeletal remodeling, both in embryos and adults, we suggest that robustness-promoting mechanisms aimed at actin could be effective at all life stages.
Assuntos
Actinas/genética , Proteínas de Drosophila/genética , Troca Materno-Fetal/genética , Proteínas de Membrana/genética , Morfogênese/genética , Citoesqueleto de Actina/genética , Citoesqueleto de Actina/metabolismo , Animais , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/genética , Drosophila melanogaster/crescimento & desenvolvimento , Embrião não Mamífero , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Proteínas de Membrana/metabolismo , Fenótipo , Gravidez , Transdução de Sinais/genética , Vinculina/genética , alfa Catenina/genéticaRESUMO
MOTIVATION: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. METHODS AND RESULTS: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. CONCLUSIONS: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. CONTACT: lichtarge@bcm.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Conformação Proteica , Análise de Sequência de Proteína/métodos , Algoritmos , Proteínas de Bactérias/química , Epistasia Genética , Evolução Molecular , Anotação de Sequência Molecular , Mutação , Proteínas/química , Proteínas/genética , Proteoma/química , Serina Endopeptidases/químicaRESUMO
The auxiliary factor DksA is a global transcription regulator and, with the help of ppGpp, controls the nutritional stress response in Escherichia coli. Although the consequences of its modulation of RNA polymerase (RNAP) are becoming better explained, it is still not fully understood how the two proteins interact. We employed a series of genetic suppressor selections to find residues in RNAP that alter its sensitivity to DksA. Our approach allowed us to identify and genetically characterize in vivo three single amino acid substitutions: ß' E677G, ß V146F, and ß G534D. We demonstrate that the mutation ß' E677G affects the activity of both DksA and its homolog, TraR, but does not affect the action of other secondary interactors, such as GreA or GreB. Our mutants provide insight into how different auxiliary transcription factors interact with RNAP and contribute to our understanding of how different stages of transcription are regulated through the secondary channel of RNAP in vivo.
Assuntos
RNA Polimerases Dirigidas por DNA/genética , Proteínas de Escherichia coli/metabolismo , Escherichia coli/enzimologia , Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica , Mutação , Sequência de Aminoácidos , Substituição de Aminoácidos , RNA Polimerases Dirigidas por DNA/química , RNA Polimerases Dirigidas por DNA/metabolismo , Escherichia coli/metabolismo , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/genética , Modelos Moleculares , Regiões Promotoras Genéticas , Ligação Proteica , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Transcrição GênicaRESUMO
SUMMARY: Understanding the differences between knotted and unknotted protein structures may offer insights into how proteins fold. To characterize the type of knot in a protein, we have developed PyKnot, a plugin that works seamlessly within the PyMOL molecular viewer and gives quick results including the knot's invariants, crossing numbers and simplified knot projections and backbones. PyKnot may be useful to researchers interested in classifying knots in macromolecules and provides tools for students of biology and chemistry with which to learn topology and macromolecular visualization. AVAILABILITY: PyMOL is available at http://www.pymol.org. The PyKnot module and tutorial videos are available at http://youtu.be/p95aif6xqcM. CONTACT: rhonald.lua@gmail.com.
Assuntos
Biologia Computacional/métodos , Conformação Proteica , Proteínas/química , Software , Modelos Moleculares , Dobramento de ProteínaRESUMO
UNLABELLED: Most proteins lack experimentally validated functions. To address this problem, we implemented the Evolutionary Trace Annotation (ETA) method in the Cytoscape network visualization environment. The result is the ETAscape plugin, which builds a structural genomics network based on local structural and evolutionary similarities among proteins and then globally diffuses known annotations across the resulting network. The plugin displays these novel functional annotations, their confidence, the molecular basis for individual matches and the set of matches that lead to a prediction. AVAILABILITY: The ETA Network Plugin is available publicly for download at http://mammoth.bcm.tmc.edu/networks/.
Assuntos
Biologia Computacional/métodos , Proteínas/química , Software , Enzimas/análise , Enzimas/química , Genômica/métodos , Proteínas/análise , Especificidade por SubstratoRESUMO
SUMMARY: PyETV is a PyMOL plugin for viewing, analyzing and manipulating predictions of evolutionarily important residues and sites in protein structures and their complexes. It seamlessly captures the output of the Evolutionary Trace server, namely ranked importance of residues, for multiple chains of a complex. It then yields a high resolution graphical interface showing their distribution and clustering throughout a quaternary structure, including at interfaces. Together with other tools in the popular PyMOL viewer, PyETV thus provides a novel tool to integrate evolutionary forces into the design of experiments targeting the most functionally relevant sites of a protein. AVAILABILITY: The PyETV module is written in Python. Installation instructions and video demonstrations may be found at the URL http://mammoth.bcm.tmc.edu/traceview/HelpDocs/PyETVHelp/pyInstructions.html. CONTACT: lichtarge@bcm.tmc.edu.
Assuntos
Complexos Multiproteicos/química , Mapeamento de Interação de Proteínas/métodos , Software , Sítios de Ligação , Análise por Conglomerados , Evolução Molecular , Estrutura Quaternária de ProteínaRESUMO
Like shoelaces, the backbones of proteins may get entangled and form knots. However, only a few knots in native proteins have been identified so far. To more quantitatively assess the rarity of knots in proteins, we make an explicit comparison between the knotting probabilities in native proteins and in random compact loops. We identify knots in proteins statistically, applying the mathematics of knot invariants to the loops obtained by complementing the protein backbone with an ensemble of random closures, and assigning a certain knot type to a given protein if and only if this knot dominates the closure statistics (which tells us that the knot is determined by the protein and not by a particular method of closure). We also examine the local fractal or geometrical properties of proteins via computational measurements of the end-to-end distance and the degree of interpenetration of its subchains. Although we did identify some rather complex knots, we show that native conformations of proteins have statistically fewer knots than random compact loops, and that the local geometrical properties, such as the crumpled character of the conformations at a certain range of scales, are consistent with the rarity of knots. From these, we may conclude that the known "protein universe" (set of native conformations) avoids knots. However, the precise reason for this is unknown--for instance, if knots were removed by evolution due to their unfavorable effect on protein folding or function or due to some other unidentified property of protein evolution.
Assuntos
Evolução Molecular , Proteínas/química , Proteínas/genética , Modelos Moleculares , Peso Molecular , Probabilidade , Conformação Proteica , Dobramento de Proteína , Proteínas/metabolismoRESUMO
Advances in cellular, molecular, and disease biology depend on the comprehensive characterization of gene interactions and pathways. Traditionally, these pathways are curated manually, limiting their efficient annotation and, potentially, reinforcing field-specific bias. Here, in order to test objective and automated identification of functionally cooperative genes, we compared a novel algorithm with three established methods to search for communities within gene interaction networks. Communities identified by the novel approach and by one of the established method overlapped significantly (q < 0.1) with control pathways. With respect to disease, these communities were biased to genes with pathogenic variants in ClinVar (p ⪠0.01), and often genes from the same community were co-expressed, including in breast cancers. The interesting subset of novel communities, defined by poor overlap to control pathways also contained co-expressed genes, consistent with a possible functional role. This work shows that community detection based on topological features of networks suggests new, biologically meaningful groupings of genes that, in turn, point to health and disease relevant hypotheses.
Assuntos
Mapas de Interação de Proteínas , Algoritmos , Síndrome de Bardet-Biedl/genética , Neoplasias da Mama/genética , Biologia Computacional , Epistasia Genética , Feminino , Redes Reguladoras de Genes , Predisposição Genética para Doença , Humanos , Mapas de Interação de Proteínas/genética , Síndrome de Zellweger/genéticaRESUMO
We suggest and discuss a simple model of an ideal gas under the piston to gain an insight into the workings of the Jarzynski identity connecting the average exponential of the work over the nonequilibrium trajectories with the equilibrium free energy. We show that the identity is valid for our system, due to the very rapid molecules belonging to the tail of the Maxwell distribution. For the most interesting extreme, when the system volume is large, while the piston is moving with great speed (compared to thermal velocity) for a very short time, the necessary number of independent experimental runs to obtain a reasonable approximation for the free energy from averaging the nonequilibrium work grows exponentially with the system size.
RESUMO
Motivated by experiments in which single-stranded DNA with a short hairpin loop at one end undergoes unforced diffusion through a narrow pore, we study the first passage times for a particle, executing one-dimensional Brownian motion in an asymmetric sawtooth potential, to exit one of the boundaries. We consider the first passage times for the case of classical diffusion, characterized by a mean-square displacement of the form <(Delta(x))2> approximately t, and for the case of anomalous diffusion or subdiffusion, characterized by a mean-square displacement of the form <(Delta(x))2> approximately t(gamma) with 0Assuntos
Membrana Celular/química
, DNA/química
, Modelos Químicos
, Simulação por Computador
, Modelos Estatísticos
, Movimento (Física)
, Conformação de Ácido Nucleico
, Porosidade
RESUMO
Natural selection for specific functions places limits upon the amino acid substitutions a protein can accept. Mechanisms that expand the range of tolerable amino acid substitutions include chaperones that can rescue destabilized proteins and additional stability-enhancing substitutions. Here, we present an alternative mechanism that is simple and uses a frequently encountered network motif. Computational and experimental evidence shows that the self-correcting, negative-feedback gene regulation motif increases repressor expression in response to deleterious mutations and thereby precisely restores repression of a target gene. Furthermore, this ability to rescue repressor function is observable across the Eubacteria kingdom through the greater accumulation of amino acid substitutions in negative-feedback transcription factors compared to genes they control. We propose that negative feedback represents a self-contained genetic canalization mechanism that preserves phenotype while permitting access to a wider range of functional genotypes.
Assuntos
Substituição de Aminoácidos , Evolução Biológica , Regulação da Expressão Gênica , Proteínas de Bactérias/genética , Retroalimentação , Modelos Genéticos , Seleção Genética , Serina Endopeptidases/genéticaRESUMO
Understanding the molecular basis of protein function remains a central goal of biology, with the hope to elucidate the role of human genes in health and in disease, and to rationally design therapies through targeted molecular perturbations. We review here some of the computational techniques and resources available for characterizing a critical aspect of protein function - those mediated by protein-protein interactions (PPI). We describe several applications and recent successes of the Evolutionary Trace (ET) in identifying molecular events and shapes that underlie protein function and specificity in both eukaryotes and prokaryotes. ET is a part of analytical approaches based on the successes and failures of evolution that enable the rational control of PPI.
Assuntos
Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Proteínas/metabolismo , Humanos , Proteínas/química , Especificidade por SubstratoRESUMO
Genome-wide association studies (GWAS) and whole-exome sequencing (WES) generate massive amounts of genomic variant information, and a major challenge is to identify which variations drive disease or contribute to phenotypic traits. Because the majority of known disease-causing mutations are exonic non-synonymous single nucleotide variations (nsSNVs), most studies focus on whether these nsSNVs affect protein function. Computational studies show that the impact of nsSNVs on protein function reflects sequence homology and structural information and predict the impact through statistical methods, machine learning techniques, or models of protein evolution. Here, we review impact prediction methods and discuss their underlying principles, their advantages and limitations, and how they compare to and complement one another. Finally, we present current applications and future directions for these methods in biological research and medical genetics.
Assuntos
Genética Médica/métodos , Polimorfismo de Nucleotídeo Único/genética , Pesquisa Biomédica , Estudo de Associação Genômica Ampla , HumanosRESUMO
Numerical studies of the average size of trivially knotted polymer loops with no excluded volume were undertaken. Topology was identified by Alexander and Vassiliev degree 2 invariants. Probability of a trivial knot, average gyration radius, and probability density distributions as functions of gyration radius were generated for loops of up to N = 3,000 segments. Gyration radii of trivially knotted loops were found to follow a power law similar to that of self-avoiding walks consistent with earlier theoretical predictions.