RESUMO
Motivation: Detecting novel functional modules in molecular networks is an important step in biological research. In the absence of gold standard functional modules, functional annotations are often used to verify whether detected modules/communities have biological meaning. However, as we show, the uneven distribution of functional annotations means that such evaluation methods favor communities of well-studied proteins. Results: We propose a novel framework for the evaluation of communities as functional modules. Our proposed framework, CommWalker, takes communities as inputs and evaluates them in their local network environment by performing short random walks. We test CommWalker's ability to overcome annotation bias using input communities from four community detection methods on two protein interaction networks. We find that modules accepted by CommWalker are similarly co-expressed as those accepted by current methods. Crucially, CommWalker performs well not only in well-annotated regions, but also in regions otherwise obscured by poor annotation. CommWalker community prioritization both faithfully captures well-validated communities and identifies functional modules that may correspond to more novel biology. Availability and implementation: The CommWalker algorithm is freely available at opig.stats.ox.ac.uk/resources or as a docker image on the Docker Hub at hub.docker.com/r/lueckenmd/commwalker/. Contact: deane@stats.ox.ac.uk. Supplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Biologia Computacional/métodos , Anotação de Sequência Molecular , Mapeamento de Interação de Proteínas/métodos , Software , Algoritmos , HumanosRESUMO
WONKA is a tool for the systematic analysis of an ensemble of protein-ligand structures. It makes the identification of conserved and unusual features within such an ensemble straightforward. WONKA uses an intuitive workflow to process structural co-ordinates. Ligand and protein features are summarised and then presented within an interactive web application. WONKA's power in consolidating and summarising large amounts of data is described through the analysis of three bromodomain datasets. Furthermore, and in contrast to many current methods, WONKA relates analysis to individual ligands, from which we find unusual and erroneous binding modes. Finally the use of WONKA as an annotation tool to share observations about structures is demonstrated. WONKA is freely available to download and install locally or can be used online at http://wonka.sgc.ox.ac.uk.
Assuntos
Desenho de Fármacos , Proteínas/química , Software , Bases de Dados de Proteínas , Histona Acetiltransferases , Chaperonas de Histonas , Humanos , Peptídeos e Proteínas de Sinalização Intracelular/química , Peptídeos e Proteínas de Sinalização Intracelular/metabolismo , Ligantes , Modelos Moleculares , Proteínas Nucleares/química , Proteínas Nucleares/metabolismo , Ligação Proteica , Conformação Proteica , Estrutura Terciária de Proteína , Proteínas/metabolismo , Fatores Genéricos de Transcrição , Fluxo de TrabalhoRESUMO
BACKGROUND: In mammals, oocyte activation at fertilization is thought to be induced by the sperm-specific phospholipase C zeta (PLCzeta). However, it still remains to be conclusively shown that PLCzeta is the endogenous agent of oocyte activation. Some types of human infertility appear to be caused by failure of the sperm to activate and this may be due to specific defects in PLCzeta. METHODS AND RESULTS: Immunofluorescence studies showed PLCzeta to be localized in the equatorial region of sperm from fertile men, but sperm deficient in oocyte activation exhibited no specific signal in this same region. Immunoblot analysis revealed reduced amounts of PLCzeta in sperm from infertile men, and in some cases, the presence of an abnormally low molecular weight form of PLCzeta. In one non-globozoospermic case, DNA analysis identified a point mutation in the PLCzeta gene that leads to a significant amino acid change in the catalytic region of the protein. Structural modelling suggested that this defect may have important effects upon the structure and function of the PLCzeta protein. cRNA corresponding to mutant PLCzeta failed to induce calcium oscillations when microinjected into mouse oocytes. Injection of infertile human sperm into mouse oocytes failed to activate the oocyte or trigger calcium oscillations. Injection of such infertile sperm followed by two calcium pulses, induced by assisted oocyte activation, activated the oocytes without inducing the typical pattern of calcium oscillations. CONCLUSIONS: Our findings illustrate the importance of PLCzeta during fertilization and suggest that mutant forms of PLCzeta may underlie certain types of human male infertility.
Assuntos
Infertilidade Masculina/enzimologia , Fosfoinositídeo Fosfolipase C/metabolismo , Interações Espermatozoide-Óvulo/fisiologia , Espermatozoides/metabolismo , Substituição de Aminoácidos , Animais , Sítios de Ligação , Cálcio/metabolismo , Fertilização/fisiologia , Humanos , Immunoblotting , Masculino , Camundongos , Modelos Moleculares , Fosfoinositídeo Fosfolipase C/química , Fosfoinositídeo Fosfolipase C/genética , Mutação Puntual , Estrutura Terciária de ProteínaRESUMO
Antibodies are proteins of the immune system that are able to bind to a huge variety of different substances, making them attractive candidates for therapeutic applications. Antibody structures have the potential to be useful during drug development, allowing the implementation of rational design procedures. The most challenging part of the antibody structure to experimentally determine or model is the H3 loop, which in addition is often the most important region in an antibody's binding site. This review summarises the approaches used so far in the pursuit of accurate computational H3 structure prediction.
RESUMO
CODA, an algorithm for predicting the variable regions in proteins, combines FREAD a knowledge based approach, and PETRA, which constructs the region ab initio. FREAD selects from a database of protein structure fragments with environmentally constrained substitution tables and other rule-based filters. FREAD was parameterized and tested on over 3000 loops. The average root mean square deviation ranged from 0.78 A for three residue loops to 3.5 A for eight residue loops on a nonhomologous test set. CODA clusters the predictions from the two independent programs and makes a consensus prediction that must pass a set of rule-based filters. CODA was parameterized and tested on two unrelated separate sets of structures that were nonhomologous to one another and those found in the FREAD database. The average root mean square deviation in the test set ranged from 0.76 A for three residue loops to 3.09 A for eight residue loops. CODA shows a general improvement in loop prediction over PETRA and FREAD individually. The improvement is far more marked for lengths six and upward, probably as the predictive power of PETRA becomes more important. CODA was further tested on several model structures to determine its applicability to the modeling situation. A web server of CODA is available at http://www-cryst.bioc.cam.ac.uk/~charlotte/Coda/search_coda.html.
Assuntos
Algoritmos , Motivos de Aminoácidos , Bases de Dados Factuais , Fragmentos de Peptídeos/química , Substituição de Aminoácidos , Teorema de Bayes , Variação Genética , Modelos Moleculares , Modelos Teóricos , Conformação Proteica , Reprodutibilidade dos TestesRESUMO
We describe a database of protein structure alignments for homologous families. The database HOMSTRAD presently contains 130 protein families and 590 aligned structures, which have been selected on the basis of quality of the X-ray analysis and accuracy of the structure. For each family, the database provides a structure-based alignment derived using COMPARER and annotated with JOY in a special format that represents the local structural environment of each amino acid residue. HOMSTRAD also provides a set of superposed atomic coordinates obtained using MNYFIT, which can be viewed with a graphical user interface or used for comparative modeling studies. The database is freely available on the World Wide Web at: http://www-cryst.bioc.cam. ac.uk/-homstrad/, with search facilities and links to other databases.
Assuntos
Bases de Dados Factuais , Proteínas/química , Alinhamento de Sequência , Ácido Aspártico Endopeptidases/química , Evolução Molecular , Internet , Estrutura Secundária de ProteínaRESUMO
Thrombopoietin (TPO) is a glycoprotein hormone that regulates red blood cell production. Presented here is a modeling study of the extracellular region of the human thrombopoietin receptor complex, in particular the TPO-receptor interface. The models were developed from structural homology to other cytokines and their receptors. Experimental evidence suggests that the receptor is homodimeric and it was modeled accordingly. Key interactions are shown that correlate with previous cytokine receptor complexes, and the pattern of cysteine bonding (Cys7-Cys151 and Cys29-Cys85) agrees with that experimentally determined for thrombopoietin. These models pave the way for possible mutagenesis experimentation and the design of (ant)agonists.
Assuntos
Modelos Moleculares , Proteínas de Neoplasias , Conformação Proteica , Proteínas Proto-Oncogênicas/química , Receptores de Citocinas , Trombopoetina/química , Sequência de Aminoácidos , Sítios de Ligação , Gráficos por Computador , Humanos , Dados de Sequência Molecular , Estrutura Molecular , Proteínas Proto-Oncogênicas/metabolismo , Receptores de Trombopoetina , Alinhamento de Sequência , Trombopoetina/metabolismoRESUMO
The binding site of an antibody is formed between the two variable domains, VH and VL, of its antigen binding fragment (Fab). Understanding how VH and VL orientate with respect to one another is important both for studying the mechanisms of antigen specificity and affinity and improving antibody modelling, docking and engineering. Different VH-VL orientations are commonly described using relative measures such as root-mean-square deviation. Recently, the orientation has also been characterised using the absolute measure of a VH-VL packing angle. However, a single angle cannot fully describe all modes of orientation. Here, we present a method which fully characterises VH-VL orientation in a consistent and absolute sense using five angles (HL, HC1, LC1, HC2 and LC2) and a distance (dc). Additionally, we provide a computational tool, ABangle, to allow the VH-VL orientation for any antibody to be automatically calculated and compared with all other known structures. We compare previous studies and show how the modes of orientation being identified relate to movements of different angles. Thus, we are able to explain why different studies identify different structural clusters and different residues as important. Given this result, we then identify those positions and their residue identities which influence each of the angular measures of orientation. Finally, by analysing VH-VL orientation in bound and unbound forms, we find that antibodies specific for protein antigens are significantly more flexible in their unbound form than antibodies specific for hapten antigens. ABangle is freely available at http://opig.stats.ox.ac.uk/webapps/abangle.
Assuntos
Anticorpos/química , Biologia Computacional , Cadeias Leves de Imunoglobulina/química , Região Variável de Imunoglobulina/química , Animais , Anticorpos/imunologia , Sequência Consenso , Haptenos/imunologia , Humanos , Cadeias Leves de Imunoglobulina/imunologia , Região Variável de Imunoglobulina/imunologia , Camundongos , Modelos Moleculares , Conformação Proteica , RatosRESUMO
5-HT3 receptors possess a number of highly conserved proline residues. We changed each of these to alanine, expressed the mutants as homomeric 5-HT3A receptors in HEK293 cells, and analyzed them with radioligand binding, electrophysiology, and immunocytochemistry. Mutation of Pro56, Pro104, Pro123, and Pro170 resulted in ablation of radioligand binding, whereas mutation of Pro257 and Pro301 did not. Only the latter were expressed at the plasma membrane but were non-functional. Thus the former, which are in the N-terminal domain, may be involved in forming correct receptor structure, while those in the transmembrane region (Pro257 and Pro301) are necessary for the function of the protein. To explore the conformational preference (propensity) of these residues we examined the proportion of cis-prolines and the influence of adjacent residues in known protein structures. 4.7% of prolines in the protein data base were in the cis conformation, and the distribution of amino acids adjacent to cis-prolines was not randomly distributed. Comparison of the proportion of each amino acid residue adjacent to a cis-proline revealed that aromatic and bend-facilitating residues were favored while those with beta-branched chains were not. Thus five residues (Gly, Pro, Tyr, Trp, Phe) and three residues (Pro, Tyr, Phe) were found more frequently than expected before and after cis-prolines respectively, whereas five residues (Val, Ile, Leu, Asp, Thr) and two residues (Asp, Glu) were found less frequently. Of the 20 proline residues in the 5-HT3A receptor subunit only Pro170 has adjacent residues that are favorable. Mutating these to non-favorable residues resulted in ablation of ligand binding, whereas replacement with alternative favorable residues did not. We therefore propose that Pro170, which is part of the characteristic cys-loop found in this family of proteins, may be in the cis conformation.
Assuntos
Prolina/metabolismo , Receptores de Serotonina/metabolismo , Sequência de Aminoácidos , Linhagem Celular , Humanos , Imuno-Histoquímica , Dados de Sequência Molecular , Mutagênese , Prolina/genética , Ensaio Radioligante , Receptores de Serotonina/química , Receptores de Serotonina/genética , Receptores 5-HT3 de SerotoninaRESUMO
We present a fast ab initio method for the prediction of local conformations in proteins. The program, PETRA, selects polypeptide fragments from a computer-generated database (APD) encoding all possible peptide fragments up to twelve amino acids long. Each fragment is defined by a representative set of eight straight phi/psi pairs, obtained iteratively from a trial set by calculating how fragments generated from them represent the protein databank (PDB). Ninety-six percent (96%) of length five fragments in crystal structures, with a resolution better than 1.5 A and less than 25% identity, have a conformer in the database with less than 1 A root-mean-square deviation (rmsd). In order to select segments from APD, PETRA uses a set of simple rule-based filters, thus reducing the number of potential conformations to a manageable total. This reduced set is scored and sorted using rmsd fit to the anchor regions and a knowledge-based energy function dependent on the sequence to be modelled. The best scoring fragments can then be optimized by minimization of contact potentials and rmsd fit to the core model. The quality of the prediction made by PETRA is evaluated by calculating both the differences in rmsd and backbone torsion angles between the final model and the native fragment. The average rmsd ranges from 1.4 A for three residue loops to 3.9 A for eight residue loops.
Assuntos
Algoritmos , Peptídeos/química , Conformação Proteica , Proteínas/química , Bases de Dados Factuais , Modelos Moleculares , Fragmentos de Peptídeos/química , Dobramento de Proteína , SoftwareRESUMO
The SLoop database of supersecondary fragments, first described by Donate et al. (Protein Sci., 1996, 5, 2600-2616), contains protein loops, classified according to structural similarity. The database has recently been updated and currently contains over 10 000 loops up to 20 residues in length, which cluster into over 560 well populated classes. The database can be found at http://www-cryst.bioc.cam.ac.uk/~sloop. In this paper, we identify conserved structural features such as main chain conformation and hydrogen bonding. Using the original approach of Rufino and co-workers (1997), the correct structural class is predicted with the highest SLoop score for 35% of loops. This rises to 65% by considering the three highest scoring class predictions and to 75% in the top five scoring class predictions. Inclusion of residues from the neighbouring secondary structures and use of substitution tables derived using a reduced definition of secondary structure increase these prediction accuracies to 58, 78 and 85%, respectively. This suggests that capping residues can stabilize the loop conformation as well as that of the secondary structure. Further increases are achieved if only well-populated classes are considered in the prediction. These results correspond to an average loop root mean square deviation of between 0.4 and 2.6 A for loops up to five residues in length.
Assuntos
Motivos de Aminoácidos , Fragmentos de Peptídeos , Proteínas/química , Algoritmos , Sequência de Aminoácidos , Bases de Dados Factuais , Ligação de Hidrogênio , Internet , Dados de Sequência Molecular , Fragmentos de Peptídeos/química , Peptídeos/química , Conformação Proteica , Proteínas/genética , Homologia de Sequência de AminoácidosRESUMO
We describe a web server, which provides easy access to the SLoop database of loop conformations connecting elements of protein secondary structure. The loops are classified according to their length, the type of bounding secondary structures and the conformation of the mainchain. The current release of the database consists of over 8000 loops of up to 20 residues in length. A loop prediction method, which selects conformers on the basis of the sequence and the positions of the elements of secondary structure, is also implemented. These web pages are freely accessible over the internet at http://www-cryst.bioc.cam.ac.uk/ approximately sloop.
Assuntos
Bases de Dados Factuais , Proteínas/química , Sequência de Aminoácidos , Internet , Modelos Moleculares , Dados de Sequência Molecular , Fragmentos de Peptídeos/química , Conformação Proteica , Estrutura Secundária de Proteína , Proteínas/genética , SoftwareRESUMO
MOTIVATION: The prediction of the regions of homology models that can be 'restrained by' or 'copied from' the basis structures is a vital step in correct model generation, because these regions are the models most accurate part. However, there is no ideal method for the identification of their limits. In most algorithms their length depends on the number of family members and definitions of secondary structure. RESULTS: The algorithm SCORE steps away from the conventional definitions of the core to identify from large numbers of basis structures those regions that can be considered structurally related to a target sequence. The use of phi, psi constraints to accurately pinpoint the regions that are conserved across a family and environmentally constrained substitution tables to extend these regions allows SCORE to rapidly (generally in under 1 s, an order of magnitude faster than methods such as MODELLER) identify and build the core of homology models from the alignments of the target sequence to the basis structures. The SCORE algorithm was used to build 114 model cores. In only two cases was the core size less than 50% of the structure and all the cores built had an RMSD of 3.7 A or less to the target structure.
Assuntos
Algoritmos , Modelos Moleculares , Proteínas/análise , Sequência Conservada , Variação Genética , Estrutura Terciária de Proteína/genética , Alinhamento de SequênciaRESUMO
MOTIVATION: JOY is a program to annotate protein sequence alignments with three-dimensional (3D) structural features. It was developed to display 3D structural information in a sequence alignment and to help understand the conservation of amino acids in their specific local environments. RESULTS: : The JOY representation now constitutes an essential part of the two databases of protein structure alignments: HOMSTRAD (http://www-cryst.bioc.cam.ac.uk/homstrad ) and CAMPASS (http://www-cryst.bioc.cam.ac. uk/campass). It has also been successfully used for identifying distant evolutionary relationships. AVAILABILITY: The program can be obtained via anonymous ftp from torsa.bioc.cam.ac.uk from the directory /pub/joy/. The address for the JOY server is http://www-cryst.bioc.cam.ac.uk/cgi-bin/joy.cgi. CONTACT: kenji@cryst.bioc.cam.ac.uk
Assuntos
Conformação Proteica , Proteínas/química , Software , Sequência de Aminoácidos , Ligação de Hidrogênio , Dados de Sequência Molecular , Estrutura Secundária de Proteína , Alinhamento de SequênciaRESUMO
Asparagine and aspartate are known to adopt conformations in the left-handed alpha-helical region and other partially allowed regions of the Ramachandran plot more readily than any other non-glycyl amino acids. The reason for this preference has not been established. An examination of the local environments of asparagine and aspartic acid in protein structures with a resolution better than 1.5 A revealed that their side-chain carbonyls are frequently within 4 A of their own backbone carbonyl or the backbone carbonyl of the previous residue. Calculations using protein structures with a resolution better than 1.8 A reveal that this close contact occurs in more than 80% of cases. This carbonyl-carbonyl interaction offers an energetic sabilization for the partially allowed conformations of asparagine and aspartic acid with respect to all other non-glycyl amino acids. The non-covalent attractive interactions between the dipoles of two carbonyls has recently been calculated to have an energy comparable to that of a hydrogen bond. The preponderance of asparagine in the left-handed alpha-helical region, and in general of aspartic acid and asparagine in the partially allowed regions of the Ramachandran plot, may be a consequence of this carbonyl-carbonyl stacking interaction.
Assuntos
Asparagina/química , Ácido Aspártico/química , Aminoácidos/química , Cristalografia por Raios X , Modelos Moleculares , Conformação Molecular , Estrutura Secundária de ProteínaRESUMO
Our approach to fold recognition for the fourth critical assessment of techniques for protein structure prediction (CASP4) experiment involved the use of the FUGUE sequence-structure homology recognition program (http://www-cryst.bioc.cam.ac.uk/fugue), followed by model building. We treat models as hypotheses and examine these to determine whether they explain the available data. Our method depends heavily on environment-specific substitution tables derived from our database of structural alignments of homologous proteins (HOMSTRAD, http://www-cryst.bioc.cam.ac.uk/homstrad/). FUGUE uses these tables to incorporate structural information into profiles created from HOMSTRAD alignments that are matched against a profile created for the target from multiple sequence alignment. In addition, environment-specific substitution tables are used throughout the modeling procedure and as part of the model evaluation. Annotation of sequence alignments with JOY, to reflect local structural features, proved valuable, both for modifying hypotheses, and for rejecting predictions when the expected pattern of conservation is not observed. Our stringency in rejecting incorrect predictions led us to submit a relatively small number of models, including only a low number of false positives, resulting in a high average score.
Assuntos
Modelos Moleculares , Estrutura Terciária de Proteína , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Metabolismo dos Carboidratos , Hidrolases de Éster Carboxílico/química , Simulação por Computador , Proteínas do Citoesqueleto/química , Dados de Sequência Molecular , Polissacarídeo-Liases/química , Ligação Proteica , Dobramento de Proteína , Análise de Sequência de Proteína , Software , alfa CateninaRESUMO
Correct alignment of the sequence of a target protein with those of homologues of known three-dimensional structure is a key step in comparative modeling. Usually an iterative approach that takes account of the local and overall structural features is required. We describe such an approach that exploits databases of structural alignments of homologous proteins (HOMSTRAD, http:/(/)www-cryst.bioc.cam.ac.uk/ approximately homstrad) and protein superfamilies (CAMPASS, http:/(/)www-cryst.bioc.cam.ac.uk/ approximately campass), in which structure-based alignments are analyzed and formatted with the program JOY (http:/(/)www-cryst.bioc.cam.ac.uk/ approximately joy) to reveal conserved local structural features. The databases facilitate the recognition of a family or superfamily, they assist in the selection of useful parent structures, they are helpful in alignment of the target sequences with the parent set, and are useful for deriving relationships that can be used in validating models. In the iterative approach, a model is constructed on the basis of the proposed sequence alignment and this is then reexpressed in the JOY format and realigned with the parent set. This is repeated until the model and sequence alignment is optimized. We examine the case for comparison and use of multiple structures of family members, rather than a single parent structure. We use the targets attempted by our group in CASP3 to assess the value of such procedures.