Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
PLoS Comput Biol ; 11(8): e1004362, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-26312481

RESUMEN

The African clawed frog Xenopus laevis is an important model organism for studies in developmental and cell biology, including cell-signaling. However, our knowledge of X. laevis protein post-translational modifications remains scarce. Here, we used a mass spectrometry-based approach to survey the phosphoproteome of this species, compiling a list of 2636 phosphosites. We used structural information and phosphoproteomic data for 13 other species in order to predict functionally important phospho-regulatory events. We found that the degree of conservation of phosphosites across species is predictive of sites with known molecular function. In addition, we predicted kinase-protein interactions for a set of cell-cycle kinases across all species. The degree of conservation of kinase-protein interactions was found to be predictive of functionally relevant regulatory interactions. Finally, using comparative protein structure models, we find that phosphosites within structured domains tend to be located at positions with high conformational flexibility. Our analysis suggests that a small class of phosphosites occurs in positions that have the potential to regulate protein conformation.


Asunto(s)
Oocitos/metabolismo , Fosfoproteínas/análisis , Fosfoproteínas/química , Animales , Femenino , Espectrometría de Masas , Modelos Moleculares , Fosfoproteínas/metabolismo , Fosforilación , Mapas de Interacción de Proteínas , Proteómica , Xenopus laevis
2.
Nucleic Acids Res ; 42(Database issue): D336-46, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24271400

RESUMEN

ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence-structure alignment, model building and model assessment (http://salilab.org/modeller/). ModBase currently contains almost 30 million reliable models for domains in 4.7 million unique protein sequences. ModBase allows users to compute or update comparative models on demand, through an interface to the ModWeb modeling server (http://salilab.org/modweb). ModBase models are also available through the Protein Model Portal (http://www.proteinmodelportal.org/). Recently developed associated resources include the AllosMod server for modeling ligand-induced protein dynamics (http://salilab.org/allosmod), the AllosMod-FoXS server for predicting a structural ensemble that fits an SAXS profile (http://salilab.org/allosmod-foxs), the FoXSDock server for protein-protein docking filtered by an SAXS profile (http://salilab.org/foxsdock), the SAXS Merge server for automatic merging of SAXS profiles (http://salilab.org/saxsmerge) and the Pose & Rank server for scoring protein-ligand complexes (http://salilab.org/poseandrank). In this update, we also highlight two applications of ModBase: a PSI:Biology initiative to maximize the structural coverage of the human alpha-helical transmembrane proteome and a determination of structural determinants of human immunodeficiency virus-1 protease specificity.


Asunto(s)
Bases de Datos de Proteínas , Modelos Moleculares , Homología Estructural de Proteína , Proteasa del VIH/química , Humanos , Internet , Proteínas de la Membrana/química , Anotación de Secuencia Molecular , Estructura Terciaria de Proteína , Proteoma/química , Dispersión del Ángulo Pequeño , Difracción de Rayos X
3.
Proc Natl Acad Sci U S A ; 110(36): E3381-7, 2013 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-23959887

RESUMEN

Although the universe of protein structures is vast, these innumerable structures can be categorized into a finite number of folds. New functions commonly evolve by elaboration of existing scaffolds, for example, via domain insertions. Thus, understanding structural diversity of a protein fold evolving via domain insertions is a fundamental challenge. The haloalkanoic dehalogenase superfamily serves as an excellent model system wherein a variable cap domain accessorizes the ubiquitous Rossmann-fold core domain. Here, we determine the impact of the cap-domain insertion on the sequence and structure divergence of the core domain. Through quantitative analysis on a unique dataset of 154 core-domain-only and cap-domain-only structures, basic principles of their evolution have been uncovered. The relationship between sequence and structure divergence of the core domain is shown to be monotonic and independent of the corresponding type of domain insert, reflecting the robustness of the Rossmann fold to mutation. However, core domains with the same cap type share greater similarity at the sequence and structure levels, suggesting interplay between the cap and core domains. Notably, results reveal that the variance in structure maps to α-helices flanking the central ß-sheet and not to the domain-domain interface. Collectively, these results hint at intramolecular coevolution where the fold diverges differentially in the context of an accessory domain, a feature that might also apply to other multidomain superfamilies.


Asunto(s)
Hidrolasas/química , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Evolución Molecular , Variación Genética , Hidrolasas/clasificación , Hidrolasas/genética , Modelos Moleculares , Mutagénesis Insercional , Filogenia , Análisis de Componente Principal , Pliegue de Proteína
4.
Nat Methods ; 9(8): 834-9, 2012 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-22609626

RESUMEN

Although nearly half of today's major pharmaceutical drugs target human integral membrane proteins (hIMPs), only 30 hIMP structures are currently available in the Protein Data Bank, largely owing to inefficiencies in protein production. Here we describe a strategy for the rapid structure determination of hIMPs, using solution NMR spectroscopy with systematically labeled proteins produced via cell-free expression. We report new backbone structures of six hIMPs, solved in only 18 months from 15 initial targets. Application of our protocols to an additional 135 hIMPs with molecular weight <30 kDa yielded 38 hIMPs suitable for structural characterization by solution NMR spectroscopy without additional optimization.


Asunto(s)
Proteínas de la Membrana/química , Resonancia Magnética Nuclear Biomolecular/métodos , Bases de Datos de Proteínas , Humanos , Modelos Moleculares , Peso Molecular , Conformación Proteica
5.
PLoS Comput Biol ; 9(10): e1003253, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24098102

RESUMEN

Mycobacterium tuberculosis, the causative agent of tuberculosis (TB), infects an estimated two billion people worldwide and is the leading cause of mortality due to infectious disease. The development of new anti-TB therapeutics is required, because of the emergence of multi-drug resistance strains as well as co-infection with other pathogens, especially HIV. Recently, the pharmaceutical company GlaxoSmithKline published the results of a high-throughput screen (HTS) of their two million compound library for anti-mycobacterial phenotypes. The screen revealed 776 compounds with significant activity against the M. tuberculosis H37Rv strain, including a subset of 177 prioritized compounds with high potency and low in vitro cytotoxicity. The next major challenge is the identification of the target proteins. Here, we use a computational approach that integrates historical bioassay data, chemical properties and structural comparisons of selected compounds to propose their potential targets in M. tuberculosis. We predicted 139 target--compound links, providing a necessary basis for further studies to characterize the mode of action of these compounds. The results from our analysis, including the predicted structural models, are available to the wider scientific community in the open source mode, to encourage further development of novel TB therapeutics.


Asunto(s)
Antituberculosos/química , Proteínas Bacterianas/química , Biología Computacional/métodos , Descubrimiento de Drogas/métodos , Mycobacterium tuberculosis/química , Secuencia de Aminoácidos , Antituberculosos/metabolismo , Proteínas Bacterianas/metabolismo , Bases de Datos de Compuestos Químicos , Simulación del Acoplamiento Molecular , Datos de Secuencia Molecular , Conformación Proteica , Alineación de Secuencia
6.
Bioinformatics ; 28(15): 2072-3, 2012 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-22618536

RESUMEN

SUMMARY: Accurate alignment of protein sequences and/or structures is crucial for many biological analyses, including functional annotation of proteins, classifying protein sequences into families, and comparative protein structure modeling. Described here is a web interface to SALIGN, the versatile protein multiple sequence/structure alignment module of MODELLER. The web server automatically determines the best alignment procedure based on the inputs, while allowing the user to override default parameter values. Multiple alignments are guided by a dendrogram computed from a matrix of all pairwise alignment scores. When aligning sequences to structures, SALIGN uses structural environment information to place gaps optimally. If two multiple sequence alignments of related proteins are input to the server, a profile-profile alignment is performed. All features of the server have been previously optimized for accuracy, especially in the contexts of comparative modeling and identification of interacting protein partners. AVAILABILITY: The SALIGN web server is freely accessible to the academic community at http://salilab.org/salign. SALIGN is a module of the MODELLER software, also freely available to academic users (http://salilab.org/modeller). CONTACT: sali@salilab.org; madhusudhan@bii.a-star.edu.sg.


Asunto(s)
Secuencia de Aminoácidos , Proteínas/química , Alineación de Secuencia/métodos , Programas Informáticos , Biología Computacional/métodos , Internet , Interfaz Usuario-Computador
7.
Mol Cell Proteomics ; 10(6): M110.006478, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21454883

RESUMEN

The presence of multiple membrane-bound intracellular compartments is a major feature of eukaryotic cells. Many of the proteins required for formation and maintenance of these compartments share an evolutionary history. Here, we identify the SEA (Seh1-associated) protein complex in yeast that contains the nucleoporin Seh1 and Sec13, the latter subunit of both the nuclear pore complex and the COPII coating complex. The SEA complex also contains Npr2 and Npr3 proteins (upstream regulators of TORC1 kinase) and four previously uncharacterized proteins (Sea1-Sea4). Combined computational and biochemical approaches indicate that the SEA complex proteins possess structural characteristics similar to the membrane coating complexes COPI, COPII, the nuclear pore complex, and, in particular, the related Vps class C vesicle tethering complexes HOPS and CORVET. The SEA complex dynamically associates with the vacuole in vivo. Genetic assays indicate a role for the SEA complex in intracellular trafficking, amino acid biogenesis, and response to nitrogen starvation. These data demonstrate that the SEA complex is an additional member of a family of membrane coating and vesicle tethering assemblies, extending the repertoire of protocoatomer-related complexes.


Asunto(s)
Proteínas de Complejo Poro Nuclear/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Vacuolas/metabolismo , Autofagia , Inmunoprecipitación , Membranas Intracelulares/metabolismo , Modelos Moleculares , Complejos Multiproteicos/metabolismo , Proteínas de Complejo Poro Nuclear/química , Fenotipo , Filogenia , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Transporte de Proteínas , Proteínas de Saccharomyces cerevisiae/química , Homología Estructural de Proteína , Fracciones Subcelulares/metabolismo
8.
Nucleic Acids Res ; 39(Database issue): D465-74, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21097780

RESUMEN

ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence-structure alignment, model building and model assessment (http://salilab.org/modeller/). ModBase currently contains 10,355,444 reliable models for domains in 2,421,920 unique protein sequences. ModBase allows users to update comparative models on demand, and request modeling of additional sequences through an interface to the ModWeb modeling server (http://salilab.org/modweb). ModBase models are available through the ModBase interface as well as the Protein Model Portal (http://www.proteinmodelportal.org/). Recently developed associated resources include the SALIGN server for multiple sequence and structure alignment (http://salilab.org/salign), the ModEval server for predicting the accuracy of protein structure models (http://salilab.org/modeval), the PCSS server for predicting which peptides bind to a given protein (http://salilab.org/pcss) and the FoXS server for calculating and fitting Small Angle X-ray Scattering profiles (http://salilab.org/foxs).


Asunto(s)
Bases de Datos de Proteínas , Modelos Moleculares , Estructura Terciaria de Proteína , Proteínas Bacterianas/química , Gráficos por Computador , Péptidos/química , Mapeo de Interacción de Proteínas , Proteínas/química , Dispersión del Ángulo Pequeño , Alineación de Secuencia , Programas Informáticos , Homología Estructural de Proteína , Interfaz Usuario-Computador , Difracción de Rayos X
9.
Proteins ; 80(8): 2110-6, 2012 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-22544723

RESUMEN

The nuclear pore complex (NPC), embedded in the nuclear envelope, is a large, dynamic molecular assembly that facilitates exchange of macromolecules between the nucleus and the cytoplasm. The yeast NPC is an eightfold symmetric annular structure composed of ~456 polypeptide chains contributed by ~30 distinct proteins termed nucleoporins. Nup116, identified only in fungi, plays a central role in both protein import and mRNA export through the NPC. Nup116 is a modular protein with N-terminal "FG" repeats containing a Gle2p-binding sequence motif and a NPC targeting domain at its C-terminus. We report the crystal structure of the NPC targeting domain of Candida glabrata Nup116, consisting of residues 882-1034 [CgNup116(882-1034)], at 1.94 Å resolution. The X-ray structure of CgNup116(882-1034) is consistent with the molecular envelope determined in solution by small-angle X-ray scattering. Structural similarities of CgNup116(882-1034) with homologous domains from Saccharomyces cerevisiae Nup116, S. cerevisiae Nup145N, and human Nup98 are discussed.


Asunto(s)
Proteínas Fúngicas/química , Proteínas de Complejo Poro Nuclear/química , Poro Nuclear/química , Proteínas de Saccharomyces cerevisiae/química , Homología de Secuencia de Aminoácido , Secuencia de Aminoácidos , Candida glabrata/química , Cristalografía por Rayos X , Humanos , Datos de Secuencia Molecular , Complejos Multiproteicos/química , Membrana Nuclear/química , Estructura Terciaria de Proteína , Saccharomyces cerevisiae/química
10.
Biol Chem ; 393(3): 177-86, 2012 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22718633

RESUMEN

Cathepsin E splice variant 2 appears in a number of gastric carcinomas. Here we report detecting this variant in HeLa cells using polyclonal antibodies and biotinylated inhibitor pepstatin A. An overexpression of GFP fusion proteins of cathepsin E and its splice variant within HEK-293T cells was performed to show their localization. Their distribution under a fluorescence microscope showed that they are colocalized. We also expressed variants 1 and 2 of cathepsins E, with propeptide and without it, in Escherichia coli. After refolding from the inclusion bodies, the enzymatic activity and circular dichroism spectra of the splice variant 2 were compared to those of the wild-type mature active cathepsins E. While full-length cathepsin E variant 1 is activated at acid pH, the splice variant remains inactive. In contrast to the active cathepsin E, the splice variant 2 predominantly assumes ß-sheet structure, prone to oligomerization, at least under in vitro conditions, as shown by atomic force microscopy as shallow disk-like particles. A comparative structure model of splice variant 2 was computed based on its alignment to the known structure of cathepsin E intermediate (Protein Data Bank code 1TZS) and used to rationalize its conformational properties and loss of activity.


Asunto(s)
Catepsina E/química , Secuencia de Aminoácidos , Catepsina E/genética , Catepsina E/metabolismo , Escherichia coli/genética , Expresión Génica , Células HEK293 , Células HeLa , Humanos , Microscopía de Fuerza Atómica , Modelos Moleculares , Datos de Secuencia Molecular , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Alineación de Secuencia
11.
Bioinformatics ; 26(14): 1714-22, 2010 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-20505003

RESUMEN

MOTIVATION: Granzyme B (GrB) and caspases cleave specific protein substrates to induce apoptosis in virally infected and neoplastic cells. While substrates for both types of proteases have been determined experimentally, there are many more yet to be discovered in humans and other metazoans. Here, we present a bioinformatics method based on support vector machine (SVM) learning that identifies sequence and structural features important for protease recognition of substrate peptides and then uses these features to predict novel substrates. Our approach can act as a convenient hypothesis generator, guiding future experiments by high-confidence identification of peptide-protein partners. RESULTS: The method is benchmarked on the known substrates of both protease types, including our literature-curated GrB substrate set (GrBah). On these benchmark sets, the method outperforms a number of other methods that consider sequence only, predicting at a 0.87 true positive rate (TPR) and a 0.13 false positive rate (FPR) for caspase substrates, and a 0.79 TPR and a 0.21 FPR for GrB substrates. The method is then applied to approximately 25 000 proteins in the human proteome to generate a ranked list of predicted substrates of each protease type. Two of these predictions, AIF-1 and SMN1, were selected for further experimental analysis, and each was validated as a GrB substrate. AVAILABILITY: All predictions for both protease types are publically available at http://salilab.org/peptide. A web server is at the same site that allows a user to train new SVM models to make predictions for any protein that recognizes specific oligopeptide ligands.


Asunto(s)
Biología Computacional/métodos , Péptido Hidrolasas/química , Análisis de Secuencia de Proteína/métodos , Caspasas/química , Ligandos
12.
Nucleic Acids Res ; 37(Database issue): D347-54, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18948282

RESUMEN

MODBASE (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by MODPIPE, an automated modeling pipeline that relies primarily on MODELLER for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE currently contains 5,152,695 reliable models for domains in 1,593,209 unique protein sequences; only models based on statistically significant alignments and/or models assessed to have the correct fold are included. MODBASE also allows users to calculate comparative models on demand, through an interface to the MODWEB modeling server (http://salilab.org/modweb). Other resources integrated with MODBASE include databases of multiple protein structure alignments (DBAli), structurally defined ligand binding sites (LIGBASE), predicted ligand binding sites (AnnoLyze), structurally defined binary domain interfaces (PIBASE) and annotated single nucleotide polymorphisms and somatic mutations found in human proteins (LS-SNP, LS-Mut). MODBASE models are also available through the Protein Model Portal (http://www.proteinmodelportal.org/).


Asunto(s)
Bases de Datos de Proteínas , Modelos Moleculares , Estructura Terciaria de Proteína , Homología Estructural de Proteína , Genómica , Humanos , Ligandos , Mutación , Polimorfismo de Nucleótido Simple , Pliegue de Proteína , Dominios y Motivos de Interacción de Proteínas , Proteínas/genética , Interfaz Usuario-Computador
13.
J Struct Funct Genomics ; 10(4): 269-80, 2009 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-19760129

RESUMEN

Membrane proteins serve as cellular gatekeepers, regulators, and sensors. Prior studies have explored the functional breadth and evolution of proteins and families of particular interest, such as the diversity of transport-associated membrane protein families in prokaryotes and eukaryotes, the composition of integral membrane proteins, and family classification of all human G-protein coupled receptors. However, a comprehensive analysis of the content and evolutionary associations between membrane proteins and families in a diverse set of genomes is lacking. Here, a membrane protein annotation pipeline was developed to define the integral membrane genome and associations between 21,379 proteins from 34 genomes; most, but not all of these proteins belong to 598 defined families. The pipeline was used to provide target input for a structural genomics project that successfully cloned, expressed, and purified 61 of our first 96 selected targets in yeast. Furthermore, the methodology was applied (1) to explore the evolutionary history of the substrate-binding transmembrane domains of the human ABC transporter superfamily, (2) to identify the multidrug resistance-associated membrane proteins in whole genomes, and (3) to identify putative new membrane protein families.


Asunto(s)
Transportadoras de Casetes de Unión a ATP/genética , Resistencia a Múltiples Medicamentos/genética , Evolución Molecular , Genoma Humano/genética , Proteínas de la Membrana/genética , Receptores Acoplados a Proteínas G/genética , Animales , Genómica/métodos , Humanos , Estructura Secundaria de Proteína/genética
14.
J Struct Funct Genomics ; 10(2): 107-25, 2009 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-19219566

RESUMEN

To study the substrate specificity of enzymes, we use the amidohydrolase and enolase superfamilies as model systems; members of these superfamilies share a common TIM barrel fold and catalyze a wide range of chemical reactions. Here, we describe a collaboration between the Enzyme Specificity Consortium (ENSPEC) and the New York SGX Research Center for Structural Genomics (NYSGXRC) that aims to maximize the structural coverage of the amidohydrolase and enolase superfamilies. Using sequence- and structure-based protein comparisons, we first selected 535 target proteins from a variety of genomes for high-throughput structure determination by X-ray crystallography; 63 of these targets were not previously annotated as superfamily members. To date, 20 unique amidohydrolase and 41 unique enolase structures have been determined, increasing the fraction of sequences in the two superfamilies that can be modeled based on at least 30% sequence identity from 45% to 73%. We present case studies of proteins related to uronate isomerase (an amidohydrolase superfamily member) and mandelate racemase (an enolase superfamily member), to illustrate how this structure-focused approach can be used to generate hypotheses about sequence-structure-function relationships.


Asunto(s)
Amidohidrolasas/química , Biología Computacional/métodos , Genómica/métodos , Fosfopiruvato Hidratasa/química , Sitios de Unión , Bases de Datos de Proteínas , Conformación Proteica , Pliegue de Proteína , Especificidad por Sustrato
15.
Nucleic Acids Res ; 35(Web Server issue): W393-7, 2007 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-17478513

RESUMEN

The DBAli tools use a comprehensive set of structural alignments in the DBAli database to leverage the structural information deposited in the Protein Data Bank (PDB). These tools include (i) the DBAlit program that allows users to input the 3D coordinates of a protein structure for comparison by MAMMOTH against all chains in the PDB; (ii) the AnnoLite and AnnoLyze programs that annotate a target structure based on its stored relationships to other structures; (iii) the ModClus program that clusters structures by sequence and structure similarities; (iv) the ModDom program that identifies domains as recurrent structural fragments and (v) an implementation of the COMPARER method in the SALIGN command in MODELLER that creates a multiple structure alignment for a set of related protein structures. Thus, the DBAli tools, which are freely accessible via the World Wide Web at http://salilab.org/DBAli/, allow users to mine the protein structure space by establishing relationships between protein structures and their functions.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Bases de Datos de Proteínas , Proteínas/química , Proteínas/metabolismo , Pseudomonas aeruginosa/metabolismo , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Secuencia de Aminoácidos , Interpretación Estadística de Datos , Internet , Datos de Secuencia Molecular , Conformación Proteica , Proteínas/clasificación , Homología de Secuencia de Aminoácido , Relación Estructura-Actividad
16.
Angew Chem Int Ed Engl ; 48(16): 2978-82, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19288503

RESUMEN

The negative charge originating from deprotonation of the methyl group is distributed over the 2-picolyl ring. Bonding properties derived from the electron density distribution support the enamide character of picolyllithium (PicLi; the picture shows the deformation density of [2-PicLi x PicH](2)), but electrophilic attack occurs at the deprotonated C atom. This reactivity is rationalized by the electrostatic potential, which guides electrophiles towards the nucleophilic C atom.

17.
Nucleic Acids Res ; 34(10): 2943-52, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-16738133

RESUMEN

Proteins function through interactions with other molecules. Thus, the network of physical interactions among proteins is of great interest to both experimental and computational biologists. Here we present structure-based predictions of 3387 binary and 1234 higher order protein complexes in Saccharomyces cerevisiae involving 924 and 195 proteins, respectively. To generate candidate complexes, comparative models of individual proteins were built and combined together using complexes of known structure as templates. These candidate complexes were then assessed using a statistical potential, derived from binary domain interfaces in PIBASE (http://salilab.org/pibase). The statistical potential discriminated a benchmark set of 100 interface structures from a set of sequence-randomized negative examples with a false positive rate of 3% and a true positive rate of 97%. Moreover, the predicted complexes were also filtered using functional annotation and sub-cellular localization data. The ability of the method to select the correct binding mode among alternates is demonstrated for three camelid VHH domain-porcine alpha-amylase interactions. We also highlight the prediction of co-complexed domain superfamilies that are not present in template complexes. Through integration with MODBASE, the application of the method to proteomes that are less well characterized than that of S.cerevisiae will contribute to expansion of the structural and functional coverage of protein interaction space. The predicted complexes are deposited in MODBASE (http://salilab.org/modbase).


Asunto(s)
Complejos Multiproteicos/química , Mapeo de Interacción de Proteínas/métodos , Proteínas de Saccharomyces cerevisiae/química , Algoritmos , Biología Computacional/métodos , Modelos Moleculares , Complejos Multiproteicos/metabolismo , Unión Proteica , Estructura Terciaria de Proteína , Curva ROC , Proteínas de Saccharomyces cerevisiae/metabolismo , alfa-Amilasas/química , alfa-Amilasas/metabolismo
18.
Nucleic Acids Res ; 34(Database issue): D291-5, 2006 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-16381869

RESUMEN

MODBASE (http://salilab.org/modbase) is a database of annotated comparative protein structure models for all available protein sequences that can be matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on MODELLER for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, and improvements in the software for calculating the models. MODBASE currently contains 3 094 524 reliable models for domains in 1 094 750 out of 1 817 889 unique protein sequences in the UniProt database (July 5, 2005); only models based on statistically significant alignments and models assessed to have the correct fold despite insignificant alignments are included. MODBASE also allows users to generate comparative models for proteins of interest with the automated modeling server MODWEB (http://salilab.org/modweb). Our other resources integrated with MODBASE include comprehensive databases of multiple protein structure alignments (DBAli, http://salilab.org/dbali), structurally defined ligand binding sites and structurally defined binary domain interfaces (PIBASE, http://salilab.org/pibase) as well as predictions of ligand binding sites, interactions between yeast proteins, and functional consequences of human nsSNPs (LS-SNP, http://salilab.org/LS-SNP).


Asunto(s)
Bases de Datos de Proteínas , Modelos Moleculares , Proteínas/química , Homología Estructural de Proteína , Sitios de Unión , Humanos , Internet , Ligandos , Polimorfismo de Nucleótido Simple , Estructura Terciaria de Proteína , Proteínas/genética , Proteínas/metabolismo , Programas Informáticos , Integración de Sistemas , Interfaz Usuario-Computador
19.
Methods Enzymol ; 606: 1-71, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30097089

RESUMEN

The radical SAM superfamily contains over 100,000 homologous enzymes that catalyze a remarkably broad range of reactions required for life, including metabolism, nucleic acid modification, and biogenesis of cofactors. While the highly conserved SAM-binding motif responsible for formation of the key 5'-deoxyadenosyl radical intermediate is a key structural feature that simplifies identification of superfamily members, our understanding of their structure-function relationships is complicated by the modular nature of their structures, which exhibit varied and complex domain architectures. To gain new insight about these relationships, we classified the entire set of sequences into similarity-based subgroups that could be visualized using sequence similarity networks. This superfamily-wide analysis reveals important features that had not previously been appreciated from studies focused on one or a few members. Functional information mapped to the networks indicates which members have been experimentally or structurally characterized, their known reaction types, and their phylogenetic distribution. Despite the biological importance of radical SAM chemistry, the vast majority of superfamily members have never been experimentally characterized in any way, suggesting that many new reactions remain to be discovered. In addition to 20 subgroups with at least one known function, we identified additional subgroups made up entirely of sequences of unknown function. Importantly, our results indicate that even general reaction types fail to track well with our sequence similarity-based subgroupings, raising major challenges for function prediction for currently identified and new members that continue to be discovered. Interactive similarity networks and other data from this analysis are available from the Structure-Function Linkage Database.


Asunto(s)
Enzimas/clasificación , Radicales Libres/metabolismo , Dominios Proteicos/genética , S-Adenosilmetionina/metabolismo , Secuencia de Aminoácidos/genética , Biología Computacional , Enzimas/química , Enzimas/genética , Enzimas/metabolismo , Evolución Molecular , Radicales Libres/química , Filogenia , S-Adenosilmetionina/química , Alineación de Secuencia , Relación Estructura-Actividad
20.
BMC Bioinformatics ; 8 Suppl 4: S4, 2007 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-17570147

RESUMEN

BACKGROUND: Advances in structural biology, including structural genomics, have resulted in a rapid increase in the number of experimentally determined protein structures. However, about half of the structures deposited by the structural genomics consortia have little or no information about their biological function. Therefore, there is a need for tools for automatically and comprehensively annotating the function of protein structures. We aim to provide such tools by applying comparative protein structure annotation that relies on detectable relationships between protein structures to transfer functional annotations. Here we introduce two programs, AnnoLite and AnnoLyze, which use the structural alignments deposited in the DBAli database. DESCRIPTION: AnnoLite predicts the SCOP, CATH, EC, InterPro, PfamA, and GO terms with an average sensitivity of ~90% and average precision of ~80%. AnnoLyze predicts ligand binding site and domain interaction patches with an average sensitivity of ~70% and average precision of ~30%, correctly localizing binding sites for small molecules in ~95% of its predictions. CONCLUSION: The AnnoLite and AnnoLyze programs for comparative annotation of protein structures can reliably and automatically annotate new protein structures. The programs are fully accessible via the Internet as part of the DBAli suite of tools at http://salilab.org/DBAli/.


Asunto(s)
Algoritmos , Bases de Datos de Proteínas , Proteínas/química , Proteínas/metabolismo , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Secuencia de Aminoácidos , Intervalos de Confianza , Interpretación Estadística de Datos , Almacenamiento y Recuperación de la Información/métodos , Datos de Secuencia Molecular , Proteínas/clasificación , Sensibilidad y Especificidad , Homología de Secuencia de Aminoácido , Relación Estructura-Actividad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA