Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
BMC Bioinformatics ; 24(1): 263, 2023 Jun 23.
Artículo en Inglés | MEDLINE | ID: mdl-37353753

RESUMEN

BACKGROUND: Protein-protein interactions play a crucial role in almost all cellular processes. Identifying interacting proteins reveals insight into living organisms and yields novel drug targets for disease treatment. Here, we present a publicly available, automated pipeline to predict genome-wide protein-protein interactions and produce high-quality multimeric structural models. RESULTS: Application of our method to the Human and Yeast genomes yield protein-protein interaction networks similar in quality to common experimental methods. We identified and modeled Human proteins likely to interact with the papain-like protease of SARS-CoV2's non-structural protein 3. We also produced models of SARS-CoV2's spike protein (S) interacting with myelin-oligodendrocyte glycoprotein receptor and dipeptidyl peptidase-4. CONCLUSIONS: The presented method is capable of confidently identifying interactions while providing high-quality multimeric structural models for experimental validation. The interactome modeling pipeline is available at usegalaxy.org and usegalaxy.eu.


Asunto(s)
COVID-19 , Mapeo de Interacción de Proteínas , Humanos , ARN Viral/metabolismo , SARS-CoV-2 , Saccharomyces cerevisiae/metabolismo
2.
Nucleic Acids Res ; 46(W1): W537-W544, 2018 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-29790989

RESUMEN

Galaxy (homepage: https://galaxyproject.org, main public server: https://usegalaxy.org) is a web-based scientific analysis platform used by tens of thousands of scientists across the world to analyze large biomedical datasets such as those found in genomics, proteomics, metabolomics and imaging. Started in 2005, Galaxy continues to focus on three key challenges of data-driven biomedical science: making analyses accessible to all researchers, ensuring analyses are completely reproducible, and making it simple to communicate analyses so that they can be reused and extended. During the last two years, the Galaxy team and the open-source community around Galaxy have made substantial improvements to Galaxy's core framework, user interface, tools, and training materials. Framework and user interface improvements now enable Galaxy to be used for analyzing tens of thousands of datasets, and >5500 tools are now available from the Galaxy ToolShed. The Galaxy community has led an effort to create numerous high-quality tutorials focused on common types of genomic analyses. The Galaxy developer and user communities continue to grow and be integral to Galaxy's development. The number of Galaxy public servers, developers contributing to the Galaxy framework and its tools, and users of the main Galaxy server have all increased substantially.


Asunto(s)
Genómica/estadística & datos numéricos , Metabolómica/estadística & datos numéricos , Imagen Molecular/estadística & datos numéricos , Proteómica/estadística & datos numéricos , Interfaz Usuario-Computador , Conjuntos de Datos como Asunto , Humanos , Difusión de la Información , Cooperación Internacional , Internet , Reproducibilidad de los Resultados
3.
Nucleic Acids Res ; 44(W1): W3-W10, 2016 07 08.
Artículo en Inglés | MEDLINE | ID: mdl-27137889

RESUMEN

High-throughput data production technologies, particularly 'next-generation' DNA sequencing, have ushered in widespread and disruptive changes to biomedical research. Making sense of the large datasets produced by these technologies requires sophisticated statistical and computational methods, as well as substantial computational power. This has led to an acute crisis in life sciences, as researchers without informatics training attempt to perform computation-dependent analyses. Since 2005, the Galaxy project has worked to address this problem by providing a framework that makes advanced computational tools usable by non experts. Galaxy seeks to make data-intensive research more accessible, transparent and reproducible by providing a Web-based environment in which users can perform computational analyses and have all of the details automatically tracked for later inspection, publication, or reuse. In this report we highlight recently added features enabling biomedical analyses on a large scale.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Conjuntos de Datos como Asunto/estadística & datos numéricos , Interfaz Usuario-Computador , Investigación Biomédica , Biología Computacional/métodos , Bases de Datos Genéticas , Humanos , Internet , Reproducibilidad de los Resultados
4.
J Chem Inf Model ; 53(3): 717-25, 2013 Mar 25.
Artículo en Inglés | MEDLINE | ID: mdl-23413988

RESUMEN

The key step of template-based protein-protein structure prediction is the recognition of complexes from experimental structure libraries that have similar quaternary fold. Maintaining two monomer and dimer structure libraries is however laborious, and inappropriate library construction can degrade template recognition coverage. We propose a novel strategy SPRING to identify complexes by mapping monomeric threading alignments to protein-protein interactions based on the original oligomer entries in the PDB, which does not rely on library construction and increases the efficiency and quality of complex template recognitions. SPRING is tested on 1838 nonhomologous protein complexes which can recognize correct quaternary template structures with a TM score >0.5 in 1115 cases after excluding homologous proteins. The average TM score of the first model is 60% and 17% higher than that by HHsearch and COTH, respectively, while the number of targets with an interface RMSD <2.5 Å by SPRING is 134% and 167% higher than these competing methods. SPRING is controlled with ZDOCK on 77 docking benchmark proteins. Although the relative performance of SPRING and ZDOCK depends on the level of homology filters, a combination of the two methods can result in a significantly higher model quality than ZDOCK at all homology thresholds. These data demonstrate a new efficient approach to quaternary structure recognition that is ready to use for genome-scale modeling of protein-protein interactions due to the high speed and accuracy.


Asunto(s)
Conformación Proteica , Proteínas/química , Algoritmos , Benchmarking , Bases de Datos de Proteínas , Dimerización , Modelos Moleculares , Alineación de Secuencia , Programas Informáticos
5.
Nucleic Acids Res ; 38(Web Server issue): W46-52, 2010 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-20460464

RESUMEN

A web service for analysis of protein structures that are sequentially or non-sequentially similar was generated. Recently, the non-sequential structure alignment algorithm GANGSTA+ was introduced. GANGSTA+ can detect non-sequential structural analogs for proteins stated to possess novel folds. Since GANGSTA+ ignores the polypeptide chain connectivity of secondary structure elements (i.e. alpha-helices and beta-strands), it is able to detect structural similarities also between proteins whose sequences were reshuffled during evolution. GANGSTA+ was applied in an all-against-all comparison on the ASTRAL40 database (SCOP version 1.75), which consists of >10,000 protein domains yielding about 55 x 10(6) possible protein structure alignments. Here, we provide the resulting protein structure alignments as a public web-based service, named GANGSTA+ Internet Services (GIS). We also allow to browse the ASTRAL40 database of protein structures with GANGSTA+ relative to an externally given protein structure using different constraints to select specific results. GIS allows us to analyze protein structure families according to the SCOP classification scheme. Additionally, users can upload their own protein structures for pairwise protein structure comparison, alignment against all protein structures of the ASTRAL40 database (SCOP version 1.75) or symmetry analysis. GIS is publicly available at http://agknapp.chemie.fu-berlin.de/gplus.


Asunto(s)
Programas Informáticos , Homología Estructural de Proteína , Algoritmos , Gráficos por Computador , Bases de Datos de Proteínas , Internet , Estructura Terciaria de Proteína
6.
J Mol Biol ; 433(10): 166944, 2021 05 14.
Artículo en Inglés | MEDLINE | ID: mdl-33741411

RESUMEN

Genome-wide protein-protein interaction (PPI) determination remains a significant unsolved problem in structural biology. The difficulty is twofold since high-throughput experiments (HTEs) have often a relatively high false-positive rate in assigning PPIs, and PPI quaternary structures are more difficult to solve than tertiary structures using traditional structural biology techniques. We proposed a uniform pipeline, Threpp, to address both problems. Starting from a pair of monomer sequences, Threpp first threads both sequences through a complex structure library, where the alignment score is combined with HTE data using a naïve Bayesian classifier model to predict the likelihood of two chains to interact with each other. Next, quaternary complex structures of the identified PPIs are constructed by reassembling monomeric alignments with dimeric threading frameworks through interface-specific structural alignments. The pipeline was applied to the Escherichia coli genome and created 35,125 confident PPIs which is 4.5-fold higher than HTE alone. Graphic analyses of the PPI networks show a scale-free cluster size distribution, consistent with previous studies, which was found critical to the robustness of genome evolution and the centrality of functionally important proteins that are essential to E. coli survival. Furthermore, complex structure models were constructed for all predicted E. coli PPIs based on the quaternary threading alignments, where 6771 of them were found to have a high confidence score that corresponds to the correct fold of the complexes with a TM-score >0.5, and 39 showed a close consistency with the later released experimental structures with an average TM-score = 0.73. These results demonstrated the significant usefulness of threading-based homologous modeling in both genome-wide PPI network detection and complex structural construction.


Asunto(s)
Proteínas de Escherichia coli/genética , Escherichia coli/genética , Proteínas HSP70 de Choque Térmico/genética , Fosfotransferasas/genética , Proteoma/genética , Factores de Transcripción/genética , Teorema de Bayes , Análisis por Conglomerados , Escherichia coli/metabolismo , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/metabolismo , Regulación Bacteriana de la Expresión Génica , Genoma Bacteriano , Proteínas HSP70 de Choque Térmico/química , Proteínas HSP70 de Choque Térmico/metabolismo , Fosfotransferasas/química , Fosfotransferasas/metabolismo , Pliegue de Proteína , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas/genética , Estructura Cuaternaria de Proteína , Proteoma/química , Proteoma/metabolismo , Transducción de Señal , Factores de Transcripción/química , Factores de Transcripción/metabolismo
7.
Proteins ; 78(7): 1618-30, 2010 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-20112421

RESUMEN

Finding and identifying circular permuted protein pairs (CPP) is one of the harder tasks for structure alignment programs, because of the different location of the break in the polypeptide chain connectivity. The protein structure alignment tool GANGSTA+ was used to search for CPPs in a database of nearly 10,000 protein structures. It also allows determination of the statistical significance of the occurrence of circular permutations in the protein universe. The number of detected CPPs was found to be higher than expected, raising questions about the evolutionary processes leading to CPPs. The GANGSTA+ protein structure alignment tool is available online via the web server at http://gangsta.chemie.fu-berlin.de. On the same webpage the complete data base of similar protein structure pairs based on the ASTRAL40 set of protein domains is provided and one can select CPPs specifically.


Asunto(s)
Modelos Químicos , Pliegue de Proteína , Proteínas/química , Modelos Moleculares , Conformación Proteica , Alineación de Secuencia , Programas Informáticos
8.
Genome Inform ; 22: 21-9, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-20238416

RESUMEN

Due to the large number of available protein structure alignment algorithms, a lot of effort has been made to define robust measures to evaluate their performances and the quality of generated alignments. Most quality measures involve the number of aligned residues and the RMSD. In this work, we analyze how these two properties are influenced by different residue assignment strategies as employed in common non-sequential structure alignment algorithms. Therefore, we implemented different residue assignment strategies into our non-sequential structure alignment algorithm GANGSTA+. We compared the resulting numbers of aligned residues and RMSDs for each residue assignment strategy and different alignment algorithms on a benchmark set of circular-permuted protein pairs. Unfortunately, differences in the residue assignment strategies are often ignored when comparing the performances of different algorithms. However, our results clearly show that this may strongly bias the observations. Bringing residue assignment strategies in line can explain observed performance differences between entirely different alignment algorithms. Our results suggest that performance comparison of non-sequential protein structure alignment algorithms should be based on the same residue assignment strategy.


Asunto(s)
Algoritmos , Biología Computacional , Proteínas/química , Alineación de Secuencia/métodos , Bases de Datos Factuales , Humanos , Programas Informáticos
9.
Nucleic Acids Res ; 36(Web Server issue): W47-54, 2008 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-18492720

RESUMEN

The Superimposé webserver performs structural similarity searches with a preference towards 3D structure-based methods. Similarities can be detected between small molecules (e.g. drugs), parts of large structures (e.g. binding sites of proteins) and entire proteins. For this purpose, a number of algorithms were implemented and various databases are provided. Superimposé assists the user regarding the selection of a suitable combination of algorithm and database. After the computation on our server infrastructure, a visual assessment of the results is provided. The structure-based in silico screening for similar drug-like compounds enables the detection of scaffold-hoppers with putatively similar effects. The possibility to find similar binding sites can be of special interest in the functional analysis of proteins. The search for structurally similar proteins allows the detection of similar folds with different backbone topology. The Superimposé server is available at: http://bioinformatics.charite.de/superimpose.


Asunto(s)
Conformación Molecular , Programas Informáticos , Homología Estructural de Proteína , Algoritmos , Sitios de Unión , Bases de Datos Factuales , Internet , Modelos Moleculares , Preparaciones Farmacéuticas/química , Proteínas/química
10.
J Chem Inf Model ; 49(9): 2147-51, 2009 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-19728738

RESUMEN

Insights in structural biology can be gained by analyzing protein architectures and characterizing their structural similarities. Current computational approaches enable a comparison of a variety of structural and physicochemical properties in protein space. Here we describe the automated detection of rotational symmetries within a representative set of nearly 10,000 nonhomologous protein structures. To find structural symmetries in proteins initially, equivalent pairs of secondary structure elements (SSE), i.e., alpha-helices and beta-strands, are assigned. Thereby, we also allow SSE pairs to be assigned in reverse sequential order. The results highlight that the generation of symmetric, i.e., repetitive, protein structures is one of nature's major strategies to explore the universe of possible protein folds. This way structurally separated 'islands' of protein folds with a significant amount of symmetry were identified. The complete results of the present study are available at http://agknapp.chemie.fu-berlin.de/gplus, where symmetry analysis of new protein structures can also be performed.


Asunto(s)
Pliegue de Proteína , Proteínas/química , Bases de Datos de Proteínas , Modelos Moleculares , Conformación Proteica , Proteínas/metabolismo , Rotación
11.
Genome Inform ; 20: 260-9, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-19425140

RESUMEN

Protein-protein docking is a major task in structural biology. In general, the geometries of protein pairs are sampled by generating docked conformations, analyzing them with scoring functions and selecting appropriate geometries for further refinement. Here, we present an algorithm in real space to sample geometries of protein pairs. Therefore, we initially determine uniformly distributed points on the surfaces of the two protein structures to be docked and additionally define a set of uniformly distributed rotations. Then, the sampling method generates structures of protein pairs as follows: (i) We rotate one protein of the protein pair according to a selected rotation and (ii) translate it along a line connecting two surface points belonging to different proteins such that these surface points coincide. The resulting protein pair geometries are then analyzed and selected using a scoring function that considers residues and atom pairs. We applied this approach to a set of 22 enzyme-inhibitor complexes and demonstrate that a discretisation of the rigid-body search in real space provides an efficient and robust sampling scheme. Our method generates decoy sets with a considerable fraction of near-native geometries for all considered enzyme-inhibitor complexes.


Asunto(s)
Proteínas/química , Proteínas/metabolismo , Algoritmos , Secuencia Conservada , Bases de Datos de Proteínas , Homeostasis , Ligandos , Modelos Moleculares , Unión Proteica , Conformación Proteica , Receptores de Superficie Celular/química , Receptores de Superficie Celular/metabolismo , Serina Endopeptidasas/química , Serina Endopeptidasas/metabolismo , Inhibidores de Serina Proteinasa/química , Inhibidores de Serina Proteinasa/farmacología , Propiedades de Superficie , Termodinámica
12.
Sci Rep ; 6: 24507, 2016 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-27079421

RESUMEN

The laboratory mouse is the primary mammalian species used for studying alternative splicing events. Recent studies have generated computational models to predict functions for splice isoforms in the mouse. However, the functional relationship network, describing the probability of splice isoforms participating in the same biological process or pathway, has not yet been studied in the mouse. Here we describe a rich genome-wide resource of mouse networks at the isoform level, which was generated using a unique framework that was originally developed to infer isoform functions. This network was built through integrating heterogeneous genomic and protein data, including RNA-seq, exon array, protein docking and pseudo-amino acid composition. Through simulation and cross-validation studies, we demonstrated the accuracy of the algorithm in predicting isoform-level functional relationships. We showed that this network enables the users to reveal functional differences of the isoforms of the same gene, as illustrated by literature evidence with Anxa6 (annexin a6) as an example. We expect this work will become a useful resource for the mouse genetics community to understand gene functions. The network is publicly available at: http://guanlab.ccmb.med.umich.edu/isoformnetwork.


Asunto(s)
Empalme Alternativo , Redes Reguladoras de Genes , Isoformas de ARN , Algoritmos , Animales , Biología Computacional/métodos , Simulación por Computador , Genómica/métodos , Aprendizaje Automático , Ratones , Mapeo de Interacción de Proteínas , Reproducibilidad de los Resultados
13.
Protein Sci ; 17(8): 1374-82, 2008 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-18583523

RESUMEN

Newly determined protein structures are classified to belong to a new fold, if the structures are sufficiently dissimilar from all other so far known protein structures. To analyze structural similarities of proteins, structure alignment tools are used. We demonstrate that the usage of nonsequential structure alignment tools, which neglect the polypeptide chain connectivity, can yield structure alignments with significant similarities between proteins of known three-dimensional structure and newly determined protein structures that possess a new fold. The recently introduced protein structure alignment tool, GANGSTA, is specialized to perform nonsequential alignments with proper assignment of the secondary structure types by focusing on helices and strands only. In the new version, GANGSTA+, the underlying algorithms were completely redesigned, yielding enhanced quality of structure alignments, offering alignment against a larger database of protein structures, and being more efficient. We applied DaliLite, TM-align, and GANGSTA+ on three protein crystal structures considered to be novel folds. Applying GANGSTA+ to these novel folds, we find proteins in the ASTRAL40 database, which possess significant structural similarities, albeit the alignments are nonsequential and in some cases involve secondary structure elements aligned in reverse orientation. A web server is available at http://agknapp.chemie.fu-berlin.de/gplus for pairwise alignment, visualization, and database comparison.


Asunto(s)
Algoritmos , Análisis de Secuencia de Proteína/métodos , Biología Computacional/métodos , Bases de Datos de Proteínas , Modelos Moleculares , Pliegue de Proteína , Estructura Secundaria de Proteína , Homología Estructural de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA