RESUMEN
Biological networks are often used to represent complex biological systems, which can contain several types of entities. Analysis and visualization of such networks is supported by the Cytoscape software tool and its many apps. While earlier versions of stringApp focused on providing intraspecies protein-protein interactions from the STRING database, the new stringApp 2.0 greatly improves the support for heterogeneous networks. Here, we highlight new functionality that makes it possible to create networks that contain proteins and interactions from STRING as well as other biological entities and associations from other sources. We exemplify this by complementing a published SARS-CoV-2 interactome with interactions from STRING. We have also extended stringApp with new data and query functionality for protein-protein interactions between eukaryotic parasites and their hosts. We show how this can be used to retrieve and visualize a cross-species network for a malaria parasite, its host, and its vector. Finally, the latest stringApp version has an improved user interface, allows retrieval of both functional associations and physical interactions, and supports group-wise enrichment analysis of different parts of a network to aid biological interpretation. stringApp is freely available at https://apps.cytoscape.org/apps/stringapp.
Asunto(s)
COVID-19 , Humanos , SARS-CoV-2 , Programas Informáticos , Proteínas , EucariontesRESUMEN
Cellular life depends on a complex web of functional associations between biomolecules. Among these associations, protein-protein interactions are particularly important due to their versatility, specificity and adaptability. The STRING database aims to integrate all known and predicted associations between proteins, including both physical interactions as well as functional associations. To achieve this, STRING collects and scores evidence from a number of sources: (i) automated text mining of the scientific literature, (ii) databases of interaction experiments and annotated complexes/pathways, (iii) computational interaction predictions from co-expression and from conserved genomic context and (iv) systematic transfers of interaction evidence from one organism to another. STRING aims for wide coverage; the upcoming version 11.5 of the resource will contain more than 14 000 organisms. In this update paper, we describe changes to the text-mining system, a new scoring-mode for physical interactions, as well as extensive user interface features for customizing, extending and sharing protein networks. In addition, we describe how to query STRING with genome-wide, experimental data, including the automated detection of enriched functionalities and potential biases in the user's query data. The STRING resource is available online, at https://string-db.org/.
Asunto(s)
Bases de Datos de Proteínas , Mapeo de Interacción de Proteínas , Proteínas/genética , Interfaz Usuario-ComputadorRESUMEN
Linear B-cell epitope prediction research has received a steadily growing interest ever since the first method was developed in 1981. B-cell epitope identification with the help of an accurate prediction method can lead to an overall faster and cheaper vaccine design process, a crucial necessity in the COVID-19 era. Consequently, several B-cell epitope prediction methods have been developed over the past few decades, but without significant success. In this study, we review the current performance and methodology of some of the most widely used linear B-cell epitope predictors which are available via a command-line interface, namely, BcePred, BepiPred, ABCpred, COBEpro, SVMTriP, LBtope, and LBEEP. Additionally, we attempted to remedy performance issues of the individual methods by developing a consensus classifier, which combines the separate predictions of these methods into a single output, accelerating the epitope-based vaccine design. While the method comparison was performed with some necessary caveats and individual methods might perform much better for specialized datasets, we hope that this update in performance can aid researchers towards the choice of a predictor, for the development of biomedical applications such as designed vaccines, diagnostic kits, immunotherapeutics, immunodiagnostic tests, antibody production, and disease diagnosis and therapy.
Asunto(s)
Biología Computacional/métodos , Mapeo Epitopo/métodos , Epítopos de Linfocito B/química , Vacunas/química , Simulación por Computador , Diseño de Fármacos , Epítopos de Linfocito B/metabolismo , Humanos , SARS-CoV-2/química , SARS-CoV-2/metabolismo , Vacunas/metabolismoRESUMEN
Monoclonal antibodies (mAbs) constitute a promising class of therapeutics, since ca. 25% of all biotech drugs in development are mAbs. Even though their therapeutic value is now well established, human- and murine-derived mAbs do have deficiencies, such as short in vivo lifespan and low stability. However, the most difficult obstacle to overcome, toward the exploitation of mAbs for disease treatment, is the prevention of the formation of protein aggregates. ANTISOMA is a pipeline for the reduction of the aggregation tendency of mAbs through the decrease in their intrinsic aggregation propensity, based on an automated amino acid substitution approach. The method takes into consideration the special features of mAbs and aims at proposing specific point mutations that could lead to the redesign of those promising therapeutics, without affecting their epitope-binding ability. The method is available online at http://bioinformatics.biol.uoa.gr/ANTISOMA .
Asunto(s)
Anticuerpos Monoclonales , Biología Computacional , Agregación Patológica de Proteínas , Animales , Anticuerpos Monoclonales/genética , Anticuerpos Monoclonales/metabolismo , Anticuerpos Monoclonales/uso terapéutico , Biología Computacional/métodos , Epítopos/genética , Humanos , Ratones , Agregación Patológica de Proteínas/tratamiento farmacológicoRESUMEN
Voltage-gated ion channels (VGICs) are one of the largest groups of transmembrane proteins. Due to their major role in the generation and propagation of electrical signals, VGICs are considered important from a medical viewpoint, and their dysfunction is often associated with Channelopathies. We identified disease-associated mutations and polymorphisms in these proteins through mapping missense single-nucleotide polymorphisms from the UniProt and ClinVar databases on their amino acid sequence, considering their special topological and functional characteristics. Statistical analysis revealed that disease-associated SNPs are mostly found in the voltage sensor domain and the pore loop. Both of these regions are extremely important for the activation and ion conductivity of VGICs. Moreover, among the most frequently observed mutations are those of arginine to glutamine, to histidine or to cysteine, which can probably be attributed to the extremely important role of arginine residues in the regulation of membrane potential in these proteins. We suggest that topological information in combination with genetic variation data can contribute toward a better evaluation of the effect of currently unclassified mutations in VGICs. It is hoped that potential associations with certain disease phenotypes will be revealed in the future with the use of similar approaches.
Asunto(s)
Canales de Calcio/genética , Canalopatías/genética , Polimorfismo de Nucleótido Simple , Canales de Potasio con Entrada de Voltaje/genética , Canales de Sodio Activados por Voltaje/genética , Secuencia de Aminoácidos , Arginina/metabolismo , Canales de Calcio/clasificación , Canales de Calcio/metabolismo , Canalopatías/metabolismo , Canalopatías/patología , Cisteína/metabolismo , Bases de Datos de Proteínas , Expresión Génica , Glutamina/metabolismo , Histidina/metabolismo , Humanos , Activación del Canal Iónico/genética , Modelos Moleculares , Canales de Potasio con Entrada de Voltaje/clasificación , Canales de Potasio con Entrada de Voltaje/metabolismo , Conformación Proteica , Dominios Proteicos , Proteómica/métodos , Canales de Sodio Activados por Voltaje/clasificación , Canales de Sodio Activados por Voltaje/metabolismoRESUMEN
A large number of modular domains that exhibit specific lipid binding properties are present in many membrane proteins involved in trafficking and signal transduction. These domains are present in either eukaryotic peripheral membrane or transmembrane proteins and are responsible for the non-covalent interactions of these proteins with membrane lipids. Here we report a profile Hidden Markov Model based method capable of detecting Membrane Binding Proteins (MBPs) from information encoded in their amino acid sequence, called MBPpred. The method identifies MBPs that contain one or more of the Membrane Binding Domains (MBDs) that have been described to date, and further classifies these proteins based on their position in respect to the membrane, either as peripheral or transmembrane. MBPpred is available online at http://bioinformatics.biol.uoa.gr/MBPpred. This method was applied in selected eukaryotic proteomes, in order to examine the characteristics they exhibit in various eukaryotic kingdoms and phyla.
Asunto(s)
Proteínas Portadoras/análisis , Cadenas de Markov , Lípidos de la Membrana/metabolismo , Proteínas de la Membrana/análisis , Proteoma , AlgoritmosRESUMEN
Clusterin, a multitasking glycoprotein, is a protein highly conserved amongst mammals. In humans, Clusterin is mainly a secreted protein, described as an extracellular chaperone with the capability of interacting with a broad spectrum of molecules. In neurodegenerative diseases, such as Alzheimer's disease, it is an amyloid associated protein, co-localized with fibrillar deposits in amyloid plaques in systemic or localized amyloidoses. An 'aggregation-prone' segment (NFHAMFQ) was located within the Clusterin α-chain sequence using AMYLPRED, a consensus method for the prediction of amyloid propensity, developed in our lab. This peptide was synthesized and was found to self-assemble into amyloid-like fibrils in vitro, as electron microscopy, X-ray fiber diffraction, Attenuated Total Reflectance Fourier-Transform Spectroscopy and Congo red staining studies reveal. All experimental results verify that this human Clusterin peptide-analogue, possesses high aggregation potency. Additional computational analysis highlighted novel and at the same time, unexplored features of human Clusterin.
Asunto(s)
Amiloidosis , Clusterina/química , Biología Computacional , Amiloide , Animales , Humanos , Conformación ProteicaRESUMEN
Ligand-Gated Ion Channels (LGICs) is one of the largest groups of transmembrane proteins. Due to their major role in synaptic transmission, both in the nervous system and the somatic neuromuscular junction, LGICs present attractive therapeutic targets. During the last few years, several computational methods for the detection of LGICs have been developed. These methods are based on machine learning approaches utilizing features extracted solely from the amino acid composition. Here we report the development of LiGIoNs, a profile Hidden Markov Model (pHMM) method for the prediction and ligand-based classification of LGICs. The method consists of a library of 10 pHMMs, one per LGIC subfamily, built from the alignment of representative LGIC sequences. In addition, 14 Pfam pHMMs are used to further annotate and classify unknown protein sequences into one of the 10 LGIC subfamilies. Evaluation of the method showed that it outperforms existing methods in the detection of LGICs. On top of that, LiGIoNs is the only currently available method that classifies LGICs into subfamilies. The method is available online at http://bioinformatics.biol.uoa.gr/ligions/.
Asunto(s)
Canales Iónicos Activados por Ligandos , Secuencia de Aminoácidos , LigandosRESUMEN
Alzheimer disease (AD) is a neurodegenerative disorder with an -as of yet- unclear etiology and pathogenesis. Research to unveil disease processes underlying AD often relies on the use of neurodegenerative disease model organisms, such as Caenorhabditis elegans. This study sought to identify biological pathways implicated in AD that are conserved in Homo sapiens and C. elegans. Protein-protein interaction networks were assembled for amyloid precursor protein (APP) and Tau in H. sapiens-two proteins whose aggregation is a hallmark in AD-and their orthologs APL-1 and PTL-1 for C. elegans. Global network alignment was used to compare these networks and determine similar, likely conserved, network regions. This comparison revealed that two prominent pathways, the APP-processing and the Tau-phosphorylation pathways, are highly conserved in both organisms. While the majority of interactions between proteins in those pathways are known to be associated with AD in human, they remain unexamined in C. elegans, signifying the need for their further investigation. In this work, we have highlighted conserved interactions related to AD in humans and have identified specific proteins that can act as targets for experimental studies in C. elegans, aiming to uncover the underlying mechanisms of AD.
Asunto(s)
Enfermedad de Alzheimer/metabolismo , Biomarcadores , Caenorhabditis elegans/metabolismo , Transducción de Señal , Enfermedad de Alzheimer/etiología , Precursor de Proteína beta-Amiloide/metabolismo , Animales , Proteínas de Caenorhabditis elegans/metabolismo , Biología Computacional/métodos , Modelos Animales de Enfermedad , Susceptibilidad a Enfermedades , Humanos , Proteínas Asociadas a Microtúbulos/metabolismo , Fosforilación , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas , Proteoma , Proteómica/métodos , Proteínas tau/metabolismoRESUMEN
The majority of all proteins in cells interact with membranes either permanently or temporarily. Peripheral membrane proteins form transient complexes with membrane proteins and/or lipids, via non-covalent interactions and are of outmost importance, due to numerous cellular functions in which they participate. In an effort to collect data regarding this heterogeneous group of proteins we designed and constructed a database, called PerMemDB. PerMemDB is currently the most complete and comprehensive repository of data for eukaryotic peripheral membrane proteins deposited in UniProt or predicted with the use of MBPpred - a computational method that specializes in the detection of proteins that interact non-covalently with membrane lipids, via membrane binding domains. The first version of the database contains 231,770 peripheral membrane proteins from 1009 organisms. All entries have cross-references to other databases, literature references and annotation regarding their interactions with other proteins. Moreover, additional sequence annotation of the characteristic domains that allow these proteins to interact with membranes is available, due to the application of MBPpred. Through the web interface of PerMemDB, users can browse the contents of the database, submit advanced text searches and BLAST queries against the protein sequences deposited in PerMemDB. We expect this repository to serve as a source of information that will allow the scientific community to gain a deeper understanding of the evolution and function of peripheral membrane proteins via the enhancement of proteome-wide analyses. The database is available at: http://bioinformatics.biol.uoa.gr/db=permemdb.
Asunto(s)
Bases de Datos de Proteínas , Eucariontes/química , Proteínas de la Membrana , Biología Computacional , Recolección de Datos/métodos , Unión ProteicaRESUMEN
Blood-cell targeting Autoimmune Diseases (BLADs) are complex diseases that affect blood cell formation or prevent blood cell production. Since these clinical conditions are gathering growing attention, experimental approaches are being used to investigate the mechanisms behind their pathogenesis and to identify proteins associated with them. However, computational approaches have not been utilized extensively in the study of BLADs. This study aims to investigate the interaction network of proteins associated with BLADs (BLAD interactome) and to identify novel associations with other human proteins. The method followed in this study combines information regarding protein-protein interaction network properties and autoimmune disease terms. Proteins with high network scores and statistically significant autoimmune disease term enrichment were obtained and 14 of them were designated as candidate proteins associated with BLADs. Additionally, clustering analysis of the BLAD interactome was used and allowed the detection of 17 proteins that act as "connectors" of different BLADs. We expect our findings to further extend experimental efforts for the investigation of the pathogenesis and the relationships of BLADs.
Asunto(s)
Enfermedades Autoinmunes/inmunología , Células Sanguíneas/inmunología , Enfermedades Hematológicas/inmunología , Mapeo de Interacción de Proteínas/métodos , Mapas de Interacción de Proteínas/inmunología , Autoanticuerpos/inmunología , Autoanticuerpos/metabolismo , Autoantígenos/inmunología , Autoantígenos/metabolismo , Enfermedades Autoinmunes/sangre , Biomarcadores/análisis , Biomarcadores/sangre , Biomarcadores/metabolismo , Células Sanguíneas/metabolismo , Análisis por Conglomerados , Biología Computacional/métodos , Conjuntos de Datos como Asunto , Perfilación de la Expresión Génica , Enfermedades Hematológicas/sangre , Hematopoyesis/inmunología , HumanosRESUMEN
Amyloid fibrils are formed when soluble proteins misfold into highly ordered insoluble fibrillar aggregates and affect various organs and tissues. The deposition of amyloid fibrils is the main hallmark of a group of disorders, called amyloidoses. Curiously, fibril deposition has been also recorded as a complication in a number of other pathological conditions, including well-known neurodegenerative or endocrine diseases. To date, amyloidoses are roughly classified, owing to their tremendous heterogeneity. In this work, we introduce AmyCo, a freely available collection of amyloidoses and clinical disorders related to amyloid deposition. AmyCo classifies 75 diseases associated with amyloid deposition into two distinct categories, namely 1) amyloidosis and 2) clinical conditions associated with amyloidosis. Each database entry is annotated with the major protein component (causative protein), other components of amyloid deposits and affected tissues or organs. Database entries are also supplemented with appropriate detailed annotation and are referenced to ICD-10, MeSH, OMIM, PubMed, AmyPro and UniProtKB databases. To our knowledge, AmyCo is the first attempt towards the creation of a complete and an up-to-date repository, containing information about amyloidoses and diseases related to amyloid deposition. The AmyCo web interface is available at http://bioinformatics.biol.uoa.gr/amyco .
Asunto(s)
Enfermedad de Alzheimer/clasificación , Amiloide/genética , Amiloidosis/clasificación , Enfermedad de Parkinson/clasificación , Enfermedad de Alzheimer/diagnóstico , Enfermedad de Alzheimer/genética , Enfermedad de Alzheimer/metabolismo , Amiloide/metabolismo , Amiloidosis/diagnóstico , Amiloidosis/genética , Amiloidosis/metabolismo , Bases de Datos Factuales , Estudio de Asociación del Genoma Completo , Humanos , Mutación , Enfermedad de Parkinson/diagnóstico , Enfermedad de Parkinson/genética , Enfermedad de Parkinson/metabolismo , Terminología como AsuntoRESUMEN
Protein aggregation is an active area of research in recent decades, since it is the most common and troubling indication of protein instability. Understanding the mechanisms governing protein aggregation and amyloidogenesis is a key component to the aetiology and pathogenesis of many devastating disorders, including Alzheimer's disease or type 2 diabetes. Protein aggregation data are currently found "scattered" in an increasing number of repositories, since advances in computational biology greatly influence this field of research. This review exploits the various resources of aggregation data and attempts to distinguish and analyze the biological knowledge they contain, by introducing protein-based, fragment-based and disease-based repositories, related to aggregation. In order to gain a broad overview of the available repositories, a novel comprehensive network maps and visualizes the current association between aggregation databases and other important databases and/or tools and discusses the beneficial role of community annotation. The need for unification of aggregation databases in a common platform is also addressed.
Asunto(s)
Enfermedad de Alzheimer/metabolismo , Amiloide/metabolismo , Minería de Datos , Bases de Datos Factuales , Diabetes Mellitus Tipo 2/metabolismo , Agregado de Proteínas , Agregación Patológica de Proteínas/metabolismo , Animales , HumanosRESUMEN
Protein-protein interactions are the quintessence of physiological activities, but also participate in pathological conditions. Amyloid formation, an abnormal protein-protein interaction process, is a widespread phenomenon in divergent proteins and peptides, resulting in a variety of aggregation disorders. The complexity of the mechanisms underlying amyloid formation/amyloidogenicity is a matter of great scientific interest, since their revelation will provide important insight on principles governing protein misfolding, self-assembly and aggregation. The implication of more than one protein in the progression of different aggregation disorders, together with the cited synergistic occurrence between amyloidogenic proteins, highlights the necessity for a more universal approach, during the study of these proteins. In an attempt to address this pivotal need we constructed and analyzed the human amyloid interactome, a protein-protein interaction network of amyloidogenic proteins and their experimentally verified interactors. This network assembled known interconnections between well-characterized amyloidogenic proteins and proteins related to amyloid fibril formation. The consecutive extended computational analysis revealed significant topological characteristics and unraveled the functional roles of all constituent elements. This study introduces a detailed protein map of amyloidogenicity that will aid immensely towards separate intervention strategies, specifically targeting sub-networks of significant nodes, in an attempt to design possible novel therapeutics for aggregation disorders.
Asunto(s)
Amiloide/metabolismo , Amiloidosis/metabolismo , Agregación Patológica de Proteínas/metabolismo , Mapas de Interacción de Proteínas , Amiloide/química , HumanosRESUMEN
A major part of membrane function is conducted by proteins, both integral and peripheral. Peripheral membrane proteins temporarily adhere to biological membranes, either to the lipid bilayer or to integral membrane proteins with noncovalent interactions. The aim of this study was to construct and analyze the interactions of the human plasma membrane peripheral proteins (peripherome hereinafter). For this purpose, we collected a dataset of peripheral proteins of the human plasma membrane. We also collected a dataset of experimentally verified interactions for these proteins. The interaction network created from this dataset has been visualized using Cytoscape. We grouped the proteins based on their subcellular location and clustered them using the MCL algorithm in order to detect functional modules. Moreover, functional and graph theory based analyses have been performed to assess biological features of the network. Interaction data with drug molecules show that ~10% of peripheral membrane proteins are targets for approved drugs, suggesting their potential implications in disease. In conclusion, we reveal novel features and properties regarding the protein-protein interaction network created by peripheral proteins of the human plasma membrane.