Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
1.
Int J Mol Sci ; 25(8)2024 Apr 11.
Artículo en Inglés | MEDLINE | ID: mdl-38673810

RESUMEN

Cardiovascular diseases (CVDs) represent a major concern for global health, whose mechanistic understanding is complicated by a complex interplay between genetic predisposition and environmental factors. Specifically, heart failure (HF), encompassing dilated cardiomyopathy (DC), ischemic cardiomyopathy (ICM), and hypertrophic cardiomyopathy (HCM), is a topic of substantial interest in basic and clinical research. Here, we used a Partial Correlation Coefficient-based algorithm (PCC) within the context of a meta-analysis framework to construct a Gene Regulatory Network (GRN) that identifies key regulators whose activity is perturbed in Heart Failure. By integrating data from multiple independent studies, our approach unveiled crucial regulatory associations between transcription factors (TFs) and structural genes, emphasizing their pivotal roles in regulating metabolic pathways, such as fatty acid metabolism, oxidative stress response, epithelial-to-mesenchymal transition, and coagulation. In addition to known associations, our analysis also identified novel regulators, including the identification of TFs FPM315 and OVOL2, which are implicated in dilated cardiomyopathies, and TEAD1 and TEAD2 in both dilated and ischemic cardiomyopathies. Moreover, we uncovered alterations in adipogenesis and oxidative phosphorylation pathways in hypertrophic cardiomyopathy and discovered a role for IL2 STAT5 signaling in heart failure. Our findings underscore the importance of TF activity in the initiation and progression of cardiac disease, highlighting their potential as pharmacological targets.


Asunto(s)
Enfermedades Cardiovasculares , Redes Reguladoras de Genes , Factores de Transcripción , Humanos , Enfermedades Cardiovasculares/genética , Enfermedades Cardiovasculares/metabolismo , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Regulación de la Expresión Génica , Algoritmos , Insuficiencia Cardíaca/genética , Insuficiencia Cardíaca/metabolismo
2.
Noncoding RNA Res ; 7(2): 98-105, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35387279

RESUMEN

Recent research provides insight into the ability of miRNA to regulate various pathways in several cancer types. Despite their involvement in the regulation of the mRNA via targeting the 3'UTR, there are relatively few studies examining the changes in these regulatory mechanisms specific to single cancer types or shared between different cancer types. We analyzed samples where both miRNA and mRNA expression had been measured and performed a thorough correlation analysis on 7494 experimentally validated human miRNA-mRNA target-gene pairs in both healthy and tumoral samples. We show how more than 90% of these miRNA-mRNA interactions show a loss of regulation in the tumoral samples compared with their healthy counterparts. As expected, we found shared miRNA-mRNA dysregulated pairs among different tumors of the same tissue. However, anatomically different cancers also share multiple dysregulated interactions, suggesting that some cancer-related mechanisms are not tumor-specific. 2865 unique miRNA-mRNA pairs were identified across 13 cancer types, ≈ 40% of these pairs showed a loss of correlation in the tumoral samples in at least 2 out of the 13 analyzed cancers. Specifically, miR-200 family, miR-155 and miR-1 were identified, based on the computational analysis described below, as the miRNAs that potentially lose the highest number of interactions across different samples (only literature-based interactions were used for this analysis). Moreover, the miR-34a/ALDH2 and miR-9/MTHFD2 pairs show a switch in their correlation between healthy and tumor kidney samples suggesting a possible change in the regulation exerted by the miRNAs. Interestingly, the expression of these mRNAs is also associated with the overall survival. The disruption of miRNA regulation on its target, therefore, suggests the possible involvement of these pairs in cell malignant functions. The analysis reported here shows how the regulation of miRNA-mRNA interactions strongly differs between healthy and tumoral cells, based on the strong correlation variation between miRNA and its target that we obtained by analyzing the expression data of healthy and tumor tissue in highly reliable miRNA-target pairs. Finally, a go term enrichment analysis shows that the critical pairs identified are involved in cellular adhesion, proliferation, and migration.

3.
Nucleic Acids Res ; 49(W1): W67-W71, 2021 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-34038531

RESUMEN

The interaction between RNA and RNA-binding proteins (RBPs) has a key role in the regulation of gene expression, in RNA stability, and in many other biological processes. RBPs accomplish these functions by binding target RNA molecules through specific sequence and structure motifs. The identification of these binding motifs is therefore fundamental to improve our knowledge of the cellular processes and how they are regulated. Here, we present BRIO (BEAM RNA Interaction mOtifs), a new web server designed for the identification of sequence and structure RNA-binding motifs in one or more RNA molecules of interest. BRIO enables the user to scan over 2508 sequence motifs and 2296 secondary structure motifs identified in Homo sapiens and Mus musculus, in three different types of experiments (PAR-CLIP, eCLIP, HITS). The motifs are associated with the binding of 186 RBPs and 69 protein domains. The web server is freely available at http://brio.bio.uniroma2.it.


Asunto(s)
Proteínas de Unión al ARN/metabolismo , ARN/química , Programas Informáticos , Animales , Secuencia de Bases , Línea Celular , Humanos , Internet , Ratones , Motivos de Nucleótidos , ARN/metabolismo , ARN Nuclear Pequeño/metabolismo , ARN Viral/metabolismo , Análisis de Secuencia de ARN
4.
Methods Mol Biol ; 2284: 43-50, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33835436

RESUMEN

RNA primary and secondary motif discovery is an important step in the annotation and characterization of unknown interaction dynamics between RNAs and RNA-Binding Proteins, and several methods have been developed to meet the need of fast and efficient discovery of interaction motifs. Recent advances have increased the amount of data produced by experimental assays and there is no available method suitable for the analysis of all type of results. Here we present a simple workflow to help choosing the more appropriate method, depending on the starting situation, among the three algorithms that best cover the landscape of approaches. A detailed analysis is presented to highlight the need for different algorithms in different working settings. In conclusion, the proposed workflow depends on the nature of the starting data and on the availability of RNA annotations.


Asunto(s)
Conformación de Ácido Nucleico , ARN/química , Análisis de Secuencia de ARN/métodos , Algoritmos , Animales , Sitios de Unión/genética , Biología Computacional/métodos , Conjuntos de Datos como Asunto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Unión Proteica/genética , ARN/genética , ARN/metabolismo , Proteínas de Unión al ARN/metabolismo , Programas Informáticos
5.
NAR Genom Bioinform ; 3(1): lqab007, 2021 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-33615214

RESUMEN

Structural characterization of RNAs is a dynamic field, offering many modelling possibilities. RNA secondary structure models are usually characterized by an encoding that depicts structural information of the molecule through string representations or graphs. In this work, we provide a generalization of the BEAR encoding (a context-aware structural encoding we previously developed) by expanding the set of alignments used for the construction of substitution matrices and then applying it to secondary structure encodings ranging from fine-grained to more coarse-grained representations. We also introduce a re-interpretation of the Shannon Information applied on RNA alignments, proposing a new scoring metric, the Relative Information Gain (RIG). The RIG score is available for any position in an alignment, showing how different levels of detail encoded in the RNA representation can contribute differently to convey structural information. The approaches presented in this study can be used alongside state-of-the-art tools to synergistically gain insights into the structural elements that RNAs and RNA families are composed of. This additional information could potentially contribute to their improvement or increase the degree of confidence in the secondary structure of families and any set of aligned RNAs.

6.
PLoS Comput Biol ; 15(12): e1007219, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31846452

RESUMEN

The most frequently used approach for protein structure prediction is currently homology modeling. The 3D model building phase of this methodology is critical for obtaining an accurate and biologically useful prediction. The most widely employed tool to perform this task is MODELLER. This program implements the "modeling by satisfaction of spatial restraints" strategy and its core algorithm has not been altered significantly since the early 1990s. In this work, we have explored the idea of modifying MODELLER with two effective, yet computationally light strategies to improve its 3D modeling performance. Firstly, we have investigated how the level of accuracy in the estimation of structural variability between a target protein and its templates in the form of σ values profoundly influences 3D modeling. We show that the σ values produced by MODELLER are on average weakly correlated to the true level of structural divergence between target-template pairs and that increasing this correlation greatly improves the program's predictions, especially in multiple-template modeling. Secondly, we have inquired into how the incorporation of statistical potential terms (such as the DOPE potential) in the MODELLER's objective function impacts positively 3D modeling quality by providing a small but consistent improvement in metrics such as GDT-HA and lDDT and a large increase in stereochemical quality. Python modules to harness this second strategy are freely available at https://github.com/pymodproject/altmod. In summary, we show that there is a large room for improving MODELLER in terms of 3D modeling quality and we propose strategies that could be pursued in order to further increase its performance.


Asunto(s)
Modelos Moleculares , Programas Informáticos , Homología Estructural de Proteína , Algoritmos , Biología Computacional , Simulación de Dinámica Molecular/estadística & datos numéricos , Proteínas/química , Alineación de Secuencia/estadística & datos numéricos
7.
Sci Rep ; 9(1): 15222, 2019 10 23.
Artículo en Inglés | MEDLINE | ID: mdl-31645597

RESUMEN

Recent advances in pharmacogenomics have generated a wealth of data of different types whose analysis have helped in the identification of signatures of different cellular sensitivity/resistance responses to hundreds of chemical compounds. Among the different data types, gene expression has proven to be the more successful for the inference of drug response in cancer cell lines. Although effective, the whole transcriptome can introduce noise in the predictive models, since specific mechanisms are required for different drugs and these realistically involve only part of the proteins encoded in the genome. We analyzed the pharmacogenomics data of 961 cell lines tested with 265 anti-cancer drugs and developed different machine learning approaches for dissecting the genome systematically and predict drug responses using both drug-unspecific and drug-specific genes. These methodologies reach better response predictions for the vast majority of the screened drugs using tens to few hundreds genes specific to each drug instead of the whole genome, thus allowing a better understanding and interpretation of drug-specific response mechanisms which are not necessarily restricted to the drug known targets.


Asunto(s)
Antineoplásicos/farmacología , Regulación Neoplásica de la Expresión Génica/efectos de los fármacos , Neoplasias/tratamiento farmacológico , Neoplasias/genética , Antineoplásicos/uso terapéutico , Línea Celular Tumoral , Relación Dosis-Respuesta a Droga , Genoma Humano/efectos de los fármacos , Humanos , Aprendizaje Automático , Modelos Biológicos , Farmacogenética , Transcriptoma/efectos de los fármacos
8.
Nucleic Acids Res ; 47(10): 4958-4969, 2019 06 04.
Artículo en Inglés | MEDLINE | ID: mdl-31162604

RESUMEN

RNA molecules are able to bind proteins, DNA and other small or long RNAs using information at primary, secondary or tertiary structure level. Recent techniques that use cross-linking and immunoprecipitation of RNAs can detect these interactions and, if followed by high-throughput sequencing, molecules can be analysed to find recurrent elements shared by interactors, such as sequence and/or structure motifs. Many tools are able to find sequence motifs from lists of target RNAs, while others focus on structure using different approaches to find specific interaction elements. In this work, we make a systematic analysis of RBP-RNA and RNA-RNA datasets to better characterize the interaction landscape with information about multi-motifs on the same RNAs. To achieve this goal, we updated our BEAM algorithm to combine both sequence and structure information to create pairs of patterns that model motifs of interaction. This algorithm was applied to several RNA binding proteins and ncRNAs interactors, confirming already known motifs and discovering new ones. This landscape analysis on interaction variability reflects the diversity of target recognition and underlines that often both primary and secondary structure are involved in molecular recognition.


Asunto(s)
Motivos de Nucleótidos , Proteínas de Unión al ARN/química , ARN/química , Análisis de Secuencia de ARN/métodos , Algoritmos , Animales , Secuencia de Bases , Sitios de Unión , Línea Celular , Células HEK293 , Células Hep G2 , Humanos , Células K562 , Ratones , MicroARNs/química , MicroARNs/genética , MicroARNs/metabolismo , Unión Proteica , ARN/genética , ARN/metabolismo , Proteínas de Unión al ARN/genética , Proteínas de Unión al ARN/metabolismo
9.
Bioinformatics ; 35(3): 372-379, 2019 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-30016513

RESUMEN

Motivation: Signaling and metabolic pathways are finely regulated by a network of protein phosphorylation events. Unraveling the nature of this intricate network, composed of kinases, target proteins and their interactions, is therefore of crucial importance. Although thousands of kinase-specific phosphorylations (KsP) have been annotated in model organisms their kinase-target network is far from being complete, with less studied organisms lagging behind. Results: In this work, we achieved an automated and accurate identification of kinase domains, inferring the residues that most likely contribute to peptide specificity. We integrated this information with the target peptides of known human KsP to predict kinase-specific interactions in other eukaryotes through a deep neural network, outperforming similar methods. We analyzed the differential conservation of kinase specificity among eukaryotes revealing the high conservation of the specificity of tyrosine kinases. With this approach we discovered 1590 novel KsP of potential clinical relevance in the human proteome. Availability and implementation: http://akid.bio.uniroma2.it. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Fosfotransferasas/química , Proteoma , Transducción de Señal , Eucariontes , Humanos , Fosforilación
10.
Bioinformatics ; 34(6): 1058-1060, 2018 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-29095974

RESUMEN

Motivation: RNA structural motif finding is a relevant problem that becomes computationally hard when working on high-throughput data (e.g. eCLIP, PAR-CLIP), often represented by thousands of RNA molecules. Currently, the BEAM server is the only web tool capable to handle tens of thousands of RNA in input with a motif discovery procedure that is only limited by the current secondary structure prediction accuracies. Results: The recently developed method BEAM (BEAr Motifs finder) can analyze tens of thousands of RNA molecules and identify RNA secondary structure motifs associated to a measure of their statistical significance. BEAM is extremely fast thanks to the BEAR encoding that transforms each RNA secondary structure in a string of characters. BEAM also exploits the evolutionary knowledge contained in a substitution matrix of secondary structure elements, extracted from the RFAM database of families of homologous RNAs. The BEAM web server has been designed to streamline data pre-processing by automatically handling folding and encoding of RNA sequences, giving users a choice for the preferred folding program. The server provides an intuitive and informative results page with the list of secondary structure motifs identified, the logo of each motif, its significance, graphic representation and information about its position in the RNA molecules sharing it. Availability and implementation: The web server is freely available at http://beam.uniroma2.it/ and it is implemented in NodeJS and Python with all major browsers supported. Contact: marco.pietrosanto@uniroma2.it. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
ARN/química , Secuencia de Bases , Bases de Datos Factuales , Internet , Motivos de Nucleótidos , Análisis de Secuencia de ARN , Programas Informáticos
11.
Mol Cell Proteomics ; 13(9): 2198-212, 2014 09.
Artículo en Inglés | MEDLINE | ID: mdl-24830415

RESUMEN

Phosphorylation is a widespread post-translational modification that modulates the function of a large number of proteins. Here we show that a significant proportion of all the domains in the human proteome is significantly enriched or depleted in phosphorylation events. A substantial improvement in phosphosites prediction is achieved by leveraging this observation, which has not been tapped by existing methods. Phosphorylation sites are often not shared between multiple occurrences of the same domain in the proteome, even when the phosphoacceptor residue is conserved. This is partly because of different functional constraints acting on the same domain in different protein contexts. Moreover, by augmenting domain alignments with structural information, we were able to provide direct evidence that phosphosites in protein-protein interfaces need not be positionally conserved, likely because they can modulate interactions simply by sitting in the same general surface area.


Asunto(s)
Fosforilación , Proteoma/metabolismo , Biología Computacional/métodos , Humanos , Fosfoproteínas/metabolismo , Dominios Proteicos , Proteoma/química
12.
Nucleic Acids Res ; 42(10): 6146-57, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-24753415

RESUMEN

Structural information is crucial in ribonucleic acid (RNA) analysis and functional annotation; nevertheless, how to include such structural data is still a debated problem. Dot-bracket notation is the most common and simple representation for RNA secondary structures but its simplicity leads also to ambiguity requiring further processing steps to dissolve. Here we present BEAR (Brand nEw Alphabet for RNA), a new context-aware structural encoding represented by a string of characters. Each character in BEAR encodes for a specific secondary structure element (loop, stem, bulge and internal loop) with specific length. Furthermore, exploiting this informative and yet simple encoding in multiple alignments of related RNAs, we captured how much structural variation is tolerated in RNA families and convert it into transition rates among secondary structure elements. This allowed us to compute a substitution matrix for secondary structure elements called MBR (Matrix of BEAR-encoded RNA secondary structures), of which we tested the ability in aligning RNA secondary structures. We propose BEAR and the MBR as powerful resources for the RNA secondary structure analysis, comparison and classification, motif finding and phylogeny.


Asunto(s)
ARN/química , Algoritmos , Biología Computacional/métodos , Conformación de Ácido Nucleico , Análisis de Secuencia de ARN
13.
Database (Oxford) ; 2013: bat050, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23842462

RESUMEN

The use of high-throughput RNA sequencing technology (RNA-seq) allows whole transcriptome analysis, providing an unbiased and unabridged view of alternative transcript expression. Coupling splicing variant-specific expression with its functional inference is still an open and difficult issue for which we created the DataBase of Alternative Transcripts Expression (DBATE), a web-based repository storing expression values and functional annotation of alternative splicing variants. We processed 13 large RNA-seq panels from human healthy tissues and in disease conditions, reporting expression levels and functional annotations gathered and integrated from different sources for each splicing variant, using a variant-specific annotation transfer pipeline. The possibility to perform complex queries by cross-referencing different functional annotations permits the retrieval of desired subsets of splicing variant expression values that can be visualized in several ways, from simple to more informative. DBATE is intended as a novel tool to help appreciate how, and possibly why, the transcriptome expression is shaped. DATABASE URL: http://bioinformatica.uniroma2.it/DBATE/.


Asunto(s)
Empalme Alternativo/genética , Bases de Datos Genéticas , Humanos , Internet , Anotación de Secuencia Molecular , Mapas de Interacción de Proteínas/genética , Motor de Búsqueda , Interfaz Usuario-Computador
14.
BMC Genomics ; 14: 379, 2013 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-23758645

RESUMEN

BACKGROUND: Anecdotal evidence of the involvement of alternative splicing (AS) in the regulation of protein-protein interactions has been reported by several studies. AS events have been shown to significantly occur in regions where a protein interaction domain or a short linear motif is present. Several AS variants show partial or complete loss of interface residues, suggesting that AS can play a major role in the interaction regulation by selectively targeting the protein binding sites. In the present study we performed a statistical analysis of the alternative splicing of a non-redundant dataset of human protein-protein interfaces known at molecular level to determine the importance of this way of modulation of protein-protein interactions through AS. RESULTS: Using a Cochran-Mantel-Haenszel chi-square test we demonstrated that the alternative splicing-mediated partial removal of both heterodimeric and homodimeric binding sites occurs at lower frequencies than expected, and this holds true even if we consider only those isoforms whose sequence is less different from that of the canonical protein and which therefore allow to selectively regulate functional regions of the protein. On the other hand, large removals of the binding site are not significantly prevented, possibly because they are associated to drastic structural changes of the protein. The observed protection of the binding sites from AS is not preferentially directed towards putative hot spot interface residues, and is widespread to all protein functional classes. CONCLUSIONS: Our findings indicate that protein-protein binding sites are generally protected from alternative splicing-mediated partial removals. However, some cases in which the binding site is selectively removed exist, and here we discuss one of them.


Asunto(s)
Empalme Alternativo , Proteínas/química , Proteínas/metabolismo , Proteómica , Sitios de Unión , Proteínas Cullin/química , Proteínas Cullin/genética , Proteínas Cullin/metabolismo , Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Humanos , Modelos Moleculares , Unión Proteica , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Multimerización de Proteína , Estructura Cuaternaria de Proteína , Proteínas/genética , Termodinámica
15.
Nucleic Acids Res ; 41(Web Server issue): W308-13, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23737450

RESUMEN

The webPDBinder (http://pdbinder.bio.uniroma2.it/PDBinder) is a web server for the identification of small ligand-binding sites in a protein structure. webPDBinder searches a protein structure against a library of known binding sites and a collection of control non-binding pockets. The number of similarities identified with the residues in the two sets is then used to derive a propensity value for each residue of the query protein associated to the likelihood that the residue is part of a ligand binding site. The predicted binding residues can be further refined using conservation scores derived from the multiple alignment of the PFAM protein family. webPDBinder correctly identifies residues belonging to the binding site in 77% of the cases and is able to identify binding pockets starting from holo or apo structures with comparable performances. This is important for all the real world cases where the query protein has been crystallized without a ligand and is also difficult to obtain clear similarities with bound pockets from holo pocket libraries. The input is either a PDB code or a user-submitted structure. The output is a list of predicted binding pocket residues with propensity and conservation values both in text and graphical format.


Asunto(s)
Proteínas/química , Programas Informáticos , Sitios de Unión , Internet , Ligandos , Modelos Moleculares , Conformación Proteica , Proteínas/metabolismo
16.
Nucleic Acids Res ; 41(Web Server issue): W281-5, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23703207

RESUMEN

Nucleos is a web server for the identification of nucleotide-binding sites in protein structures. Nucleos compares the structure of a query protein against a set of known template 3D binding sites representing nucleotide modules, namely the nucleobase, carbohydrate and phosphate. Structural features, clustering and conservation are used to filter and score the predictions. The predicted nucleotide modules are then joined to build whole nucleotide-binding sites, which are ranked by their score. The server takes as input either the PDB code of the query protein structure or a user-submitted structure in PDB format. The output of Nucleos is composed of ranked lists of predicted nucleotide-binding sites divided by nucleotide type (e.g. ATP-like). For each ranked prediction, Nucleos provides detailed information about the score, the template structure and the structural match for each nucleotide module composing the nucleotide-binding site. The predictions on the query structure and the template-binding sites can be viewed directly on the web through a graphical applet. In 98% of the cases, the modules composing correct predictions belong to proteins with no homology relationship between each other, meaning that the identification of brand-new nucleotide-binding sites is possible using information from non-homologous proteins. Nucleos is available at http://nucleos.bio.uniroma2.it/nucleos/.


Asunto(s)
Nucleótidos/metabolismo , Conformación Proteica , Programas Informáticos , Apoproteínas/química , Apoproteínas/metabolismo , Sitios de Unión , Internet , Proteínas/metabolismo
17.
PLoS One ; 7(11): e50240, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23209685

RESUMEN

Nucleotides are involved in several cellular processes, ranging from the transmission of genetic information, to energy transfer and storage. Both sequence and structure based methods have been developed to predict the location of nucleotide-binding sites in proteins. Here we propose a novel methodology that leverages the observation that nucleotide-binding sites have a modular structure. Nucleotides are composed of identifiable fragments, i.e. the phosphate, the nucleobase and the carbohydrate moieties. These fragments are bound by specific structural motifs that recur in proteins of different fold. Moreover these motifs behave as modules and are found in different combinations across fold space. Our method predicts binding sites for each nucleotide fragment by comparing a query protein with a database of templates extracted from proteins of known structure. Whenever a similarity is found the fragment bound by the template is transferred on the query protein, thus identifying a putative binding site. Predictions falling inside the surface of the protein are discarded, and the remaining ones are scored using clustering and conservation. The method is able to rank as first a correct prediction in the 48%, 48% and 68% of the analyzed proteins for the nucleobase, carbohydrate and phosphate respectively, while considering the first five predictions the performances change to 71%, 65% and 86% respectively. Furthermore we attempted to reconstruct the full structure of the binding site, starting from the predicted positions of the fragments. We calculated that in the 59% of the analyzed proteins the method ranks as first a reconstructed binding site or a part of it. Finally we tested the reliability of our method in a real world case in which it has to predict nucleotide-binding sites in unbound proteins. We analyzed proteins whose structure has been solved with and without the nucleotide and observed only little variations in the method performance.


Asunto(s)
Nucleótidos/química , Proteínas/química , Animales , Proteínas Bacterianas/química , Sitios de Unión , Carbohidratos/química , Biología Computacional/métodos , Bases de Datos de Proteínas , Humanos , Nucleótidos/genética , Fosfatos/química , Proteínas de Plantas/química , Unión Proteica , Pliegue de Proteína , Reproducibilidad de los Resultados , Programas Informáticos , Solventes/química
18.
Adv Appl Bioinform Chem ; 5: 11-21, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22888263

RESUMEN

The ability to predict immunogenic regions in selected proteins by in-silico methods has broad implications, such as allowing a quick selection of potential reagents to be used as diagnostics, vaccines, immunotherapeutics, or research tools in several branches of biological and biotechnological research. However, the prediction of antibody target sites in proteins using computational methodologies has proven to be a highly challenging task, which is likely due to the somewhat elusive nature of B-cell epitopes. This paper proposes a web-based platform for scoring potential immunological reagents based on the structures or 3D models of the proteins of interest. The method scores a protein's peptides set, which is derived from a sliding window, based on the average solvent exposure, with a filter on the average local model quality for each peptide. The platform was validated on a custom-assembled database of 1336 experimentally determined epitopes from 106 proteins for which a reliable 3D model could be obtained through standard modeling techniques. Despite showing poor sensitivity, this method can achieve a specificity of 0.70 and a positive predictive value of 0.29 by combining these two simple parameters. These values are slightly higher than those obtained with other established sequence-based or structure-based methods that have been evaluated using the same epitopes dataset. This method is implemented in a web server called B-Pred, which is accessible at http://immuno.bio.uniroma2.it/bpred. The server contains a number of original features that allow users to perform personalized reagent searches by manipulating the sliding window's width and sliding step, changing the exposure and model quality thresholds, and running sequential queries with different parameters. The B-Pred server should assist experimentalists in the rational selection of epitope antigens for a wide range of applications.

19.
BMC Bioinformatics ; 13 Suppl 4: S17, 2012 Mar 28.
Artículo en Inglés | MEDLINE | ID: mdl-22536963

RESUMEN

BACKGROUND: The identification of ligand binding sites is a key task in the annotation of proteins with known structure but uncharacterized function. Here we describe a knowledge-based method exploiting the observation that unrelated binding sites share small structural motifs that bind the same chemical fragments irrespective of the nature of the ligand as a whole. RESULTS: PDBinder compares a query protein against a library of binding and non-binding protein surface regions derived from the PDB. The results of the comparison are used to derive a propensity value for each residue which is correlated with the likelihood that the residue is part of a ligand binding site. The method was applied to two different problems: i) the prediction of ligand binding residues and ii) the identification of which surface cleft harbours the binding site. In both cases PDBinder performed consistently better than existing methods. PDBinder has been trained on a non-redundant set of 1356 high-quality protein-ligand complexes and tested on a set of 239 holo and apo complex pairs. We obtained an MCC of 0.313 on the holo set with a PPV of 0.413 while on the apo set we achieved an MCC of 0.271 and a PPV of 0.372. CONCLUSIONS: We show that PDBinder performs better than existing methods. The good performance on the unbound proteins is extremely important for real-world applications where the location of the binding site is unknown. Moreover, since our approach is orthogonal to those used in other programs, the PDBinder propensity value can be integrated in other algorithms further increasing the final performance.


Asunto(s)
Algoritmos , Bases del Conocimiento , Proteínas/química , Animales , Sitios de Unión , Bases de Datos de Proteínas , Ligandos , Modelos Moleculares , Unión Proteica , Conformación Proteica , Estructura Terciaria de Proteína , Proteínas/metabolismo
20.
BMC Genomics ; 12: 614, 2011 Dec 19.
Artículo en Inglés | MEDLINE | ID: mdl-22182631

RESUMEN

BACKGROUND: Protein phosphorylation modulates protein function in organisms at all levels of complexity. Parasites of the Leishmania genus undergo various developmental transitions in their life cycle triggered by changes in the environment. The molecular mechanisms that these organisms use to process and integrate these external cues are largely unknown. However Leishmania lacks transcription factors, therefore most regulatory processes may occur at a post-translational level and phosphorylation has recently been demonstrated to be an important player in this process. Experimental identification of phosphorylation sites is a time-consuming task. Moreover some sites could be missed due to the highly dynamic nature of this process or to difficulties in phospho-peptide enrichment. RESULTS: Here we present PhosTryp, a phosphorylation site predictor specific for trypansomatids. This method uses an SVM-based approach and has been trained with recent Leishmania phosphosproteomics data. PhosTryp achieved a 17% improvement in prediction performance compared with Netphos, a non organism-specific predictor. The analysis of the peptides correctly predicted by our method but missed by Netphos demonstrates that PhosTryp captures Leishmania-specific phosphorylation features. More specifically our results show that Leishmania kinases have sequence specificities which are different from their counterparts in higher eukaryotes. Consequently we were able to propose two possible Leishmania-specific phosphorylation motifs.We further demonstrate that this improvement in performance extends to the related trypanosomatids Trypanosoma brucei and Trypanosoma cruzi. Finally, in order to maximize the usefulness of PhosTryp, we trained a predictor combining all the peptides from L. infantum, T. brucei and T. cruzi. CONCLUSIONS: Our work demonstrates that training on organism-specific data results in an improvement that extends to related species. PhosTryp is freely available at http://phostryp.bio.uniroma2.it.


Asunto(s)
Trypanosoma/metabolismo , Animales , Fosforilación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA