Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Nucleic Acids Res ; 44(D1): D548-54, 2016 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-26467481

RESUMEN

Assembly of large biochemical networks can be achieved by confronting new cell-specific experimental data with an interaction subspace constrained by prior literature evidence. The SIGnaling Network Open Resource, SIGNOR (available on line at http://signor.uniroma2.it), was developed to support such a strategy by providing a scaffold of prior experimental evidence of causal relationships between biological entities. The core of SIGNOR is a collection of approximately 12,000 manually-annotated causal relationships between over 2800 human proteins participating in signal transduction. Other entities annotated in SIGNOR are complexes, chemicals, phenotypes and stimuli. The information captured in SIGNOR can be represented as a signed directed graph illustrating the activation/inactivation relationships between signalling entities. Each entry is associated to the post-translational modifications that cause the activation/inactivation of the target proteins. More than 4900 modified residues causing a change in protein concentration or activity have been curated and linked to the modifying enzymes (about 351 human kinases and 94 phosphatases). Additional modifications such as ubiquitinations, sumoylations, acetylations and their effect on the modified target proteins are also annotated. This wealth of structured information can support experimental approaches based on multi-parametric analysis of cell systems after physiological or pathological perturbations and to assemble large logic models.


Asunto(s)
Bases de Datos de Proteínas , Transducción de Señal , Humanos , Internet , Péptidos y Proteínas de Señalización Intracelular/química , Fosfoproteínas Fosfatasas/química , Fosfoproteínas Fosfatasas/metabolismo , Proteínas Quinasas/química , Proteínas Quinasas/metabolismo
2.
Nucleic Acids Res ; 42(Database issue): D358-63, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24234451

RESUMEN

IntAct (freely available at http://www.ebi.ac.uk/intact) is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. IntAct has developed a sophisticated web-based curation tool, capable of supporting both IMEx- and MIMIx-level curation. This tool is now utilized by multiple additional curation teams, all of whom annotate data directly into the IntAct database. Members of the IntAct team supply appropriate levels of training, perform quality control on entries and take responsibility for long-term data maintenance. Recently, the MINT and IntAct databases decided to merge their separate efforts to make optimal use of limited developer resources and maximize the curation output. All data manually curated by the MINT curators have been moved into the IntAct database at EMBL-EBI and are merged with the existing IntAct dataset. Both IntAct and MINT are active contributors to the IMEx consortium (http://www.imexconsortium.org).


Asunto(s)
Bases de Datos de Proteínas , Mapeo de Interacción de Proteínas , Internet , Programas Informáticos
3.
Nat Methods ; 9(4): 345-50, 2012 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-22453911

RESUMEN

The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices.


Asunto(s)
Bases de Datos de Proteínas , Mapeo de Interacción de Proteínas , Proteínas/metabolismo , Publicaciones Periódicas como Asunto , Unión Proteica , Proteínas/química , Control de Calidad
4.
Biotechnol Adv ; 30(1): 4-15, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-21740962

RESUMEN

Families of conserved protein domains, specialized in mediating interactions with short linear peptide motifs, are responsible for the formation of a variety of dynamic complexes in the cell. An important subclass of these motifs are characterized by a high proline content and play a pivotal role in biological processes requiring the coordinated assembly of multi-protein complexes. This is achieved via interaction of proteins containing modules such as Src Homology-3 (SH3) or WW domains and specific proline rich patterns. Here we make available via a publicly accessible database a synopsis of our current understanding of the interaction landscape of the human SH3 protein family. This is achieved by integrating an information extraction strategy with a new experimental approach. In a first approach we have used a text mining strategy to capture a large number of manuscripts reporting interactions between SH3 domains and target peptides. Relevant information was annotated in the MINT database. In a second experimental approach we have used a variant of the WISE (Whole Interactome Scanning Experiment) strategy to probe a large number of naturally occurring and chemically-synthesized peptides arrayed at high density on a glass surface. By this method we have tested 60 human SH3 domains for their ability to bind a collection of 9192 poly-proline containing peptides immobilized on a glass chip. To evaluate the quality of the resulting interaction dataset, we retested some of the interactions on a smaller scale and performed a series of pull down experiments on native proteins. Peptide chips, pull down assays, SPOT synthesis and phage display experiments have allowed us to further characterize the specificity and promiscuity of proline-rich binding domains and to map their interaction network. Both the information captured from the literature and the interactions inferred from the peptide chip experiments were collected and stored in the PepspotDB (http://mint.bio.uniroma2.it/PepspotDB/).


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Dominios Proteicos Ricos en Prolina , Mapeo de Interacción de Proteínas/métodos , Dominios Homologos src , Humanos , Análisis por Matrices de Proteínas , Mapas de Interacción de Proteínas , Proteínas/química , Proteínas/clasificación , Proteínas/metabolismo , Reproducibilidad de los Resultados , Programas Informáticos , Interfaz Usuario-Computador
5.
Nucleic Acids Res ; 40(Database issue): D857-61, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22096227

RESUMEN

The Molecular INTeraction Database (MINT, http://mint.bio.uniroma2.it/mint/) is a public repository for protein-protein interactions (PPI) reported in peer-reviewed journals. The database grows steadily over the years and at September 2011 contains approximately 235,000 binary interactions captured from over 4750 publications. The web interface allows the users to search, visualize and download interactions data. MINT is one of the members of the International Molecular Exchange consortium (IMEx) and adopts the Molecular Interaction Ontology of the Proteomics Standard Initiative (PSI-MI) standards for curation and data exchange. MINT data are freely accessible and downloadable at http://mint.bio.uniroma2.it/mint/download.do. We report here the growth of the database, the major changes in curation policy and a new algorithm to assign a confidence to each interaction.


Asunto(s)
Bases de Datos de Proteínas , Mapeo de Interacción de Proteínas , Algoritmos , Animales , Humanos , Ratones , Proteínas/química , Proteínas/genética , Ratas
6.
BMC Bioinformatics ; 12 Suppl 8: S3, 2011 Oct 03.
Artículo en Inglés | MEDLINE | ID: mdl-22151929

RESUMEN

BACKGROUND: Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them. RESULTS: A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89% and the best AUC iP/R was 68%. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53%, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35%) the macro-averaged precision ranged between 50% and 80%, with a maximum F-Score of 55%. CONCLUSIONS: The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows.


Asunto(s)
Algoritmos , Minería de Datos , Proteínas/metabolismo , Animales , Bases de Datos de Proteínas , Humanos , Publicaciones Periódicas como Asunto , PubMed
7.
BMC Bioinformatics ; 12 Suppl 8: S8, 2011 Oct 03.
Artículo en Inglés | MEDLINE | ID: mdl-22151178

RESUMEN

BACKGROUND: The vast amount of data published in the primary biomedical literature represents a challenge for the automated extraction and codification of individual data elements. Biological databases that rely solely on manual extraction by expert curators are unable to comprehensively annotate the information dispersed across the entire biomedical literature. The development of efficient tools based on natural language processing (NLP) systems is essential for the selection of relevant publications, identification of data attributes and partially automated annotation. One of the tasks of the Biocreative 2010 Challenge III was devoted to the evaluation of NLP systems developed to identify articles for curation and extraction of protein-protein interaction (PPI) data. RESULTS: The Biocreative 2010 competition addressed three tasks: gene normalization, article classification and interaction method identification. The BioGRID and MINT protein interaction databases both participated in the generation of the test publication set for gene normalization, annotated the development and test sets for article classification, and curated the test set for interaction method classification. These test datasets served as a gold standard for the evaluation of data extraction algorithms. CONCLUSION: The development of efficient tools for extraction of PPI data is a necessary step to achieve full curation of the biomedical literature. NLP systems can in the first instance facilitate expert curation by refining the list of candidate publications that contain PPI data; more ambitiously, NLP approaches may be able to directly extract relevant information from full-text articles for rapid inspection by expert curators. Close collaboration between biological databases and NLP systems developers will continue to facilitate the long-term objectives of both disciplines.


Asunto(s)
Minería de Datos , Bases de Datos de Proteínas , Genes , Procesamiento de Lenguaje Natural , Algoritmos , Minería de Datos/normas , Humanos
8.
PLoS One ; 6(7): e22270, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21799808

RESUMEN

The function of proteins is often mediated by short linear segments of their amino acid sequence, called Short Linear Motifs or SLiMs, the identification of which can provide important information about a protein function. However, the short length of the motifs and their variable degree of conservation makes their identification hard since it is difficult to correctly estimate the statistical significance of their occurrence. Consequently, only a small fraction of them have been discovered so far. We describe here an approach for the discovery of SLiMs based on their occurrence in evolutionarily unrelated proteins belonging to the same biological, signalling or metabolic pathway and give specific examples of its effectiveness in both rediscovering known motifs and in discovering novel ones. An automatic implementation of the procedure, available for download, allows significant motifs to be identified, automatically annotated with functional, evolutionary and structural information and organized in a database that can be inspected and queried. An instance of the database populated with pre-computed data on seven organisms is accessible through a publicly available server and we believe it constitutes by itself a useful resource for the life sciences (http://www.biocomputing.it/modipath).


Asunto(s)
Biología Computacional/métodos , Minería de Datos/métodos , Bases de Datos de Proteínas , Proteínas/química , Proteínas/metabolismo , Secuencias de Aminoácidos , Animales , Secuencia Conservada , Evolución Molecular , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , Proteínas/genética , Ratas , Interfaz Usuario-Computador
9.
Nucleic Acids Res ; 38(Database issue): D532-9, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19897547

RESUMEN

MINT (http://mint.bio.uniroma2.it/mint) is a public repository for molecular interactions reported in peer-reviewed journals. Since its last report, MINT has grown considerably in size and evolved in scope to meet the requirements of its users. The main changes include a more precise definition of the curation policy and the development of an enhanced and user-friendly interface to facilitate the analysis of the ever-growing interaction dataset. MINT has adopted the PSI-MI standards for the annotation and for the representation of molecular interactions and is a member of the IMEx consortium.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos , Mapeo de Interacción de Proteínas , Animales , Biología Computacional/tendencias , Bases de Datos de Proteínas , Receptores ErbB/metabolismo , Genoma Viral , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Lenguajes de Programación , Unión Proteica , Estructura Terciaria de Proteína , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...