Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Nat Genet ; 55(3): 389-398, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36823319

RESUMEN

Interacting proteins tend to have similar functions, influencing the same organismal traits. Interaction networks can be used to expand the list of candidate trait-associated genes from genome-wide association studies. Here, we performed network-based expansion of trait-associated genes for 1,002 human traits showing that this recovers known disease genes or drug targets. The similarity of network expansion scores identifies groups of traits likely to share an underlying genetic and biological process. We identified 73 pleiotropic gene modules linked to multiple traits, enriched in genes involved in processes such as protein ubiquitination and RNA processing. In contrast to gene deletion studies, pleiotropy as defined here captures specifically multicellular-related processes. We show examples of modules linked to human diseases enriched in genes with known pathogenic variants that can be used to map targets of approved drugs for repurposing. Finally, we illustrate the use of network expansion scores to study genes at inflammatory bowel disease genome-wide association study loci, and implicate inflammatory bowel disease-relevant genes with strong functional and genetic support.


Asunto(s)
Biología Celular , Células , Enfermedad , Estudios de Asociación Genética , Pleiotropía Genética , Estudios de Asociación Genética/métodos , Humanos , Ubiquitinación/genética , Procesamiento Postranscripcional del ARN/genética , Células/metabolismo , Células/patología , Reposicionamiento de Medicamentos/métodos , Reposicionamiento de Medicamentos/tendencias , Enfermedad/genética , Enfermedades Inflamatorias del Intestino/genética , Enfermedades Inflamatorias del Intestino/patología , Estudio de Asociación del Genoma Completo , Fenotipo , Enfermedades Autoinmunes/genética , Enfermedades Autoinmunes/patología
2.
Nucleic Acids Res ; 50(D1): D648-D653, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34761267

RESUMEN

The IntAct molecular interaction database (https://www.ebi.ac.uk/intact) is a curated resource of molecular interactions, derived from the scientific literature and from direct data depositions. As of August 2021, IntAct provides more than one million binary interactions, curated by twelve global partners of the International Molecular Exchange consortium, for which the IntAct database provides a shared curation and dissemination platform. The IMEx curation policy has always emphasised a fine-grained data and curation model, aiming to capture the relevant experimental detail essential for the interpretation of the provided molecular interaction data. Here, we present recent curation focus and progress, as well as a completely redeveloped website which presents IntAct data in a much more user-friendly and detailed way.


Asunto(s)
Bases de Datos de Proteínas , Mapas de Interacción de Proteínas/genética , Programas Informáticos , Humanos , Mapeo de Interacción de Proteínas/métodos
3.
Nucleic Acids Res ; 50(D1): D578-D586, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34718729

RESUMEN

The Complex Portal (www.ebi.ac.uk/complexportal) is a manually curated, encyclopaedic database of macromolecular complexes with known function from a range of model organisms. It summarizes complex composition, topology and function along with links to a large range of domain-specific resources (i.e. wwPDB, EMDB and Reactome). Since the last update in 2019, we have produced a first draft complexome for Escherichia coli, maintained and updated that of Saccharomyces cerevisiae, added over 40 coronavirus complexes and increased the human complexome to over 1100 complexes that include approximately 200 complexes that act as targets for viral proteins or are part of the immune system. The display of protein features in ComplexViewer has been improved and the participant table is now colour-coordinated with the nodes in ComplexViewer. Community collaboration has expanded, for example by contributing to an analysis of putative transcription cofactors and providing data accessible to semantic web tools through Wikidata which is now populated with manually curated Complex Portal content through a new bot. Our data license is now CC0 to encourage data reuse. Users are encouraged to get in touch, provide us with feedback and send curation requests through the 'Support' link.


Asunto(s)
Curaduría de Datos/métodos , Bases de Datos de Proteínas , Complejos Multiproteicos/química , Coronavirus/química , Visualización de Datos , Bases de Datos de Compuestos Químicos , Enzimas/química , Enzimas/metabolismo , Escherichia coli/química , Humanos , Cooperación Internacional , Anotación de Secuencia Molecular , Complejos Multiproteicos/metabolismo , Interfaz Usuario-Computador
4.
Bioinformatics ; 37(20): 3684-3685, 2021 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-33961020

RESUMEN

SUMMARY: IntAct App is a Cytoscape 3 application that grants in-depth access to IntAct's molecular interaction data. It build networks where nodes are interacting molecules (mainly proteins, but also genes, RNA, chemicals…) and edges represent evidence of interaction. Users can query a network by providing its molecules, identified by different fields and optionally include all their interacting partners in the resulting network. The app offers three visualizations: one only displaying interactions, another representing every evidence and the last one emphasizing evidence where mutated versions of proteins were used. Users can also filter networks and click on nodes and edges to access all their related details. Finally, the application supports automation of its main features via Cytoscape commands. AVAILABILITY AND IMPLEMENTATION: Implementation available at https://apps.cytoscape.org/apps/intactapp, while the source code is available at https://github.com/EBI-IntAct/IntactApp.

5.
Nucleic Acids Res ; 49(6): 3156-3167, 2021 04 06.
Artículo en Inglés | MEDLINE | ID: mdl-33677561

RESUMEN

The EMBL-EBI Complex Portal is a knowledgebase of macromolecular complexes providing persistent stable identifiers. Entries are linked to literature evidence and provide details of complex membership, function, structure and complex-specific Gene Ontology annotations. Data are freely available and downloadable in HUPO-PSI community standards and missing entries can be requested for curation. In collaboration with Saccharomyces Genome Database and UniProt, the yeast complexome, a compendium of all known heteromeric assemblies from the model organism Saccharomyces cerevisiae, was curated. This expansion of knowledge and scope has led to a 50% increase in curated complexes compared to the previously published dataset, CYC2008. The yeast complexome is used as a reference resource for the analysis of complexes from large-scale experiments. Our analysis showed that genes coding for proteins in complexes tend to have more genetic interactions, are co-expressed with more genes, are more multifunctional, localize more often in the nucleus, and are more often involved in nucleic acid-related metabolic processes and processes where large machineries are the predominant functional drivers. A comparison to genetic interactions showed that about 40% of expanded co-complex pairs also have genetic interactions, suggesting strong functional links between complex members.


Asunto(s)
Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Conjuntos de Datos como Asunto , Ontología de Genes , Bases del Conocimiento , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética
6.
Bioinformatics ; 36(24): 5712-5718, 2021 04 05.
Artículo en Inglés | MEDLINE | ID: mdl-32637990

RESUMEN

MOTIVATION: A large variety of molecular interactions occurs between biomolecular components in cells. When a molecular interaction results in a regulatory effect, exerted by one component onto a downstream component, a so-called 'causal interaction' takes place. Causal interactions constitute the building blocks in our understanding of larger regulatory networks in cells. These causal interactions and the biological processes they enable (e.g. gene regulation) need to be described with a careful appreciation of the underlying molecular reactions. A proper description of this information enables archiving, sharing and reuse by humans and for automated computational processing. Various representations of causal relationships between biological components are currently used in a variety of resources. RESULTS: Here, we propose a checklist that accommodates current representations, called the Minimum Information about a Molecular Interaction CAusal STatement (MI2CAST). This checklist defines both the required core information, as well as a comprehensive set of other contextual details valuable to the end user and relevant for reusing and reproducing causal molecular interaction information. The MI2CAST checklist can be used as reporting guidelines when annotating and curating causal statements, while fostering uniformity and interoperability of the data across resources. AVAILABILITY AND IMPLEMENTATION: The checklist together with examples is accessible at https://github.com/MI2CAST/MI2CAST. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Causalidad , Humanos
7.
Nat Commun ; 11(1): 6144, 2020 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-33262342

RESUMEN

The International Molecular Exchange (IMEx) Consortium provides scientists with a single body of experimentally verified protein interactions curated in rich contextual detail to an internationally agreed standard. In this update to the work of the IMEx Consortium, we discuss how this initiative has been working in practice, how it has ensured database sustainability, and how it is meeting emerging annotation challenges through the introduction of new interactor types and data formats. Additionally, we provide examples of how IMEx data are being used by biomedical researchers and integrated in other bioinformatic tools and resources.


Asunto(s)
Acceso a la Información , Bases de Datos Genéticas , Humanos , Difusión de la Información , Cooperación Internacional
8.
Nucleic Acids Res ; 47(D1): D550-D558, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30357405

RESUMEN

The Complex Portal (www.ebi.ac.uk/complexportal) is a manually curated, encyclopaedic database that collates and summarizes information on stable, macromolecular complexes of known function. It captures complex composition, topology and function and links out to a large range of domain-specific resources that hold more detailed data, such as PDB or Reactome. We have made several significant improvements since our last update, including improving compliance to the FAIR data principles by providing complex-specific, stable identifiers that include versioning. Protein complexes are now available from 20 species for download in standards-compliant formats such as PSI-XML, MI-JSON and ComplexTAB or can be accessed via an improved REST API. A component-based JS front-end framework has been implemented to drive a new website and this has allowed the use of APIs from linked services to import and visualize information such as the 3D structure of protein complexes, its role in reactions and pathways and the co-expression of complex components in the tissues of multi-cellular organisms. A first draft of the complete complexome of Saccharomyces cerevisiae is now available to browse and download.


Asunto(s)
Bases de Datos de Proteínas , Complejos Multiproteicos/química , Animales , Gráficos por Computador , Humanos , Sustancias Macromoleculares/química , Ratones , Complejos Multiproteicos/metabolismo , Ácidos Nucleicos/química , Conformación Proteica
11.
J Proteome Res ; 15(11): 4101-4115, 2016 11 04.
Artículo en Inglés | MEDLINE | ID: mdl-27581094

RESUMEN

The current catalogue of the human proteome is not yet complete, as experimental proteomics evidence is still elusive for a group of proteins known as the missing proteins. The Human Proteome Project (HPP) has been successfully using technology and bioinformatic resources to improve the characterization of such challenging proteins. In this manuscript, we propose a pipeline starting with the mining of the PRIDE database to select a group of data sets potentially enriched in missing proteins that are subsequently analyzed for protein identification with a method based on the statistical analysis of proteotypic peptides. Spermatozoa and the HEK293 cell line were found to be a promising source of missing proteins and clearly merit further attention in future studies. After the analysis of the selected samples, we found 342 PSMs, suggesting the presence of 97 missing proteins in human spermatozoa or the HEK293 cell line, while only 36 missing proteins were potentially detected in the retina, frontal cortex, aorta thoracica, or placenta. The functional analysis of the missing proteins detected confirmed their tissue specificity, and the validation of a selected set of peptides using targeted proteomics (SRM/MRM assays) further supports the utility of the proposed pipeline. As illustrative examples, DNAH3 and TEPP in spermatozoa, and UNCX and ATAD3C in HEK293 cells were some of the more robust and remarkable identifications in this study. We provide evidence indicating the relevance to carefully analyze the ever-increasing MS/MS data available from PRIDE and other repositories as sources for missing proteins detection in specific biological matrices as revealed for HEK293 cells.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Proteoma/análisis , Aorta/química , Femenino , Lóbulo Frontal/química , Células HEK293 , Humanos , Masculino , Placenta/química , Embarazo , Proteómica/métodos , Retina/química , Espermatozoides/química , Espectrometría de Masas en Tándem
12.
Nat Methods ; 13(8): 651-656, 2016 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-27493588

RESUMEN

Mass spectrometry (MS) is the main technology used in proteomics approaches. However, on average 75% of spectra analysed in an MS experiment remain unidentified. We propose to use spectrum clustering at a large-scale to shed a light on these unidentified spectra. PRoteomics IDEntifications database (PRIDE) Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra: 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify less than 20% of the consistently unidentified spectra. The complete spectrum clustering results are available through the new version of the PRIDE Cluster resource (http://www.ebi.ac.uk/pride/cluster). This resource is intended, among other aims, to encourage and simplify further investigation into these unidentified spectra.

13.
Mol Cell Proteomics ; 15(1): 305-17, 2016 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-26545397

RESUMEN

The original PRIDE Inspector tool was developed as an open source standalone tool to enable the visualization and validation of mass-spectrometry (MS)-based proteomics data before data submission or already publicly available in the Proteomics Identifications (PRIDE) database. The initial implementation of the tool focused on visualizing PRIDE data by supporting the PRIDE XML format and a direct access to private (password protected) and public experiments in PRIDE.The ProteomeXchange (PX) Consortium has been set up to enable a better integration of existing public proteomics repositories, maximizing its benefit to the scientific community through the implementation of standard submission and dissemination pipelines. Within the Consortium, PRIDE is focused on supporting submissions of tandem MS data. The increasing use and popularity of the new Proteomics Standards Initiative (PSI) data standards such as mzIdentML and mzTab, and the diversity of workflows supported by the PX resources, prompted us to design and implement a new suite of algorithms and libraries that would build upon the success of the original PRIDE Inspector and would enable users to visualize and validate PX "complete" submissions. The PRIDE Inspector Toolsuite supports the handling and visualization of different experimental output files, ranging from spectra (mzML, mzXML, and the most popular peak lists formats) and peptide and protein identification results (mzIdentML, PRIDE XML, mzTab) to quantification data (mzTab, PRIDE XML), using a modular and extensible set of open-source, cross-platform libraries. We believe that the PRIDE Inspector Toolsuite represents a milestone in the visualization and quality assessment of proteomics data. It is freely available at http://github.com/PRIDE-Toolsuite/.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Proteoma/metabolismo , Proteómica/métodos , Programas Informáticos , Internet , Reproducibilidad de los Resultados , Espectrometría de Masas en Tándem
14.
Nucleic Acids Res ; 44(D1): D447-56, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26527722

RESUMEN

The PRoteomics IDEntifications (PRIDE) database is one of the world-leading data repositories of mass spectrometry (MS)-based proteomics data. Since the beginning of 2014, PRIDE Archive (http://www.ebi.ac.uk/pride/archive/) is the new PRIDE archival system, replacing the original PRIDE database. Here we summarize the developments in PRIDE resources and related tools since the previous update manuscript in the Database Issue in 2013. PRIDE Archive constitutes a complete redevelopment of the original PRIDE, comprising a new storage backend, data submission system and web interface, among other components. PRIDE Archive supports the most-widely used PSI (Proteomics Standards Initiative) data standard formats (mzML and mzIdentML) and implements the data requirements and guidelines of the ProteomeXchange Consortium. The wide adoption of ProteomeXchange within the community has triggered an unprecedented increase in the number of submitted data sets (around 150 data sets per month). We outline some statistics on the current PRIDE Archive data contents. We also report on the status of the PRIDE related stand-alone tools: PRIDE Inspector, PRIDE Converter 2 and the ProteomeXchange submission tool. Finally, we will give a brief update on the resources under development 'PRIDE Cluster' and 'PRIDE Proteomes', which provide a complementary view and quality-scored information of the peptide and protein identification data available in PRIDE Archive.


Asunto(s)
Bases de Datos de Proteínas , Espectrometría de Masas , Proteómica , Péptidos/química , Proteínas/química , Proteínas/metabolismo , Programas Informáticos , Interfaz Usuario-Computador
15.
Bioinformatics ; 31(17): 2903-5, 2015 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-25910694

RESUMEN

UNLABELLED: The ms-data-core-api is a free, open-source library for developing computational proteomics tools and pipelines. The Application Programming Interface, written in Java, enables rapid tool creation by providing a robust, pluggable programming interface and common data model. The data model is based on controlled vocabularies/ontologies and captures the whole range of data types included in common proteomics experimental workflows, going from spectra to peptide/protein identifications to quantitative results. The library contains readers for three of the most used Proteomics Standards Initiative standard file formats: mzML, mzIdentML, and mzTab. In addition to mzML, it also supports other common mass spectra data formats: dta, ms2, mgf, pkl, apl (text-based), mzXML and mzData (XML-based). Also, it can be used to read PRIDE XML, the original format used by the PRIDE database, one of the world-leading proteomics resources. Finally, we present a set of algorithms and tools whose implementation illustrates the simplicity of developing applications using the library. AVAILABILITY AND IMPLEMENTATION: The software is freely available at https://github.com/PRIDE-Utilities/ms-data-core-api. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online CONTACT: juan@ebi.ac.uk.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Bases de Datos de Proteínas , Espectrometría de Masas/métodos , Proteínas/análisis , Proteómica/métodos , Programas Informáticos , Humanos , Fragmentos de Péptidos/análisis , Flujo de Trabajo
16.
Nucleic Acids Res ; 43(W1): W599-604, 2015 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-25904633

RESUMEN

The PRIDE (PRoteomics IDEntifications) database is one of the world-leading public repositories of mass spectrometry (MS)-based proteomics data and it is a founding member of the ProteomeXchange Consortium of proteomics resources. In the original PRIDE database system, users could access data programmatically by accessing the web services provided by the PRIDE BioMart interface. New REST (REpresentational State Transfer) web services have been developed to serve the most popular functionality provided by BioMart (now discontinued due to data scalability issues) and address the data access requirements of the newly developed PRIDE Archive. Using the API (Application Programming Interface) it is now possible to programmatically query for and retrieve peptide and protein identifications, project and assay metadata and the originally submitted files. Searching and filtering is also possible by metadata information, such as sample details (e.g. species and tissues), instrumentation (mass spectrometer), keywords and other provided annotations. The PRIDE Archive web services were first made available in April 2014. The API has already been adopted by a few applications and standalone tools such as PeptideShaker, PRIDE Inspector, the Unipept web application and the Python-based BioServices package. This application is free and open to all users with no login requirement and can be accessed at http://www.ebi.ac.uk/pride/ws/archive/.


Asunto(s)
Bases de Datos de Proteínas , Proteómica , Internet , Espectrometría de Masas , Proteínas/química
17.
Mol Cell Proteomics ; 13(10): 2765-75, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-24980485

RESUMEN

The HUPO Proteomics Standards Initiative has developed several standardized data formats to facilitate data sharing in mass spectrometry (MS)-based proteomics. These allow researchers to report their complete results in a unified way. However, at present, there is no format to describe the final qualitative and quantitative results for proteomics and metabolomics experiments in a simple tabular format. Many downstream analysis use cases are only concerned with the final results of an experiment and require an easily accessible format, compatible with tools such as Microsoft Excel or R. We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. mzTab is intended as a lightweight supplement to the existing standard XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. mzTab files can contain protein, peptide, and small molecule identifications together with experimental metadata and basic quantitative information. The format is not intended to store the complete experimental evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the experimental design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biological community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive additional documentation can be found online.


Asunto(s)
Bases de Datos de Proteínas , Programas Informáticos , Acceso a la Información , Espectrometría de Masas , Metabolómica , Proteómica , Interfaz Usuario-Computador
18.
Nucleic Acids Res ; 42(Database issue): D358-63, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24234451

RESUMEN

IntAct (freely available at http://www.ebi.ac.uk/intact) is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. IntAct has developed a sophisticated web-based curation tool, capable of supporting both IMEx- and MIMIx-level curation. This tool is now utilized by multiple additional curation teams, all of whom annotate data directly into the IntAct database. Members of the IntAct team supply appropriate levels of training, perform quality control on entries and take responsibility for long-term data maintenance. Recently, the MINT and IntAct databases decided to merge their separate efforts to make optimal use of limited developer resources and maximize the curation output. All data manually curated by the MINT curators have been moved into the IntAct database at EMBL-EBI and are merged with the existing IntAct dataset. Both IntAct and MINT are active contributors to the IMEx consortium (http://www.imexconsortium.org).


Asunto(s)
Bases de Datos de Proteínas , Mapeo de Interacción de Proteínas , Internet , Programas Informáticos
19.
Nucleic Acids Res ; 41(Web Server issue): W601-6, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23671334

RESUMEN

The Proteomics Standard Initiative Common QUery InterfaCe (PSICQUIC) specification was created by the Human Proteome Organization Proteomics Standards Initiative (HUPO-PSI) to enable computational access to molecular-interaction data resources by means of a standard Web Service and query language. Currently providing >150 million binary interaction evidences from 28 servers globally, the PSICQUIC interface allows the concurrent search of multiple molecular-interaction information resources using a single query. Here, we present an extension of the PSICQUIC specification (version 1.3), which has been released to be compliant with the enhanced standards in molecular interactions. The new release also includes a new reference implementation of the PSICQUIC server available to the data providers. It offers augmented web service capabilities and improves the user experience. PSICQUIC has been running for almost 5 years, with a user base growing from only 4 data providers to 28 (April 2013) allowing access to 151 310 109 binary interactions. The power of this web service is shown in PSICQUIC View web application, an example of how to simultaneously query, browse and download results from the different PSICQUIC servers. This application is free and open to all users with no login requirement (http://www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml).


Asunto(s)
Proteómica/normas , Programas Informáticos , Internet
20.
Bioinformatics ; 29(8): 1103-4, 2013 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-23435069

RESUMEN

SUMMARY: BioJS is an open-source project whose main objective is the visualization of biological data in JavaScript. BioJS provides an easy-to-use consistent framework for bioinformatics application programmers. It follows a community-driven standard specification that includes a collection of components purposely designed to require a very simple configuration and installation. In addition to the programming framework, BioJS provides a centralized repository of components available for reutilization by the bioinformatics community. AVAILABILITY AND IMPLEMENTATION: http://code.google.com/p/biojs/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Gráficos por Computador , Programas Informáticos , Lenguajes de Programación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...