Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
2.
J Proteome Res ; 23(1): 418-429, 2024 01 05.
Artículo en Inglés | MEDLINE | ID: mdl-38038272

RESUMEN

The inherent diversity of approaches in proteomics research has led to a wide range of software solutions for data analysis. These software solutions encompass multiple tools, each employing different algorithms for various tasks such as peptide-spectrum matching, protein inference, quantification, statistical analysis, and visualization. To enable an unbiased comparison of commonly used bottom-up label-free proteomics workflows, we introduce WOMBAT-P, a versatile platform designed for automated benchmarking and comparison. WOMBAT-P simplifies the processing of public data by utilizing the sample and data relationship format for proteomics (SDRF-Proteomics) as input. This feature streamlines the analysis of annotated local or public ProteomeXchange data sets, promoting efficient comparisons among diverse outputs. Through an evaluation using experimental ground truth data and a realistic biological data set, we uncover significant disparities and a limited overlap in the quantified proteins. WOMBAT-P not only enables rapid execution and seamless comparison of workflows but also provides valuable insights into the capabilities of different software solutions. These benchmarking metrics are a valuable resource for researchers in selecting the most suitable workflow for their specific data sets. The modular architecture of WOMBAT-P promotes extensibility and customization. The software is available at https://github.com/wombat-p/WOMBAT-Pipelines.


Asunto(s)
Benchmarking , Proteómica , Flujo de Trabajo , Programas Informáticos , Proteínas , Análisis de Datos
3.
Biomolecules ; 13(3)2023 03 07.
Artículo en Inglés | MEDLINE | ID: mdl-36979426

RESUMEN

Proteomic studies using mass spectrometry (MS)-based quantification are a main approach to the discovery of new biomarkers. However, a number of analytical conditions in front and during MS data acquisition can affect the accuracy of the obtained outcome. Therefore, comprehensive quality assessment of the acquired data plays a central role in quantitative proteomics, though, due to the immense complexity of MS data, it is often neglected. Here, we address practically the quality assessment of quantitative MS data, describing key steps for the evaluation, including the levels of raw data, identification and quantification. With this, four independent datasets from cerebrospinal fluid, an important biofluid for neurodegenerative disease biomarker studies, were assessed, demonstrating that sample processing-based differences are already reflected at all three levels but with varying impacts on the quality of the quantitative data. Specifically, we provide guidance to critically interpret the quality of MS data for quantitative proteomics. Moreover, we provide the free and open source quality control tool MaCProQC, enabling systematic, rapid and uncomplicated data comparison of raw data, identification and feature detection levels through defined quality metrics and a step-by-step quality control workflow.


Asunto(s)
Enfermedades Neurodegenerativas , Espectrometría de Masas en Tándem , Humanos , Espectrometría de Masas en Tándem/métodos , Proteoma/análisis , Proteómica/métodos , Biomarcadores/análisis , Control de Calidad
4.
J Proteome Res ; 22(3): 681-696, 2023 03 03.
Artículo en Inglés | MEDLINE | ID: mdl-36744821

RESUMEN

In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mass spectrometry-based proteomics analysis of any sample or organism. Following this sample-to-data roadmap helped identify knowledge gaps and define needs. Being able to generate bespoke and realistic synthetic data has legitimate and important uses in system suitability, method development, and algorithm benchmarking, while also posing critical ethical questions. The interdisciplinary nature of the workshop informed discussions of what is currently possible and future opportunities and challenges. In the following perspective we summarize these discussions in the hope of conveying our excitement about the potential of machine learning in proteomics and to inspire future research.


Asunto(s)
Aprendizaje Automático , Proteómica , Proteómica/métodos , Algoritmos , Espectrometría de Masas
5.
Neuropathol Appl Neurobiol ; 49(1): e12853, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-36180966

RESUMEN

AIMS: Target skeletal muscle fibres - defined by different concentric areas in oxidative enzyme staining - can occur in patients with neurogenic muscular atrophy. Here, we used our established hypothesis-free proteomic approach with the aim of deciphering the protein composition of targets. We also searched for potential novel interactions between target proteins. METHODS: Targets and control areas were laser microdissected from skeletal muscle sections of 20 patients with neurogenic muscular atrophy. Samples were analysed by a highly sensitive mass spectrometry approach, enabling relative protein quantification. The results were validated by immunofluorescence studies. Protein interactions were investigated by yeast two-hybrid assays, coimmunoprecipitation experiments and bimolecular fluorescence complementation. RESULTS: More than 1000 proteins were identified. Among these, 55 proteins were significantly over-represented and 40 proteins were significantly under-represented in targets compared to intraindividual control samples. The majority of over-represented proteins were associated with the myofibrillar Z-disc and actin dynamics, followed by myosin and myosin-associated proteins, proteins involved in protein biosynthesis and chaperones. Under-represented proteins were mainly mitochondrial proteins. Functional studies revealed that the LIM domain of the over-represented protein LIMCH1 interacts with isoform A of Xin actin-binding repeat-containing protein 1 (XinA). CONCLUSIONS: In particular, proteins involved in myofibrillogenesis are over-represented in target structures, which indicate an ongoing process of sarcomere assembly and/or remodelling within this specific area of the muscle fibres. We speculate that target structures are the result of reinnervation processes in which filamin C-associated myofibrillogenesis is tightly regulated by the BAG3-associated protein quality system.


Asunto(s)
Enfermedades del Sistema Nervioso Periférico , Humanos , Enfermedades del Sistema Nervioso Periférico/metabolismo , Actinas/análisis , Actinas/metabolismo , Proteómica , Proteínas Musculares/metabolismo , Fibras Musculares Esqueléticas/química , Fibras Musculares Esqueléticas/metabolismo , Músculo Esquelético/metabolismo , Atrofia Muscular/metabolismo , Proteínas Adaptadoras Transductoras de Señales/metabolismo , Proteínas Reguladoras de la Apoptosis/análisis , Proteínas Reguladoras de la Apoptosis/metabolismo
6.
Int J Mol Sci ; 23(19)2022 Sep 24.
Artículo en Inglés | MEDLINE | ID: mdl-36232544

RESUMEN

Chronic obstructive pulmonary disease (COPD) is a major risk factor for the development of lung adenocarcinoma (AC). AC often develops on underlying COPD; thus, the differentiation of both entities by biomarker is challenging. Although survival of AC patients strongly depends on early diagnosis, a biomarker panel for AC detection and differentiation from COPD is still missing. Plasma samples from 176 patients with AC with or without underlying COPD, COPD patients, and hospital controls were analyzed using mass-spectrometry-based proteomics. We performed univariate statistics and additionally evaluated machine learning algorithms regarding the differentiation of AC vs. COPD and AC with COPD vs. COPD. Univariate statistics revealed significantly regulated proteins that were significantly regulated between the patient groups. Furthermore, random forest classification yielded the best performance for differentiation of AC vs. COPD (area under the curve (AUC) 0.935) and AC with COPD vs. COPD (AUC 0.916). The most influential proteins were identified by permutation feature importance and compared to those identified by univariate testing. We demonstrate the great potential of machine learning for differentiation of highly similar disease entities and present a panel of biomarker candidates that should be considered for the development of a future biomarker panel.


Asunto(s)
Adenocarcinoma del Pulmón , Neoplasias Pulmonares , Enfermedad Pulmonar Obstructiva Crónica , Biomarcadores , Humanos , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/patología , Proteómica , Enfermedad Pulmonar Obstructiva Crónica/patología
7.
PLoS One ; 17(10): e0276401, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36269744

RESUMEN

In bottom-up proteomics, proteins are enzymatically digested into peptides before measurement with mass spectrometry. The relationship between proteins and their corresponding peptides can be represented by bipartite graphs. We conduct a comprehensive analysis of bipartite graphs using quantified peptides from measured data sets as well as theoretical peptides from an in silico digestion of the corresponding complete taxonomic protein sequence databases. The aim of this study is to characterize and structure the different types of graphs that occur and to compare them between data sets. We observed a large influence of the accepted minimum peptide length during in silico digestion. When changing from theoretical peptides to measured ones, the graph structures are subject to two opposite effects. On the one hand, the graphs based on measured peptides are on average smaller and less complex compared to graphs using theoretical peptides. On the other hand, the proportion of protein nodes without unique peptides, which are a complicated case for protein inference and quantification, is considerably larger for measured data. Additionally, the proportion of graphs containing at least one protein node without unique peptides rises when going from database to quantitative level. The fraction of shared peptides and proteins without unique peptides as well as the complexity and size of the graphs highly depends on the data set and organism. Large differences between the structures of bipartite peptide-protein graphs have been observed between database and quantitative level as well as between analyzed species. In the analyzed measured data sets, the proportion of protein nodes without unique peptides ranged from 6.4% to 55.0%. This highlights the need for novel methods that can quantify proteins without unique peptides. The knowledge about the structure of the bipartite peptide-protein graphs gained in this study will be useful for the development of such algorithms.


Asunto(s)
Péptidos , Proteínas , Proteínas/química , Péptidos/química , Bases de Datos de Proteínas , Proteómica/métodos , Espectrometría de Masas/métodos
8.
Data Brief ; 43: 108435, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-35845101

RESUMEN

In this article, we present a data dependent acquisition (DDA) dataset which was generated as a reference and ground truth quantitative dataset. While initially used to compare samples measured with DDA and data independent acquisition (DIA) (Barkovits et al., 2020), the presented dataset holds potential value as a benchmark reference for any workflows working on DDA data. The entire dataset consists of 15 LC-MS/MS measurements composed of five distinct spike-in-states, each with three replicates. To generate the data set, a C2C12 (immortalized mouse myoblast) cell lysate was used as a complex background for five different states which were simulated by spiking 13 defined proteins at different concentrations. For this purpose, the cell lysate was used in a constant amount of 20 µg for all samples and different amounts of the 13 selected proteins ranging from 0.1  to 10 pmol were added, reflecting physiological amounts of proteins. Afterwards, all samples were tryptically digested using the same method. From each sample 200 ng tryptic peptides were measured in triplicates on a Q Exactive HF (Thermo Fisher Scientific). The mass range for MS1 was set to 350-1400 m/z with a resolution of 60,000 at 200 m/z. HCD fragmentation of the Top10 abundant precursor ions was performed at 27% NCE. The fragment analysis (MS2) was performed with a resolution of 30,000 at 200 m/z. Additionally to the raw files, the dataset contains centroided mzML files and spectrum identification results for peptide identifications performed by Mascot (Perkins et al., 1999), MS-GF+ (Kim et al., 2010) and X!Tandem (Craig and Beavis, 2004) for each separate MS analysis. The corresponding FASTA containing protein sequences as well as a combination of all identification runs performed by PIA (Uszkoreit et al., 2019, 2015) and a peptide and protein quantification performed by OpenMS (Pfeuffer et al., 2017) is included. All data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository (Perez-Riverol et al., 2018) with the dataset identifier PXD012986.

9.
Nat Commun ; 12(1): 5854, 2021 10 06.
Artículo en Inglés | MEDLINE | ID: mdl-34615866

RESUMEN

The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets.


Asunto(s)
Análisis de Datos , Bases de Datos de Proteínas , Metadatos , Proteómica , Macrodatos , Humanos , Reproducibilidad de los Resultados , Programas Informáticos , Transcriptoma
10.
Methods Mol Biol ; 2228: 307-325, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33950500

RESUMEN

Data-independent acquisition (DIA) has recently developed as a powerful tool to enhance the quantification of peptides and proteins within a variety of sample types, by overcoming the stochastic nature of classical data-dependent approaches, as well as by enabling the identification of all peptides detected in a mass spectrometric event. Here, we describe a workflow for the establishment of a sample-fitting DIA method using Spectronaut Pulsar X (Biognosys, Switzerland).


Asunto(s)
Proteínas/análisis , Proteoma , Proteómica , Espectrometría de Masa por Ionización de Electrospray , Animales , Cromatografía Líquida de Alta Presión , Humanos , Proyectos de Investigación , Flujo de Trabajo
11.
Rapid Commun Mass Spectrom ; : e9087, 2021 Apr 16.
Artículo en Inglés | MEDLINE | ID: mdl-33861485

RESUMEN

The European Bioinformatics Community for Mass Spectrometry (EuBIC-MS; eubic-ms.org) was founded in 2014 to unite European computational mass spectrometry researchers and proteomics bioinformaticians working in academia and industry. EuBIC-MS maintains educational resources (proteomics-academy.org) and organises workshops at national and international conferences on proteomics and mass spectrometry. Furthermore, EuBIC-MS is actively involved in several community initiatives such as the Human Proteome Organization's Proteomics Standards Initiative (HUPO-PSI). Apart from these collaborations, EuBIC-MS has organised two Winter Schools and two Developers' Meetings that have contributed to the strengthening of the European mass spectrometry network and fostered international collaboration in this field, even beyond Europe. Moreover, EuBIC-MS is currently actively developing a community-driven standard dedicated to mass spectrometry data annotation (SDRF-Proteomics) that will facilitate data reuse and collaboration. This manuscript highlights what EuBIC-MS is, what it does, and what it already has achieved. A warm invitation is extended to new researchers at all career stages to join the EuBIC-MS community on its Slack channel (eubic.slack.com).

12.
J Proteome Res ; 20(4): 2145-2150, 2021 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-33724838

RESUMEN

Protein sequence databases play a crucial role in the majority of the currently applied mass-spectrometry-based proteomics workflows. Here UniProtKB serves as one of the major sources, as it combines the information of several smaller databases and enriches the entries with additional biological information. For the identification of peptides in a sample by tandem mass spectra, as generated by data-dependent acquisition, protein sequence databases provide the basis for most spectrum identification search engines. In addition, for targeted proteomics approaches like selected reaction monitoring (SRM) and parallel reaction monitoring (PRM), knowledge of the peptide sequences, their masses, and whether they are unique for a protein is essential. Because most bottom-up proteomics approaches use trypsin to cleave the proteins in a sample, the tryptic peptides contained in a protein database are of great interest. We present a database, called MaCPepDB (mass-centric peptide database), that consists of the complete tryptic digest of the Swiss-Prot and TrEMBL parts of UniProtKB. This database is especially designed to not only allow queries of peptide sequences and return the respective information about connected proteins and thus whether a peptide is unique but also allow queries of specific masses of peptides or precursors of MS/MS spectra. Furthermore, posttranslational modifications can be considered in a query as well as different mass deviations for posttranslational modifications. Hence the database can be used by a sequence query not only to, for example, check in which proteins of the UniProt database a tryptic peptide can be found but also to find possibly interfering peptides in PRM/SRM experiments using the mass query. The complete database contains currently 5 939 244 990 peptides from 185 561 610 proteins (UniProt version 2020_03), for which a single query usually takes less than 1 s. For easy exploration of the data, a web interface was developed. A REST application programming interface (API) for programmatic and workflow access is also available at https://macpepdb.mpc.rub.de.


Asunto(s)
Péptidos , Espectrometría de Masas en Tándem , Bases de Datos de Proteínas , Proteínas , Proteómica
13.
Brief Bioinform ; 22(5)2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-33589928

RESUMEN

This article describes some use case studies and self-assessments of FAIR status of de.NBI services to illustrate the challenges and requirements for the definition of the needs of adhering to the FAIR (findable, accessible, interoperable and reusable) data principles in a large distributed bioinformatics infrastructure. We address the challenge of heterogeneity of wet lab technologies, data, metadata, software, computational workflows and the levels of implementation and monitoring of FAIR principles within the different bioinformatics sub-disciplines joint in de.NBI. On the one hand, this broad service landscape and the excellent network of experts are a strong basis for the development of useful research data management plans. On the other hand, the large number of tools and techniques maintained by distributed teams renders FAIR compliance challenging.


Asunto(s)
Manejo de Datos/métodos , Metadatos , Redes Neurales de la Computación , Proteómica/métodos , Programas Informáticos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Cooperación Internacional , Fenotipo , Plantas/genética , Proteoma , Autoevaluación (Psicología) , Flujo de Trabajo
14.
Acta Neuropathol Commun ; 8(1): 154, 2020 09 04.
Artículo en Inglés | MEDLINE | ID: mdl-32887649

RESUMEN

Filamin C (FLNc) is mainly expressed in striated muscle cells where it localizes to Z-discs, myotendinous junctions and intercalated discs. Recent studies have revealed numerous mutations in the FLNC gene causing familial and sporadic myopathies and cardiomyopathies with marked clinical variability. The most frequent myopathic mutation, p.W2710X, which is associated with myofibrillar myopathy, deletes the carboxy-terminal 16 amino acids from FLNc and abolishes the dimerization property of Ig-like domain 24. We previously characterized "knock-in" mice heterozygous for this mutation (p.W2711X), and have now investigated homozygous mice using protein and mRNA expression analyses, mass spectrometry, and extensive immunolocalization and ultrastructural studies. Although the latter mice display a relatively mild myopathy under normal conditions, our analyses identified major mechanisms causing the pathophysiology of this disease: in comparison to wildtype animals (i) the expression level of FLNc protein is drastically reduced; (ii) mutant FLNc is relocalized from Z-discs to particularly mechanically strained parts of muscle cells, i.e. myotendinous junctions and myofibrillar lesions; (iii) the number of lesions is greatly increased and these lesions lack Bcl2-associated athanogene 3 (BAG3) protein; (iv) the expression of heat shock protein beta-7 (HSPB7) is almost completely abolished. These findings indicate grave disturbances of BAG3-dependent and -independent autophagy pathways that are required for efficient lesion repair. In addition, our studies reveal general mechanisms of lesion formation and demonstrate that defective FLNc dimerization via its carboxy-terminal domain does not disturb assembly and basic function of myofibrils. An alternative, more amino-terminally located dimerization site might compensate for that loss. Since filamins function as stress sensors, our data further substantiate that FLNc is important for mechanosensing in the context of Z-disc stabilization and maintenance.


Asunto(s)
Filaminas/genética , Miopatías Estructurales Congénitas/genética , Miopatías Estructurales Congénitas/patología , Sarcómeros/patología , Animales , Técnicas de Sustitución del Gen , Homocigoto , Ratones , Mutación , Miopatías Estructurales Congénitas/metabolismo , Sarcómeros/metabolismo
15.
Mucosal Immunol ; 13(4): 702-714, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32112048

RESUMEN

The urothelium of the urinary bladder represents the first line of defense. However, uropathogenic E. coli (UPEC) damage the urothelium and cause acute bacterial infection. Here, we demonstrate the crosstalk between macrophages and the urothelium stimulating macrophage migration into the urothelium. Using spatial proteomics by MALDI-MSI and LC-MS/MS, a novel algorithm revealed the spatial activation and migration of macrophages. Analysis of the spatial proteome unravelled the coexpression of Myo9b and F4/80 in the infected urothelium, indicating that macrophages have entered the urothelium upon infection. Immunofluorescence microscopy additionally indicated that intraurothelial macrophages phagocytosed UPEC and eliminated neutrophils. Further analysis of the spatial proteome by MALDI-MSI showed strong expression of IL-6 in the urothelium and local inhibition of this molecule reduced macrophage migration into the urothelium and aggravated the infection. After IL-6 inhibition, the expression of matrix metalloproteinases and chemokines, such as CX3CL1 was reduced in the urothelium. Accordingly, macrophage migration into the urothelium was diminished in the absence of CX3CL1 signaling in Cx3cr1gfp/gfp mice. Conclusively, this study describes the crosstalk between the infected urothelium and macrophages through IL-6-induced CX3CL1 expression. Such crosstalk facilitates the relocation of macrophages into the urothelium and reduces bacterial burden in the urinary bladder.


Asunto(s)
Comunicación Celular , Quimiocina CX3CL1/metabolismo , Interleucina-6/metabolismo , Macrófagos/metabolismo , Proteómica , Urotelio/inmunología , Urotelio/metabolismo , Animales , Modelos Animales de Enfermedad , Susceptibilidad a Enfermedades , Técnica del Anticuerpo Fluorescente , Inmunohistoquímica , Macrófagos/inmunología , Ratones , Proteómica/métodos , Vejiga Urinaria/inmunología , Vejiga Urinaria/metabolismo , Vejiga Urinaria/microbiología , Infecciones Urinarias/etiología , Infecciones Urinarias/metabolismo , Infecciones Urinarias/patología , Urotelio/microbiología
16.
Mol Cell Proteomics ; 19(1): 181-197, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31699904

RESUMEN

Currently data-dependent acquisition (DDA) is the method of choice for mass spectrometry-based proteomics discovery experiments, but data-independent acquisition (DIA) is steadily becoming more important. One of the most important requirements to perform a DIA analysis is the availability of suitable spectral libraries for peptide identification and quantification. Several studies were performed addressing the evaluation of spectral library performance for protein identification in DIA measurements. But so far only few experiments estimate the effect of these libraries on the quantitative level.In this work we created a gold standard spike-in sample set with known contents and ratios of proteins in a complex protein matrix that allowed a detailed comparison of DIA quantification data obtained with different spectral library approaches. We used in-house generated sample-specific spectral libraries created using varying sample preparation approaches and repeated DDA measurement. In addition, two different search engines were tested for protein identification from DDA data and subsequent library generation. In total, eight different spectral libraries were generated, and the quantification results compared with a library free method, as well as a default DDA analysis. Not only the number of identifications on peptide and protein level in the spectral libraries and the corresponding DIA analysis results was inspected, but also the number of expected and identified differentially abundant protein groups and their ratios.We found, that while libraries of prefractionated samples were generally larger, there was no significant increase in DIA identifications compared with repetitive non-fractionated measurements. Furthermore, we show that the accuracy of the quantification is strongly dependent on the applied spectral library and whether the quantification is based on peptide or protein level. Overall, the reproducibility and accuracy of DIA quantification is superior to DDA in all applied approaches.Data has been deposited to the ProteomeXchange repository with identifiers PXD012986, PXD012987, PXD012988 and PXD014956.


Asunto(s)
Exactitud de los Datos , Biblioteca de Péptidos , Proteoma/análisis , Proteómica/métodos , Animales , Línea Celular , Cromatografía Liquida/métodos , Bases de Datos de Proteínas , Ratones , Mioblastos/metabolismo , Péptidos/análisis , Proteínas/análisis , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Análisis de Secuencia de Proteína , Programas Informáticos , Espectrometría de Masas en Tándem/métodos
17.
J Proteome Res ; 18(2): 741-747, 2019 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-30474983

RESUMEN

Proteomics using LC-MS/MS has become one of the main methods to analyze the proteins in biological samples in high-throughput. But the existing mass-spectrometry instruments are still limited with respect to resolution and measurable mass ranges, which is one of the main reasons why shotgun proteomics is the major approach. Here proteins are digested, which leads to the identification and quantification of peptides instead. While often neglected, the important step of protein inference needs to be conducted to infer from the identified peptides to the actual proteins in the original sample. In this work, we highlight some of the previously published and newly added features of the tool PIA - Protein Inference Algorithms, which helps the user with the protein inference of measured samples. We also highlight the importance of the usage of PSI standard file formats, as PIA is the only current software supporting all available standards used for spectrum identification and protein inference. Additionally, we briefly describe the benefits of working with workflow environments for proteomics analyses and show the new features of the PIA nodes for the KNIME Analytics Platform. Finally, we benchmark PIA against a recently published data set for isoform detection. PIA is open source and available for download on GitHub ( https://github.com/mpc-bioinformatics/pia ) or directly via the community extensions inside the KNIME analytics platform.


Asunto(s)
Biología Computacional/métodos , Péptidos/análisis , Proteómica/métodos , Programas Informáticos , Flujo de Trabajo , Algoritmos , Benchmarking , Cromatografía Liquida , Isoformas de Proteínas , Espectrometría de Masas en Tándem
18.
Nucleic Acids Res ; 47(D1): D442-D450, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30395289

RESUMEN

The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world's largest data repository of mass spectrometry-based proteomics data, and is one of the founding members of the global ProteomeXchange (PX) consortium. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2016. In the last 3 years, public data sharing through PRIDE (as part of PX) has definitely become the norm in the field. In parallel, data re-use of public proteomics data has increased enormously, with multiple applications. We first describe the new architecture of PRIDE Archive, the archival component of PRIDE. PRIDE Archive and the related data submission framework have been further developed to support the increase in submitted data volumes and additional data types. A new scalable and fault tolerant storage backend, Application Programming Interface and web interface have been implemented, as a part of an ongoing process. Additionally, we emphasize the improved support for quantitative proteomics data through the mzTab format. At last, we outline key statistics on the current data contents and volume of downloads, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas.


Asunto(s)
Bases de Datos de Proteínas , Espectrometría de Masas , Proteómica , Péptidos/química , Programas Informáticos
19.
EuPA Open Proteom ; 22-23: 4-7, 2019 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-31890545

RESUMEN

The 2019 European Bioinformatics Community (EuBIC) Winter School was held from January 15th to January 18th 2019 in Zakopane, Poland. This year's meeting was the third of its kind and gathered international researchers in the field of (computational) proteomics to discuss (mainly) challenges in proteomics quantification and data independent acquisition (DIA). Here, we present an overview of the scientific program of the 2019 EuBIC Winter School. Furthermore, we can already give a small outlook to the upcoming EuBIC 2020 Developer's Meeting.

20.
EuPA Open Proteom ; 22-23: 19-21, 2019 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-31890549

RESUMEN

In a common proteomics analysis today, the origins of our sample in the vial are known and therefore a database dependent approach to identify the containing peptides can be used. The first YPIC challenge though provided us with 19 synthetic peptides, which together formed an English sentence. For the identification of these peptides, a de-novo approach was used, which brought us together with an internet search engine to the hidden sentence. But only having the sentence was not sufficient for us, we also wanted to identify as many as possible of the spectra in our data. Therefore, we created and refined a database approach from the de-novo method and finally could identify the peptide-sentence with a good overlap.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...