Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
Proteomics ; 24(8): e2300112, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37672792

RESUMO

Machine learning (ML) and deep learning (DL) models for peptide property prediction such as Prosit have enabled the creation of high quality in silico reference libraries. These libraries are used in various applications, ranging from data-independent acquisition (DIA) data analysis to data-driven rescoring of search engine results. Here, we present Oktoberfest, an open source Python package of our spectral library generation and rescoring pipeline originally only available online via ProteomicsDB. Oktoberfest is largely search engine agnostic and provides access to online peptide property predictions, promoting the adoption of state-of-the-art ML/DL models in proteomics analysis pipelines. We demonstrate its ability to reproduce and even improve our results from previously published rescoring analyses on two distinct use cases. Oktoberfest is freely available on GitHub (https://github.com/wilhelm-lab/oktoberfest) and can easily be installed locally through the cross-platform PyPI Python package.


Assuntos
Proteômica , Software , Proteômica/métodos , Peptídeos , Algoritmos
2.
Proteomics ; : e2400078, 2024 Jun 02.
Artigo em Inglês | MEDLINE | ID: mdl-38824665

RESUMO

The human gut microbiome plays a vital role in preserving individual health and is intricately involved in essential functions. Imbalances or dysbiosis within the microbiome can significantly impact human health and are associated with many diseases. Several metaproteomics platforms are currently available to study microbial proteins within complex microbial communities. In this study, we attempted to develop an integrated pipeline to provide deeper insights into both the taxonomic and functional aspects of the cultivated human gut microbiomes derived from clinical colon biopsies. We combined a rapid peptide search by MSFragger against the Unified Human Gastrointestinal Protein database and the taxonomic and functional analyses with Unipept Desktop and MetaLab-MAG. Across seven samples, we identified and matched nearly 36,000 unique peptides to approximately 300 species and 11 phyla. Unipept Desktop provided gene ontology, InterPro entries, and enzyme commission number annotations, facilitating the identification of relevant metabolic pathways. MetaLab-MAG contributed functional annotations through Clusters of Orthologous Genes and Non-supervised Orthologous Groups categories. These results unveiled functional similarities and differences among the samples. This integrated pipeline holds the potential to provide deeper insights into the taxonomy and functions of the human gut microbiome for interrogating the intricate connections between microbiome balance and diseases.

3.
Proteomics ; 23(21-22): e2200292, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37401192

RESUMO

Prediction of protein-protein interactions (PPIs) commonly involves a significant computational component. Rapid recent advances in the power of computational methods for protein interaction prediction motivate a review of the state-of-the-art. We review the major approaches, organized according to the primary source of data utilized: protein sequence, protein structure, and protein co-abundance. The advent of deep learning (DL) has brought with it significant advances in interaction prediction, and we show how DL is used for each source data type. We review the literature taxonomically, present example case studies in each category, and conclude with observations about the strengths and weaknesses of machine learning methods in the context of the principal sources of data for protein interaction prediction.


Assuntos
Mapeamento de Interação de Proteínas , Proteínas , Mapeamento de Interação de Proteínas/métodos , Proteínas/metabolismo , Aprendizado de Máquina , Sequência de Aminoácidos , Biologia Computacional/métodos
4.
Proteomics ; 23(3-4): e2200068, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-35580145

RESUMO

Protein phosphorylation plays an essential role in modulating cell signalling and its downstream transcriptional and translational regulations. Until recently, protein phosphorylation has been studied mostly using low-throughput biochemical assays. The advancement of mass spectrometry (MS)-based phosphoproteomics transformed the field by enabling measurement of proteome-wide phosphorylation events, where tens of thousands of phosphosites are routinely identified and quantified in an experiment. This has brought a significant challenge in analysing large-scale phosphoproteomic data, making computational methods and systems approaches integral parts of phosphoproteomics. Previous works have primarily focused on reviewing the experimental techniques in MS-based phosphoproteomics, yet a systematic survey of the computational landscape in this field is still missing. Here, we review computational methods and tools, and systems approaches that have been developed for phosphoproteomics data analysis. We categorise them into four aspects including data processing, functional analysis, phosphoproteome annotation and their integration with other omics, and in each aspect, we discuss the key methods and example studies. Lastly, we highlight some of the potential research directions on which future work would make a significant contribution to this fast-growing field. We hope this review provides a useful snapshot of the field of computational systems phosphoproteomics and stimulates new research that drives future development.


Assuntos
Fosfoproteínas , Processamento de Proteína Pós-Traducional , Fosfoproteínas/metabolismo , Fosforilação , Proteoma/metabolismo , Análise de Sistemas
5.
Proteomics ; 19(5): e1800300, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30656827

RESUMO

Heavy methyl Stable Isotope Labeling with Amino acids in Cell culture (hmSILAC) is a metabolic labeling strategy employed in proteomics to increase the confidence of global identification of methylated peptides by MS. However, to this day, the automatic and robust identification of heavy and light peak doublets from MS-raw data of hmSILAC experiments is a challenging task, for which the choice of computational methods is very limited. Here, hmSEEKER, a software designed to work downstream of a MaxQuant analysis for in-depth search of MS peak pairs that correspond to light and heavy methyl-peptide within MaxQuant-generated tables is described with good sensitivity and specificity. The software is written in Perl, and its code and user manual are freely available at Bitbucket (https://bit.ly/2scCT9u).


Assuntos
Aminoácidos/análise , Marcação por Isótopo/métodos , Peptídeos/química , Proteômica/métodos , Software , Aminoácidos/metabolismo , Animais , Cromatografia Líquida/métodos , Humanos , Metais Pesados/análise , Metais Pesados/metabolismo , Metilação , Peptídeos/metabolismo , Espectrometria de Massas em Tandem/métodos
6.
Proteomics ; 19(21-22): e1800450, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31472481

RESUMO

Protein phosphorylation acts as an efficient switch controlling deregulated key signaling pathway in cancer. Computational biology aims to address the complexity of reconstructed networks but overrepresents well-known proteins and lacks information on less-studied proteins. A bioinformatic tool to reconstruct and select relatively small networks that connect signaling proteins to their targets in specific contexts is developed. It enables to propose and validate new signaling axes of the Syk kinase. To validate the potency of the tool, it is applied to two phosphoproteomic studies on oncogenic mutants of the well-known phosphatidyl-inositol 3-kinase (PIK3CA) and the unfamiliar Src-related tyrosine kinase lacking C-terminal regulatory tyrosine and N-terminal myristoylation sites (SRMS) kinase. By combining network reconstruction and signal propagation, comprehensive signaling networks from large-scale experimental data are built and multiple molecular paths from these kinases to their targets are extracted. Specific paths from two distinct PIK3CA mutants are retrieved, and their differential impact on the HER3 receptor kinase is explained. In addition, to address the missing connectivities of the SRMS kinase to its targets in interaction pathway databases, phospho-tyrosine and phospho-serine/threonine proteomic data are integrated. The resulting SRMS-signaling network comprises casein kinase 2, thereby validating its currently suggested role downstream of SRMS. The computational pipeline is publicly available, and contains a user-friendly graphical interface (http://doi.org/10.5281/zenodo.3333687).


Assuntos
Neoplasias/metabolismo , Proteômica , Transdução de Sinais , Linhagem Celular Tumoral , Humanos , Mutação/genética , Proteínas de Neoplasias/metabolismo , Fosforilação , Interface Usuário-Computador
7.
Proteomics ; 16(18): 2495-501, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-27436706

RESUMO

Data sharing in the field of MS has advanced greatly thanks to innovations such as the standardized formats, data repositories, and publications guidelines. However, there is currently no data sharing mechanism that enables real-time data browsing and deep linking on a large scale: unrestricted data access (particularly at the quantitative level) ultimately requires the user to download a local copy of the relevant data files (e.g., in order to generate extracted ion chromatograms [XICs]). In this technical resource, we present a set of technologies (collectively termed OpenSlice) that enable the user to quantitatively query hundreds of hours of proteomics discovery data (i.e., nontargeted acquisition) in real time: the user is able to effectively generate XICs for arbitrary masses on the fly and across the entire dataset (so-called global ion chromatograms), interacting with the results through a very intuitive browser-based interface. A key design consideration underlying the OpenSlice approach is the notion that every aspect of the acquired data must be accessible through a RESTful uniform resource locator based application programming interface, up to and including individual chromatographic peaks (hence HyperPeaks). A publicly accessible demonstration of this technology based on the Clinical Proteomics Tumor Analysis Consortium CompRef dataset is made available at http://compref.fenyolab.org.


Assuntos
Cromatografia/métodos , Proteômica/métodos , Software , Humanos , Disseminação de Informação , Neoplasias/metabolismo , Interface Usuário-Computador
8.
Proteomics ; 16(18): 2461-9, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-27503675

RESUMO

A frequently sought output from a shotgun proteomics experiment is a list of proteins that we believe to have been present in the analyzed sample before proteolytic digestion. The standard technique to control for errors in such lists is to enforce a preset threshold for the false discovery rate (FDR). Many consider protein-level FDRs a difficult and vague concept, as the measurement entities, spectra, are manifestations of peptides and not proteins. Here, we argue that this confusion is unnecessary and provide a framework on how to think about protein-level FDRs, starting from its basic principle: the null hypothesis. Specifically, we point out that two competing null hypotheses are used concurrently in today's protein inference methods, which has gone unnoticed by many. Using simulations of a shotgun proteomics experiment, we show how confusing one null hypothesis for the other can lead to serious discrepancies in the FDR. Furthermore, we demonstrate how the same simulations can be used to verify FDR estimates of protein inference methods. In particular, we show that, for a simple protein inference method, decoy models can be used to accurately estimate protein-level FDRs for both competing null hypotheses.


Assuntos
Algoritmos , Proteínas/análise , Proteômica/métodos , Bases de Dados de Proteínas , Ensaios de Triagem em Larga Escala , Proteínas/metabolismo
9.
Neuroimage ; 124(Pt B): 1131-1136, 2016 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-26032888

RESUMO

The Northwestern University Neuroimaging Data Archive (NUNDA), an XNAT-powered data archiving system, aims to facilitate secure data storage; centralized data management; automated, standardized data processing; and simple, intuitive data sharing. NUNDA is a federated data archive, wherein individual project owners regulate access to their data. NUNDA supports multiple methods of data import, enabling data collection in a central repository. Data in NUNDA are available by project to any authorized user, allowing coordinated data management and review across sites. With NUNDA pipelines, users capitalize on existing procedures or standardize custom routines for consistent, automated data processing. NUNDA can be integrated with other research databases to simplify data exploration and discovery. And data on NUNDA can be confidently shared for secure collaboration.


Assuntos
Bases de Dados Factuais , Neuroimagem , Coleta de Dados , Sistemas de Gerenciamento de Base de Dados , Processamento Eletrônico de Dados , Humanos , Disseminação de Informação
10.
J Microsc ; 263(1): 3-9, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-26800017

RESUMO

Serial block face scanning electron microscopy is rapidly becoming a popular tool for collecting large three-dimensional data sets of cells and tissues, filling the resolution and volume gap between fluorescence microscopy and high-resolution electron microscopy. The automated collection of data within the instrument occupies the smallest proportion of the time required to prepare and analyse biological samples. It is the processing of data once it has been collected that proves the greatest challenge. In this review we discuss different methods that are used to process data. We suggest potential workflows that can be used to facilitate the transfer of raw image stacks into quantifiable data as well as propose a set of criteria for reporting methods for data analysis to enable replication of work.


Assuntos
Processamento Eletrônico de Dados/métodos , Microscopia Eletrônica de Varredura/métodos , Animais , Humanos , Imageamento Tridimensional/métodos , Software
11.
Acta Neurochir Suppl ; 122: 75-80, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27165881

RESUMO

Continuous high-volume and high-frequency brain signals such as intracranial pressure (ICP) and electroencephalographic (EEG) waveforms are commonly collected by bedside monitors in neurocritical care. While such signals often carry early signs of neurological deterioration, detecting these signs in real time with conventional data processing methods mainly designed for retrospective analysis has been extremely challenging. Such methods are not designed to handle the large volumes of waveform data produced by bedside monitors. In this pilot study, we address this challenge by building a prototype system using the IBM InfoSphere Streams platform, a scalable stream computing platform, to detect unstable ICP dynamics in real time. The system continuously receives electrocardiographic and ICP signals and analyzes ICP pulse morphology looking for deviations from a steady state. We also designed a Web interface to display in real time the result of this analysis in a Web browser. With this interface, physicians are able to ubiquitously check on the status of their patients and gain direct insight into and interpretation of the patient's state in real time. The prototype system has been successfully tested prospectively on live hospitalized patients.


Assuntos
Sistemas Computacionais , Pressão Intracraniana , Monitorização Fisiológica/métodos , Processamento de Sinais Assistido por Computador , Algoritmos , Sistemas de Apoio a Decisões Clínicas , Eletroencefalografia , Registros Eletrônicos de Saúde , Humanos , Unidades de Terapia Intensiva , Projetos Piloto , Software
12.
Proteomics ; 15(5-6): 950-63, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25475148

RESUMO

Analysis of the phosphoproteome by MS has become a key technology for the characterization of dynamic regulatory processes in the cell, since kinase and phosphatase action underlie many major biological functions. However, the addition of a phosphate group to a suitable side chain often confounds informatic analysis by generating product ion spectra that are more difficult to interpret (and consequently identify) relative to unmodified peptides. Collectively, these challenges have motivated bioinformaticians to create novel software tools and pipelines to assist in the identification of phosphopeptides in proteomic mixtures, and help pinpoint or "localize" the most likely site of modification in cases where there is ambiguity. Here we review the challenges to be met and the informatics solutions available to address them for phosphoproteomic analysis, as well as highlighting the difficulties associated with using them and the implications for data standards.


Assuntos
Fosfopeptídeos/análise , Fosfoproteínas/análise , Proteômica
13.
Proteomics ; 15(5-6): 964-80, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25430050

RESUMO

Data-independent acquisition (DIA) offers several advantages over data-dependent acquisition (DDA) schemes for characterizing complex protein digests analyzed by LC-MS/MS. In contrast to the sequential detection, selection, and analysis of individual ions during DDA, DIA systematically parallelizes the fragmentation of all detectable ions within a wide m/z range regardless of intensity, thereby providing broader dynamic range of detected signals, improved reproducibility for identification, better sensitivity, and accuracy for quantification, and, potentially, enhanced proteome coverage. To fully exploit these advantages, composite or multiplexed fragment ion spectra generated by DIA require more elaborate processing algorithms compared to DDA. This review examines different DIA schemes and, in particular, discusses the concepts applied to and related to data processing. Available software implementations for identification and quantification are presented as comprehensively as possible and examples of software usage are cited. Processing workflows, including complete proprietary frameworks or combinations of modules from different open source data processing packages are described and compared in terms of software availability and usability, programming language, operating system support, input/output data formats, as well as the main principles employed in the algorithms used for identification and quantification. This comparative study concludes with further discussion of current limitations and expectable improvements in the short- and midterm future.


Assuntos
Cromatografia Líquida/métodos , Espectrometria de Massas/métodos , Proteômica/métodos , Software , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
14.
Proteome Sci ; 12: 36, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25028575

RESUMO

BACKGROUND: It is possible to identify thousands of phosphopeptides and -proteins in a single experiment with mass spectrometry-based phosphoproteomics. However, a current bottleneck is the downstream data analysis which is often laborious and requires a number of manual steps. RESULTS: Toward automating the analysis steps, we have developed and implemented a software, PhosFox, which enables peptide-level processing of phosphoproteomic data generated by multiple protein identification search algorithms, including Mascot, Sequest, and Paragon, as well as cross-comparison of their identification results. The software supports both qualitative and quantitative phosphoproteomics studies, as well as multiple between-group comparisons. Importantly, PhosFox detects uniquely phosphorylated peptides and proteins in one sample compared to another. It also distinguishes differences in phosphorylation sites between phosphorylated proteins in different samples. Using two case study examples, a qualitative phosphoproteome dataset from human keratinocytes and a quantitative phosphoproteome dataset from rat kidney inner medulla, we demonstrate here how PhosFox facilitates an efficient and in-depth phosphoproteome data analysis. PhosFox was implemented in the Perl programming language and it can be run on most common operating systems. Due to its flexible interface and open source distribution, the users can easily incorporate the program into their MS data analysis workflows and extend the program with new features. PhosFox source code, implementation and user instructions are freely available from https://bitbucket.org/phintsan/phosfox. CONCLUSIONS: PhosFox facilitates efficient and more in-depth comparisons between phosphoproteins in case-control settings. The open source implementation is easily extendable to accommodate additional features for widespread application use cases.

15.
J Proteomics ; 296: 105110, 2024 03 30.
Artigo em Inglês | MEDLINE | ID: mdl-38325730

RESUMO

Clinical proteomics studies aiming to develop markers of clinical outcome or disease typically involve distinct discovery and validation stages, neither of which focus on the clinical applicability of the candidate markers studied. Our clinically useful selection of proteins (CUSP) protocol proposes a rational approach, with statistical and non-statistical components, to identify proteins for the validation phase of studies that could be most effective markers of disease or clinical outcome. Additionally, this protocol considers commercially available analysis methods for each selected protein to ensure that use of this prospective marker is easily translated into clinical practice. SIGNIFICANCE: When developing proteomic markers of clinical outcomes, there is currently no consideration at the validation stage of how to implement such markers into a clinical setting. This has been identified by several studies as a limitation to the progression of research findings from proteomics studies. When integrated into a proteomic workflow, the CUSP protocol allows for a strategically designed validation study that improves researchers' abilities to translate research findings from discovery-based proteomics into clinical practice.


Assuntos
Proteínas , Proteômica , Proteômica/métodos , Biomarcadores/metabolismo , Estudos Prospectivos
16.
Biol Imaging ; 3: e11, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38487685

RESUMO

With the aim of producing a 3D representation of tumors, imaging and molecular annotation of xenografts and tumors (IMAXT) uses a large variety of modalities in order to acquire tumor samples and produce a map of every cell in the tumor and its host environment. With the large volume and variety of data produced in the project, we developed automatic data workflows and analysis pipelines. We introduce a research methodology where scientists connect to a cloud environment to perform analysis close to where data are located, instead of bringing data to their local computers. Here, we present the data and analysis infrastructure, discuss the unique computational challenges and describe the analysis chains developed and deployed to generate molecularly annotated tumor models. Registration is achieved by use of a novel technique involving spherical fiducial marks that are visible in all imaging modalities used within IMAXT. The automatic pipelines are highly optimized and allow to obtain processed datasets several times quicker than current solutions narrowing the gap between data acquisition and scientific exploitation.

17.
Front Neuroinform ; 15: 689675, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34483871

RESUMO

We present Clinica (www.clinica.run), an open-source software platform designed to make clinical neuroscience studies easier and more reproducible. Clinica aims for researchers to (i) spend less time on data management and processing, (ii) perform reproducible evaluations of their methods, and (iii) easily share data and results within their institution and with external collaborators. The core of Clinica is a set of automatic pipelines for processing and analysis of multimodal neuroimaging data (currently, T1-weighted MRI, diffusion MRI, and PET data), as well as tools for statistics, machine learning, and deep learning. It relies on the brain imaging data structure (BIDS) for the organization of raw neuroimaging datasets and on established tools written by the community to build its pipelines. It also provides converters of public neuroimaging datasets to BIDS (currently ADNI, AIBL, OASIS, and NIFD). Processed data include image-valued scalar fields (e.g., tissue probability maps), meshes, surface-based scalar fields (e.g., cortical thickness maps), or scalar outputs (e.g., regional averages). These data follow the ClinicA Processed Structure (CAPS) format which shares the same philosophy as BIDS. Consistent organization of raw and processed neuroimaging files facilitates the execution of single pipelines and of sequences of pipelines, as well as the integration of processed data into statistics or machine learning frameworks. The target audience of Clinica is neuroscientists or clinicians conducting clinical neuroscience studies involving multimodal imaging, and researchers developing advanced machine learning algorithms applied to neuroimaging data.

18.
Methods Mol Biol ; 1832: 185-203, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30073528

RESUMO

Assays profiling nucleosome positioning and occupancy are often coupled with high-throughput sequencing, which results in generation of large data sets. These data sets require processing in specialized computational pipelines to yield useful information. Here, we describe main steps of such a pipeline, and discuss bioinformatic and statistical aspects of assessing data quality, as well as data visualization and further analysis.


Assuntos
Biologia Computacional/métodos , Histonas/metabolismo , Nucleossomos/metabolismo , Algoritmos , Composição de Bases/genética , Regulação da Expressão Gênica , Nuclease do Micrococo/metabolismo , Isoformas de Proteínas/metabolismo , Alinhamento de Sequência , Sítio de Iniciação de Transcrição
20.
Int Rev Neurobiol ; 141: 3-30, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30314600

RESUMO

Recent advances in disease understanding, instrumentation technology, and computationally demanding image analysis approaches are opening new frontiers in the investigation of movement disorders and brain disease in general. A key aspect is the recognition of the need to determine molecular correlates to early functional and metabolic connectivity alterations, which are increasingly recognized as useful signatures of specific clinical disease phenotypes. Such multi-modal approaches are highly likely to provide new information on pathogenic mechanisms and to help the identification of novel therapeutic targets. This chapter describes recent methodological developments in PET starting with a very brief overview of radiotracers relevant to movement disorders while emphasizing the development of instrumentation, algorithms and imaging analysis methods relevant to multi-modal investigation of movement disorders.


Assuntos
Transtornos dos Movimentos/diagnóstico por imagem , Neuroimagem/métodos , Tomografia por Emissão de Pósitrons/métodos , Humanos , Transtornos dos Movimentos/metabolismo , Transtornos dos Movimentos/fisiopatologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA