Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
PLoS Comput Biol ; 19(1): e1010752, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36622853

RESUMEN

There is an ongoing explosion of scientific datasets being generated, brought on by recent technological advances in many areas of the natural sciences. As a result, the life sciences have become increasingly computational in nature, and bioinformatics has taken on a central role in research studies. However, basic computational skills, data analysis, and stewardship are still rarely taught in life science educational programs, resulting in a skills gap in many of the researchers tasked with analysing these big datasets. In order to address this skills gap and empower researchers to perform their own data analyses, the Galaxy Training Network (GTN) has previously developed the Galaxy Training Platform (https://training.galaxyproject.org), an open access, community-driven framework for the collection of FAIR (Findable, Accessible, Interoperable, Reusable) training materials for data analysis utilizing the user-friendly Galaxy framework as its primary data analysis platform. Since its inception, this training platform has thrived, with the number of tutorials and contributors growing rapidly, and the range of topics extending beyond life sciences to include topics such as climatology, cheminformatics, and machine learning. While initially aimed at supporting researchers directly, the GTN framework has proven to be an invaluable resource for educators as well. We have focused our efforts in recent years on adding increased support for this growing community of instructors. New features have been added to facilitate the use of the materials in a classroom setting, simplifying the contribution flow for new materials, and have added a set of train-the-trainer lessons. Here, we present the latest developments in the GTN project, aimed at facilitating the use of the Galaxy Training materials by educators, and its usage in different learning environments.


Asunto(s)
Biología Computacional , Programas Informáticos , Humanos , Biología Computacional/métodos , Análisis de Datos , Investigadores
2.
J Proteome Res ; 22(8): 2608-2619, 2023 08 04.
Artículo en Inglés | MEDLINE | ID: mdl-37450889

RESUMEN

During the COVID-19 pandemic, impaired immunity and medical interventions resulted in cases of secondary infections. The clinical difficulties and dangers associated with secondary infections in patients necessitate the exploration of their microbiome. Metaproteomics is a powerful approach to study the taxonomic composition and functional status of the microbiome under study. In this study, the mass spectrometry (MS)-based data of nasopharyngeal swab samples from COVID-19 patients was used to investigate the metaproteome. We have established a robust bioinformatics workflow within the Galaxy platform, which includes (a) generation of a tailored database of the common respiratory tract pathogens, (b) database search using multiple search algorithms, and (c) verification of the detected microbial peptides. The microbial peptides detected in this study, belong to several opportunistic pathogens such as Streptococcus pneumoniae, Klebsiella pneumoniae, Rhizopus microsporus, and Syncephalastrum racemosum. Microbial proteins with a role in stress response, gene expression, and DNA repair were found to be upregulated in severe patients compared to negative patients. Using parallel reaction monitoring (PRM), we confirmed some of the microbial peptides in fresh clinical samples. MS-based clinical metaproteomics can serve as a powerful tool for detection and characterization of potential pathogens, which can significantly impact the diagnosis and treatment of patients.


Asunto(s)
COVID-19 , Coinfección , Humanos , COVID-19/diagnóstico , Pandemias , Péptidos , Nasofaringe
3.
Expert Rev Proteomics ; 20(11): 251-266, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37787106

RESUMEN

INTRODUCTION: Continuous advances in mass spectrometry (MS) technologies have enabled deeper and more reproducible proteome characterization and a better understanding of biological systems when integrated with other 'omics data. Bioinformatic resources meeting the analysis requirements of increasingly complex MS-based proteomic data and associated multi-omic data are critically needed. These requirements included availability of software that would span diverse types of analyses, scalability for large-scale, compute-intensive applications, and mechanisms to ease adoption of the software. AREAS COVERED: The Galaxy ecosystem meets these requirements by offering a multitude of open-source tools for MS-based proteomics analyses and applications, all in an adaptable, scalable, and accessible computing environment. A thriving global community maintains these software and associated training resources to empower researcher-driven analyses. EXPERT OPINION: The community-supported Galaxy ecosystem remains a crucial contributor to basic biological and clinical studies using MS-based proteomics. In addition to the current status of Galaxy-based resources, we describe ongoing developments for meeting emerging challenges in MS-based proteomic informatics. We hope this review will catalyze increased use of Galaxy by researchers employing MS-based proteomics and inspire software developers to join the community and implement new tools, workflows, and associated training content that will add further value to this already rich ecosystem.


Asunto(s)
Proteómica , Humanos , Biología Computacional/métodos , Espectrometría de Masas/métodos , Proteómica/métodos , Programas Informáticos
4.
Clin Proteomics ; 20(1): 14, 2023 Apr 02.
Artículo en Inglés | MEDLINE | ID: mdl-37005570

RESUMEN

BACKGROUND: Clinical bronchoalveolar lavage fluid (BALF) samples are rich in biomolecules, including proteins, and useful for molecular studies of lung health and disease. However, mass spectrometry (MS)-based proteomic analysis of BALF is challenged by the dynamic range of protein abundance, and potential for interfering contaminants. A robust, MS-based proteomics compatible sample preparation workflow for BALF samples, including those of small and large volume, would be useful for many researchers. RESULTS: We have developed a workflow that combines high abundance protein depletion, protein trapping, clean-up, and in-situ tryptic digestion, that is compatible with either qualitative or quantitative MS-based proteomic analysis. The workflow includes a value-added collection of endogenous peptides for peptidomic analysis of BALF samples, if desired, as well as amenability to offline semi-preparative or microscale fractionation of complex peptide mixtures prior to LC-MS/MS analysis, for increased depth of analysis. We demonstrate the effectiveness of this workflow on BALF samples collected from COPD patients, including for smaller sample volumes of 1-5 mL that are commonly available from the clinic. We also demonstrate the repeatability of the workflow as an indicator of its utility for quantitative proteomic studies. CONCLUSIONS: Overall, our described workflow consistently provided high quality proteins and tryptic peptides for MS analysis. It should enable researchers to apply MS-based proteomics to a wide-variety of studies focused on BALF clinical specimens.

5.
J Proteome Res ; 20(2): 1451-1454, 2021 02 05.
Artículo en Inglés | MEDLINE | ID: mdl-33393790

RESUMEN

In this Letter, we reanalyze published mass spectrometry data sets of clinical samples with a focus on determining the coinfection status of individuals infected with SARS-CoV-2 coronavirus. We demonstrate the use of ComPIL 2.0 software along with a metaproteomics workflow within the Galaxy platform to detect cohabitating potential pathogens in COVID-19 patients using mass spectrometry-based analysis. From a sample collected from gargling solutions, we detected Streptococcus pneumoniae (opportunistic and multidrug-resistant pathogen) and Lactobacillus rhamnosus (a probiotic component) along with SARS-Cov-2. We could also detect Pseudomonas sps. Bc-h from COVID-19 positive samples and Acinetobacter ursingii and Pseudomonas monteilii from COVID-19 negative samples collected from oro- and nasopharyngeal samples. We believe that the early detection and characterization of coinfections by using metaproteomics from COVID-19 patients will potentially impact the diagnosis and treatment of patients affected by SARS-CoV-2 infection.


Asunto(s)
Infecciones Bacterianas/diagnóstico , COVID-19/diagnóstico , Proteómica/métodos , SARS-CoV-2/metabolismo , Acinetobacter/aislamiento & purificación , Infecciones Bacterianas/complicaciones , Infecciones Bacterianas/microbiología , COVID-19/complicaciones , COVID-19/virología , Coinfección/microbiología , Coinfección/virología , Humanos , Espectrometría de Masas/métodos , Nasofaringe/microbiología , Nasofaringe/virología , Pseudomonas/aislamiento & purificación , SARS-CoV-2/fisiología , Streptococcus pneumoniae/aislamiento & purificación
6.
J Proteome Res ; 20(4): 2130-2137, 2021 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-33683127

RESUMEN

metaQuantome is a software suite that enables the quantitative analysis, statistical evaluation. and visualization of mass-spectrometry-based metaproteomics data. In the latest update of this software, we have provided several extensions, including a step-by-step training guide, the ability to perform statistical analysis on samples from multiple conditions, and a comparative analysis of metatranscriptomics data. The training module, accessed via the Galaxy Training Network, will help users to use the suite effectively both for functional as well as for taxonomic analysis. We extend the ability of metaQuantome to now perform multi-data-point quantitative and statistical analyses so that studies with measurements across multiple conditions, such as time-course studies, can be analyzed. With an eye on the multiomics analysis of microbial communities, we have also initiated the use of metaQuantome statistical and visualization tools on outputs from metatranscriptomics data, which complements the metagenomic and metaproteomic analyses already available. For this, we have developed a tool named MT2MQ ("metatranscriptomics to metaQuantome"), which takes in outputs from the ASaiM metatranscriptomics workflow and transforms them so that the data can be used as an input for comparative statistical analysis and visualization via metaQuantome. We believe that these improvements to metaQuantome will facilitate the use of the software for quantitative metaproteomics and metatranscriptomics and will enable multipoint data analysis. These improvements will take us a step toward integrative multiomic microbiome analysis so as to understand dynamic taxonomic and functional responses of these complex systems in a variety of biological contexts. The updated metaQuantome and MT2MQ are open-source software and are available via the Galaxy Toolshed and GitHub.


Asunto(s)
Microbiota , Proteómica , Espectrometría de Masas , Metagenómica , Programas Informáticos
7.
Clin Proteomics ; 18(1): 15, 2021 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-33971807

RESUMEN

BACKGROUND: The Coronavirus Disease 2019 (COVID-19) global pandemic has had a profound, lasting impact on the world's population. A key aspect to providing care for those with COVID-19 and checking its further spread is early and accurate diagnosis of infection, which has been generally done via methods for amplifying and detecting viral RNA molecules. Detection and quantitation of peptides using targeted mass spectrometry-based strategies has been proposed as an alternative diagnostic tool due to direct detection of molecular indicators from non-invasively collected samples as well as the potential for high-throughput analysis in a clinical setting; many studies have revealed the presence of viral peptides within easily accessed patient samples. However, evidence suggests that some viral peptides could serve as better indicators of COVID-19 infection status than others, due to potential misidentification of peptides derived from human host proteins, poor spectral quality, high limits of detection etc. METHODS: In this study we have compiled a list of 636 peptides identified from Sudden Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) samples, including from in vitro and clinical sources. These datasets were rigorously analyzed using automated, Galaxy-based workflows containing tools such as PepQuery, BLAST-P, and the Multi-omic Visualization Platform as well as the open-source tools MetaTryp and Proteomics Data Viewer (PDV). RESULTS: Using PepQuery for confirming peptide spectrum matches, we were able to narrow down the 639-peptide possibilities to 87 peptides that were most robustly detected and specific to the SARS-CoV-2 virus. The specificity of these sequences to coronavirus taxa was confirmed using Unipept and BLAST-P. Through stringent p-value cutoff combined with manual verification of peptide spectrum match quality, 4 peptides derived from the nucleocapsid phosphoprotein and membrane protein were found to be most robustly detected across all cell culture and clinical samples, including those collected non-invasively. CONCLUSION: We propose that these peptides would be of the most value for clinical proteomics applications seeking to detect COVID-19 from patient samples. We also contend that samples harvested from the upper respiratory tract and oral cavity have the highest potential for diagnosis of SARS-CoV-2 infection from easily collected patient samples using mass spectrometry-based proteomics assays.

8.
Mol Cell Proteomics ; 18(8 suppl 1): S82-S91, 2019 08 09.
Artículo en Inglés | MEDLINE | ID: mdl-31235611

RESUMEN

Microbiome research offers promising insights into the impact of microorganisms on biological systems. Metaproteomics, the study of microbial proteins at the community level, integrates genomic, transcriptomic, and proteomic data to determine the taxonomic and functional state of a microbiome. However, standard metaproteomics software is subject to several limitations, commonly supporting only spectral counts, emphasizing exploratory analysis rather than hypothesis testing and rarely offering the ability to analyze the interaction of function and taxonomy - that is, which taxa are responsible for different processes.Here we present metaQuantome, a novel, multifaceted software suite that analyzes the state of a microbiome by leveraging complex taxonomic and functional hierarchies to summarize peptide-level quantitative information, emphasizing label-free intensity-based methods. For experiments with multiple experimental conditions, metaQuantome offers differential abundance analysis, principal components analysis, and clustered heat map visualizations, as well as exploratory analysis for a single sample or experimental condition. We benchmark metaQuantome analysis against standard methods, using two previously published datasets: (1) an artificially assembled microbial community dataset (taxonomy benchmarking) and (2) a dataset with a range of recombinant human proteins spiked into an Escherichia coli background (functional benchmarking). Furthermore, we demonstrate the use of metaQuantome on a previously published human oral microbiome dataset.In both the taxonomic and functional benchmarking analyses, metaQuantome quantified taxonomic and functional terms more accurately than standard summarization-based methods. We use the oral microbiome dataset to demonstrate metaQuantome's ability to produce publication-quality figures and elucidate biological processes of the oral microbiome. metaQuantome enables advanced investigation of metaproteomic datasets, which should be broadly applicable to microbiome-related research. In the interest of accessible, flexible, and reproducible analysis, metaQuantome is open source and available on the command line and in Galaxy.


Asunto(s)
Microbiota , Proteómica , Programas Informáticos , Niño , Placa Dental/microbiología , Disbiosis/microbiología , Escherichia coli/genética , Humanos , Enfermedades de la Boca/microbiología , Péptidos/metabolismo
9.
J Proteome Res ; 19(7): 2772-2785, 2020 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-32396365

RESUMEN

Multiomics approaches focused on mass spectrometry (MS)-based data, such as metaproteomics, utilize genomic and/or transcriptomic sequencing data to generate a comprehensive protein sequence database. These databases can be very large, containing millions of sequences, which reduces the sensitivity of matching tandem mass spectrometry (MS/MS) data to sequences to generate peptide spectrum matches (PSMs). Here, we describe and evaluate a sectioning method for generating an enriched database for those protein sequences that are most likely present in the sample. Our evaluation demonstrates how this method helps to increase the sensitivity of PSMs while maintaining acceptable false discovery rate statistics-offering a flexible alternative to traditional large database searching, as well as previously described two-step database searching methods for large sequence database applications. Furthermore, implementation in the Galaxy platform provides access to an automated and customizable workflow for carrying out the method. Additionally, the results of this study provide valuable insights into the advantages and limitations offered by available methods aimed at addressing challenges of genome-guided, large database applications in proteomics. Relevant raw data has been made available at https://zenodo.org/ using data set identifier "3754789" and https://arcticdata.io/catalog using data set identifier "A2VX06340".


Asunto(s)
Proteómica , Espectrometría de Masas en Tándem , Bases de Datos de Proteínas , Genómica , Péptidos/genética , Programas Informáticos
10.
J Proteome Res ; 19(1): 161-173, 2020 01 03.
Artículo en Inglés | MEDLINE | ID: mdl-31793300

RESUMEN

Workflows for large-scale (MS)-based shotgun proteomics can potentially lead to costly errors in the form of incorrect peptide-spectrum matches (PSMs). To improve the robustness of these workflows, we have investigated the use of the precursor mass discrepancy (PMD) to detect and filter potentially false PSMs that have, nonetheless, a high confidence score. We identified and addressed three cases of unexpected bias in PMD results: time of acquisition within a liquid chromatography-mass spectrometry (LC-MS) run, decoy PSMs, and length of the peptide. We created a postanalysis Bayesian confidence measure based on score and PMD, called PMD-false discovery rate (FDR). We tested PMD-FDR on four data sets across three types of MS-based proteomics projects: standard (single organism; reference database), proteogenomics (single organism; customized genomic-based database plus reference), and metaproteomics (microorganism community; customized conglomerate database). On a ground-truth data set and other representative data, PMD-FDR was able to detect 60-80% of likely incorrect PSMs (false-hits) while losing only 5% of correct PSMs (true-hits). PMD-FDR can also be used to evaluate data quality for results generated within different experimental PSM-generating workflows, assisting in method development. Going forward, PMD-FDR should provide detection of high scoring but likely false-hits, aiding applications that rely heavily on accurate PSMs, such as proteogenomics and metaproteomics.


Asunto(s)
Péptidos , Espectrometría de Masas en Tándem , Algoritmos , Teorema de Bayes , Cromatografía Liquida , Bases de Datos de Proteínas , Proteómica
11.
J Proteome Res ; 18(2): 728-731, 2019 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-30511867

RESUMEN

moFF is a modular and operating-system-independent tool for quantitative analysis of label-free mass-spectrometry-based proteomics data. The moFF workflow, comprising matching-between-runs and apex quantification, can be applied to any upstream search engine's output, along with the corresponding Thermo or mzML raw file. We here present moFF 2.0, with improvements in speed through multithreading, the use of a new raw file access library, and a novel filtering approach in the matching-between-runs module. This filter allows moFF to correctly identify features that are present in one run but not in another, as demonstrated using spiked-in iRT peptides. Moreover, moFF 2.0 also provides a new peptide summary export that can be used in downstream statistical analysis. moFF is open source and freely available and can be downloaded from https://github.com/compomics/moFF.


Asunto(s)
Algoritmos , Interpretación Estadística de Datos , Proteómica/métodos , Análisis de Datos , Péptidos/análisis , Péptidos/química , Programas Informáticos
12.
J Proteome Res ; 18(2): 782-790, 2019 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-30582332

RESUMEN

Next-generation sequencing technologies, coupled to advances in mass-spectrometry-based proteomics, have facilitated system-wide quantitative profiling of expressed mRNA transcripts and proteins. Proteo-transcriptomic analysis compares the relative abundance levels of transcripts and their corresponding proteins, illuminating discordant gene product responses to perturbations. These results reveal potential post-transcriptional regulation, providing researchers with important new insights into underlying biological and pathological disease mechanisms. To carry out proteo-transcriptomic analysis, researchers require software that statistically determines transcript-protein abundance correlation levels and provides results visualization and interpretation functionality, ideally within a flexible, user-friendly platform. As a solution, we have developed the QuanTP software within the Galaxy platform. The software offers a suite of tools and functionalities critical for proteo-transcriptomics, including statistical algorithms for assessing the correlation between single transcript-protein pairs as well as across two cohorts, outlier identification and clustering, along with a diverse set of results visualizations. It is compatible with analyses of results from single experiment data or from a two-cohort comparison of aggregated replicate experiments. The tool is available in the Galaxy Tool Shed through a cloud-based instance and a Docker container. In all, QuanTP provides an accessible and effective software resource, which should enable new multiomic discoveries from quantitative proteo-transcriptomic data sets.


Asunto(s)
Biología Computacional/métodos , Análisis de Datos , Perfilación de la Expresión Génica/métodos , Proteómica/métodos , Programas Informáticos , Animales , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Espectrometría de Masas
13.
J Proteome Res ; 17(12): 4329-4336, 2018 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-30130115

RESUMEN

The Chromosome-centric Human Proteome Project (C-HPP) seeks to comprehensively characterize all protein products coded by the genome, including those expressed sequence variants confirmed via proteogenomics methods. The closely related Biology/Disease-driven Human Proteome Project (B/D-HPP) seeks to understand the biological and pathological associations of expressed protein products, especially those carrying sequence variants that may be drivers of disease. To achieve these objectives, informatics tools are required that interpret potential functional or disease implications of variant protein sequence detected via proteogenomics. Toward this end, we have developed an automated workflow within the Galaxy for Proteomics (Galaxy-P) platform, which leverages the Cancer-Related Analysis of Variants Toolkit (CRAVAT) and makes it interoperable with proteogenomic results. Protein sequence variants confirmed by proteogenomics are assessed for potential structure-function effects as well as associations with cancer using CRAVAT's rich suite of functionalities, including visualization of results directly within the Galaxy user interface. We demonstrate the effectiveness of this workflow on proteogenomic results generated from an MCF7 breast cancer cell line. Our free and open software should enable improved interpretation of the functional and pathological effects of protein sequence variants detected via proteogenomics, acting as a bridge between the C-HPP and B/D-HPP.


Asunto(s)
Proteogenómica/métodos , Proteoma , Programas Informáticos , Secuencia de Aminoácidos , Línea Celular Tumoral , Cromosomas Humanos/genética , Variación Genética , Humanos , Células MCF-7 , Neoplasias/genética , Flujo de Trabajo
14.
Methods Mol Biol ; 2820: 165-185, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38941023

RESUMEN

The upper respiratory tract (URT) is home to a diverse range of microbial species. Respiratory infections disturb the microbial flora in the URT, putting people at risk of secondary infections. The potential dangers and clinical effects of bacterial and fungal coinfections with SARS-CoV-2 support the need to investigate the microbiome of the URT using clinical samples. Mass spectrometry (MS)-based metaproteomics analysis of microbial proteins is a novel approach to comprehensively assess the clinical specimens with complex microbial makeup. The coronavirus that causes severe acute respiratory syndrome (SARS-CoV-2) is responsible for the COVID-19 pandemic resulting in a plethora of microbial coinfections impeding therapy, prognosis, and overall disease management. In this chapter, the corresponding workflows for MS-based shotgun proteomics and metaproteomic analysis are illustrated.


Asunto(s)
COVID-19 , Coinfección , Proteómica , SARS-CoV-2 , Humanos , COVID-19/virología , COVID-19/complicaciones , Proteómica/métodos , Coinfección/microbiología , Coinfección/virología , SARS-CoV-2/aislamiento & purificación , Microbiota , Infecciones del Sistema Respiratorio/microbiología , Infecciones del Sistema Respiratorio/virología , Infecciones del Sistema Respiratorio/diagnóstico , Espectrometría de Masas/métodos , Proteoma/análisis , Sistema Respiratorio/microbiología , Sistema Respiratorio/metabolismo , Sistema Respiratorio/virología
15.
mSphere ; 9(6): e0079323, 2024 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-38780289

RESUMEN

Clinical metaproteomics has the potential to offer insights into the host-microbiome interactions underlying diseases. However, the field faces challenges in characterizing microbial proteins found in clinical samples, usually present at low abundance relative to the host proteins. As a solution, we have developed an integrated workflow coupling mass spectrometry-based analysis with customized bioinformatic identification, quantification, and prioritization of microbial proteins, enabling targeted assay development to investigate host-microbe dynamics in disease. The bioinformatics tools are implemented in the Galaxy ecosystem, offering the development and dissemination of complex bioinformatic workflows. The modular workflow integrates MetaNovo (to generate a reduced protein database), SearchGUI/PeptideShaker and MaxQuant [to generate peptide-spectral matches (PSMs) and quantification], PepQuery2 (to verify the quality of PSMs), Unipept (for taxonomic and functional annotation), and MSstatsTMT (for statistical analysis). We have utilized this workflow in diverse clinical samples, from the characterization of nasopharyngeal swab samples to bronchoalveolar lavage fluid. Here, we demonstrate its effectiveness via analysis of residual fluid from cervical swabs. The complete workflow, including training data and documentation, is available via the Galaxy Training Network, empowering non-expert researchers to utilize these powerful tools in their clinical studies. IMPORTANCE: Clinical metaproteomics has immense potential to offer functional insights into the microbiome and its contributions to human disease. However, there are numerous challenges in the metaproteomic analysis of clinical samples, including handling of very large protein sequence databases for sensitive and accurate peptide and protein identification from mass spectrometry data, as well as taxonomic and functional annotation of quantified peptides and proteins to enable interpretation of results. To address these challenges, we have developed a novel clinical metaproteomics workflow that provides customized bioinformatic identification, verification, quantification, and taxonomic and functional annotation. This bioinformatic workflow is implemented in the Galaxy ecosystem and has been used to characterize diverse clinical sample types, such as nasopharyngeal swabs and bronchoalveolar lavage fluid. Here, we demonstrate its effectiveness and availability for use by the research community via analysis of residual fluid from cervical swabs.


Asunto(s)
Biología Computacional , Proteómica , Flujo de Trabajo , Proteómica/métodos , Humanos , Biología Computacional/métodos , Interacciones Microbiota-Huesped , Espectrometría de Masas , Microbiota/genética , Líquido del Lavado Bronquioalveolar/microbiología , Líquido del Lavado Bronquioalveolar/química , Proteínas Bacterianas/genética
16.
mSystems ; : e0092923, 2024 Jun 27.
Artículo en Inglés | MEDLINE | ID: mdl-38934598

RESUMEN

Airway microbiota are known to contribute to lung diseases, such as cystic fibrosis (CF), but their contributions to pathogenesis are still unclear. To improve our understanding of host-microbe interactions, we have developed an integrated analytical and bioinformatic mass spectrometry (MS)-based metaproteomics workflow to analyze clinical bronchoalveolar lavage (BAL) samples from people with airway disease. Proteins from BAL cellular pellets were processed and pooled together in groups categorized by disease status (CF vs. non-CF) and bacterial diversity, based on previously performed small subunit rRNA sequencing data. Proteins from each pooled sample group were digested and subjected to liquid chromatography tandem mass spectrometry (MS/MS). MS/MS spectra were matched to human and bacterial peptide sequences leveraging a bioinformatic workflow using a metagenomics-guided protein sequence database and rigorous evaluation. Label-free quantification revealed differentially abundant human peptides from proteins with known roles in CF, like neutrophil elastase and collagenase, and proteins with lesser-known roles in CF, including apolipoproteins. Differentially abundant bacterial peptides were identified from known CF pathogens (e.g., Pseudomonas), as well as other taxa with potentially novel roles in CF. We used this host-microbe peptide panel for targeted parallel-reaction monitoring validation, demonstrating for the first time an MS-based assay effective for quantifying host-microbe protein dynamics within BAL cells from individual CF patients. Our integrated bioinformatic and analytical workflow combining discovery, verification, and validation should prove useful for diverse studies to characterize microbial contributors in airway diseases. Furthermore, we describe a promising preliminary panel of differentially abundant microbe and host peptide sequences for further study as potential markers of host-microbe relationships in CF disease pathogenesis.IMPORTANCEIdentifying microbial pathogenic contributors and dysregulated human responses in airway disease, such as CF, is critical to understanding disease progression and developing more effective treatments. To this end, characterizing the proteins expressed from bacterial microbes and human host cells during disease progression can provide valuable new insights. We describe here a new method to confidently detect and monitor abundance changes of both microbe and host proteins from challenging BAL samples commonly collected from CF patients. Our method uses both state-of-the art mass spectrometry-based instrumentation to detect proteins present in these samples and customized bioinformatic software tools to analyze the data and characterize detected proteins and their association with CF. We demonstrate the use of this method to characterize microbe and host proteins from individual BAL samples, paving the way for a new approach to understand molecular contributors to CF and other diseases of the airway.

17.
Res Sq ; 2024 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-38883770

RESUMEN

Background: Obstructive lung disease (OLD) is increasingly prevalent among persons living with HIV (PLWH). However, the role of proteases in HIV-associated OLD remains unclear. Methods: We combined proteomics and peptidomics to comprehensively characterize protease activities. We combined mass spectrometry (MS) analysis on bronchoalveolar lavage fluid (BALF) peptides and proteins from PLWH with OLD (n=25) and without OLD (n=26) with a targeted Somascan aptamer-based proteomic approach to quantify individual proteases and assess their correlation with lung function. Endogenous peptidomics mapped peptides to native proteins to identify substrates of protease activity. Using the MEROPS database, we identified candidate proteases linked to peptide generation based on binding site affinities which were assessed via z-scores. We used t-tests to compare average forced expiratory volume in 1 second per predicted value (FEV1pp) between samples with and without detection of each cleaved protein and adjusted for multiple comparisons by controlling the false discovery rate (FDR). Findings: We identified 101 proteases, of which 95 had functional network associations and 22 correlated with FEV1pp. These included cathepsins, metalloproteinases (MMP), caspases and neutrophil elastase. We discovered 31 proteins subject to proteolytic cleavage that associate with FEV1pp, with the top pathways involved in small ubiquitin-like modifier mediated modification (SUMOylation). Proteases linked to protein cleavage included neutrophil elastase, granzyme, and cathepsin D. Interpretations: In HIV-associated OLD, a significant number of proteases are up-regulated, many of which are involved in protein degradation. These proteases degrade proteins involved in cell cycle and protein stability, thereby disrupting critical biological functions.

18.
bioRxiv ; 2023 Dec 19.
Artículo en Inglés | MEDLINE | ID: mdl-38045370

RESUMEN

Clinical metaproteomics has the potential to offer insights into the host-microbiome interactions underlying diseases. However, the field faces challenges in characterizing microbial proteins found in clinical samples, which are usually present at low abundance relative to the host proteins. As a solution, we have developed an integrated workflow coupling mass spectrometry-based analysis with customized bioinformatic identification, quantification and prioritization of microbial and host proteins, enabling targeted assay development to investigate host-microbe dynamics in disease. The bioinformatics tools are implemented in the Galaxy ecosystem, offering the development and dissemination of complex bioinformatic workflows. The modular workflow integrates MetaNovo (to generate a reduced protein database), SearchGUI/PeptideShaker and MaxQuant (to generate peptide-spectral matches (PSMs) and quantification), PepQuery2 (to verify the quality of PSMs), and Unipept and MSstatsTMT (for taxonomy and functional annotation). We have utilized this workflow in diverse clinical samples, from the characterization of nasopharyngeal swab samples to bronchoalveolar lavage fluid. Here, we demonstrate its effectiveness via analysis of residual fluid from cervical swabs. The complete workflow, including training data and documentation, is available via the Galaxy Training Network, empowering non-expert researchers to utilize these powerful tools in their clinical studies.

19.
Environ Microbiome ; 18(1): 56, 2023 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-37420292

RESUMEN

BACKGROUND: 'Omics methods have empowered scientists to tackle the complexity of microbial communities on a scale not attainable before. Individually, omics analyses can provide great insight; while combined as "meta-omics", they enhance the understanding of which organisms occupy specific metabolic niches, how they interact, and how they utilize environmental nutrients. Here we present three integrative meta-omics workflows, developed in Galaxy, for enhanced analysis and integration of metagenomics, metatranscriptomics, and metaproteomics, combined with our newly developed web-application, ViMO (Visualizer for Meta-Omics) to analyse metabolisms in complex microbial communities. RESULTS: In this study, we applied the workflows on a highly efficient cellulose-degrading minimal consortium enriched from a biogas reactor to analyse the key roles of uncultured microorganisms in complex biomass degradation processes. Metagenomic analysis recovered metagenome-assembled genomes (MAGs) for several constituent populations including Hungateiclostridium thermocellum, Thermoclostridium stercorarium and multiple heterogenic strains affiliated to Coprothermobacter proteolyticus. The metagenomics workflow was developed as two modules, one standard, and one optimized for improving the MAG quality in complex samples by implementing a combination of single- and co-assembly, and dereplication after binning. The exploration of the active pathways within the recovered MAGs can be visualized in ViMO, which also provides an overview of the MAG taxonomy and quality (contamination and completeness), and information about carbohydrate-active enzymes (CAZymes), as well as KEGG annotations and pathways, with counts and abundances at both mRNA and protein level. To achieve this, the metatranscriptomic reads and metaproteomic mass-spectrometry spectra are mapped onto predicted genes from the metagenome to analyse the functional potential of MAGs, as well as the actual expressed proteins and functions of the microbiome, all visualized in ViMO. CONCLUSION: Our three workflows for integrative meta-omics in combination with ViMO presents a progression in the analysis of 'omics data, particularly within Galaxy, but also beyond. The optimized metagenomics workflow allows for detailed reconstruction of microbial community consisting of MAGs with high quality, and thus improves analyses of the metabolism of the microbiome, using the metatranscriptomics and metaproteomics workflows.

20.
Proteomes ; 10(2)2022 Apr 14.
Artículo en Inglés | MEDLINE | ID: mdl-35466239

RESUMEN

Chronic inflammation of the colon causes genomic and/or transcriptomic events, which can lead to expression of non-canonical protein sequences contributing to oncogenesis. To better understand these mechanisms, Rag2-/-Il10-/- mice were infected with Helicobacter hepaticus to induce chronic inflammation of the cecum and the colon. Transcriptomic data from harvested proximal colon samples were used to generate a customized FASTA database containing non-canonical protein sequences. Using a proteogenomic approach, mass spectrometry data for proximal colon proteins were searched against this custom FASTA database using the Galaxy for Proteomics (Galaxy-P) platform. In addition to the increased abundance in inflammatory response proteins, we also discovered several non-canonical peptide sequences derived from unique proteoforms. We confirmed the veracity of these novel sequences using an automated bioinformatics verification workflow with targeted MS-based assays for peptide validation. Our bioinformatics discovery workflow identified 235 putative non-canonical peptide sequences, of which 58 were verified with high confidence and 39 were validated in targeted proteomics assays. This study provides insights into challenges faced when identifying non-canonical peptides using a proteogenomics approach and demonstrates an integrated workflow addressing these challenges. Our bioinformatic discovery and verification workflow is publicly available and accessible via the Galaxy platform and should be valuable in non-canonical peptide identification using proteogenomics.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA