RESUMO
SUMMARY: The retarded transient function (RTF) approach serves as a complementary method to ordinary differential equations (ODEs) for modelling dynamics typically observed in cellular signalling processes. We introduce an R package that implements the RTF approach, originally implemented within the MATLAB-based Data2Dynamics modelling framework. This package facilitates the modelling of time and dose dependencies, and it includes the possibility of model reduction to minimize overfitting. It can be applied to experimental data or trajectories of ODE models to characterize their dynamics. Additionally, it can generate a low-dimensional representation based on the fitted RTF parameters of a set of time-resolved data, aiding in the identification of key targets of experimental perturbations. AVAILABILITY AND IMPLEMENTATION: The R package RTF is available at https://github.com/kreutz-lab/RTF.
Assuntos
Software , Modelos Biológicos , Algoritmos , Transdução de Sinais , Biologia Computacional/métodosRESUMO
State-of-the-art mass spectrometers combined with modern bioinformatics algorithms for peptide-to-spectrum matching (PSM) with robust statistical scoring allow for more variable features (i.e., post-translational modifications) being reliably identified from (tandem-) mass spectrometry data, often without the need for biochemical enrichment. Semi-specific proteome searches, that enforce a theoretical enzymatic digestion to solely the N- or C-terminal end, allow to identify of native protein termini or those arising from endogenous proteolytic activity (also referred to as "neo-N-termini" analysis or "N-terminomics"). Nevertheless, deriving biological meaning from these search outputs can be challenging in terms of data mining and analysis. Thus, we introduce TermineR, a data analysis approach for the (1) annotation of peptides according to their enzymatic cleavage specificity and known protein processing features, (2) differential abundance and enrichment analysis of N-terminal sequence patterns, and (3) visualization of neo-N-termini location. We illustrate the use of TermineR by applying it to tandem mass tag (TMT)-based proteomics data of a mouse model of polycystic kidney disease, and assess the semi-specific searches for biological interpretation of cleavage events and the variable contribution of proteolytic products to general protein abundance. The TermineR approach and example data are available as an R package at https://github.com/MiguelCos/TermineR.
Assuntos
Proteólise , Proteômica , Espectrometria de Massas em Tandem , Proteômica/métodos , Animais , Camundongos , Espectrometria de Massas em Tandem/métodos , Processamento de Proteína Pós-Traducional , Algoritmos , Doenças Renais Policísticas/metabolismo , Proteoma/metabolismo , Proteoma/análise , Software , Bases de Dados de Proteínas , Peptídeos/metabolismo , Peptídeos/análise , Peptídeos/químicaRESUMO
Data-independent acquisition (DIA) has revolutionized the field of mass spectrometry (MS)-based proteomics over the past few years. DIA stands out for its ability to systematically sample all peptides in a given m/z range, allowing an unbiased acquisition of proteomics data. This greatly mitigates the issue of missing values and significantly enhances quantitative accuracy, precision, and reproducibility compared to many traditional methods. This review focuses on the critical role of DIA analysis software tools, primarily focusing on their capabilities and the challenges they address in proteomic research. Advances in MS technology, such as trapped ion mobility spectrometry, or high field asymmetric waveform ion mobility spectrometry require sophisticated analysis software capable of handling the increased data complexity and exploiting the full potential of DIA. We identify and critically evaluate leading software tools in the DIA landscape, discussing their unique features, and the reliability of their quantitative and qualitative outputs. We present the biological and clinical relevance of DIA-MS and discuss crucial publications that paved the way for in-depth proteomic characterization in patient-derived specimens. Furthermore, we provide a perspective on emerging trends in clinical applications and present upcoming challenges including standardization and certification of MS-based acquisition strategies in molecular diagnostics. While we emphasize the need for continuous development of software tools to keep pace with evolving technologies, we advise researchers against uncritically accepting the results from DIA software tools. Each tool may have its own biases, and some may not be as sensitive or reliable as others. Our overarching recommendation for both researchers and clinicians is to employ multiple DIA analysis tools, utilizing orthogonal analysis approaches to enhance the robustness and reliability of their findings.
Assuntos
Espectrometria de Massas , Proteômica , Software , Proteômica/métodos , Humanos , Espectrometria de Massas/métodos , Reprodutibilidade dos TestesRESUMO
Despite substantial heterogeneity of studies, there is evidence that antibiotics commonly used in primary care influence the composition of the gastrointestinal microbiota in terms of changing their composition and/or diversity. Benzyl isothiocyanate (BITC) from the food and medicinal plant nasturtium (Tropaeolum majus) is known for its antimicrobial activity and is used for the treatment of infections of the draining urinary tract and upper respiratory tract. Against this background, we raised the question of whether a 14 d nasturtium intervention (3 g daily, N = 30 healthy females) could also impact the normal gut microbiota composition. Spot urinary BITC excretion highly correlated with a weak but significant antibacterial effect against Escherichia coli. A significant increase in human beta defensin 1 as a parameter for host defense was seen in urine and exhaled breath condensate (EBC) upon verum intervention. Pre-to-post analysis revealed that mean gut microbiome composition did not significantly differ between groups, nor did the circulating serum metabolome. On an individual level, some large changes were observed between sampling points, however. Explorative Spearman rank correlation analysis in subgroups revealed associations between gut microbiota and the circulating metabolome, as well as between changes in blood markers and bacterial gut species.
Assuntos
Microbioma Gastrointestinal , Nasturtium , Tropaeolum , Feminino , Humanos , Isotiocianatos/farmacologia , Bactérias , Escherichia coli , MetabolomaRESUMO
Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal cancer, often diagnosed at stages that dis-qualify for surgical resection. Neoadjuvant therapies offer potential tumor regression and improved resectability. Although features of the tumor biology (e.g., molecular markers) may guide adjuvant therapy, biological alterations after neoadjuvant therapy remain largely unexplored. We performed mass spectrometry to characterize the proteomes of 67 PDAC resection specimens of patients who received either neoadjuvant chemo (NCT) or chemo-radiation (NCRT) therapy. We employed data-independent acquisition (DIA), yielding a proteome coverage in excess of 3500 proteins. Moreover, we successfully integrated two publicly available proteome datasets of treatment-naïve PDAC to unravel proteome alterations in response to neoadjuvant therapy, highlighting the feasibility of this approach. We found highly distinguishable proteome profiles. Treatment-naïve PDAC was characterized by enrichment of immunoglobulins, complement and extracellular matrix (ECM) proteins. Post-NCT and post-NCRT PDAC presented high abundance of ribosomal and metabolic proteins as compared to treatment-naïve PDAC. Further analyses on patient survival and protein expression identified treatment-specific prognostic candidates. We present the first proteomic characterization of the residual PDAC mass after NCT and NCRT, and potential protein candidate markers associated with overall survival. We conclude that residual PDAC exhibits fundamentally different proteome profiles as compared to treatment-naïve PDAC, influenced by the type of neoadjuvant treatment. These findings may impact adjuvant or targeted therapy options.
Assuntos
Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Humanos , Terapia Neoadjuvante , Proteínas Ribossômicas , Proteoma , Neoplasia Residual , Proteômica , Neoplasias Pancreáticas/patologia , Carcinoma Ductal Pancreático/patologia , Ativação do Complemento , Metabolismo EnergéticoRESUMO
Recent extensions of single-cell studies to multiple data modalities raise new questions regarding experimental design. For example, the challenge of sparsity in single-omics data might be partly resolved by compensating for missing information across modalities. In particular, deep learning approaches, such as deep generative models (DGMs), can potentially uncover complex patterns via a joint embedding. Yet, this also raises the question of sample size requirements for identifying such patterns from single-cell multi-omics data. Here, we empirically examine the quality of DGM-based integrations for varying sample sizes. We first review the existing literature and give a short overview of deep learning methods for multi-omics integration. Next, we consider eight popular tools in more detail and examine their robustness to different cell numbers, covering two of the most common multi-omics types currently favored. Specifically, we use data featuring simultaneous gene expression measurements at the RNA level and protein abundance measurements for cell surface proteins (CITE-seq), as well as data where chromatin accessibility and RNA expression are measured in thousands of cells (10x Multiome). We examine the ability of the methods to learn joint embeddings based on biological and technical metrics. Finally, we provide recommendations for the design of multi-omics experiments and discuss potential future developments.
RESUMO
Numerous software tools exist for data-independent acquisition (DIA) analysis of clinical samples, necessitating their comprehensive benchmarking. We present a benchmark dataset comprising real-world inter-patient heterogeneity, which we use for in-depth benchmarking of DIA data analysis workflows for clinical settings. Combining spectral libraries, DIA software, sparsity reduction, normalization, and statistical tests results in 1428 distinct data analysis workflows, which we evaluate based on their ability to correctly identify differentially abundant proteins. From our dataset, we derive bootstrap datasets of varying sample sizes and use the whole range of bootstrap datasets to robustly evaluate each workflow. We find that all DIA software suites benefit from using a gas-phase fractionated spectral library, irrespective of the library refinement used. Gas-phase fractionation-based libraries perform best against two out of three reference protein lists. Among all investigated statistical tests non-parametric permutation-based statistical tests consistently perform best.
Assuntos
Benchmarking , Proteômica , Humanos , Proteoma/análise , Proteômica/métodos , Software , Fluxo de TrabalhoRESUMO
Imputation is a prominent strategy when dealing with missing values (MVs) in proteomics data analysis pipelines. However, it is difficult to assess the performance of different imputation methods and varies strongly depending on data characteristics. To overcome this issue, we present the concept of a data-driven selection of an imputation algorithm (DIMA). The performance and broad applicability of DIMA are demonstrated on 142 quantitative proteomics data sets from the PRoteomics IDEntifications (PRIDE) database and on simulated data consisting of 5-50% MVs with different proportions of missing not at random and missing completely at random values. DIMA reliably suggests a high-performing imputation algorithm, which is always among the three best algorithms and results in a root mean square error difference (ΔRMSE) ≤ 10% in 80% of the cases. DIMA implementation is available in MATLAB at github.com/kreutz-lab/OmicsData and in R at github.com/kreutz-lab/DIMAR.
Assuntos
Algoritmos , Proteômica , Bases de Dados Factuais , HumanosRESUMO
High-throughput biological data-such as mass spectrometry (MS)-based proteomics data-suffer from systematic non-biological variance due to systematic errors. This hinders the estimation of "real" biological signals and, in turn, decreases the power of statistical tests and biases the identification of differentially expressed proteins. To remove such unintended variation, while retaining the biological signal of interest, analysis workflows for quantitative MS data typically comprise normalization prior to their statistical analysis. Several normalization methods, such as quantile normalization (QN), have originally been developed for microarray data. In contrast to microarray data proteomics data may contain features, in the form of protein intensities that are consistently high across experimental conditions and, hence, are encountered in the tails of the protein intensity distribution. If QN is applied in the presence of such proteins statistical inferences of the features' intensity profiles are impeded due to the biased estimation of their variance. A freely available, novel approach is introduced which serves as an improvement of the classical QN by preserving the biological signals of features in the tails of the intensity distribution and by accounting for sample-dependent missing values (MVs): The "tail-robust quantile normalization" (TRQN).
Assuntos
Proteínas , Proteômica , Perfilação da Expressão Gênica , Espectrometria de MassasRESUMO
Autism spectrum disorders (ASD) are a heterogeneous group of disorders which have complex behavioural phenotypes. Although ASD is a highly heritable neuropsychiatric disorder, genetic research alone has not provided a profound understanding of the underlying causes. Recent developments using biochemical tools such as transcriptomics, proteomics and cellular models, will pave the way to gain new insights into the underlying pathological pathways. This review addresses the state-of-the-art in the search for molecular biomarkers for ASD. In particular, the most important findings in the biochemical field are highlighted and the need for establishing streamlined interaction between behavioural studies, genetics and proteomics is stressed. Eventually, these approaches will lead to suitable translational ASD models and, therefore, a better disease understanding which may facilitate novel drug discovery efforts in this challenging field.
Assuntos
Biomarcadores , Transtornos Globais do Desenvolvimento Infantil/diagnóstico , Proteômica , Transtornos Globais do Desenvolvimento Infantil/genética , HumanosRESUMO
The Gram-negative opportunistic pathogen Legionella pneumophila replicates in phagocytes within a specific compartment, the Legionella-containing vacuole (LCV). Formation of LCVs is a complex process requiring the bacterial Icm/Dot type IV secretion system and more than 100 translocated effector proteins, which putatively subvert cellular signaling and vesicle trafficking pathways. Phosphoinositide (PI) glycerolipids are pivotal regulators of signal transduction and membrane dynamics in eukaryotes. Recently, a number of Icm/Dot substrates were found to anchor to the LCV membrane by binding to PIs. One of these effectors, SidC, specifically interacts with phosphatidylinositol-4 phosphate [PtdIns(4)P]. Using an antibody against SidC and magnetic beads coupled to a secondary antibody, intact LCVs were purified by immuno-magnetic separation, followed by density centrifugation. This purification strategy is in principle applicable to any pathogen vacuole that carries specific markers. The LCV proteome determined by LC-MS/MS revealed 566 host proteins, including novel components of the endosomal pathway, as well as the early and late secretory trafficking pathways. Thus, LCV formation is a robust process that involves many (functionally redundant) Icm/Dot substrates, as well as the interaction with different host cell vesicle trafficking pathways.
RESUMO
The causative agent of Legionnaires disease, Legionella pneumophila, forms a replicative vacuole in phagocytes by means of the intracellular multiplication/defective organelle trafficking (Icm/Dot) type IV secretion system and translocated effector proteins, some of which subvert host GTP and phosphoinositide (PI) metabolism. The Icm/Dot substrate SidC anchors to the membrane of Legionella-containing vacuoles (LCVs) by specifically binding to phosphatidylinositol 4-phosphate (PtdIns(4)P). Using a nonbiased screen for novel L. pneumophila PI-binding proteins, we identified the Rab1 guanine nucleotide exchange factor (GEF) SidM/DrrA as the predominant PtdIns(4)P-binding protein. Purified SidM specifically and directly bound to PtdIns(4)P, whereas the SidM-interacting Icm/Dot substrate LidA preferentially bound PtdIns(3)P but also PtdIns(4)P, and the L. pneumophila Arf1 GEF RalF did not bind to any PIs. The PtdIns(4)P-binding domain of SidM was mapped to the 12-kDa C-terminal sequence, termed "P4M" (PtdIns4P binding of SidM/DrrA). The isolated P4M domain is largely helical and displayed higher PtdIns(4)P binding activity in the context of the alpha-helical, monomeric full-length protein. SidM constructs containing P4M were translocated by Icm/Dot-proficient L. pneumophila and localized to the LCV membrane, indicating that SidM anchors to PtdIns(4)P on LCVs via its P4M domain. An L. pneumophila DeltasidM mutant strain displayed significantly higher amounts of SidC on LCVs, suggesting that SidM and SidC compete for limiting amounts of PtdIns(4)P on the vacuole. Finally, RNA interference revealed that PtdIns(4)P on LCVs is specifically formed by host PtdIns 4-kinase IIIbeta. Thus, L. pneumophila exploits PtdIns(4)P produced by PtdIns 4-kinase IIIbeta to anchor the effectors SidC and SidM to LCVs.
Assuntos
Proteínas de Bactérias/química , Proteínas de Transporte/química , Fatores de Troca do Nucleotídeo Guanina/química , Legionella pneumophila/química , Fosfatos de Fosfatidilinositol/química , Proteínas rab1 de Ligação ao GTP/química , Animais , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Proteínas de Transporte/genética , Proteínas de Transporte/metabolismo , Linhagem Celular , Drosophila , Fatores de Troca do Nucleotídeo Guanina/genética , Fatores de Troca do Nucleotídeo Guanina/metabolismo , Humanos , Legionella pneumophila/genética , Legionella pneumophila/metabolismo , Legionella pneumophila/patogenicidade , Doença dos Legionários/genética , Doença dos Legionários/metabolismo , Mutação , Mapeamento de Peptídeos , Fagócitos/metabolismo , Fagócitos/microbiologia , Fosfatos de Fosfatidilinositol/genética , Fosfatos de Fosfatidilinositol/metabolismo , Ligação Proteica/fisiologia , Estrutura Terciária de Proteína/fisiologia , Vacúolos/genética , Vacúolos/metabolismo , Vacúolos/microbiologia , Proteínas rab1 de Ligação ao GTP/genética , Proteínas rab1 de Ligação ao GTP/metabolismoRESUMO
Curli fibers, encoded by the csgBAC genes, promote biofilm formation in Escherichia coli and other enterobacteria. Curli production is dependent on the CsgD transcription activator, which also promotes cellulose biosynthesis. In this study, we investigated the effects of CsgD expression from a weak constitutive promoter in the biofilm formation-deficient PHL565 strain of E. coli. We found that despite its function as a transcription activator, the CsgD protein is localized in the cytoplasmic membrane. Constitutive CsgD expression promotes biofilm formation by PHL565 and activates transcription from the csgBAC promoter; however, csgBAC expression remains dependent on temperature and the growth medium. Constitutive expression of the CsgD protein results in altered transcription patterns for at least 24 novel genes, in addition to the previously identified CsgD-dependent genes. The cspA and fecR genes, encoding regulatory proteins responding to cold shock and to iron, respectively, and yoaD, encoding a putative negative regulator of cellulose biosynthesis, were found to be some of the novel CsgD-regulated genes. Consistent with the predicted functional role, increased expression of the yoaD gene negatively affects cell aggregation, while yoaD inactivation results in stimulation of cell aggregation and leads to increased cellulose production. Inactivation of fecR results in significant increases in both cell aggregation and biofilm formation, while the effects of cspA are not as strong in the conditions tested. Our results indicate that CsgD can modulate cellulose biosynthesis through activation of the yoaD gene. In addition, the positive effect of CsgD on biofilm formation might be enhanced by repression of the fecR gene.
Assuntos
Aderência Bacteriana/fisiologia , Celulose/biossíntese , Proteínas de Escherichia coli/fisiologia , Escherichia coli/fisiologia , Regulação Bacteriana da Expressão Gênica/fisiologia , Transativadores/fisiologia , Fusão Gênica Artificial , Aderência Bacteriana/genética , Biofilmes/crescimento & desenvolvimento , Membrana Celular/química , Proteínas e Peptídeos de Choque Frio , Meios de Cultura/química , Escherichia coli/genética , Proteínas de Escherichia coli/análise , Proteínas de Escherichia coli/biossíntese , Proteínas de Escherichia coli/genética , Deleção de Genes , Genes Reporter , Glucuronidase/análise , Glucuronidase/genética , Proteínas de Choque Térmico/genética , Proteínas de Membrana Transportadoras/genética , Modelos Biológicos , Diester Fosfórico Hidrolases/genética , Regiões Promotoras Genéticas , Fator sigma/genética , Temperatura , Transativadores/análise , Transcrição GênicaRESUMO
Production of curli, extracellular structures important for biofilm formation, is positively regulated by OmpR, which constitutes with the EnvZ protein an osmolarity-sensing two-component regulatory system. The expression of curli is cryptic in most Escherichia coli laboratory strains such as MG1655, due to the lack of csgD expression. The csgD gene encodes a transcription activator of the curli-subunit-encoding csgBA operon. The ompR234 up-mutation can restore csgD expression, resulting in curli production and increased biofilm formation. In this report, it is shown that ompR234-dependent csgD expression, in addition to csgBA activation during stationary phase of growth, stimulates expression of the yaiC gene and negatively regulates at least two other genes, pepD and yagS. The promoter regions of these four genes share a conserved 11 bp sequence (CGGGKGAKNKA), necessary for csgBA and yaiC regulation by CsgD. While at both the csgBA and yaiC promoters the sequence is located upstream of the promoter elements, in both yagS and pepD it overlaps either the putative -10 sequence or the transcription start point, suggesting that CsgD can function as both an activator and a repressor. Adhesion experiments show that csgD-independent expression of both yagS and pepD from a multicopy plasmid negatively affects biofilm formation, which, in contrast, is stimulated by yaiC expression. Thus it is proposed that CsgD stimulates biofilm formation in E. coli by contemporary activation of adhesion positive determinants (the curli-encoding csg operons and the product of the yaiC gene) and repression of negative effectors such as yagS and pepD.
Assuntos
Proteínas de Bactérias/biossíntese , Biofilmes/crescimento & desenvolvimento , Proteínas de Escherichia coli , Escherichia coli/crescimento & desenvolvimento , Transativadores/fisiologia , Alelos , Aderência Bacteriana , Sequência de Bases , Sítios de Ligação , Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica , Dados de Sequência Molecular , Regiões Promotoras GenéticasRESUMO
This study reports on the development and application of a fish-specific estrogen-responsive reporter gene assay. The assay is based on the rainbow trout (Oncorhynchus mykiss) gonad cell line RTG-2 in which an acute estrogenic response is created by cotransfecting cultures with an expression vector containing rainbow trout estrogen receptor a complementary DNA (rtERalpha cDNA) in the presence of an estrogen-dependent reporter plasmid and an estrogen receptor (ER) agonist. In a further approach, RTG-2 cells were stably transfected with the rtERalpha cDNA expression vector, and clones responsive to 17beta-estradiol (E2) were selected. The estrogenic activity of E2, 17alpha-ethinylestradiol, 4-nonylphenol, nonylphenoxy acetic acid, 4-tert-octylphenol, bisphenol A, o,p'-DDT, p,p'-DDT, o,p'-2,2-bis(chlorophenyl)-1,1-dichloroethylene (o,p'-DDE), p,p'-DDE, o,p'-2,2-bis(chlorophenyl)-1,1-di-chloroethane (o,p'-DDD), p,p'-DDD, and p,p'-2,2-bis(chlorophenyl)acetic acid (p,p'-DDA) was assessed at increasing concentrations. All compounds except o,p'-DDT, p,p'-DDE, and p,p'-DDA showed logistic dose-response curves, which allowed the calculation of lowest-observed-effect concentrations and the concentrations at which half-maximal reporter gene activities were reached. To check whether estrogen-responsive RTG-2 cells may be used to detect the estrogenic activity of environmental samples, an extract from a sewage treatment plant (STP) effluent was assessed and found to have estrogenic activity corresponding to the transcriptional activity elicited by 0.05 nM of E2. Dose-response curves of nonylphenol, octylphenol, bisphenol A, and o,p'-DDD revealed that the RTG-2 reporter gene assay is more sensitive for these compounds when compared to transfection systems recombinant for mammalian ERs. These differences may have an effect on the calculation of E2 equivalents when estrogenic mixtures of known constitution, or environmental samples, such as STP effluents, are assessed.