Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 392
Filtrar
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38426325

RESUMO

Accurate metabolite annotation and false discovery rate (FDR) control remain challenging in large-scale metabolomics. Recent progress leveraging proteomics experiences and interdisciplinary inspirations has provided valuable insights. While target-decoy strategies have been introduced, generating reliable decoy libraries is difficult due to metabolite complexity. Moreover, continuous bioinformatics innovation is imperative to improve the utilization of expanding spectral resources while reducing false annotations. Here, we introduce the concept of ion entropy for metabolomics and propose two entropy-based decoy generation approaches. Assessment of public databases validates ion entropy as an effective metric to quantify ion information in massive metabolomics datasets. Our entropy-based decoy strategies outperform current representative methods in metabolomics and achieve superior FDR estimation accuracy. Analysis of 46 public datasets provides instructive recommendations for practical application.


Assuntos
Algoritmos , Espectrometria de Massas em Tandem , Entropia , Espectrometria de Massas em Tandem/métodos , Metabolômica/métodos , Biologia Computacional/métodos , Bases de Dados de Proteínas
2.
Brain ; 147(3): 858-870, 2024 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-37671566

RESUMO

Parkinson's disease is an age-related neurodegenerative disorder with a higher incidence in males than females. The causes for this sex difference are unknown. Genome-wide association studies (GWAS) have identified 90 Parkinson's disease risk loci, but the genetic studies have not found sex-specific differences in allele frequency on autosomal chromosomes or sex chromosomes. Genetic variants, however, could exert sex-specific effects on gene function and regulation of gene expression. To identify genetic loci that might have sex-specific effects, we studied pleiotropy between Parkinson's disease and sex-specific traits. Summary statistics from GWASs were acquired from large-scale consortia for Parkinson's disease (n cases = 13 708; n controls = 95 282), age at menarche (n = 368 888 females) and age at menopause (n = 69 360 females). We applied the conditional/conjunctional false discovery rate (FDR) method to identify shared loci between Parkinson's disease and these sex-specific traits. Next, we investigated sex-specific gene expression differences in the superior frontal cortex of both neuropathologically healthy individuals and Parkinson's disease patients (n cases = 61; n controls = 23). To provide biological insights to the genetic pleiotropy, we performed sex-specific expression quantitative trait locus (eQTL) analysis and sex-specific age-related differential expression analysis for genes mapped to Parkinson's disease risk loci. Through conditional/conjunctional FDR analysis we found 11 loci shared between Parkinson's disease and the sex-specific traits age at menarche and age at menopause. Gene-set and pathway analysis of the genes mapped to these loci highlighted the importance of the immune response in determining an increased disease incidence in the male population. Moreover, we highlighted a total of nine genes whose expression or age-related expression in the human brain is influenced by genetic variants in a sex-specific manner. With these analyses we demonstrated that the lack of clear sex-specific differences in allele frequencies for Parkinson's disease loci does not exclude a genetic contribution to differences in disease incidence. Moreover, further studies are needed to elucidate the role that the candidate genes identified here could have in determining a higher incidence of Parkinson's disease in the male population.


Assuntos
Doença de Parkinson , Humanos , Feminino , Masculino , Doença de Parkinson/genética , Estudo de Associação Genômica Ampla , Caracteres Sexuais , Fenótipo , Encéfalo
3.
Proteomics ; 24(3-4): e2300068, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37997224

RESUMO

Top-down proteomics (TDP) directly analyzes intact proteins and thus provides more comprehensive qualitative and quantitative proteoform-level information than conventional bottom-up proteomics (BUP) that relies on digested peptides and protein inference. While significant advancements have been made in TDP in sample preparation, separation, instrumentation, and data analysis, reliable and reproducible data analysis still remains one of the major bottlenecks in TDP. A key step for robust data analysis is the establishment of an objective estimation of proteoform-level false discovery rate (FDR) in proteoform identification. The most widely used FDR estimation scheme is based on the target-decoy approach (TDA), which has primarily been established for BUP. We present evidence that the TDA-based FDR estimation may not work at the proteoform-level due to an overlooked factor, namely the erroneous deconvolution of precursor masses, which leads to incorrect FDR estimation. We argue that the conventional TDA-based FDR in proteoform identification is in fact protein-level FDR rather than proteoform-level FDR unless precursor deconvolution error rate is taken into account. To address this issue, we propose a formula to correct for proteoform-level FDR bias by combining TDA-based FDR and precursor deconvolution error rate.


Assuntos
Peptídeos , Proteômica , Proteínas de Ligação a DNA
4.
Proteomics ; 24(8): e2300084, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38380501

RESUMO

Assigning statistical confidence estimates to discoveries produced by a tandem mass spectrometry proteomics experiment is critical to enabling principled interpretation of the results and assessing the cost/benefit ratio of experimental follow-up. The most common technique for computing such estimates is to use target-decoy competition (TDC), in which observed spectra are searched against a database of real (target) peptides and a database of shuffled or reversed (decoy) peptides. TDC procedures for estimating the false discovery rate (FDR) at a given score threshold have been developed for application at the level of spectra, peptides, or proteins. Although these techniques are relatively straightforward to implement, it is common in the literature to skip over the implementation details or even to make mistakes in how the TDC procedures are applied in practice. Here we present Crema, an open-source Python tool that implements several TDC methods of spectrum-, peptide- and protein-level FDR estimation. Crema is compatible with a variety of existing database search tools and provides a straightforward way to obtain robust FDR estimates.


Assuntos
Algoritmos , Peptídeos , Bases de Dados de Proteínas , Peptídeos/química , Proteínas/análise , Proteômica/métodos
5.
J Proteome Res ; 23(6): 1894-1906, 2024 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-38652578

RESUMO

Searching for tandem mass spectrometry proteomics data against a database is a well-established method for assigning peptide sequences to observed spectra but typically cannot identify peptides harboring unexpected post-translational modifications (PTMs). Open modification searching aims to address this problem by allowing a spectrum to match a peptide even if the spectrum's precursor mass differs from the peptide mass. However, expanding the search space in this way can lead to a loss of statistical power to detect peptides. We therefore developed a method, called CONGA (combining open and narrow searches with group-wise analysis), that takes into account results from both types of searches─a traditional "narrow window" search and an open modification search─while carrying out rigorous false discovery rate control. The result is an algorithm that provides the best of both worlds: the ability to detect unexpected PTMs without a concomitant loss of power to detect unmodified peptides.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Processamento de Proteína Pós-Traducional , Proteômica , Espectrometria de Massas em Tandem , Espectrometria de Massas em Tandem/métodos , Proteômica/métodos , Peptídeos/análise , Peptídeos/química , Humanos , Software , Sequência de Aminoácidos
6.
J Proteome Res ; 2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38426863

RESUMO

Neuropeptides represent a unique class of signaling molecules that have garnered much attention but require special consideration when identifications are gleaned from mass spectra. With highly variable sequence lengths, neuropeptides must be analyzed in their endogenous state. Further, neuropeptides share great homology within families, differing by as little as a single amino acid residue, complicating even routine analyses and necessitating optimized computational strategies for confident and accurate identifications. We present EndoGenius, a database searching strategy designed specifically for elucidating neuropeptide identifications from mass spectra by leveraging optimized peptide-spectrum matching approaches, an expansive motif database, and a novel scoring algorithm to achieve broader representation of the neuropeptidome and minimize reidentification. This work describes an algorithm capable of reporting more neuropeptide identifications at 1% false-discovery rate than alternative software in five Callinectes sapidus neuronal tissue types.

7.
J Proteome Res ; 23(6): 1907-1914, 2024 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-38687997

RESUMO

Traditional database search methods for the analysis of bottom-up proteomics tandem mass spectrometry (MS/MS) data are limited in their ability to detect peptides with post-translational modifications (PTMs). Recently, "open modification" database search strategies, in which the requirement that the mass of the database peptide closely matches the observed precursor mass is relaxed, have become popular as ways to find a wider variety of types of PTMs. Indeed, in one study, Kong et al. reported that the open modification search tool MSFragger can achieve higher statistical power to detect peptides than a traditional "narrow window" database search. We investigated this claim empirically and, in the process, uncovered a potential general problem with false discovery rate (FDR) control in the machine learning postprocessors Percolator and PeptideProphet. This problem might have contributed to Kong et al.'s report that their empirical results suggest that false discovery (FDR) control in the narrow window setting might generally be compromised. Indeed, reanalyzing the same data while using a more standard form of target-decoy competition-based FDR control, we found that, after accounting for chimeric spectra as well as for the inherent difference in the number of candidates in open and narrow searches, the data does not provide sufficient evidence that FDR control in proteomics MS/MS database search is inherently problematic.


Assuntos
Bases de Dados de Proteínas , Processamento de Proteína Pós-Traducional , Proteômica , Espectrometria de Massas em Tandem , Espectrometria de Massas em Tandem/métodos , Proteômica/métodos , Peptídeos/análise , Peptídeos/química , Aprendizado de Máquina , Humanos , Algoritmos , Software
8.
BMC Genomics ; 25(1): 264, 2024 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-38459442

RESUMO

While single-cell RNA sequencing (scRNA-seq) allows researchers to analyze gene expression in individual cells, its unique characteristics like over-dispersion, zero-inflation, high gene-gene correlation, and large data volume with many features pose challenges for most existing feature selection methods. In this paper, we present a feature selection method based on neural network (scFSNN) to solve classification problem for the scRNA-seq data. scFSNN is an embedded method that can automatically select features (genes) during model training, control the false discovery rate of selected features and adaptively determine the number of features to be eliminated. Extensive simulation and real data studies demonstrate its excellent feature selection ability and predictive performance.


Assuntos
Redes Neurais de Computação , Análise da Expressão Gênica de Célula Única , Simulação por Computador , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos , Análise por Conglomerados
9.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-35534181

RESUMO

Proteogenomics refers to the integrated analysis of the genome and proteome that leverages mass-spectrometry (MS)-based proteomics data to improve genome annotations, understand gene expression control through proteoforms and find sequence variants to develop novel insights for disease classification and therapeutic strategies. However, proteogenomic studies often suffer from reduced sensitivity and specificity due to inflated database size. To control the error rates, proteogenomics depends on the target-decoy search strategy, the de-facto method for false discovery rate (FDR) estimation in proteomics. The proteogenomic databases constructed from three- or six-frame nucleotide database translation not only increase the search space and compute-time but also violate the equivalence of target and decoy databases. These searches result in poorer separation between target and decoy scores, leading to stringent FDR thresholds. Understanding these factors and applying modified strategies such as two-pass database search or peptide-class-specific FDR can result in a better interpretation of MS data without introducing additional statistical biases. Based on these considerations, a user can interpret the proteogenomics results appropriately and control false positives and negatives in a more informed manner. In this review, first, we briefly discuss the proteogenomic workflows and limitations in database construction, followed by various considerations that can influence potential novel discoveries in a proteogenomic study. We conclude with suggestions to counter these challenges for better proteogenomic data interpretation.


Assuntos
Proteogenômica , Bases de Dados de Proteínas , Nucleotídeos , Peptídeos/química , Proteogenômica/métodos , Proteoma , Proteômica/métodos
10.
Stat Med ; 2024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-38922944

RESUMO

The brain functional connectivity can typically be represented as a brain functional network, where nodes represent regions of interest (ROIs) and edges symbolize their connections. Studying group differences in brain functional connectivity can help identify brain regions and recover the brain functional network linked to neurodegenerative diseases. This process, known as differential network analysis focuses on the differences between estimated precision matrices for two groups. Current methods struggle with individual heterogeneity in measuring the brain connectivity, false discovery rate (FDR) control, and accounting for confounding factors, resulting in biased estimates and diminished power. To address these issues, we present a two-stage FDR-controlled feature selection method for differential network analysis using functional magnetic resonance imaging (fMRI) data. First, we create individual brain connectivity measures using a high-dimensional precision matrix estimation technique. Next, we devise a penalized logistic regression model that employs individual brain connectivity data and integrates a new knockoff filter for FDR control when detecting significant differential edges. Through extensive simulations, we showcase the superiority of our approach compared to other methods. Additionally, we apply our technique to fMRI data to identify differential edges between Alzheimer's disease and control groups. Our results are consistent with prior experimental studies, emphasizing the practical applicability of our method.

11.
Stat Appl Genet Mol Biol ; 22(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-37082815

RESUMO

It is often of research interest to identify genes that satisfy a particular expression pattern across different conditions such as tissues, genotypes, etc. One common practice is to perform differential expression analysis for each condition separately and then take the intersection of differentially expressed (DE) genes or non-DE genes under each condition to obtain genes that satisfy a particular pattern. Such a method can lead to many false positives, especially when the desired gene expression pattern involves equivalent expression under one condition. In this paper, we apply a Bayesian partition model to identify genes of all desired patterns while simultaneously controlling their false discovery rates (FDRs). Our simulation studies show that the common practice fails to control group specific FDRs for patterns involving equivalent expression while the proposed Bayesian method simultaneously controls group specific FDRs at all settings studied. In addition, the proposed method is more powerful when the FDR of the common practice is under control for identifying patterns only involving DE genes. Our simulation studies also show that it is an inherently more challenging problem to identify patterns involving equivalent expression than patterns only involving differential expression. Therefore, larger sample sizes are required to obtain the same target power to identify the former types of patterns than the latter types of patterns.


Assuntos
Perfilação da Expressão Gênica , RNA-Seq , Perfilação da Expressão Gênica/métodos , Teorema de Bayes , Simulação por Computador , Sequenciamento do Exoma
12.
Brain ; 146(8): 3392-3403, 2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-36757824

RESUMO

Psychiatric disorders and common epilepsies are heritable disorders with a high comorbidity and overlapping symptoms. However, the causative mechanisms underlying this relationship are poorly understood. Here we aimed to identify overlapping genetic loci between epilepsy and psychiatric disorders to gain a better understanding of their comorbidity and shared clinical features. We analysed genome-wide association study data for all epilepsies (n = 44 889), genetic generalized epilepsy (n = 33 446), focal epilepsy (n = 39 348), schizophrenia (n = 77 096), bipolar disorder (n = 406 405), depression (n = 500 199), attention deficit hyperactivity disorder (n = 53 293) and autism spectrum disorder (n = 46 350). First, we applied the MiXeR tool to estimate the total number of causal variants influencing the disorders. Next, we used the conjunctional false discovery rate statistical framework to improve power to discover shared genomic loci. Additionally, we assessed the validity of the findings in independent cohorts, and functionally characterized the identified loci. The epilepsy phenotypes were considerably less polygenic (1.0 K to 3.4 K causal variants) than the psychiatric disorders (5.6 K to 13.9 K causal variants), with focal epilepsy being the least polygenic (1.0 K variants), and depression having the highest polygenicity (13.9 K variants). We observed cross-trait genetic enrichment between genetic generalized epilepsy and all psychiatric disorders and between all epilepsies and schizophrenia and depression. Using conjunctional false discovery rate analysis, we identified 40 distinct loci jointly associated with epilepsies and psychiatric disorders at conjunctional false discovery rate <0.05, four of which were associated with all epilepsies and 39 with genetic generalized epilepsy. Most epilepsy risk loci were shared with schizophrenia (n = 31). Among the identified loci, 32 were novel for genetic generalized epilepsy, and two were novel for all epilepsies. There was a mixture of concordant and discordant allelic effects in the shared loci. The sign concordance of the identified variants was highly consistent between the discovery and independent datasets for all disorders, supporting the validity of the findings. Gene-set analysis for the shared loci between schizophrenia and genetic generalized epilepsy implicated biological processes related to cell cycle regulation, protein phosphatase activity, and membrane and vesicle function; the gene-set analyses for the other loci were underpowered. The extensive genetic overlap with mixed effect directions between psychiatric disorders and common epilepsies demonstrates a complex genetic relationship between these disorders, in line with their bi-directional relationship, and indicates that overlapping genetic risk may contribute to shared pathophysiological and clinical features between epilepsy and psychiatric disorders.


Assuntos
Transtorno do Deficit de Atenção com Hiperatividade , Transtorno do Espectro Autista , Epilepsias Parciais , Epilepsia Generalizada , Humanos , Transtorno do Espectro Autista/genética , Estudo de Associação Genômica Ampla , Epilepsias Parciais/genética , Genômica , Epilepsia Generalizada/genética , Loci Gênicos/genética , Predisposição Genética para Doença/genética , Polimorfismo de Nucleotídeo Único/genética
13.
Mol Cell Proteomics ; 21(12): 100437, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36328188

RESUMO

Estimating false discovery rates (FDRs) of protein identification continues to be an important topic in mass spectrometry-based proteomics, particularly when analyzing very large datasets. One performant method for this purpose is the Picked Protein FDR approach which is based on a target-decoy competition strategy on the protein level that ensures that FDRs scale to large datasets. Here, we present an extension to this method that can also deal with protein groups, that is, proteins that share common peptides such as protein isoforms of the same gene. To obtain well-calibrated FDR estimates that preserve protein identification sensitivity, we introduce two novel ideas. First, the picked group target-decoy and second, the rescued subset grouping strategies. Using entrapment searches and simulated data for validation, we demonstrate that the new Picked Protein Group FDR method produces accurate protein group-level FDR estimates regardless of the size of the data set. The validation analysis also uncovered that applying the commonly used Occam's razor principle leads to anticonservative FDR estimates for large datasets. This is not the case for the Picked Protein Group FDR method. Reanalysis of deep proteomes of 29 human tissues showed that the new method identified up to 4% more protein groups than MaxQuant. Applying the method to the reanalysis of the entire human section of ProteomicsDB led to the identification of 18,000 protein groups at 1% protein group-level FDR. The analysis also showed that about 1250 genes were represented by ≥2 identified protein groups. To make the method accessible to the proteomics community, we provide a software tool including a graphical user interface that enables merging results from multiple MaxQuant searches into a single list of identified and quantified protein groups.


Assuntos
Peptídeos , Espectrometria de Massas em Tandem , Humanos , Espectrometria de Massas em Tandem/métodos , Bases de Dados de Proteínas , Software , Proteoma , Algoritmos
14.
Artigo em Inglês | MEDLINE | ID: mdl-38098875

RESUMO

With the development of data collection techniques, analysis with a survival response and high-dimensional covariates has become routine. Here we consider an interaction model, which includes a set of low-dimensional covariates, a set of high-dimensional covariates, and their interactions. This model has been motivated by gene-environment (G-E) interaction analysis, where the E variables have a low dimension, and the G variables have a high dimension. For such a model, there has been extensive research on estimation and variable selection. Comparatively, inference studies with a valid false discovery rate (FDR) control have been very limited. The existing high-dimensional inference tools cannot be directly applied to interaction models, as interactions and main effects are not "equal". In this article, for high-dimensional survival analysis with interactions, we model survival using the Accelerated Failure Time (AFT) model and adopt a "weighted least squares + debiased Lasso" approach for estimation and selection. A hierarchical FDR control approach is developed for inference and respect of the "main effects, interactions" hierarchy. The asymptotic distribution properties of the debiased Lasso estimators are rigorously established. Simulation demonstrates the satisfactory performance of the proposed approach, and the analysis of a breast cancer dataset further establishes its practical utility.

15.
Biom J ; 66(1): e2300177, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38102999

RESUMO

Online testing procedures assume that hypotheses are observed in sequence, and allow the significance thresholds for upcoming tests to depend on the test statistics observed so far. Some of the most popular online methods include alpha investing, LORD++, and SAFFRON. These three methods have been shown to provide online control of the "modified" false discovery rate (mFDR) under a condition known as CS. However, to our knowledge, LORD++ and SAFFRON have only been shown to control the traditional false discovery rate (FDR) under an independence condition on the test statistics. Our work bolsters these results by showing that SAFFRON and LORD++ additionally ensure online control of the FDR under a "local" form of nonnegative dependence. Further, FDR control is maintained under certain types of adaptive stopping rules, such as stopping after a certain number of rejections have been observed. Because alpha investing can be recovered as a special case of the SAFFRON framework, our results immediately apply to alpha investing as well. In the process of deriving these results, we also formally characterize how the conditional super-uniformity assumption implicitly limits the allowed p-value dependencies. This implicit limitation is important not only to our proposed FDR result, but also to many existing mFDR results.


Assuntos
Crocus , Projetos de Pesquisa , Reações Falso-Positivas
16.
Proteomics ; 23(18): e2200406, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37357151

RESUMO

In discovery proteomics, as well as many other "omic" approaches, the possibility to test for the differential abundance of hundreds (or of thousands) of features simultaneously is appealing, despite requiring specific statistical safeguards, among which controlling for the false discovery rate (FDR) has become standard. Moreover, when more than two biological conditions or group treatments are considered, it has become customary to rely on the one-way analysis of variance (ANOVA) framework, where a first global differential abundance landscape provided by an omnibus test can be subsequently refined using various post-hoc tests (PHTs). However, the interactions between the FDR control procedures and the PHTs are complex, because both correspond to different types of multiple test corrections (MTCs). This article surveys various ways to orchestrate them in a data processing workflow and discusses their pros and cons.


Assuntos
Proteômica , Proteômica/métodos , Análise de Variância
17.
J Proteome Res ; 22(2): 420-431, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36696582

RESUMO

Neuropeptides are a class of endogenous peptides that have key regulatory roles in biochemical, physiological, and behavioral processes. Mass spectrometry analyses of neuropeptides often rely on protein informatics tools for database searching and peptide identification. As neuropeptide databases are typically experimentally built and comprised of short sequences with high sequence similarity to each other, we developed a novel database searching tool, HyPep, which utilizes sequence homology searching for peptide identification. HyPep aligns de novo sequenced peptides, generated through PEAKS software, with neuropeptide database sequences and identifies neuropeptides based on the alignment score. HyPep performance was optimized using LC-MS/MS measurements of peptide extracts from various Callinectes sapidus neuronal tissue types and compared with a commercial database searching software, PEAKS DB. HyPep identified more neuropeptides from each tissue type than PEAKS DB at 1% false discovery rate, and the false match rate from both programs was 2%. In addition to identification, this report describes how HyPep can aid in the discovery of novel neuropeptides.


Assuntos
Neuropeptídeos , Espectrometria de Massas em Tandem , Sequência de Aminoácidos , Cromatografia Líquida , Neuropeptídeos/genética , Neuropeptídeos/metabolismo , Peptídeos/análise , Software , Homologia de Sequência , Bases de Dados de Proteínas
18.
J Proteome Res ; 22(6): 1828-1842, 2023 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-37099386

RESUMO

Phosphorylation is a post-translational modification of great interest to researchers due to its relevance in many biological processes. LC-MS/MS techniques have enabled high-throughput data acquisition, with studies claiming identification and localization of thousands of phosphosites. The identification and localization of phosphosites emerge from different analytical pipelines and scoring algorithms, with uncertainty embedded throughout the pipeline. For many pipelines and algorithms, arbitrary thresholding is used, but little is known about the actual global false localization rate in these studies. Recently, it has been suggested to use decoy amino acids to estimate global false localization rates of phosphosites, among the peptide-spectrum matches reported. Here, we describe a simple pipeline aiming to maximize the information extracted from these studies by objectively collapsing from peptide-spectrum match to the peptidoform-site level, as well as combining findings from multiple studies while maintaining track of false localization rates. We show that the approach is more effective than current processes that use a simpler mechanism for handling phosphosite identification redundancy within and across studies. In our case study using eight rice phosphoproteomics data sets, 6368 unique sites were confidently identified using our decoy approach compared to 4687 using traditional thresholding in which false localization rates are unknown.


Assuntos
Proteômica , Rios , Cromatografia Líquida , Proteômica/métodos , Espectrometria de Massas em Tandem , Processamento de Proteína Pós-Traducional , Peptídeos/química , Algoritmos , Bases de Dados de Proteínas
19.
BMC Plant Biol ; 23(1): 552, 2023 Nov 08.
Artigo em Inglês | MEDLINE | ID: mdl-37940862

RESUMO

In this study, we investigated the intricate interplay between Trichoderma and the tomato genome, focusing on the transcriptional and metabolic changes triggered during the late colonization event. Microarray probe set (GSE76332) was utilized to analyze the gene expression profiles changes of the un-inoculated control (tomato) and Trichoderma-tomato interactions for identification of the differentially expressed significant genes. Based on principal component analysis and R-based correlation, we observed a positive correlation between the two cross-comaparable groups, corroborating the existence of transcriptional responses in the host triggered by Trichoderma priming. The statistically significant genes based on different p-value cut-off scores [(padj-values or q-value); padj-value < 0.05], [(pcal-values); pcal-value < 0.05; pcal < 0.01; pcal < 0.001)] were cross compared. Through cross-comparison, we identified 156 common genes that were consistently significant across all probability thresholds, and showing a strong positive corelation between p-value and q-value in the selected probe sets. We reported TD2, CPT1, pectin synthase, EXT-3 (extensin-3), Lox C, and pyruvate kinase (PK), which exhibited upregulated expression, and Glb1 and nitrate reductase (nii), which demonstrated downregulated expression during Trichoderma-tomato interaction. In addition, microbial priming with Trichoderma resulted into differential expression of transcription factors related to systemic defense and flowering including MYB13, MYB78, ERF2, ERF3, ERF5, ERF-1B, NAC, MADS box, ZF3, ZAT10, A20/AN1, polyol sugar transporter like zinc finger proteins, and a novel plant defensin protein. The potential bottleneck and hub genes involved in this dynamic response were also identified. The protein-protein interaction (PPI) network analysis based on 25 topmost DEGS (pcal-value < 0.05) and the Weighted Correlation Gene Network Analysis (WGCNA) of the 1786 significant DEGs (pcal-value < 0.05) we reported the hits associated with carbohydrate metabolism, secondary metabolite biosynthesis, and the nitrogen metabolism. We conclude that the Trichoderma-induced microbial priming re-programmed the host genome for transcriptional response during the late colonization event and were characterized by metabolic shifting and biochemical changes specific to plant growth and development. The work also highlights the relevance of statistical parameters in understanding the gene regulatory dynamics and complex regulatory networks based on differential expression, co-expression, and protein interaction networks orchestrating the host responses to beneficial microbial interactions.


Assuntos
Hypocreales , Solanum lycopersicum , Transcriptoma , Solanum lycopersicum/genética , Perfilação da Expressão Gênica , Proteínas de Plantas/genética
20.
Artigo em Inglês | MEDLINE | ID: mdl-38092030

RESUMO

OBJECTIVES: To assess the relationship between self-reported and serologic evidence of prior chlamydial infection, rheumatoid arthritis (RA)-related autoantibodies and risk of RA-development. METHODS: This is a nested study within a prospective Swiss-based cohort including all first-degree relatives of RA patients (RA-FDR) who answered a question on past chlamydial infections. Primary outcome was systemic autoimmunity associated with RA (RA-autoimmunity) defined as positivity for anti-citrullinated peptide antibodies (ACPA) and/or rheumatoid factor (RF). Secondary outcomes were high levels of RA-autoimmunity, RA-associated symptoms and RA-autoimmunity, and subsequent seropositive RA diagnosis. We conducted a nested case-control analysis by measuring the serological status against Chlamydia trachomatis' major outer membrane protein. We replicated our analysis in an independent United States-based RA-FDR cohort. RESULTS: Among 1231 RA-FDRs, 168 (13.6%) developed RA-autoimmunity. Prevalence of self-reported chlamydial infection was significantly higher in individuals with RA-autoimmunity compared with controls (17.9% vs 9.8%, OR = 2.00, 95%CI: 1.27-3.09, p < 0.01). This association remained significant after adjustments (OR = 1.91, 95%CI: 1.20-2.95). Stronger effect sizes were observed in later stages of RA development. There was a similar trend between a positive C. trachomatis serology and high levels of RA-autoimmunity (OR = 3.05, 95% CI: 1.10-8.46, p= 0.032). In the replication cohort, there were significant associations between chlamydial infection and RF positivity and incident RA, but not anti-CCP positivity. CONCLUSIONS: Self-reported chlamydial infections are associated with elevated RA-autoimmunity in at risk individuals. The differing association of chlamydial infections and ACPA/RF between cohorts will need to be explored in future studies but is consistent with a role of mucosal origin of RA-related autoimmunity.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa