Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 231
Filtrar
1.
Mol Cell Proteomics ; 23(5): 100768, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38621647

RESUMO

Mass spectrometry (MS)-based single-cell proteomics (SCP) provides us the opportunity to unbiasedly explore biological variability within cells without the limitation of antibody availability. This field is rapidly developed with the main focuses on instrument advancement, sample preparation refinement, and signal boosting methods; however, the optimal data processing and analysis are rarely investigated which holds an arduous challenge because of the high proportion of missing values and batch effect. Here, we introduced a quantification quality control to intensify the identification of differentially expressed proteins (DEPs) by considering both within and across SCP data. Combining quantification quality control with isobaric matching between runs (IMBR) and PSM-level normalization, an additional 12% and 19% of proteins and peptides, with more than 90% of proteins/peptides containing valid values, were quantified. Clearly, quantification quality control was able to reduce quantification variations and q-values with the more apparent cell type separations. In addition, we found that PSM-level normalization performed similar to other protein-level normalizations but kept the original data profiles without the additional requirement of data manipulation. In proof of concept of our refined pipeline, six uniquely identified DEPs exhibiting varied fold-changes and playing critical roles for melanoma and monocyte functionalities were selected for validation using immunoblotting. Five out of six validated DEPs showed an identical trend with the SCP dataset, emphasizing the feasibility of combining the IMBR, cell quality control, and PSM-level normalization in SCP analysis, which is beneficial for future SCP studies.


Assuntos
Proteômica , Controle de Qualidade , Análise de Célula Única , Análise de Célula Única/métodos , Proteômica/métodos , Humanos , Espectrometria de Massas/métodos , Análise de Dados , Proteoma/metabolismo
2.
BMC Biol ; 22(1): 110, 2024 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-38735918

RESUMO

BACKGROUND: Plants differ more than threefold in seed oil contents (SOCs). Soybean (Glycine max), cotton (Gossypium hirsutum), rapeseed (Brassica napus), and sesame (Sesamum indicum) are four important oil crops with markedly different SOCs and fatty acid compositions. RESULTS: Compared to grain crops like maize and rice, expanded acyl-lipid metabolism genes and relatively higher expression levels of genes involved in seed oil synthesis (SOS) in the oil crops contributed to the oil accumulation in seeds. Here, we conducted comparative transcriptomics on oil crops with two different SOC materials. In common, DIHYDROLIPOAMIDE DEHYDROGENASE, STEAROYL-ACYL CARRIER PROTEIN DESATURASE, PHOSPHOLIPID:DIACYLGLYCEROL ACYLTRANSFERASE, and oil-body protein genes were both differentially expressed between the high- and low-oil materials of each crop. By comparing functional components of SOS networks, we found that the strong correlations between genes in "glycolysis/gluconeogenesis" and "fatty acid synthesis" were conserved in both grain and oil crops, with PYRUVATE KINASE being the common factor affecting starch and lipid accumulation. Network alignment also found a conserved clique among oil crops affecting seed oil accumulation, which has been validated in Arabidopsis. Differently, secondary and protein metabolism affected oil synthesis to different degrees in different crops, and high SOC was due to less competition of the same precursors. The comparison of Arabidopsis mutants and wild type showed that CINNAMYL ALCOHOL DEHYDROGENASE 9, the conserved regulator we identified, was a factor resulting in different relative contents of lignins to oil in seeds. The interconnection of lipids and proteins was common but in different ways among crops, which partly led to differential oil production. CONCLUSIONS: This study goes beyond the observations made in studies of individual species to provide new insights into which genes and networks may be fundamental to seed oil accumulation from a multispecies perspective.


Assuntos
Produtos Agrícolas , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Óleos de Plantas , Produtos Agrícolas/genética , Produtos Agrícolas/metabolismo , Óleos de Plantas/metabolismo , Perfilação da Expressão Gênica/métodos , Transcriptoma , Sementes/genética , Sementes/metabolismo , Regulação da Expressão Gênica de Plantas
3.
BMC Bioinformatics ; 25(1): 259, 2024 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-39112940

RESUMO

BACKGROUND: Effective identification of differentially expressed genes (DEGs) has been challenging for single-cell RNA sequencing (scRNA-seq) profiles. Many existing algorithms have high false positive rates (FPRs) and often fail to identify weak biological signals. RESULTS: We present a novel method for identifying DEGs in scRNA-seq data called RankCompV3. It is based on the comparison of relative expression orderings (REOs) of gene pairs which are determined by comparing the expression levels of a pair of genes in a set of single-cell profiles. The numbers of genes with consistently higher or lower expression levels than the gene of interest are counted in two groups in comparison, respectively, and the result is tabulated in a 3 × 3 contingency table which is tested by McCullagh's method to determine if the gene is dysregulated. In both simulated and real scRNA-seq data, RankCompV3 tightly controlled the FPR and demonstrated high accuracy, outperforming 11 other common single-cell DEG detection algorithms. Analysis with either regular single-cell or synthetic pseudo-bulk profiles produced highly concordant DEGs with the ground-truth. In addition, RankCompV3 demonstrates higher sensitivity to weak biological signals than other methods. The algorithm was implemented using Julia and can be called in R. The source code is available at https://github.com/pathint/RankCompV3.jl . CONCLUSIONS: The REOs-based algorithm is a valuable tool for analyzing single-cell RNA profiles and identifying DEGs with high accuracy and sensitivity.


Assuntos
Algoritmos , Perfilação da Expressão Gênica , Análise de Sequência de RNA , Análise de Célula Única , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Transcriptoma/genética , Humanos , Software
4.
Genet Epidemiol ; 47(5): 379-393, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37042632

RESUMO

Variation in RNA-Seq data creates modeling challenges for differential gene expression (DE) analysis. Statistical approaches address conventional small sample sizes and implement empirical Bayes or non-parametric tests, but frequently produce different conclusions. Increasing sample sizes enable proposal of alternative DE paradigms. Here we develop RoPE, which uses a data-driven adjustment for variation and a robust profile likelihood ratio DE test. Simulation studies show RoPE can have improved performance over existing tools as sample size increases and has the most reliable control of error rates. Application of RoPE demonstrates that an active Pseudomonas aeruginosa infection downregulates the SLC9A3 Cystic Fibrosis modifier gene.


Assuntos
Perfilação da Expressão Gênica , Modelos Genéticos , Humanos , Funções Verossimilhança , Perfilação da Expressão Gênica/métodos , Teorema de Bayes , Simulação por Computador
5.
BMC Genomics ; 25(1): 548, 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38824502

RESUMO

Gibel carp (Carassius gibelio) is a cyprinid fish that originated in eastern Eurasia and is considered as invasive in European freshwater ecosystems. The populations of gibel carp in Europe are mostly composed of asexually reproducing triploid females (i.e., reproducing by gynogenesis) and sexually reproducing diploid females and males. Although some cases of coexisting sexual and asexual reproductive forms are known in vertebrates, the molecular mechanisms maintaining such coexistence are still in question. Both reproduction modes are supposed to exhibit evolutionary and ecological advantages and disadvantages. To better understand the coexistence of these two reproduction strategies, we performed transcriptome profile analysis of gonad tissues (ovaries) and studied the differentially expressed reproduction-associated genes in sexual and asexual females. We used high-throughput RNA sequencing to generate transcriptomic profiles of gonadal tissues of triploid asexual females and males, diploid sexual males and females of gibel carp, as well as diploid individuals from two closely-related species, C. auratus and Cyprinus carpio. Using SNP clustering, we showed the close similarity of C. gibelio and C. auratus with a basal position of C. carpio to both Carassius species. Using transcriptome profile analyses, we showed that many genes and pathways are involved in both gynogenetic and sexual reproduction in C. gibelio; however, we also found that 1500 genes, including 100 genes involved in cell cycle control, meiosis, oogenesis, embryogenesis, fertilization, steroid hormone signaling, and biosynthesis were differently expressed in the ovaries of asexual and sexual females. We suggest that the overall downregulation of reproduction-associated pathways in asexual females, and their maintenance in sexual ones, allows the populations of C. gibelio to combine the evolutionary and ecological advantages of the two reproductive strategies. However, we showed that many sexual-reproduction-related genes are maintained and expressed in asexual females, suggesting that gynogenetic gibel carp retains the genetic toolkits for meiosis and sexual reproduction. These findings shed new light on the evolution of this asexual and sexual complex.


Assuntos
Carpas , Reprodução Assexuada , Reprodução , Animais , Feminino , Reprodução Assexuada/genética , Reprodução/genética , Carpas/genética , Carpas/fisiologia , Masculino , Transcriptoma , Perfilação da Expressão Gênica , Ovário/metabolismo , Polimorfismo de Nucleotídeo Único
6.
BMC Genomics ; 25(1): 221, 2024 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-38418960

RESUMO

BACKGROUND: Wheat streak mosaic virus (WSMV) and Triticum mosaic virus (TriMV) are components of the wheat streak mosaic virus disease complex in the Great Plains region of the U.S.A. and elsewhere. Co-infection of wheat with WSMV and TriMV causes synergistic interaction with more severe disease symptoms compared to single infections. Plants are equipped with multiple antiviral mechanisms, of which regulation of microRNAs (miRNAs) is a potentially effective constituent. In this investigation, we have analyzed the total and relative expression of miRNA transcriptome in two wheat cultivars, Arapahoe (susceptible) and Mace (temperature-sensitive-resistant), that were mock-inoculated or inoculated with WSMV, TriMV, or both at 18 °C and 27 °C. RESULTS: Our results showed that the most abundant miRNA family among all the treatments was miRNA166, followed by 159a and 168a, although the order of the latter two changed depending on the infections. When comparing infected and control groups, twenty miRNAs showed significant upregulation, while eight miRNAs were significantly downregulated. Among them, miRNAs 9670-3p, 397-5p, and 5384-3p exhibited the most significant upregulation, whereas miRNAs 319, 9773, and 9774 were the most downregulated. The comparison of infection versus the control group for the cultivar Mace showed temperature-dependent regulation of these miRNAs. The principal component analysis confirmed that less abundant miRNAs among differentially expressed miRNAs were strongly correlated with the inoculated symptomatic wheat cultivars. Notably, miRNAs 397-5p, 398, and 9670-3p were upregulated in response to WSMV and TriMV infections, an observation not yet reported in this context. The significant upregulation of these three miRNAs was further confirmed with RT-qPCR analysis; in general, the RT-qPCR results were in agreement with our computational analysis. Target prediction analysis showed that the miRNAs standing out in our analysis targeted genes involved in defense response and regulation of transcription. CONCLUSION: Investigation into the roles of these miRNAs and their corresponding targets holds promise for advancing our understanding of the mechanisms of virus infection and possible manipulation of these factors for developing durable virus resistance in crop plants.


Assuntos
MicroRNAs , Potyviridae , MicroRNAs/genética , Doenças das Plantas/genética , Potyviridae/genética
7.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34850807

RESUMO

Cytometry techniques are widely used to discover cellular characteristics at single-cell resolution. Many data analysis methods for cytometry data focus solely on identifying subpopulations via clustering and testing for differential cell abundance. For differential expression analysis of markers between conditions, only few tools exist. These tools either reduce the data distribution to medians, discarding valuable information, or have underlying assumptions that may not hold for all expression patterns. Here, we systematically evaluated existing and novel approaches for differential expression analysis on real and simulated CyTOF data. We found that methods using median marker expressions compute fast and reliable results when the data are not strongly zero-inflated. Methods using all data detect changes in strongly zero-inflated markers, but partially suffer from overprediction or cannot handle big datasets. We present a new method, CyEMD, based on calculating the earth mover's distance between expression distributions that can handle strong zero-inflation without being too sensitive. Additionally, we developed CYANUS - CYtometry ANalysis Using Shiny - a user-friendly R Shiny App allowing the user to analyze cytometry data with state-of-the-art tools, including well-performing methods from our comparison. A public web interface is available at https://exbio.wzw.tum.de/cyanus/.


Assuntos
Análise por Conglomerados , Biomarcadores
8.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34571530

RESUMO

The identification of differentially expressed genes between different cell groups is a crucial step in analyzing single-cell RNA-sequencing (scRNA-seq) data. Even though various differential expression analysis methods for scRNA-seq data have been proposed based on different model assumptions and strategies recently, the differentially expressed genes identified by them are quite different from each other, and the performances of them depend on the underlying data structures. In this paper, we propose a new ensemble learning-based differential expression analysis method, scDEA, to produce a more stable and accurate result. scDEA integrates the P-values obtained from 12 individual differential expression analysis methods for each gene using a P-value combination method. Comprehensive experiments show that scDEA outperforms the state-of-the-art individual methods with different experimental settings and evaluation metrics. We expect that scDEA will serve a wide range of users, including biologists, bioinformaticians and data scientists, who need to detect differentially expressed genes in scRNA-seq data.


Assuntos
RNA , Análise de Célula Única , Perfilação da Expressão Gênica/métodos , Aprendizado de Máquina , RNA/genética , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Sequenciamento do Exoma
9.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36274239

RESUMO

Gene-based transcriptome analysis, such as differential expression analysis, can identify the key factors causing disease production, cell differentiation and other biological processes. However, this is not enough because basic life activities are mainly driven by the interactions between genes. Although there have been already many differential network inference methods for identifying the differential gene interactions, currently, most studies still only use the information of nodes in the network for downstream analyses. To investigate the insight into differential gene interactions, we should perform interaction-based transcriptome analysis (IBTA) instead of gene-based analysis after obtaining the differential networks. In this paper, we illustrated a workflow of IBTA by developing a Co-hub Differential Network inference (CDN) algorithm, and a novel interaction-based metric, pivot APC2. We confirmed the superior performance of CDN through simulation experiments compared with other popular differential network inference algorithms. Furthermore, three case studies are given using colorectal cancer, COVID-19 and triple-negative breast cancer datasets to demonstrate the ability of our interaction-based analytical process to uncover causative mechanisms.


Assuntos
COVID-19 , Redes Reguladoras de Genes , Humanos , Perfilação da Expressão Gênica/métodos , Transcriptoma , Algoritmos
10.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34664075

RESUMO

Transposable elements (TEs) have been associated with many, frequently detrimental, biological roles. Consequently, the regulations of TEs, e.g. via DNA-methylation and histone modifications, are considered critical for maintaining genomic integrity and other functions. Still, the high-throughput study of TEs is usually limited to the family or consensus-sequence level because of alignment problems prompted by high-sequence similarities and short read lengths. To entirely comprehend the effects and reasons of TE expression, however, it is necessary to assess the TE expression at the level of individual instances. Our simulation study demonstrates that sequence similarities and short read lengths do not rule out the accurate assessment of (differential) expression of TEs at the instance-level. With only slight modifications to existing methods, TE expression analysis works surprisingly well for conventional paired-end sequencing data. We find that SalmonTE and Telescope can accurately tally a considerable amount of TE instances, allowing for differential expression recovery in model and non-model organisms.


Assuntos
Elementos de DNA Transponíveis , Genômica , Metilação de DNA , Análise de Sequência de DNA
11.
Mol Cell Proteomics ; 21(9): 100269, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35853575

RESUMO

Several algorithms for the normalization of proteomic data are currently available, each based on a priori assumptions. Among these is the extent to which differential expression (DE) can be present in the dataset. This factor is usually unknown in explorative biomarker screens. Simultaneously, the increasing depth of proteomic analyses often requires the selection of subsets with a high probability of being DE to obtain meaningful results in downstream bioinformatical analyses. Based on the relationship of technical variation and (true) biological DE of an unknown share of proteins, we propose the "Normics" algorithm: Proteins are ranked based on their expression level-corrected variance and the mean correlation with all other proteins. The latter serves as a novel indicator of the non-DE likelihood of a protein in a given dataset. Subsequent normalization is based on a subset of non-DE proteins only. No a priori information such as batch, clinical, or replicate group is necessary. Simulation data demonstrated robust and superior performance across a wide range of stochastically chosen parameters. Five publicly available spike-in and biologically variant datasets were reliably and quantitively accurately normalized by Normics with improved performance compared to standard variance stabilization as well as median, quantile, and LOESS normalizations. In complex biological datasets Normics correctly determined proteins as being DE that had been cross-validated by an independent transcriptome analysis of the same samples. In both complex datasets Normics identified the most DE proteins. We demonstrate that combining variance analysis and data-inherent correlation structure to identify non-DE proteins improves data normalization. Standard normalization algorithms can be consolidated against high shares of (one-sided) biological regulation. The statistical power of downstream analyses can be increased by focusing on Normics-selected subsets of high DE likelihood.


Assuntos
Perfilação da Expressão Gênica , Proteômica , Algoritmos , Análise de Variância , Simulação por Computador , Perfilação da Expressão Gênica/métodos , Proteínas , Proteômica/métodos
12.
Biochem Genet ; 2024 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-38871957

RESUMO

Idiopathic pulmonary fibrosis (IPF) is a chronic, progressive form of pulmonary fibrosis of unknown etiology. Despite ongoing research, there is currently no cure for this disease. Recent studies have highlighted the significance of competitive endogenous RNA (ceRNA) regulatory networks in IPF development. Therefore, this study investigated the ceRNA network associated with IPF pathogenesis. We obtained gene expression datasets (GSE32538, GSE32537, GSE47460, and GSE24206) from the Gene Expression Omnibus (GEO) database and analyzed them using bioinformatics tools to identify differentially expressed messenger RNAs (DEmRNAs), microRNAs (DEmiRNAs), and long non-coding RNAs (DElncRNA). For DEmRNAs, we conducted an enrichment analysis, constructed protein-protein interaction networks, and identified hub genes. Additionally, we predicted the target genes of differentially expressed mRNAs and their interacting long non-coding RNAs using various databases. Subsequently, we screened RNA molecules with ceRNA regulatory relations in the lncACTdb database based on the screening results. Furthermore, we performed disease and functional enrichment analyses and pathway prediction for miRNAs in the ceRNA network. We also validated the expression levels of candidate DEmRNAs through quantitative real-time reverse transcriptase polymerase chain reaction and analyzed the correlation between the expression of these candidate DEmRNAs and the percent predicted pre-bronchodilator forced vital capacity [%predicted FVC (pre-bd)]. We found that three ceRNA regulatory axes, specifically KCNQ1OT1/XIST/NEAT1-miR-20a-5p-ITGB8, XIST-miR-146b-5p/miR-31-5p- MMP16, and NEAT1-miR-31-5p-MMP16, have the potential to significantly affect IPF progression. Further examination of the underlying regulatory mechanisms within this network enhances our understanding of IPF pathogenesis and may aid in the identification of diagnostic biomarkers and therapeutic targets.

13.
Genomics ; 115(6): 110708, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37730167

RESUMO

It has become widely accepted that sample cellular composition is a significant determinant of the gene expression patterns observed in any transcriptomic experiment performed with bulk tissue. Despite this, many investigations currently performed with whole blood do not experimentally account for possible inter-specimen differences in cellularity, and often assume that any observed gene expression differences are a result of true differences in nuclear transcription. In order to determine how confounding of an assumption this may be, in this study, we recruited a large cohort of human donors (n = 138) and used a combination of next generation sequencing and flow cytometry to quantify and compare the underlying contributions of variance in leukocyte counts versus variance in other biological factors to overall variance in whole blood transcript levels. Our results suggest that the combination of donor neutrophil and lymphocyte counts alone are the primary determinants of whole blood transcript levels for up to 75% of the protein-coding genes expressed in peripheral circulation, whereas the other factors such as age, sex, race, ethnicity, and common disease states have comparatively minimal influence. Broadly, this infers that a majority of gene expression differences observed in experiments performed with whole blood are driven by latent differences in leukocyte counts, and that cell count heterogeneity must be accounted for to meaningfully biologically interpret the results.


Assuntos
Leucócitos , Transcriptoma , Humanos , Contagem de Leucócitos , Perfilação da Expressão Gênica
14.
BMC Bioinformatics ; 24(1): 318, 2023 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-37608264

RESUMO

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) technology has enabled assessment of transcriptome-wide changes at single-cell resolution. Due to the heterogeneity in environmental exposure and genetic background across subjects, subject effect contributes to the major source of variation in scRNA-seq data with multiple subjects, which severely confounds cell type specific differential expression (DE) analysis. Moreover, dropout events are prevalent in scRNA-seq data, leading to excessive number of zeroes in the data, which further aggravates the challenge in DE analysis. RESULTS: We developed iDESC to detect cell type specific DE genes between two groups of subjects in scRNA-seq data. iDESC uses a zero-inflated negative binomial mixed model to consider both subject effect and dropouts. The prevalence of dropout events (dropout rate) was demonstrated to be dependent on gene expression level, which is modeled by pooling information across genes. Subject effect is modeled as a random effect in the log-mean of the negative binomial component. We evaluated and compared the performance of iDESC with eleven existing DE analysis methods. Using simulated data, we demonstrated that iDESC had well-controlled type I error and higher power compared to the existing methods. Applications of those methods with well-controlled type I error to three real scRNA-seq datasets from the same tissue and disease showed that the results of iDESC achieved the best consistency between datasets and the best disease relevance. CONCLUSIONS: iDESC was able to achieve more accurate and robust DE analysis results by separating subject effect from disease effect with consideration of dropouts to identify DE genes, suggesting the importance of considering subject effect and dropouts in the DE analysis of scRNA-seq data with multiple subjects.


Assuntos
Modelos Estatísticos , Transcriptoma , Humanos , Análise de Sequência de RNA
15.
BMC Bioinformatics ; 24(1): 133, 2023 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-37016291

RESUMO

BACKGROUND: RNA-seq followed by de novo transcriptome assembly has been a transformative technique in biological research of non-model organisms, but the computational processing of RNA-seq data entails many different software tools. The complexity of these de novo transcriptomics workflows therefore presents a major barrier for researchers to adopt best-practice methods and up-to-date versions of software. RESULTS: Here we present a streamlined and universal de novo transcriptome assembly and annotation pipeline, transXpress, implemented in Snakemake. transXpress supports two popular assembly programs, Trinity and rnaSPAdes, and allows parallel execution on heterogeneous cluster computing hardware. CONCLUSIONS: transXpress simplifies the use of best-practice methods and up-to-date software for de novo transcriptome assembly, and produces standardized output files that can be mined using SequenceServer to facilitate rapid discovery of new genes and proteins in non-model organisms.


Assuntos
Software , Transcriptoma , Análise de Sequência de RNA/métodos , RNA-Seq , Perfilação da Expressão Gênica , Anotação de Sequência Molecular
16.
J Proteome Res ; 22(4): 1092-1104, 2023 04 07.
Artigo em Inglês | MEDLINE | ID: mdl-36939687

RESUMO

Mass spectrometry is widely used for quantitative proteomics studies, relative protein quantification, and differential expression analysis of proteins. There is a large variety of quantification software and analysis tools. Nevertheless, there is a need for a modular, easy-to-use application programming interface in R that transparently supports a variety of well principled statistical procedures to make applying them to proteomics data, comparing and understanding their differences easy. The prolfqua package integrates essential steps of the mass spectrometry-based differential expression analysis workflow: quality control, data normalization, protein aggregation, statistical modeling, hypothesis testing, and sample size estimation. The package makes integrating new data formats easy. It can be used to model simple experimental designs with a single explanatory variable and complex experiments with multiple factors and hypothesis testing. The implemented methods allow sensitive and specific differential expression analysis. Furthermore, the package implements benchmark functionality that can help to compare data acquisition, data preprocessing, or data modeling methods using a gold standard data set. The application programmer interface of prolfqua strives to be clear, predictable, discoverable, and consistent to make proteomics data analysis application development easy and exciting. Finally, the prolfqua R-package is available on GitHub https://github.com/fgcz/prolfqua, distributed under the MIT license. It runs on all platforms supported by the R free software environment for statistical computing and graphics.


Assuntos
Proteômica , Software , Proteômica/métodos , Proteínas/análise , Modelos Estatísticos , Espectrometria de Massas/métodos
17.
BMC Plant Biol ; 23(1): 346, 2023 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-37391695

RESUMO

BACKGROUND: The solubilization of aluminum ions (Al3+) that results from soil acidity (pH < 5.5) is a limiting factor in oil palm yield. Al can be uptaken by the plant roots affecting DNA replication and cell division and triggering root morphological alterations, nutrient and water deprivation. In different oil palm-producing countries, oil palm is planted in acidic soils, representing a challenge for achieving high productivity. Several studies have reported the morphological, physiological, and biochemical oil palm mechanisms in response to Al-stress. However, the molecular mechanisms are just partially understood. RESULTS: Differential gene expression and network analysis of four contrasting oil palm genotypes (IRHO 7001, CTR 3-0-12, CR 10-0-2, and CD 19 - 12) exposed to Al-stress helped to identify a set of genes and modules involved in oil palm early response to the metal. Networks including the ABA-independent transcription factors DREB1F and NAC and the calcium sensor Calmodulin-like (CML) that could induce the expression of internal detoxifying enzymes GRXC1, PER15, ROMT, ZSS1, BBI, and HS1 against Al-stress were identified. Also, some gene networks pinpoint the role of secondary metabolites like polyphenols, sesquiterpenoids, and antimicrobial components in reducing oxidative stress in oil palm seedlings. STOP1 expression could be the first step of the induction of common Al-response genes as an external detoxification mechanism mediated by ABA-dependent pathways. CONCLUSIONS: Twelve hub genes were validated in this study, supporting the reliability of the experimental design and network analysis. Differential expression analysis and systems biology approaches provide a better understanding of the molecular network mechanisms of the response to aluminum stress in oil palm roots. These findings settled a basis for further functional characterization of candidate genes associated with Al-stress in oil palm.


Assuntos
Alumínio , Cálcio , Alumínio/toxicidade , Reprodutibilidade dos Testes , Calmodulina , Divisão Celular
18.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-33975339

RESUMO

The mechanisms controlling biological process, such as the development of disease or cell differentiation, can be investigated by examining changes in the networks of gene dependencies between states in the process. High-throughput experimental methods, like microarray and RNA sequencing, have been widely used to gather gene expression data, which paves the way to infer gene dependencies based on computational methods. However, most differential network analysis methods are designed to deal with fully observed data, but missing values, such as the dropout events in single-cell RNA-sequencing data, are frequent. New methods are needed to take account of these missing values. Moreover, since the changes of gene dependencies may be driven by certain perturbed genes, considering the changes in gene expression levels may promote the identification of gene network rewiring. In this study, a novel weighted differential network estimation (WDNE) model is proposed to handle multi-platform gene expression data with missing values and take account of changes in gene expression levels. Simulation studies demonstrate that WDNE outperforms state-of-the-art differential network estimation methods. When applied WDNE to infer differential gene networks associated with drug resistance in ovarian tumors, cell differentiation and breast tumor heterogeneity, the hub genes in the estimated differential gene networks can provide important insights into the underlying mechanisms. Furthermore, a Matlab toolbox, differential network analysis toolbox, was developed to implement the WDNE model and visualize the estimated differential networks.


Assuntos
Algoritmos , Neoplasias da Mama , Resistencia a Medicamentos Antineoplásicos/genética , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Modelos Genéticos , Neoplasias Ovarianas , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Feminino , Perfilação da Expressão Gênica , Humanos , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/metabolismo
19.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32520347

RESUMO

Label-free shotgun proteomics is an important tool in biomedical research, where tandem mass spectrometry with data-dependent acquisition (DDA) is frequently used for protein identification and quantification. However, the DDA datasets contain a significant number of missing values (MVs) that severely hinders proper analysis. Existing literature suggests that different imputation methods should be used for the two types of MVs: missing completely at random or missing not at random. However, the simulated or biased datasets utilized by most of such studies offer few clues about the composition and thus proper imputation of MVs in real-life proteomic datasets. Moreover, the impact of imputation methods on downstream differential expression analysis-a critical goal for many biomedical projects-is largely undetermined. In this study, we investigated public DDA datasets of various tissue/sample types to determine the composition of MVs in them. We then developed simulated datasets that imitate the MV profile of real-life datasets. Using such datasets, we compared the impact of various popular imputation methods on the analysis of differentially expressed proteins. Finally, we make recommendations on which imputation method(s) to use for proteomic data beyond just DDA datasets.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Proteoma , Proteômica , Humanos
20.
Cancer Invest ; 41(4): 394-404, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36797673

RESUMO

Identifying differentially expressed genes and co-expression modules lead to novel biomarkers. GO, pathway enrichment, network, and tumor stage analysis of 318 ovarian cancer samples from TCGA, categorised into primary and recurrent, pre-menopause and post-menopause, and early and late stage tumors was performed. Upregulated and downregulated genes in primary vs recurrent, early stage vs late-stage and pre-menopause vs post-menopause tumors were 84 and 62, 84 and 35, and 88 and 14, respectively. IRAK2 and CXCL8 had higher expression in recurrent tumors while REG1A had higher expression in post-menopause samples. In late stage tumors constant expression of IRAK2 and REG1A was observed, while that of CXCL8 and EGF decreased. These genes may be potential biomarkers for the diagnosis of the disease.


Assuntos
Redes Reguladoras de Genes , Neoplasias Ovarianas , Humanos , Feminino , Recidiva Local de Neoplasia , Biomarcadores , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/patologia , Análise de Sequência de RNA , Litostatina/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA