RESUMO
Pathway enrichment analysis is indispensable for interpreting omics datasets and generating hypotheses. However, the foundations of enrichment analysis remain elusive to many biologists. Here, we discuss best practices in interpreting different types of omics data using pathway enrichment analysis and highlight the importance of considering intrinsic features of various types of omics data. We further explain major components that influence the outcomes of a pathway enrichment analysis, including defining background sets and choosing reference annotation databases. To improve reproducibility, we describe how to standardize reporting methodological details in publications. This article aims to serve as a primer for biologists to leverage the wealth of omics resources and motivate bioinformatics tool developers to enhance the power of pathway enrichment analysis.
Assuntos
Biologia Computacional , Reprodutibilidade dos TestesRESUMO
Enrichment analysis (EA) is a common approach to gain functional insights from genome-scale experiments. As a consequence, a large number of EA methods have been developed, yet it is unclear from previous studies which method is the best for a given dataset. The main issues with previous benchmarks include the complexity of correctly assigning true pathways to a test dataset, and lack of generality of the evaluation metrics, for which the rank of a single target pathway is commonly used. We here provide a generalized EA benchmark and apply it to the most widely used EA methods, representing all four categories of current approaches. The benchmark employs a new set of 82 curated gene expression datasets from DNA microarray and RNA-Seq experiments for 26 diseases, of which only 13 are cancers. In order to address the shortcomings of the single target pathway approach and to enhance the sensitivity evaluation, we present the Disease Pathway Network, in which related Kyoto Encyclopedia of Genes and Genomes pathways are linked. We introduce a novel approach to evaluate pathway EA by combining sensitivity and specificity to provide a balanced evaluation of EA methods. This approach identifies Network Enrichment Analysis methods as the overall top performers compared with overlap-based methods. By using randomized gene expression datasets, we explore the null hypothesis bias of each method, revealing that most of them produce skewed P-values.
Assuntos
Benchmarking , RNA-SeqRESUMO
irGSEA is an R package designed to assess the outcomes of various gene set scoring methods when applied to single-cell RNA sequencing data. This package incorporates six distinct scoring methods that rely on the expression ranks of genes, emphasizing relative expression levels over absolute values. The implemented methods include AUCell, UCell, singscore, ssGSEA, JASMINE and Viper. Previous studies have demonstrated the robustness of these methods to variations in dataset size and composition, generating enrichment scores based solely on the relative gene expression of individual cells. By employing the robust rank aggregation algorithm, irGSEA amalgamates results from all six methods to ascertain the statistical significance of target gene sets across diverse scoring methods. The package prioritizes user-friendliness, allowing direct input of expression matrices or seamless interaction with Seurat objects. Furthermore, it facilitates a comprehensive visualization of results. The irGSEA package and its accompanying documentation are accessible on GitHub (https://github.com/chuiqin/irGSEA).
Assuntos
Algoritmos , Análise de Célula Única , Software , Análise de Célula Única/métodos , Humanos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodosRESUMO
Enrichment analysis contextualizes biological features in pathways to facilitate a systematic understanding of high-dimensional data and is widely used in biomedical research. The emerging reporter score-based analysis (RSA) method shows more promising sensitivity, as it relies on P-values instead of raw values of features. However, RSA cannot be directly applied to multi-group and longitudinal experimental designs and is often misused due to the lack of a proper tool. Here, we propose the Generalized Reporter Score-based Analysis (GRSA) method for multi-group and longitudinal omics data. A comparison with other popular enrichment analysis methods demonstrated that GRSA had increased sensitivity across multiple benchmark datasets. We applied GRSA to microbiome, transcriptome and metabolome data and discovered new biological insights in omics studies. Finally, we demonstrated the application of GRSA beyond functional enrichment using a taxonomy database. We implemented GRSA in an R package, ReporterScore, integrating with a powerful visualization module and updatable pathway databases, which is available on the Comprehensive R Archive Network (https://cran.r-project.org/web/packages/ReporterScore). We believe that the ReporterScore package will be a valuable asset for broad biomedical research fields.
Assuntos
Pesquisa Biomédica , Microbiota , Benchmarking , Bases de Dados Factuais , MetabolomaRESUMO
BACKGROUND: Global or untargeted metabolomics is widely used to comprehensively investigate metabolic profiles under various pathophysiological conditions such as inflammations, infections, responses to exposures or interactions with microbial communities. However, biological interpretation of global metabolomics data remains a daunting task. Recent years have seen growing applications of pathway enrichment analysis based on putative annotations of liquid chromatography coupled with mass spectrometry (LC-MS) peaks for functional interpretation of LC-MS-based global metabolomics data. However, due to intricate peak-metabolite and metabolite-pathway relationships, considerable variations are observed among results obtained using different approaches. There is an urgent need to benchmark these approaches to inform the best practices. RESULTS: We have conducted a benchmark study of common peak annotation approaches and pathway enrichment methods in current metabolomics studies. Representative approaches, including three peak annotation methods and four enrichment methods, were selected and benchmarked under different scenarios. Based on the results, we have provided a set of recommendations regarding peak annotation, ranking metrics and feature selection. The overall better performance was obtained for the mummichog approach. We have observed that a ~30% annotation rate is sufficient to achieve high recall (~90% based on mummichog), and using semi-annotated data improves functional interpretation. Based on the current platforms and enrichment methods, we further propose an identifiability index to indicate the possibility of a pathway being reliably identified. Finally, we evaluated all methods using 11 COVID-19 and 8 inflammatory bowel diseases (IBD) global metabolomics datasets.
Assuntos
COVID-19 , Espectrometria de Massas em Tandem , Humanos , Cromatografia Líquida/métodos , Metabolômica/métodos , MetabolomaRESUMO
BACKGROUND: It is valuable to analyze the genome-wide association studies (GWAS) data for a complex disease phenotype in the context of the protein-protein interaction (PPI) network, as the related pathophysiology results from the function of interacting polyprotein pathways. The analysis may include the design and curation of a phenotype-specific GWAS meta-database incorporating genotypic and eQTL data linking to PPI and other biological datasets, and the development of systematic workflows for PPI network-based data integration toward protein and pathway prioritization. Here, we pursued this analysis for blood pressure (BP) regulation. METHODS: The relational scheme of the implemented in Microsoft SQL Server BP-GWAS meta-database enabled the combined storage of: GWAS data and attributes mined from GWAS Catalog and the literature, Ensembl-defined SNP-transcript associations, and GTEx eQTL data. The BP-protein interactome was reconstructed from the PICKLE PPI meta-database, extending the GWAS-deduced network with the shortest paths connecting all GWAS-proteins into one component. The shortest-path intermediates were considered as BP-related. For protein prioritization, we combined a new integrated GWAS-based scoring scheme with two network-based criteria: one considering the protein role in the reconstructed by shortest-path (RbSP) interactome and one novel promoting the common neighbors of GWAS-prioritized proteins. Prioritized proteins were ranked by the number of satisfied criteria. RESULTS: The meta-database includes 6687 variants linked with 1167 BP-associated protein-coding genes. The GWAS-deduced PPI network includes 1065 proteins, with 672 forming a connected component. The RbSP interactome contains 1443 additional, network-deduced proteins and indicated that essentially all BP-GWAS proteins are at most second neighbors. The prioritized BP-protein set was derived from the union of the most BP-significant by any of the GWAS-based or the network-based criteria. It included 335 proteins, with ~ 2/3 deduced from the BP PPI network extension and 126 prioritized by at least two criteria. ESR1 was the only protein satisfying all three criteria, followed in the top-10 by INSR, PTN11, CDK6, CSK, NOS3, SH2B3, ATP2B1, FES and FINC, satisfying two. Pathway analysis of the RbSP interactome revealed numerous bioprocesses, which are indeed functionally supported as BP-associated, extending our understanding about BP regulation. CONCLUSIONS: The implemented workflow could be used for other multifactorial diseases.
Assuntos
Estudo de Associação Genômica Ampla , Mapas de Interação de Proteínas , Humanos , Mapas de Interação de Proteínas/genética , Estudo de Associação Genômica Ampla/métodos , Pressão Sanguínea/genética , Genótipo , Bases de Dados Factuais , ATPases Transportadoras de Cálcio da Membrana PlasmáticaRESUMO
Although glioblastoma multiforme (GBM) is not an invariably cold tumor, checkpoint inhibition has largely failed in GBM. In order to investigate T cell-intrinsic properties that contribute to the resistance of GBM to endogenous or therapeutically enhanced adaptive immune responses, we sorted CD4+ and CD8+ T cells from the peripheral blood, normal-appearing brain tissue, and tumor bed of nine treatment-naive patients with GBM. Bulk RNA sequencing of highly pure T cell populations from these different compartments was used to obtain deep transcriptomes of tumor-infiltrating T cells (TILs). While the transcriptome of CD8+ TILs suggested that they were partly locked in a dysfunctional state, CD4+ TILs showed a robust commitment to the type 17 T helper cell (TH17) lineage, which was corroborated by flow cytometry in four additional GBM cases. Therefore, our study illustrates that the brain tumor environment in GBM might instruct TH17 commitment of infiltrating T helper cells. Whether these properties of CD4+ TILs facilitate a tumor-promoting milieu and thus could be a target for adjuvant anti-TH17 cell interventions needs to be further investigated.
Assuntos
Neoplasias Encefálicas , Linfócitos T CD4-Positivos , Glioblastoma , Linfócitos T Auxiliares-Indutores , Neoplasias Encefálicas/patologia , Linfócitos T CD4-Positivos/citologia , Linfócitos T CD8-Positivos/citologia , Citometria de Fluxo , Glioblastoma/patologia , Humanos , Linfócitos do Interstício Tumoral/citologia , Linfócitos T Auxiliares-Indutores/citologiaRESUMO
Growing evidence has shown that besides the protein coding genes, the non-coding elements of the genome are indispensable for maintaining the property of self-renewal in human embryonic stem cells and in cell fate determination. However, the regulatory mechanisms and the landscape of interactions between the coding and non-coding elements is poorly understood. In this work, we used weighted gene co-expression network analysis (WGCNA) on transcriptomic data retrieved from RNA-seq and small RNA-seq experiments and reconstructed the core human pluripotency network (called PluriMLMiNet) consisting of 375 mRNA, 57 lncRNA and 207 miRNAs. Furthermore, we derived networks specific to the naïve and primed states of human pluripotency (called NaiveMLMiNet and PrimedMLMiNet respectively) that revealed a set of molecular markers (RPS6KA1, ZYG11A, ZNF695, ZNF273, and NLRP2 for naive state, and RAB34, TMEM178B, PTPRZ1, USP44, KIF1A and LRRN1 for primed state) which can be used to distinguish the pluripotent state from the non-pluripotent state and also to identify the intra-pluripotency states (i.e., naïve and primed state). The lncRNA DANT1 was found to be a crucial as it formed a bridge between the naive and primed state-specific networks. Analysis of the genes neighbouring DANT1 suggested its possible role as a competing endogenous RNA (ceRNA) for the induction and maintenance of human pluripotency. This was computationally validated by predicting the missing DANT1-miRNA interactions to complete the ceRNA circuit. Here we first report that DANT1 might harbour binding sites for miRNAs hsa-miR-30c-2-3p, hsa-miR-210-3p and hsa-let-7b-5p which may influence pluripotency.
Assuntos
Células-Tronco Embrionárias Humanas , MicroRNAs , RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , RNA Mensageiro/genética , Células-Tronco Embrionárias Humanas/metabolismo , MicroRNAs/genética , MicroRNAs/metabolismo , Perfilação da Expressão Gênica , Redes Reguladoras de Genes/genética , Proteínas de Ciclo Celular/metabolismo , Cinesinas/genética , Cinesinas/metabolismo , Proteínas Tirosina Fosfatases Classe 5 Semelhantes a Receptores/genética , Proteínas Tirosina Fosfatases Classe 5 Semelhantes a Receptores/metabolismo , Ubiquitina Tiolesterase/genética , Ubiquitina Tiolesterase/metabolismoRESUMO
The ever decreasing cost of Next-Generation Sequencing coupled with the emergence of efficient and reproducible analysis pipelines has rendered genomic methods more accessible. However, downstream analyses are basic or missing in most workflows, creating a significant barrier for non-bioinformaticians. To help close this gap, we developed Cactus, an end-to-end pipeline for analyzing ATAC-Seq and mRNA-Seq data, either separately or jointly. Its Nextflow-, container-, and virtual environment-based architecture ensures efficient and reproducible analyses. Cactus preprocesses raw reads, conducts differential analyses between conditions, and performs enrichment analyses in various databases, including DNA-binding motifs, ChIP-Seq binding sites, chromatin states, and ontologies. We demonstrate the utility of Cactus in a multi-modal and multi-species case study as well as by showcasing its unique capabilities as compared to other ATAC-Seq pipelines. In conclusion, Cactus can assist researchers in gaining comprehensive insights from chromatin accessibility and gene expression data in a quick, user-friendly, and reproducible manner.
Assuntos
Software , Humanos , Animais , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Cromatina por Imunoprecipitação/métodos , Cromatina/genética , Cromatina/metabolismo , RNA-Seq/métodosRESUMO
BACKGROUND: Bufo gargarizans Cantor, a widely distributed amphibian species in Asia, produces and releases toxins through its retroauricular and granular glands. Although various tissues have been sequenced, the molecular mechanisms underlying the toxin production remain unclear. To elucidate these mechanisms, abdominal skin (non-toxic secretory glands) and retroauricular gland (toxic secreting glands) samples were collected at different time points (3, 6, 12, 24, and 36 months) for RNA sequencing (RNA-seq) and analysis. RESULTS: In comparison to the S group during the same period, a total of 3053, 3026, 1516, 1028, and 2061 differentially expressed genes (DEGs) were identified across five developmental stages. Gene Ontology (GO) analysis revealed that DEGs were primarily enriched in biological processes including cellular processes, single-organism processes, metabolic processes, and biological regulation. In terms of cellular components, the DEGs were predominantly localized in the cell and cell parts, whereas molecular function indicated significant enrichment in binding and catalytic activity. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis showed that the metabolism and synthesis of various substances, such as lipid metabolism, cofactor and vitamin metabolism, tryptophan metabolism, steroid biosynthesis, and primary bile acid biosynthesis, were accompanied by the development of toads. Additionally, using trend analysis, we discovered candidate genes that were upregulated in the retroauricular glands during development, and the abundance of these genes in the abdominal skin was extremely low. Finally, we identified 26 genes that are likely to be involved in toxin production and that are likely to be involved in toxin anabolism. CONCLUSION: Overall, these results provide new insights into the genes involved in toxin production in B. gargarizans, which will improve our understanding of the molecular mechanisms underlying toxigenic gene expression.
Assuntos
Bufonidae , Animais , Bufonidae/genética , Bufonidae/metabolismo , Bufonidae/crescimento & desenvolvimento , Transcriptoma , RNA-Seq , Análise de Sequência de RNA , Análise Espaço-TemporalRESUMO
BACKGROUND: High-throughput experimental technologies can provide deeper insights into pathway perturbations in biomedical studies. Accordingly, their usage is central to the identification of molecular targets and the subsequent development of suitable treatments for various diseases. Classical interpretations of generated data, such as differential gene expression and pathway analyses, disregard interconnections between studied genes when looking for gene-disease associations. Given that these interconnections are central to cellular processes, there has been a recent interest in incorporating them in such studies. The latter allows the detection of gene modules that underlie complex phenotypes in gene interaction networks. Existing methods either impose radius-based restrictions or freely grow modules at the expense of a statistical bias towards large modules. We propose a heuristic method, inspired by Ant Colony Optimization, to apply gene-level scoring and module identification with distance-based search constraints and penalties, rather than radius-based constraints. RESULTS: We test and compare our results to other approaches using three datasets of different neurodegenerative diseases, namely Alzheimer's, Parkinson's, and Huntington's, over three independent experiments. We report the outcomes of enrichment analyses and concordance of gene-level scores for each disease. Results indicate that the proposed approach generally shows superior stability in comparison to existing methods. It produces stable and meaningful enrichment results in all three datasets which have different case to control proportions and sample sizes. CONCLUSION: The presented network-based gene expression analysis approach successfully identifies dysregulated gene modules associated with a certain disease. Using a heuristic based on Ant Colony Optimization, we perform a distance-based search with no radius constraints. Experimental results support the effectiveness and stability of our method in prioritizing modules of high relevance. Our tool is publicly available at github.com/GhadiElHasbani/ACOxGS.git.
Assuntos
Redes Reguladoras de Genes , Redes Reguladoras de Genes/genética , Humanos , Algoritmos , Doenças Neurodegenerativas/genética , Perfilação da Expressão Gênica/métodos , Biologia Computacional/métodos , Animais , Formigas/genética , Bases de Dados GenéticasRESUMO
Functional analysis of high throughput experiments using pathway analysis is now ubiquitous. Though powerful, these methods often produce thousands of redundant results owing to knowledgebase redundancies upstream. This scale of results hinders extensive exploration by biologists and can lead to investigator biases due to previous knowledge and expectations. To address this issue, we present vissE, a flexible network-based analysis and visualisation tool that organises information into semantic categories and provides various visualisation modules to characterise them with respect to the underlying data, thus providing a comprehensive view of the biological system. We demonstrate vissE's versatility by applying it to three different technologies: bulk, single-cell and spatial transcriptomics. Applying vissE to a factor analysis of a breast cancer spatial transcriptomic data, we identified stromal phenotypes that support tumour dissemination. Its adaptability allows vissE to enhance all existing gene-set enrichment and pathway analysis workflows, empowering biologists during molecular discovery.
Assuntos
Neoplasias da Mama , Perfilação da Expressão Gênica , Humanos , Feminino , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Transcriptoma , FenótipoRESUMO
Determination of the prognosis and treatment outcomes of dilated cardiomyopathy is a serious problem due to the lack of valid specific protein markers. Using in-depth proteome discovery analysis, we compared 49 plasma samples from patients suffering from dilated cardiomyopathy with plasma samples from their healthy counterparts. In total, we identified 97 proteins exhibiting statistically significant dysregulation in diseased plasma samples. The functional enrichment analysis of differentially expressed proteins uncovered dysregulation in biological processes like inflammatory response, wound healing, complement cascade, blood coagulation, and lipid metabolism in dilated cardiomyopathy patients. The same proteome approach was employed in order to find protein markers whose expression differs between the patients well-responding to therapy and nonresponders. In this case, 45 plasma proteins revealed statistically significant different expression between these two groups. Of them, fructose-1,6-bisphosphate aldolase seems to be a promising biomarker candidate because it accumulates in plasma samples obtained from patients with insufficient treatment response and with worse or fatal outcome. Data are available via ProteomeXchange with the identifier PXD046288.
Assuntos
Cardiomiopatia Dilatada , Humanos , Cardiomiopatia Dilatada/terapia , Proteoma/genética , Proteômica , Biomarcadores , Coagulação SanguíneaRESUMO
Huntington's disease (HD) is a gradually severe neurodegenerative ailment characterised by an increase of a specific trinucleotide repeat sequence (cytosine-adenine-guanine, CAG). It is passed down as a dominant characteristic that worsens over time, creating a significant risk. Despite being monogenetic, the underlying mechanisms as well as biomarkers remain poorly understood. Furthermore, early detection of HD is challenging, and the available diagnostic procedures have low precision and accuracy. The research was conducted to provide knowledge of the biomarkers, pathways and therapeutic targets involved in the molecular processes of HD using informatic based analysis and applying network-based systems biology approaches. The gene expression profile datasets GSE97100 and GSE74201 relevant to HD were studied. As a consequence, 46 differentially expressed genes (DEGs) were identified. 10 hub genes (TPM1, EIF2S3, CCN2, ACTN1, ACTG2, CCN1, CSRP1, EIF1AX, BEX2 and TCEAL5) were further differentiated in the protein-protein interaction (PPI) network. These hub genes were typically down-regulated. Additionally, DEGs-transcription factors (TFs) connections (e.g. GATA2, YY1 and FOXC1), DEG-microRNA (miRNA) interactions (e.g. hsa-miR-124-3p and has-miR-26b-5p) were also comprehensively forecast. Additionally, related gene ontology concepts (e.g. sequence-specific DNA binding and TF activity) connected to DEGs in HD were identified using gene set enrichment analysis (GSEA). Finally, in silico drug design was employed to find candidate drugs for the treatment HD, and while the possible modest therapeutic compounds (e.g. cortistatin A, 13,16-Epoxy-25-hydroxy-17-cheilanthen-19,25-olide, Hecogenin) against HD were expected. Consequently, the results from this study may give researchers useful resources for the experimental validation of Huntington's diagnosis and therapeutic approaches.
Assuntos
Biologia Computacional , Redes Reguladoras de Genes , Doença de Huntington , Mapas de Interação de Proteínas , Doença de Huntington/genética , Doença de Huntington/tratamento farmacológico , Doença de Huntington/metabolismo , Humanos , Biologia Computacional/métodos , Mapas de Interação de Proteínas/genética , Mapas de Interação de Proteínas/efeitos dos fármacos , Perfilação da Expressão Gênica , Biomarcadores/metabolismo , Regulação da Expressão Gênica/efeitos dos fármacos , Terapia de Alvo Molecular , Transcriptoma/genética , Ontologia Genética , MicroRNAs/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismoRESUMO
BACKGROUND: Most of the important biological mechanisms and functions of transmembrane proteins (TMPs) are realized through their interactions with non-transmembrane proteins(nonTMPs). The interactions between TMPs and nonTMPs in cells play vital roles in intracellular signaling, energy metabolism, investigating membrane-crossing mechanisms, correlations between disease and drugs. RESULTS: Despite the importance of TMP-nonTMP interactions, the study of them remains in the wet experimental stage, lacking specific and comprehensive studies in the field of bioinformatics. To fill this gap, we performed a comprehensive statistical analysis of known TMP-nonTMP interactions and constructed a deep learning-based predictor to identify potential interactions. The statistical analysis describes known TMP-nonTMP interactions from various perspectives, such as distributions of species and protein families, enrichment of GO and KEGG pathways, as well as hub proteins and subnetwork modules in the PPI network. The predictor implemented by an end-to-end deep learning model can identify potential interactions from protein primary sequence information. The experimental results over the independent validation demonstrated considerable prediction performance with an MCC of 0.541. CONCLUSIONS: To our knowledge, we were the first to focus on TMP-nonTMP interactions. We comprehensively analyzed them using bioinformatics methods and predicted them via deep learning-based solely on their sequence. This research completes a key link in the protein network, benefits the understanding of protein functions, and helps in pathogenesis studies of diseases and associated drug development.
Assuntos
Biologia Computacional , Proteínas de Membrana , Proteínas de Membrana/metabolismo , Proteínas de Membrana/genética , Biologia Computacional/métodos , Aprendizado Profundo , Humanos , Mapas de Interação de ProteínasRESUMO
BACKGROUND: Dendrobium huoshanense, a traditional medicinal and food plant, has a rich history of use. Recently, its genome was decoded, offering valuable insights into gene function. However, there is no comprehensive gene functional analysis platform for D. huoshanense. RESULT: To address this, we created a platform for gene function analysis and comparison in D. huoshanense (DhuFAP). Using 69 RNA-seq samples, we constructed a gene co-expression network and annotated D. huoshanense genes by aligning sequences with public protein databases. Our platform contained tools like Blast, gene set enrichment analysis, heatmap analysis, sequence extraction, and JBrowse. Analysis revealed co-expression of transcription factors (C2H2, GRAS, NAC) with genes encoding key enzymes in alkaloid biosynthesis. We also showcased the reliability and applicability of our platform using Chalcone synthases (CHS). CONCLUSION: DhuFAP ( www.gzybioinformatics.cn/DhuFAP ) and its suite of tools represent an accessible and invaluable resource for researchers, enabling the exploration of functional information pertaining to D. huoshanense genes. This platform stands poised to facilitate significant biological discoveries in this domain.
Assuntos
Dendrobium , Dendrobium/genética , Dendrobium/metabolismo , Reprodutibilidade dos TestesRESUMO
BACKGROUND: Mosaic loss of chromosome Y (LOY) in leukocytes is the most prevalent somatic aneuploidy in aging humans. Men with LOY have increased risks of all-cause mortality and the major causes of death, including many forms of cancer. It has been suggested that the association between LOY and disease risk depends on what type of leukocyte is affected with Y loss, with prostate cancer patients showing higher levels of LOY in CD4 + T lymphocytes. In previous studies, Y loss has however been observed at relatively low levels in this cell type. This motivated us to investigate whether specific subsets of CD4 + T lymphocytes are particularly affected by LOY. Publicly available, T lymphocyte enriched, single-cell RNA sequencing datasets from patients with liver, lung or colorectal cancer were used to study how LOY affects different subtypes of T lymphocyte. To validate the observations from the public data, we also generated a single-cell RNA sequencing dataset comprised of 23 PBMC samples and 32 CD4 + T lymphocytes enriched samples. RESULTS: Regulatory T cells had significantly more LOY than any other studied T lymphocytes subtype. Furthermore, LOY in regulatory T cells increased the ratio of regulatory T cells compared with other T lymphocyte subtypes, indicating an effect of Y loss on lymphocyte differentiation. This was supported by developmental trajectory analysis of CD4 + T lymphocytes culminating in the regulatory T cells cluster most heavily affected by LOY. Finally, we identify dysregulation of 465 genes in regulatory T cells with Y loss, many involved in the immunosuppressive functions and development of regulatory T cells. CONCLUSIONS: Here, we show that regulatory T cells are particularly affected by Y loss, resulting in an increased fraction of regulatory T cells and dysregulated immune functions. Considering that regulatory T cells plays a critical role in the process of immunosuppression; this enrichment for regulatory T cells with LOY might contribute to the increased risk for cancer observed among men with Y loss in leukocytes.
Assuntos
Cromossomos Humanos Y , Neoplasias , Humanos , Masculino , Cromossomos Humanos Y/genética , Linfócitos T Reguladores , Leucócitos Mononucleares , MosaicismoRESUMO
Although sows do not directly enter the market, they play an important role in piglet breeding on farms. They consume large amounts of feed, resulting in a significant environmental burden. Pig farms can increase their income and reduce environmental pollution by increasing the litter size (LS) of swine. PCR-RFLP/SSCP and GWAS are common methods to evaluate single-nucleotide polymorphisms (SNPs) in candidate genes. We conducted a systematic meta-analysis of the effect of SNPs on pig LS. We collected and analysed data published over the past 30 years using traditional and network meta-analyses. Trial sequential analysis (TSA) was used to analyse population data. Gene set enrichment analysis and protein-protein interaction network analysis were used to analyse the GWAS dataset. The results showed that the candidate genes were positively correlated with LS, and defects in PCR-RFLP/SSCP affected the reliability of candidate gene results. However, the genotypes with high and low LSs did not have a significant advantage. Current breeding and management practices for sows should consider increasing the LS while reducing lactation length and minimizing the sows' non-pregnancy period as much as possible.
RESUMO
Advancements in sequencing technologies have facilitated omics level information generation for various diseases in human. High-throughput technologies have become a powerful tool to understand differential expression studies and transcriptional network analysis. An understanding of complex transcriptional networks in human diseases requires integration of datasets representing different RNA species including microRNA (miRNA) and messenger RNA (mRNA). This review emphasises on conceptual explanation of generalized workflow and methodologies to the miRNA mediated responses in human diseases by using different in silico analysis. Although, there have been many prior explorations in miRNA-mediated responses in human diseases, the advantages, limitations and overcoming the limitation through different statistical techniques have not yet been discussed. This review focuses on miRNAs as important gene regulators in human diseases, methodologies for miRNA-target gene prediction and data driven methods for enrichment and network analysis for miRnome-targetome interactions. Additionally, it proposes an integrative workflow to analyse structural components of networks obtained from high-throughput data. This review explains how to apply the existing methods to analyse miRNA-mediated responses in human diseases. It addresses unique characteristics of different analysis, its limitations and its statistical solutions influencing the choice of methods for the analysis through a workflow. Moreover, it provides an overview of promising common integrative approaches to comprehend miRNA-mediated gene regulatory events in biological processes in humans. The proposed methodologies and workflow shall help in the analysis of multi-source data to identify molecular signatures of various human diseases.
Assuntos
Biologia Computacional , Simulação por Computador , Regulação da Expressão Gênica , Redes Reguladoras de Genes , MicroRNAs , Humanos , MicroRNAs/genética , Biologia Computacional/métodos , RNA Mensageiro/genética , RNA Mensageiro/metabolismoRESUMO
With global warming, high temperature (HT) has become one of the most common abiotic stresses resulting in significant crop yield losses, especially for jujube (Ziziphus jujuba Mill.), an important temperate economic crop cultivated worldwide. This study aims to explore the coping mechanism of jujube to HT stress at the transcriptional and post-transcriptional levels, including identifying differentially expressed miRNAs and mRNAs as well as elucidating the critical pathways involved. High-throughput sequencing analyses of miRNA and mRNA were performed on jujube leaves, which were collected from "Fucumi" (heat-tolerant) and "Junzao" (heat-sensitive) cultivars subjected to HT stress (42 °C) for 0, 1, 3, 5, and 7 days, respectively. The results showed that 45 known miRNAs, 482 novel miRNAs, and 13,884 differentially expressed mRNAs (DEMs) were identified. Among them, integrated analysis of miRNA target genes prediction and mRNA-seq obtained 1306 differentially expressed miRNAs-mRNAs pairs, including 484, 769, and 865 DEMIs-DEMs pairs discovered in "Fucuimi", "Junzao" and two genotypes comparative groups, respectively. Furthermore, functional enrichment analysis of 1306 DEMs revealed that plant-pathogen interaction, starch and sucrose metabolism, spliceosome, and plant hormone signal transduction were crucial pathways in jujube leaves response to HT stress. The constructed miRNA-mRNA network, composed of 20 DEMIs and 33 DEMs, displayed significant differently expressions between these two genotypes. This study further proved the regulatory role of miRNAs in the response to HT stress in plants and will provide a theoretical foundation for the innovation and cultivation of heat-tolerant varieties.