Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 27.749
1.
PLoS Comput Biol ; 20(5): e1012024, 2024 May.
Article En | MEDLINE | ID: mdl-38717988

The activation levels of biologically significant gene sets are emerging tumor molecular markers and play an irreplaceable role in the tumor research field; however, web-based tools for prognostic analyses using it as a tumor molecular marker remain scarce. We developed a web-based tool PESSA for survival analysis using gene set activation levels. All data analyses were implemented via R. Activation levels of The Molecular Signatures Database (MSigDB) gene sets were assessed using the single sample gene set enrichment analysis (ssGSEA) method based on data from the Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA), The European Genome-phenome Archive (EGA) and supplementary tables of articles. PESSA was used to perform median and optimal cut-off dichotomous grouping of ssGSEA scores for each dataset, relying on the survival and survminer packages for survival analysis and visualisation. PESSA is an open-access web tool for visualizing the results of tumor prognostic analyses using gene set activation levels. A total of 238 datasets from the GEO, TCGA, EGA, and supplementary tables of articles; covering 51 cancer types and 13 survival outcome types; and 13,434 tumor-related gene sets are obtained from MSigDB for pre-grouping. Users can obtain the results, including Kaplan-Meier analyses based on the median and optimal cut-off values and accompanying visualization plots and the Cox regression analyses of dichotomous and continuous variables, by selecting the gene set markers of interest. PESSA (https://smuonco.shinyapps.io/PESSA/ OR http://robinl-lab.com/PESSA) is a large-scale web-based tumor survival analysis tool covering a large amount of data that creatively uses predefined gene set activation levels as molecular markers of tumors.


Biomarkers, Tumor , Computational Biology , Databases, Genetic , Internet , Neoplasms , Software , Humans , Neoplasms/genetics , Neoplasms/mortality , Survival Analysis , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , Computational Biology/methods , Prognosis , Gene Expression Profiling/methods , Gene Expression Regulation, Neoplastic/genetics
2.
Sci Adv ; 10(19): eadj1424, 2024 May 10.
Article En | MEDLINE | ID: mdl-38718126

The ongoing expansion of human genomic datasets propels therapeutic target identification; however, extracting gene-disease associations from gene annotations remains challenging. Here, we introduce Mantis-ML 2.0, a framework integrating AstraZeneca's Biological Insights Knowledge Graph and numerous tabular datasets, to assess gene-disease probabilities throughout the phenome. We use graph neural networks, capturing the graph's holistic structure, and train them on hundreds of balanced datasets via a robust semi-supervised learning framework to provide gene-disease probabilities across the human exome. Mantis-ML 2.0 incorporates natural language processing to automate disease-relevant feature selection for thousands of diseases. The enhanced models demonstrate a 6.9% average classification power boost, achieving a median receiver operating characteristic (ROC) area under curve (AUC) score of 0.90 across 5220 diseases from Human Phenotype Ontology, OpenTargets, and Genomics England. Notably, Mantis-ML 2.0 prioritizes associations from an independent UK Biobank phenome-wide association study (PheWAS), providing a stronger form of triaging and mitigating against underpowered PheWAS associations. Results are exposed through an interactive web resource.


Biological Specimen Banks , Neural Networks, Computer , Humans , Genome-Wide Association Study/methods , Phenotype , United Kingdom , Phenomics/methods , Genetic Predisposition to Disease , Genomics/methods , Databases, Genetic , Algorithms , Computational Biology/methods , UK Biobank
3.
Sci Data ; 11(1): 488, 2024 May 11.
Article En | MEDLINE | ID: mdl-38734729

Domesticated herbivores are an important agricultural resource that play a critical role in global food security, particularly as they can adapt to varied environments, including marginal lands. An understanding of the molecular basis of their biology would contribute to better management and sustainable production. Thus, we conducted transcriptome sequencing of 100 to 105 tissues from two females of each of seven species of herbivore (cattle, sheep, goats, sika deer, horses, donkeys, and rabbits) including two breeds of sheep. The quality of raw and trimmed reads was assessed in terms of base quality, GC content, duplication sequence rate, overrepresented k-mers, and quality score distribution with FastQC. The high-quality filtered RNA-seq raw reads were deposited in a public database which provides approximately 54 billion high-quality paired-end sequencing reads in total, with an average mapping rate of ~93.92%. Transcriptome databases represent valuable resources that can be used to study patterns of gene expression, and pathways that are related to key biological processes, including important economic traits in herbivores.


Herbivory , Transcriptome , Animals , Cattle/genetics , Female , Rabbits/genetics , Databases, Genetic , Deer/genetics , Equidae/genetics , Goats/genetics , Horses/genetics , Sheep/genetics
4.
Front Immunol ; 15: 1347415, 2024.
Article En | MEDLINE | ID: mdl-38736878

Objective: Emerging evidence has shown that gut diseases can regulate the development and function of the immune, metabolic, and nervous systems through dynamic bidirectional communication on the brain-gut axis. However, the specific mechanism of intestinal diseases and vascular dementia (VD) remains unclear. We designed this study especially, to further clarify the connection between VD and inflammatory bowel disease (IBD) from bioinformatics analyses. Methods: We downloaded Gene expression profiles for VD (GSE122063) and IBD (GSE47908, GSE179285) from the Gene Expression Omnibus (GEO) database. Then individual Gene Set Enrichment Analysis (GSEA) was used to confirm the connection between the two diseases respectively. The common differentially expressed genes (coDEGs) were identified, and the STRING database together with Cytoscape software were used to construct protein-protein interaction (PPI) network and core functional modules. We identified the hub genes by using the Cytohubba plugin. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were applied to identify pathways of coDEGs and hub genes. Subsequently, receiver operating characteristic (ROC) analysis was used to identify the diagnostic ability of these hub genes, and a training dataset was used to verify the expression levels of the hub genes. An alternative single-sample gene set enrichment (ssGSEA) algorithm was used to analyze immune cell infiltration between coDEGs and immune cells. Finally, the correlation between hub genes and immune cells was analyzed. Results: We screened 167 coDEGs. The main articles of coDEGs enrichment analysis focused on immune function. 8 shared hub genes were identified, including PTPRC, ITGB2, CYBB, IL1B, TLR2, CASP1, IL10RA, and BTK. The functional categories of hub genes enrichment analysis were mainly involved in the regulation of immune function and neuroinflammatory response. Compared to the healthy controls, abnormal infiltration of immune cells was found in VD and IBD. We also found the correlation between 8 shared hub genes and immune cells. Conclusions: This study suggests that IBD may be a new risk factor for VD. The 8 hub genes may predict the IBD complicated with VD. Immune-related coDEGS may be related to their association, which requires further research to prove.


Computational Biology , Dementia, Vascular , Gene Expression Profiling , Gene Regulatory Networks , Inflammatory Bowel Diseases , Protein Interaction Maps , Humans , Inflammatory Bowel Diseases/genetics , Inflammatory Bowel Diseases/immunology , Computational Biology/methods , Dementia, Vascular/genetics , Dementia, Vascular/immunology , Databases, Genetic , Transcriptome , Gene Ontology
5.
Front Immunol ; 15: 1347139, 2024.
Article En | MEDLINE | ID: mdl-38726016

Background: Autism spectrum disorder (ASD) is a disease characterized by social disorder. Recently, the population affected by ASD has gradually increased around the world. There are great difficulties in diagnosis and treatment at present. Methods: The ASD datasets were obtained from the Gene Expression Omnibus database and the immune-relevant genes were downloaded from a previously published compilation. Subsequently, we used WGCNA to screen the modules related to the ASD and immune. We also choose the best combination and screen out the core genes from Consensus Machine Learning Driven Signatures (CMLS). Subsequently, we evaluated the genetic correlation between immune cells and ASD used GNOVA. And pleiotropic regions identified by PLACO and CPASSOC between ASD and immune cells. FUMA was used to identify pleiotropic regions, and expression trait loci (EQTL) analysis was used to determine their expression in different tissues and cells. Finally, we use qPCR to detect the gene expression level of the core gene. Results: We found a close relationship between neutrophils and ASD, and subsequently, CMLS identified a total of 47 potential candidate genes. Secondly, GNOVA showed a significant genetic correlation between neutrophils and ASD, and PLACO and CPASSOC identified a total of 14 pleiotropic regions. We annotated the 14 regions mentioned above and identified a total of 6 potential candidate genes. Through EQTL, we found that the CFLAR gene has a specific expression pattern in neutrophils, suggesting that it may serve as a potential biomarker for ASD and is closely related to its pathogenesis. Conclusions: In conclusion, our study yields unprecedented insights into the molecular and genetic heterogeneity of ASD through a comprehensive bioinformatics analysis. These valuable findings hold significant implications for tailoring personalized ASD therapies.


Autism Spectrum Disorder , Computational Biology , Genetic Predisposition to Disease , Quantitative Trait Loci , Humans , Autism Spectrum Disorder/genetics , Autism Spectrum Disorder/immunology , Computational Biology/methods , Gene Expression Profiling , Gene Regulatory Networks , Machine Learning , Databases, Genetic , Immunogenetics , Neutrophils/immunology , Neutrophils/metabolism , Transcriptome
6.
PLoS One ; 19(5): e0302753, 2024.
Article En | MEDLINE | ID: mdl-38739634

Leprosy has a high rate of cripplehood and lacks available early effective diagnosis methods for prevention and treatment, thus novel effective molecule markers are urgently required. In this study, we conducted bioinformatics analysis with leprosy and normal samples acquired from the GEO database(GSE84893, GSE74481, GSE17763, GSE16844 and GSE443). Through WGCNA analysis, 85 hub genes were screened(GS > 0.7 and MM > 0.8). Through DEG analysis, 82 up-regulated and 3 down-regulated genes were screened(|Log2FC| > 3 and FDR < 0.05). Then 49 intersection genes were considered as crucial and subjected to GO annotation, KEGG pathway and PPI analysis to determine the biological significance in the pathogenesis of leprosy. Finally, we identified a gene-pathway network, suggesting ITK, CD48, IL2RG, CCR5, FGR, JAK3, STAT1, LCK, PTPRC, CXCR4 can be used as biomarkers and these genes are active in 6 immune system pathways, including Chemokine signaling pathway, Th1 and Th2 cell differentiation, Th17 cell differentiation, T cell receptor signaling pathway, Natural killer cell mediated cytotoxicity and Leukocyte transendothelial migration. We identified 10 crucial gene markers and related important pathways that acted as essential components in the etiology of leprosy. Our study provides potential targets for diagnostic biomarkers and therapy of leprosy.


Biomarkers , Gene Regulatory Networks , Leprosy , Leprosy/genetics , Leprosy/microbiology , Humans , Biomarkers/metabolism , Computational Biology/methods , Databases, Genetic , Gene Expression Profiling , Protein Interaction Maps/genetics , Signal Transduction
7.
Comput Biol Med ; 175: 108495, 2024 Jun.
Article En | MEDLINE | ID: mdl-38697003

Allergic rhinitis is a common allergic disease with a complex pathogenesis and many unresolved issues. Studies have shown that the incidence of allergic rhinitis is closely related to genetic factors, and research on the related genes could help further understand its pathogenesis and develop new treatment methods. In this study, 446 allergic rhinitis-related genes were obtained on the basis of the DisGeNET database. The protein-protein interaction network was searched using the random-walk-with-restart algorithm with these 446 genes as seed nodes to assess the linkages between other genes and allergic rhinitis. Then, this result was further examined by three screening tests, including permutation, interaction, and enrichment tests, which aimed to pick up genes that have strong and special associations with allergic rhinitis. 52 novel genes were finally obtained. The functional enrichment test confirmed their relationships to the biological processes and pathways related to allergic rhinitis. Furthermore, some genes were extensively analyzed to uncover their special or latent associations to allergic rhinitis, including IRAK2 and MAPK, which are involved in the pathogenesis of allergic rhinitis and the inhibition of allergic inflammation via the p38-MAPK pathway, respectively. The new found genes may help the following investigations for understanding the underlying molecular mechanisms of allergic rhinitis and developing effective treatments.


Protein Interaction Maps , Rhinitis, Allergic , Humans , Rhinitis, Allergic/genetics , Protein Interaction Maps/genetics , Databases, Genetic , Algorithms , Computational Biology/methods , Gene Regulatory Networks
8.
BMC Plant Biol ; 24(1): 410, 2024 May 17.
Article En | MEDLINE | ID: mdl-38760710

Rosa roxburghii Tratt, a valuable plant in China with long history, is famous for its fruit. It possesses various secondary metabolites, such as L-ascorbic acid (vitamin C), alkaloids and poly saccharides, which make it a high nutritional and medicinal value. Here we characterized the chromosome-level genome sequence of R. roxburghii, comprising seven pseudo-chromosomes with a total size of 531 Mb and a heterozygosity of 0.25%. We also annotated 45,226 coding gene loci after masking repeat elements. Orthologs for 90.1% of the Complete Single-Copy BUSCOs were found in the R. roxburghii annotation. By aligning with protein sequences from public platform, we annotated 85.89% genes from R. roxburghii. Comparative genomic analysis revealed that R. roxburghii diverged from Rosa chinensis approximately 5.58 to 13.17 million years ago, and no whole-genome duplication event occurred after the divergence from eudicots. To fully utilize this genomic resource, we constructed a genomic database RroFGD with various analysis tools. Otherwise, 69 enzyme genes involved in L-ascorbate biosynthesis were identified and a key enzyme in the biosynthesis of vitamin C, GDH (L-Gal-1-dehydrogenase), is used as an example to introduce the functions of the database. This genome and database will facilitate the future investigations into gene function and molecular breeding in R. roxburghii.


Chromosomes, Plant , Genome, Plant , Rosa , Rosa/genetics , Rosa/metabolism , Chromosomes, Plant/genetics , Databases, Genetic , Secondary Metabolism/genetics , Ascorbic Acid/metabolism , Ascorbic Acid/biosynthesis
9.
BMC Med Genomics ; 17(1): 134, 2024 May 20.
Article En | MEDLINE | ID: mdl-38764052

BACKGROUND: Acute myocardial infarction (AMI) and diabetic nephropathy (DN) are common clinical co-morbidities, but they are challenging to manage and have poor prognoses. There is no research on the bioinformatics mechanisms of comorbidity, and this study aims to investigate such mechanisms. METHODS: We downloaded the AMI data (GSE66360) and DN datasets (GSE30528 and GSE30529) from the Gene Expression Omnibus (GEO) platform. The GSE66360 dataset was divided into two parts: the training set and the validation set, and GSE30529 was used as the training set and GSE30528 as the validation set. After identifying the common differentially expressed genes (DEGs) in AMI and DN in the training set, gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses and protein-protein interaction (PPI) network construction were performed. A sub-network graph was constructed by MCODE, and 15 hub genes were screened by the Cytohubba plugin. The screened hub genes were validated, and the 15 screened hub genes were subjected to GO, KEGG, Gene MANIA analysis, and transcription factor (TF) prediction. Finally, we performed TF differential analysis, enrichment analysis, and TF and gene regulatory network construction. RESULTS: A total of 46 genes (43 up-regulated and 3 down-regulated) were identified for subsequent analysis. GO functional analysis emphasized the presence of genes mainly in the vesicle membrane and secretory granule membrane involved in antigen processing and presentation, lipopeptide binding, NAD + nucleosidase activity, and Toll-like receptor binding. The KEGG pathways analyzed were mainly in the phagosome, neutrophil extracellular trap formation, natural killer cell-mediated cytotoxicity, apoptosis, Fc gamma R-mediated phagocytosis, and Toll-like receptor signaling pathways. Eight co-expressed hub genes were identified and validated, namely TLR2, FCER1G, CD163, CTSS, CLEC4A, IGSF6, NCF2, and MS4A6A. Three transcription factors were identified and validated in AMI, namely NFKB1, HIF1A, and SPI1. CONCLUSIONS: Our study reveals the common pathogenesis of AMI and DN. These common pathways and hub genes may provide new ideas for further mechanistic studies.


Diabetic Nephropathies , Myocardial Infarction , Transcription Factors , Myocardial Infarction/genetics , Humans , Diabetic Nephropathies/genetics , Transcription Factors/genetics , Transcription Factors/metabolism , Protein Interaction Maps , Computational Biology/methods , Gene Expression Profiling , Gene Regulatory Networks , Gene Ontology , Gene Expression Regulation , Databases, Genetic
10.
Brief Bioinform ; 25(3)2024 Mar 27.
Article En | MEDLINE | ID: mdl-38770717

Drug therapy is vital in cancer treatment. Accurate analysis of drug sensitivity for specific cancers can guide healthcare professionals in prescribing drugs, leading to improved patient survival and quality of life. However, there is a lack of web-based tools that offer comprehensive visualization and analysis of pancancer drug sensitivity. We gathered cancer drug sensitivity data from publicly available databases (GEO, TCGA and GDSC) and developed a web tool called Comprehensive Pancancer Analysis of Drug Sensitivity (CPADS) using Shiny. CPADS currently includes transcriptomic data from over 29 000 samples, encompassing 44 types of cancer, 288 drugs and more than 9000 gene perturbations. It allows easy execution of various analyses related to cancer drug sensitivity. With its large sample size and diverse drug range, CPADS offers a range of analysis methods, such as differential gene expression, gene correlation, pathway analysis, drug analysis and gene perturbation analysis. Additionally, it provides several visualization approaches. CPADS significantly aids physicians and researchers in exploring primary and secondary drug resistance at both gene and pathway levels. The integration of drug resistance and gene perturbation data also presents novel perspectives for identifying pivotal genes influencing drug resistance. Access CPADS at https://smuonco.shinyapps.io/CPADS/ or https://robinl-lab.com/CPADS.


Drug Resistance, Neoplasm , Internet , Neoplasms , Software , Humans , Neoplasms/drug therapy , Neoplasms/genetics , Drug Resistance, Neoplasm/genetics , Antineoplastic Agents/pharmacology , Antineoplastic Agents/therapeutic use , Computational Biology/methods , Databases, Genetic , Transcriptome , Gene Expression Profiling/methods
11.
PLoS One ; 19(5): e0303506, 2024.
Article En | MEDLINE | ID: mdl-38771826

OBJECTIVE: To elucidate potential molecular mechanisms differentiating osteoarthritis (OA) and rheumatoid arthritis (RA) through a bioinformatics analysis of differentially expressed genes (DEGs) in patient synovial cells, aiming to provide new insights for clinical treatment strategies. MATERIALS AND METHODS: Gene expression datasets GSE1919, GSE82107, and GSE77298 were downloaded from the Gene Expression Omnibus (GEO) database to serve as the training groups, with GSE55235 being used as the validation dataset. The OA and RA data from the GSE1919 dataset were merged with the standardized data from GSE82107 and GSE77298, followed by batch effect removal to obtain the merged datasets of differential expressed genes (DEGs) for OA and RA. Intersection analysis was conducted on the DEGs between the two conditions to identify commonly upregulated and downregulated DEGs. Enrichment analysis was then performed on these common co-expressed DEGs, and a protein-protein interaction (PPI) network was constructed to identify hub genes. These hub genes were further analyzed using the GENEMANIA online platform and subjected to enrichment analysis. Subsequent validation analysis was conducted using the GSE55235 dataset. RESULTS: The analysis of differentially expressed genes in the synovial cells from patients with Osteoarthritis (OA) and Rheumatoid Arthritis (RA), compared to a control group (individuals without OA or RA), revealed significant changes in gene expression patterns. Specifically, the genes APOD, FASN, and SCD were observed to have lower expression levels in the synovial cells of both OA and RA patients, indicating downregulation within the pathological context of these diseases. In contrast, the SDC1 gene was found to be upregulated, displaying higher expression levels in the synovial cells of OA and RA patients compared to normal controls.Additionally, a noteworthy observation was the downregulation of the transcription factor PPARG in the synovial cells of patients with OA and RA. The decrease in expression levels of PPARG further validates the alteration in lipid metabolism and inflammatory processes associated with the pathogenesis of OA and RA. These findings underscore the significance of these genes and the transcription factor not only as biomarkers for differential diagnosis between OA and RA but also as potential targets for therapeutic interventions aimed at modulating their expression to counteract disease progression. CONCLUSION: The outcomes of this investigation reveal the existence of potentially shared molecular mechanisms within Osteoarthritis (OA) and Rheumatoid Arthritis (RA). The identification of APOD, FASN, SDC1, TNFSF11 as key target genes, along with their downstream transcription factor PPARG, highlights common potential factors implicated in both diseases. A deeper examination and exploration of these findings could pave the way for new candidate targets and directions in therapeutic research aimed at treating both OA and RA. This study underscores the significance of leveraging bioinformatics approaches to unravel complex disease mechanisms, offering a promising avenue for the development of more effective and targeted treatments.


Arthritis, Rheumatoid , Gene Expression Profiling , Osteoarthritis , Protein Interaction Maps , Synovial Membrane , Arthritis, Rheumatoid/genetics , Arthritis, Rheumatoid/metabolism , Arthritis, Rheumatoid/pathology , Humans , Osteoarthritis/genetics , Osteoarthritis/metabolism , Osteoarthritis/pathology , Protein Interaction Maps/genetics , Synovial Membrane/metabolism , Synovial Membrane/pathology , Computational Biology/methods , Gene Regulatory Networks , Gene Expression Regulation , Databases, Genetic
12.
Sci Rep ; 14(1): 10728, 2024 05 10.
Article En | MEDLINE | ID: mdl-38730027

The purpose of this study was to explore the diagnostic implications of ubiquitination-related gene signatures in Alzheimer's disease. In this study, we first collected 161 samples from the GEO database (including 87 in the AD group and 74 in the normal group). Subsequently, through differential expression analysis and the iUUCD 2.0 database, we obtained 3450 Differentially Expressed Genes (DEGs) and 806 Ubiquitin-related genes (UbRGs). After taking the intersection, we obtained 128 UbR-DEGs. Secondly, by conducting GO and KEGG enrichment analysis on these 128 UbR-DEGs, we identified the main molecular functions and biological pathways related to AD. Furthermore, through the utilization of GSEA analysis, we have gained insight into the enrichment of functions and pathways within both the AD and normal groups. Further, using lasso regression analysis and cross-validation techniques, we identified 22 characteristic genes associated with AD. Subsequently, we constructed a logistic regression model and optimized it, resulting in the identification of 6 RUbR-DEGs: KLHL21, WDR82, DTX3L, UBTD2, CISH, and ATXN3L. In addition, the ROC result showed that the diagnostic model we built has excellent accuracy and reliability in identifying AD patients. Finally, we constructed a lncRNA-miRNA-mRNA (competing endogenous RNA, ceRNA) regulatory network for AD based on six RUbR-DEGs, further elucidating the interaction between UbRGs and lncRNA, miRNA. In conclusion, our findings will contribute to further understanding of the molecular pathogenesis of AD and provide a new perspective for AD risk prediction, early diagnosis and targeted therapy in the population.


Alzheimer Disease , Ubiquitination , Alzheimer Disease/genetics , Alzheimer Disease/diagnosis , Alzheimer Disease/metabolism , Humans , Gene Expression Profiling , Transcriptome , Gene Regulatory Networks , Databases, Genetic
13.
BMC Bioinformatics ; 25(1): 184, 2024 May 09.
Article En | MEDLINE | ID: mdl-38724907

BACKGROUND: Major advances in sequencing technologies and the sharing of data and metadata in science have resulted in a wealth of publicly available datasets. However, working with and especially curating public omics datasets remains challenging despite these efforts. While a growing number of initiatives aim to re-use previous results, these present limitations that often lead to the need for further in-house curation and processing. RESULTS: Here, we present the Omics Dataset Curation Toolkit (OMD Curation Toolkit), a python3 package designed to accompany and guide the researcher during the curation process of metadata and fastq files of public omics datasets. This workflow provides a standardized framework with multiple capabilities (collection, control check, treatment and integration) to facilitate the arduous task of curating public sequencing data projects. While centered on the European Nucleotide Archive (ENA), the majority of the provided tools are generic and can be used to curate datasets from different sources. CONCLUSIONS: Thus, it offers valuable tools for the in-house curation previously needed to re-use public omics data. Due to its workflow structure and capabilities, it can be easily used and benefit investigators in developing novel omics meta-analyses based on sequencing data.


Data Curation , Software , Workflow , Data Curation/methods , Metadata , Databases, Genetic , Genomics/methods , Computational Biology/methods
14.
Technol Cancer Res Treat ; 23: 15330338241241484, 2024.
Article En | MEDLINE | ID: mdl-38725284

Introduction: Endoplasmic reticulum stress (ERS) was a response to the accumulation of unfolded proteins and plays a crucial role in the development of tumors, including processes such as tumor cell invasion, metastasis, and immune evasion. However, the specific regulatory mechanisms of ERS in breast cancer (BC) remain unclear. Methods: In this study, we analyzed RNA sequencing data from The Cancer Genome Atlas (TCGA) for breast cancer and identified 8 core genes associated with ERS: ELOVL2, IFNG, MAP2K6, MZB1, PCSK6, PCSK9, IGF2BP1, and POP1. We evaluated their individual expression, independent diagnostic, and prognostic values in breast cancer patients. A multifactorial Cox analysis established a risk prognostic model, validated with an external dataset. Additionally, we conducted a comprehensive assessment of immune infiltration and drug sensitivity for these genes. Results: The results indicate that these eight core genes play a crucial role in regulating the immune microenvironment of breast cancer (BRCA) patients. Meanwhile, an independent diagnostic model based on the expression of these eight genes shows limited independent diagnostic value, and its independent prognostic value is unsatisfactory, with the time ROC AUC values generally below 0.5. According to the results of logistic regression neural networks and risk prognosis models, when these eight genes interact synergistically, they can serve as excellent biomarkers for the diagnosis and prognosis of breast cancer patients. Furthermore, the research findings have been confirmed through qPCR experiments and validation. Conclusion: In conclusion, we explored the mechanisms of ERS in BRCA patients and identified 8 outstanding biomolecular diagnostic markers and prognostic indicators. The research results were double-validated using the GEO database and qPCR.


Biomarkers, Tumor , Breast Neoplasms , Endoplasmic Reticulum Stress , Gene Expression Regulation, Neoplastic , Tumor Microenvironment , Humans , Female , Tumor Microenvironment/immunology , Tumor Microenvironment/genetics , Breast Neoplasms/genetics , Breast Neoplasms/immunology , Breast Neoplasms/pathology , Prognosis , Endoplasmic Reticulum Stress/genetics , Biomarkers, Tumor/genetics , Gene Expression Profiling , Computational Biology/methods , Databases, Genetic , ROC Curve , Kaplan-Meier Estimate , Transcriptome
15.
J Diabetes Res ; 2024: 4815488, 2024.
Article En | MEDLINE | ID: mdl-38766319

Background: Tubulointerstitial injury plays a pivotal role in the progression of diabetic kidney disease (DKD), yet the link between neutrophil extracellular traps (NETs) and diabetic tubulointerstitial injury is still unclear. Methods: We analyzed microarray data (GSE30122) from the Gene Expression Omnibus (GEO) database to identify differentially expressed genes (DEGs) associated with DKD's tubulointerstitial injury. Functional and pathway enrichment analyses were conducted to elucidate the involved biological processes (BP) and pathways. Weighted gene coexpression network analysis (WGCNA) identified modules associated with DKD. LASSO regression and random forest selected NET-related characteristic genes (NRGs) related to DKD tubulointerstitial injury. Results: Eight hundred ninety-eight DEGs were identified from the GSE30122 dataset. A significant module associated with diabetic tubulointerstitial injury overlapped with 15 NRGs. The hub genes, CASP1 and LYZ, were identified as potential biomarkers. Functional enrichment linked these genes with immune cell trafficking, metabolic alterations, and inflammatory responses. NRGs negatively correlated with glomerular filtration rate (GFR) in the Neph v5 database. Immunohistochemistry (IHC) validated increased NRGs in DKD tubulointerstitial injury. Conclusion: Our findings suggest that the CASP1 and LYZ genes may serve as potential diagnostic biomarkers for diabetic tubulointerstitial injury. Furthermore, NRGs involved in diabetic tubulointerstitial injury could emerge as prospective targets for the diagnosis and treatment of DKD.


Biomarkers , Diabetic Nephropathies , Extracellular Traps , Gene Expression Profiling , Diabetic Nephropathies/genetics , Diabetic Nephropathies/diagnosis , Diabetic Nephropathies/metabolism , Humans , Biomarkers/metabolism , Extracellular Traps/metabolism , Gene Regulatory Networks , Databases, Genetic , Nephritis, Interstitial/genetics , Nephritis, Interstitial/diagnosis , Glomerular Filtration Rate
16.
J Diabetes Res ; 2024: 5550812, 2024.
Article En | MEDLINE | ID: mdl-38774257

Objective: This study is aimed at investigating diagnostic biomarkers associated with lipotoxicity and the molecular mechanisms underlying diabetic nephropathy (DN). Methods: The GSE96804 dataset from the Gene Expression Omnibus (GEO) database was utilized to identify differentially expressed genes (DEGs) in DN patients. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were conducted using the DEGs. A protein-protein interaction (PPI) network was established to identify key genes linked to lipotoxicity in DN. Immune infiltration analysis was employed to identify immune cells with differential expression in DN and to assess the correlation between these immune cells and lipotoxicity-related hub genes. The findings were validated using the external dataset GSE104954. ROC analysis was performed to assess the diagnostic performance of the hub genes. The Gene set enrichment analysis (GSEA) enrichment method was utilized to analyze the key genes associated with lipotoxicity as mentioned above. Result: In this study, a total of 544 DEGs were identified. Among them, extracellular matrix (ECM), fatty acid metabolism, AGE-RAGE, and PI3K-Akt signaling pathways were significantly enriched. Combining the PPI network and lipotoxicity-related genes (LRGS), LUM and ALB were identified as lipotoxicity-related diagnostic biomarkers for DN. ROC analysis showed that the AUC values for LUM and ALB were 0.882 and 0.885, respectively. The AUC values for LUM and ALB validated in external datasets were 0.98 and 0.82, respectively. Immune infiltration analysis revealed significant changes in various immune cells during disease progression. Macrophages M2, mast cells activated, and neutrophils were significantly associated with all lipotoxicity-related hub genes. These key genes were enriched in fatty acid metabolism and extracellular matrix-related pathways. Conclusion: The identified lipotoxicity-related hub genes provide a deeper understanding of the development mechanisms of DN, potentially offering new theoretical foundations for the development of diagnostic biomarkers and therapeutic targets related to lipotoxicity in DN.


Biomarkers , Computational Biology , Diabetic Nephropathies , Gene Expression Profiling , Protein Interaction Maps , Humans , Diabetic Nephropathies/genetics , Diabetic Nephropathies/metabolism , Diabetic Nephropathies/diagnosis , Biomarkers/metabolism , Lumican/genetics , Lumican/metabolism , Gene Ontology , Gene Regulatory Networks , Databases, Genetic , Signal Transduction
17.
Front Cell Infect Microbiol ; 14: 1384809, 2024.
Article En | MEDLINE | ID: mdl-38774631

Introduction: Sharing microbiome data among researchers fosters new innovations and reduces cost for research. Practically, this means that the (meta)data will have to be standardized, transparent and readily available for researchers. The microbiome data and associated metadata will then be described with regards to composition and origin, in order to maximize the possibilities for application in various contexts of research. Here, we propose a set of tools and protocols to develop a real-time FAIR (Findable. Accessible, Interoperable and Reusable) compliant database for the handling and storage of human microbiome and host-associated data. Methods: The conflicts arising from privacy laws with respect to metadata, possible human genome sequences in the metagenome shotgun data and FAIR implementations are discussed. Alternate pathways for achieving compliance in such conflicts are analyzed. Sample traceable and sensitive microbiome data, such as DNA sequences or geolocalized metadata are identified, and the role of the GDPR (General Data Protection Regulation) data regulations are considered. For the construction of the database, procedures have been realized to make data FAIR compliant, while preserving privacy of the participants providing the data. Results and discussion: An open-source development platform, Supabase, was used to implement the microbiome database. Researchers can deploy this real-time database to access, upload, download and interact with human microbiome data in a FAIR complaint manner. In addition, a large language model (LLM) powered by ChatGPT is developed and deployed to enable knowledge dissemination and non-expert usage of the database.


Microbiota , Humans , Microbiota/genetics , Databases, Factual , Metadata , Metagenome , Information Dissemination , Computational Biology/methods , Metagenomics/methods , Databases, Genetic
18.
Front Immunol ; 15: 1354348, 2024.
Article En | MEDLINE | ID: mdl-38774864

Background: Systemic lupus erythematosus (SLE) is a multi-organ chronic autoimmune disease. Inflammatory bowel disease (IBD) is a common chronic inflammatory disease of the gastrointestinal tract. Previous studies have shown that SLE and IBD share common pathogenic pathways and genetic susceptibility, but the specific pathogenic mechanisms remain unclear. Methods: The datasets of SLE and IBD were downloaded from the Gene Expression Omnibus (GEO). Differentially expressed genes (DEGs) were identified using the Limma package. Weighted gene coexpression network analysis (WGCNA) was used to determine co-expression modules related to SLE and IBD. Pathway enrichment was performed using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis for co-driver genes. Using the Least AbsoluteShrinkage and Selection Operator (Lasso) regressionand Support Vector Machine-Recursive Feature Elimination (SVM-RFE), common diagnostic markers for both diseases were further evaluated. Then, we utilizedthe CIBERSORT method to assess the abundance of immune cell infiltration. Finally,we used the single-cell analysis to obtain the location of common diagnostic markers. Results: 71 common driver genes were identified in the SLE and IBD cohorts based on the DEGs and module genes. KEGG and GO enrichment results showed that these genes were closely associated with positive regulation of programmed cell death and inflammatory responses. By using LASSO regression and SVM, five hub genes (KLRF1, GZMK, KLRB1, CD40LG, and IL-7R) were ultimately determined as common diagnostic markers for SLE and IBD. ROC curve analysis also showed good diagnostic performance. The outcomes of immune cell infiltration demonstrated that SLE and IBD shared almost identical immune infiltration patterns. Furthermore, the majority of the hub genes were commonly expressed in NK cells by single-cell analysis. Conclusion: This study demonstrates that SLE and IBD share common diagnostic markers and pathogenic pathways. In addition, SLE and IBD show similar immune cellinfiltration microenvironments which provides newperspectives for future treatment.


Biomarkers , Gene Expression Profiling , Gene Regulatory Networks , Inflammatory Bowel Diseases , Lupus Erythematosus, Systemic , Humans , Lupus Erythematosus, Systemic/genetics , Lupus Erythematosus, Systemic/diagnosis , Lupus Erythematosus, Systemic/immunology , Inflammatory Bowel Diseases/genetics , Inflammatory Bowel Diseases/diagnosis , Inflammatory Bowel Diseases/immunology , Transcriptome , Computational Biology/methods , Gene Ontology , Databases, Genetic
19.
Zhong Nan Da Xue Xue Bao Yi Xue Ban ; 49(2): 207-219, 2024 Feb 28.
Article En, Zh | MEDLINE | ID: mdl-38755717

OBJECTIVES: Abnormal immune system activation and inflammation are crucial in causing Parkinson's disease. However, we still don't fully understand how certain immune-related genes contribute to the disease's development and progression. This study aims to screen key immune-related gene in Parkinson's disease based on weighted gene co-expression network analysis (WGCNA) and machine learning. METHODS: This study downloaded the gene chip data from the Gene Expression Omnibus (GEO) database, and used WGCNA to screen out important gene modules related to Parkinson's disease. Genes from important modules were exported and a Venn diagram of important Parkinson's disease-related genes and immune-related genes was drawn to screen out immune related genes of Parkinson's disease. Gene ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) were used to analyze the the functions of immune-related genes and signaling pathways involved. Immune cell infiltration analysis was performed using the CIBERSORT package of R language. Using bioinformatics method and 3 machine learning methods [least absolute shrinkage and selection operator (LASSO) regression, random forest (RF), and support vector machine (SVM)], the immune-related genes of Parkinson's disease were further screened. A Venn diagram of differentially expressed genes screened using the 4 methods was drawn with the intersection gene being hub nodes (hub) gene. The downstream proteins of the Parkinson's disease hub gene was identified through the STRING database and a protein-protein interaction network diagram was drawn. RESULTS: A total of 218 immune genes related to Parkinson's disease were identified, including 45 upregulated genes and 50 downregulated genes. Enrichment analysis showed that the 218 genes were mainly enriched in immune system response to foreign substances and viral infection pathways. The results of immune infiltration analysis showed that the infiltration percentages of CD4+ T cells, NK cells, CD8+ T cells, and B cells were higher in the samples of Parkinson's disease patients, while resting NK cells and resting CD4+ T cells were significantly infiltrated in the samples of Parkinson's disease patients. ANK1 was screened out as the hub gene. The analysis of the protein-protein interaction network showed that the ANK1 translated and expressed 11 proteins which mainly participated in functions such as signal transduction, iron homeostasis regulation, and immune system activation. CONCLUSIONS: This study identifies the Parkinson's disease immune-related key gene ANK1 via WGCNA and machine learning methods, suggesting its potential as a candidate therapeutic target for Parkinson's disease.


Gene Regulatory Networks , Machine Learning , Parkinson Disease , Parkinson Disease/genetics , Parkinson Disease/immunology , Humans , Gene Expression Profiling , Computational Biology/methods , Gene Ontology , Databases, Genetic , Signal Transduction/genetics , Oligonucleotide Array Sequence Analysis
20.
Brief Bioinform ; 25(3)2024 Mar 27.
Article En | MEDLINE | ID: mdl-38747283

The analysis and comparison of gene neighborhoods is a powerful approach for exploring microbial genome structure, function, and evolution. Although numerous tools exist for genome visualization and comparison, genome exploration across large genomic databases or user-generated datasets remains a challenge. Here, we introduce AnnoView, a web server designed for interactive exploration of gene neighborhoods across the bacterial and archaeal tree of life. Our server offers users the ability to identify, compare, and visualize gene neighborhoods of interest from 30 238 bacterial genomes and 1672 archaeal genomes, through integration with the comprehensive Genome Taxonomy Database and AnnoTree databases. Identified gene neighborhoods can be visualized using pre-computed functional annotations from different sources such as KEGG, Pfam and TIGRFAM, or clustered based on similarity. Alternatively, users can upload and explore their own custom genomic datasets in GBK, GFF or CSV format, or use AnnoView as a genome browser for relatively small genomes (e.g. viruses and plasmids). Ultimately, we anticipate that AnnoView will catalyze biological discovery by enabling user-friendly search, comparison, and visualization of genomic data. AnnoView is available at http://annoview.uwaterloo.ca.


Software , Databases, Genetic , Genome, Bacterial , Genome, Archaeal , Genomics/methods , Archaea/genetics , Genes, Microbial/genetics , Computational Biology/methods , Bacteria/genetics , Bacteria/classification
...