Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 260
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Trends Genet ; 39(4): 308-319, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36750393

RESUMO

Pathway enrichment analysis is indispensable for interpreting omics datasets and generating hypotheses. However, the foundations of enrichment analysis remain elusive to many biologists. Here, we discuss best practices in interpreting different types of omics data using pathway enrichment analysis and highlight the importance of considering intrinsic features of various types of omics data. We further explain major components that influence the outcomes of a pathway enrichment analysis, including defining background sets and choosing reference annotation databases. To improve reproducibility, we describe how to standardize reporting methodological details in publications. This article aims to serve as a primer for biologists to leverage the wealth of omics resources and motivate bioinformatics tool developers to enhance the power of pathway enrichment analysis.


Assuntos
Biologia Computacional , Reprodutibilidade dos Testes
2.
Am J Hum Genet ; 110(1): 44-57, 2023 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-36608684

RESUMO

Integrative genetic association methods have shown great promise in post-GWAS (genome-wide association study) analyses, in which one of the most challenging tasks is identifying putative causal genes and uncovering molecular mechanisms of complex traits. Recent studies suggest that prevailing computational approaches, including transcriptome-wide association studies (TWASs) and colocalization analysis, are individually imperfect, but their joint usage can yield robust and powerful inference results. This paper presents INTACT, a computational framework to integrate probabilistic evidence from these distinct types of analyses and implicate putative causal genes. This procedure is flexible and can work with a wide range of existing integrative analysis approaches. It has the unique ability to quantify the uncertainty of implicated genes, enabling rigorous control of false-positive discoveries. Taking advantage of this highly desirable feature, we further propose an efficient algorithm, INTACT-GSE, for gene set enrichment analysis based on the integrated probabilistic evidence. We examine the proposed computational methods and illustrate their improved performance over the existing approaches through simulation studies. We apply the proposed methods to analyze the multi-tissue eQTL data from the GTEx project and eight large-scale complex- and molecular-trait GWAS datasets from multiple consortia and the UK Biobank. Overall, we find that the proposed methods markedly improve the existing putative gene implication methods and are particularly advantageous in evaluating and identifying key gene sets and biological pathways underlying complex traits.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Transcriptoma/genética , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial/genética , Locos de Características Quantitativas/genética , Simulação por Computador , Polimorfismo de Nucleotídeo Único/genética , Predisposição Genética para Doença
3.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38436561

RESUMO

Enrichment analysis (EA) is a common approach to gain functional insights from genome-scale experiments. As a consequence, a large number of EA methods have been developed, yet it is unclear from previous studies which method is the best for a given dataset. The main issues with previous benchmarks include the complexity of correctly assigning true pathways to a test dataset, and lack of generality of the evaluation metrics, for which the rank of a single target pathway is commonly used. We here provide a generalized EA benchmark and apply it to the most widely used EA methods, representing all four categories of current approaches. The benchmark employs a new set of 82 curated gene expression datasets from DNA microarray and RNA-Seq experiments for 26 diseases, of which only 13 are cancers. In order to address the shortcomings of the single target pathway approach and to enhance the sensitivity evaluation, we present the Disease Pathway Network, in which related Kyoto Encyclopedia of Genes and Genomes pathways are linked. We introduce a novel approach to evaluate pathway EA by combining sensitivity and specificity to provide a balanced evaluation of EA methods. This approach identifies Network Enrichment Analysis methods as the overall top performers compared with overlap-based methods. By using randomized gene expression datasets, we explore the null hypothesis bias of each method, revealing that most of them produce skewed P-values.


Assuntos
Benchmarking , RNA-Seq
4.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36572652

RESUMO

BACKGROUND: Global or untargeted metabolomics is widely used to comprehensively investigate metabolic profiles under various pathophysiological conditions such as inflammations, infections, responses to exposures or interactions with microbial communities. However, biological interpretation of global metabolomics data remains a daunting task. Recent years have seen growing applications of pathway enrichment analysis based on putative annotations of liquid chromatography coupled with mass spectrometry (LC-MS) peaks for functional interpretation of LC-MS-based global metabolomics data. However, due to intricate peak-metabolite and metabolite-pathway relationships, considerable variations are observed among results obtained using different approaches. There is an urgent need to benchmark these approaches to inform the best practices. RESULTS: We have conducted a benchmark study of common peak annotation approaches and pathway enrichment methods in current metabolomics studies. Representative approaches, including three peak annotation methods and four enrichment methods, were selected and benchmarked under different scenarios. Based on the results, we have provided a set of recommendations regarding peak annotation, ranking metrics and feature selection. The overall better performance was obtained for the mummichog approach. We have observed that a ~30% annotation rate is sufficient to achieve high recall (~90% based on mummichog), and using semi-annotated data improves functional interpretation. Based on the current platforms and enrichment methods, we further propose an identifiability index to indicate the possibility of a pathway being reliably identified. Finally, we evaluated all methods using 11 COVID-19 and 8 inflammatory bowel diseases (IBD) global metabolomics datasets.


Assuntos
COVID-19 , Espectrometria de Massas em Tandem , Humanos , Cromatografia Líquida/métodos , Metabolômica/métodos , Metaboloma
5.
Hum Genomics ; 18(1): 15, 2024 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-38326862

RESUMO

BACKGROUND: It is valuable to analyze the genome-wide association studies (GWAS) data for a complex disease phenotype in the context of the protein-protein interaction (PPI) network, as the related pathophysiology results from the function of interacting polyprotein pathways. The analysis may include the design and curation of a phenotype-specific GWAS meta-database incorporating genotypic and eQTL data linking to PPI and other biological datasets, and the development of systematic workflows for PPI network-based data integration toward protein and pathway prioritization. Here, we pursued this analysis for blood pressure (BP) regulation. METHODS: The relational scheme of the implemented in Microsoft SQL Server BP-GWAS meta-database enabled the combined storage of: GWAS data and attributes mined from GWAS Catalog and the literature, Ensembl-defined SNP-transcript associations, and GTEx eQTL data. The BP-protein interactome was reconstructed from the PICKLE PPI meta-database, extending the GWAS-deduced network with the shortest paths connecting all GWAS-proteins into one component. The shortest-path intermediates were considered as BP-related. For protein prioritization, we combined a new integrated GWAS-based scoring scheme with two network-based criteria: one considering the protein role in the reconstructed by shortest-path (RbSP) interactome and one novel promoting the common neighbors of GWAS-prioritized proteins. Prioritized proteins were ranked by the number of satisfied criteria. RESULTS: The meta-database includes 6687 variants linked with 1167 BP-associated protein-coding genes. The GWAS-deduced PPI network includes 1065 proteins, with 672 forming a connected component. The RbSP interactome contains 1443 additional, network-deduced proteins and indicated that essentially all BP-GWAS proteins are at most second neighbors. The prioritized BP-protein set was derived from the union of the most BP-significant by any of the GWAS-based or the network-based criteria. It included 335 proteins, with ~ 2/3 deduced from the BP PPI network extension and 126 prioritized by at least two criteria. ESR1 was the only protein satisfying all three criteria, followed in the top-10 by INSR, PTN11, CDK6, CSK, NOS3, SH2B3, ATP2B1, FES and FINC, satisfying two. Pathway analysis of the RbSP interactome revealed numerous bioprocesses, which are indeed functionally supported as BP-associated, extending our understanding about BP regulation. CONCLUSIONS: The implemented workflow could be used for other multifactorial diseases.


Assuntos
Estudo de Associação Genômica Ampla , Mapas de Interação de Proteínas , Humanos , Mapas de Interação de Proteínas/genética , Estudo de Associação Genômica Ampla/métodos , Pressão Sanguínea/genética , Genótipo , Bases de Dados Factuais , ATPases Transportadoras de Cálcio da Membrana Plasmática
6.
Plant J ; 116(4): 1097-1117, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37824297

RESUMO

We have developed a compendium and interactive platform, named Stress Combinations and their Interactions in Plants Database (SCIPDb; http://www.nipgr.ac.in/scipdb.php), which offers information on morpho-physio-biochemical (phenome) and molecular (transcriptome and metabolome) responses of plants to different stress combinations. SCIPDb is a plant stress informatics hub for data mining on phenome, transcriptome, trait-gene ontology, and data-driven research for advancing mechanistic understanding of combined stress biology. We analyzed global phenome data from 939 studies to delineate the effects of various stress combinations on yield in major crops and found that yield was substantially affected under abiotic-abiotic stresses. Transcriptome datasets from 36 studies hosted in SCIPDb identified novel genes, whose roles have not been earlier established in combined stress. Integretome analysis under combined drought-heat stress pinpointed carbohydrate, amino acid, and energy metabolism pathways as the crucial metabolic, proteomic, and transcriptional components in plant tolerance to combined stress. These examples illustrate the application of SCIPDb in identifying novel genes and pathways involved in combined stress tolerance. Further, we showed the application of this database in identifying novel candidate genes and pathways for combined drought and pathogen stress tolerance. To our knowledge, SCIPDb is the only publicly available platform offering combined stress-specific omics big data visualization tools, such as an interactive scrollbar, stress matrix, radial tree, global distribution map, meta-phenome analysis, search, BLAST, transcript expression pattern table, Manhattan plot, and co-expression network. These tools facilitate a better understanding of the mechanisms underlying plant responses to combined stresses.


Assuntos
Plantas , Proteômica , Plantas/genética , Transcriptoma , Estresse Fisiológico/genética , Fenótipo , Secas , Regulação da Expressão Gênica de Plantas/genética
7.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35453140

RESUMO

Pathway enrichment analysis has become a widely used knowledge-based approach for the interpretation of biomedical data. Its popularity has led to an explosion of both enrichment methods and pathway databases. While the elegance of pathway enrichment lies in its simplicity, multiple factors can impact the results of such an analysis, which may not be accounted for. Researchers may fail to give influential aspects their due, resorting instead to popular methods and gene set collections, or default settings. Despite ongoing efforts to establish set guidelines, meaningful results are still hampered by a lack of consensus or gold standards around how enrichment analysis should be conducted. Nonetheless, such concerns have prompted a series of benchmark studies specifically focused on evaluating the influence of various factors on pathway enrichment results. In this review, we organize and summarize the findings of these benchmarks to provide a comprehensive overview on the influence of these factors. Our work covers a broad spectrum of factors, spanning from methodological assumptions to those related to prior biological knowledge, such as pathway definitions and database choice. In doing so, we aim to shed light on how these aspects can lead to insignificant, uninteresting or even contradictory results. Finally, we conclude the review by proposing future benchmarks as well as solutions to overcome some of the challenges, which originate from the outlined factors.


Assuntos
Bases de Dados Factuais , Análise Fatorial , Estudos Longitudinais
8.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-36063560

RESUMO

Biological pathways are a broadly used formalism for representing and interpreting the cascade of biochemical reactions underlying cellular and biological mechanisms. Pathway representation provides an ontological link among biomolecules such as RNA, DNA, small molecules, proteins, protein complexes, hormones and genes. Frequently, pathway annotations are used to identify mechanisms linked to genes within affected biological contexts. This important role and the simplicity and elegance in representing complex interactions led to an explosion of pathway representations and databases. Unfortunately, the lack of overlap across databases results in inconsistent enrichment analysis results, unless databases are integrated. However, due to absence of consensus, guidelines or gold standards in pathway definition and representation, integration of data across pathway databases is not straightforward. Despite multiple attempts to provide consolidated pathways, highly related, redundant, poorly overlapping or ambiguous pathways continue to render pathways analysis inconsistent and hard to interpret. Ontology-based integration will promote unbiased, comprehensive yet streamlined analysis of experiments, and will reduce the number of enriched pathways when performing pathway enrichment analysis. Moreover, appropriate and consolidated pathways provide better training data for pathway prediction algorithms. In this manuscript, we describe the current methods for pathway consolidation, their strengths and pitfalls, and highlight directions for future improvements to this research area.


Assuntos
Algoritmos , Proteínas , Bases de Dados Factuais , Hormônios , Anotação de Sequência Molecular , RNA
9.
Cytokine ; 180: 156609, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38781871

RESUMO

BACKGROUND: We aim to deal with the Hub-genes and signalling pathways connected with Sepsis-associated encephalopathy (SAE). METHODS: The raw datasets were acquired from the Gene Expression Omnibus (GEO) database (GSE198861 and GSE167610). R software filtered the differentially expressed genes (DEGs) for hub genes exploited for Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. Hub genes were identified from the intersection of DEGs via protein-protein interaction (PPI) network. And the single-cell dataset (GSE101901) was used to authenticate where the hub genes express in hippocampus cells. Cell-cell interaction analysis and Gene Set Variation Analysis (GSVA) analysis of the whole transcriptome validated the interactions between hippocampal cells. RESULTS: A total of 161 DEGs were revealed in GSE198861 and GSE167610 datasets. Biological function analysis showed that the DEGs were primarily involved in the phagosome pathway and significantly enriched. The PPI network extracted 10 Hub genes. The M2 Macrophage cell decreased significantly during the acute period, and the hub gene may play a role in this biological process. The hippocampal variation pathway was associated with the MAPK signaling pathway. CONCLUSION: Hub genes (Pecam1, Cdh5, Fcgr, C1qa, Vwf, Vegfa, C1qb, C1qc, Fcgr4 and Fcgr2b) may paticipate in the biological process of SAE.


Assuntos
Mapas de Interação de Proteínas , Encefalopatia Associada a Sepse , Humanos , Encefalopatia Associada a Sepse/genética , Encefalopatia Associada a Sepse/metabolismo , Mapas de Interação de Proteínas/genética , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Hipocampo/metabolismo , Transdução de Sinais/genética , Transcriptoma/genética , Animais , Sepse/genética , Sepse/metabolismo
10.
Exp Dermatol ; 33(3): e15043, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38459629

RESUMO

Despite progress made with immune checkpoint inhibitors and targeted therapies, skin cancer remains a significant public health concern in the United States. The intricacies of the disease, encompassing genetics, immune responses, and external factors, call for a comprehensive approach. Techniques in systems genetics, including transcriptional correlation analysis, functional pathway enrichment analysis, and protein-protein interaction network analysis, prove valuable in deciphering intricate molecular mechanisms and identifying potential diagnostic and therapeutic targets for skin cancer. Recent studies demonstrate the efficacy of these techniques in uncovering molecular processes and pinpointing diagnostic markers for various skin cancer types, highlighting the potential of systems genetics in advancing innovative therapies. While certain limitations exist, such as generalizability and contextualization of external factors, the ongoing progress in AI technologies provides hope in overcoming these challenges. By providing protocols and a practical example involving Braf, we aim to inspire early-career experimental dermatologists to adopt these tools and seamlessly integrate these techniques into their skin cancer research, positioning them at the forefront of innovative approaches in combating this devastating disease.


Assuntos
Neoplasias Cutâneas , Humanos , Neoplasias Cutâneas/genética , Pele
11.
BMC Gastroenterol ; 24(1): 60, 2024 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-38308210

RESUMO

Ulcerative colitis (UC) is a chronic inflammatory disease that targets the colon and has seen an increasing prevalence worldwide. In our pursuit of new diagnostic and therapeutic approaches for UC, we undertook a sequencing of colons from UC mouse models. We focused on analyzing their differentially expressed genes (DEGs), enriching pathways, and constructing protein-protein interaction (PPI) and Competing Endogenous RNA (ceRNA) networks. Our analysis highlighted novel DEGs such as Tppp3, Saa3, Cemip, Pappa, and Nr1d1. These DEGs predominantly play roles in pathways like cytokine-mediated signaling, extracellular matrix organization, extracellular structure organization, and external encapsulating structure organization. This suggests that the UC pathogenesis is intricately linked to the interactions between immune and non-immune cells with the extracellular matrix (ECM). To corroborate our findings, we also verified certain DEGs through quantitative real-time PCR. Within the PPI network, nodes like Stat3, Il1b, Mmp3, and Lgals3 emerged as significant and were identified to be involved in the crucial cytokine-mediated signaling pathway, which is central to inflammation. Our ceRNA network analysis further brought to light the role of the Smad7 Long non-coding RNA (lncRNA). Key MicroRNA (miRNAs) in the ceRNA network were pinpointed as mmu-miR-17-5p, mmu-miR-93-5p, mmu-miR-20b-5p, mmu-miR-16-5p, and mmu-miR-106a-5p, while central mRNAs included Egln3, Plagl2, Sema7a, Arrdc3, and Stat3. These insights imply that ceRNA networks are influential in UC progression and could provide further clarity on its pathogenesis. In conclusion, this research deepens our understanding of UC pathogenesis and paves the way for potential new diagnostic and therapeutic methods. Nevertheless, to solidify our findings, additional experiments are essential to confirm the roles and molecular interplay of the identified DEGs in UC.


Assuntos
Colite Ulcerativa , MicroRNAs , Animais , Camundongos , Colite Ulcerativa/genética , Intestinos , Inflamação/genética , MicroRNAs/genética , Modelos Animais de Doenças
12.
BMC Cardiovasc Disord ; 24(1): 375, 2024 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-39026189

RESUMO

BACKGROUND: Acute myocardial injury, cytokine storms, hypoxemia and pathogen-mediated damage were the major causes responsible for mortality induced by coronavirus disease 2019 (COVID-19)-related myocarditis. These need ECMO treatment. We investigated differentially expressed genes (DEGs) in patients with COVID-19-related myocarditis and ECMO prognosis. METHODS: GSE150392 and GSE93101 were analyzed to identify DEGs. A Venn diagram was used to obtain the same transcripts between myocarditis-related and ECMO-related DEGs. Enrichment pathway analysis was performed and hub genes were identified. Pivotal miRNAs, transcription factors, and chemicals with the screened gene interactions were identified. The GSE167028 dataset and single-cell sequencing data were used to validate the screened genes. RESULTS: Using a Venn diagram, 229 overlapping DEGs were identified between myocarditis-related and ECMO-related DEGs, which were mainly involved in T cell activation, contractile actin filament bundle, actomyosin, cyclic nucleotide phosphodiesterase activity, and cytokine-cytokine receptor interaction. 15 hub genes and 15 neighboring DEGs were screened, which were mainly involved in the positive regulation of T cell activation, integrin complex, integrin binding, the PI3K-Akt signaling pathway, and the TNF signaling pathway. Data in GSE167028 and single-cell sequencing data were used to validate the screened genes, and this demonstrated that the screened genes CCL2, APOE, ITGB8, LAMC2, COL6A3 and TNC were mainly expressed in fibroblast cells; IL6, ITGA1, PTK2, ITGB5, IL15, LAMA4, CAV1, SNCA, BDNF, ACTA2, CD70, MYL9, DPP4, ENO2 and VEGFC were expressed in cardiomyocytes; IL6, PTK2, ITGB5, IL15, APOE, JUN, SNCA, CD83, DPP4 and ENO2 were expressed in macrophages; and IL6, ITGA1, PTK2, ITGB5, IL15, VCAM1, LAMA4, CAV1, ACTA2, MYL9, CD83, DPP4, ENO2, VEGFC and IL32 were expressed in vascular endothelial cells. CONCLUSION: The screened hub genes, IL6, ITGA1, PTK2, ITGB3, ITGB5, CCL2, IL15, VCAM1, GZMB, APOE, ITGB8, LAMA4, LAMC2, COL6A3 and TNFRSF9, were validated using GEO dataset and single-cell sequencing data, which may be therapeutic targets patients with myocarditis to prevent MI progression and adverse cardiovascular events.


Assuntos
COVID-19 , Oxigenação por Membrana Extracorpórea , Miocardite , Humanos , COVID-19/genética , COVID-19/terapia , COVID-19/complicações , Miocardite/genética , Miocardite/terapia , Miocardite/virologia , Prognóstico , Perfilação da Expressão Gênica , Bases de Dados Genéticas , SARS-CoV-2 , Redes Reguladoras de Genes , Transcriptoma
13.
Environ Res ; 243: 117776, 2024 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-38043890

RESUMO

INTRODUCTION: Exposure to metals is associated with increased risk of type 2 diabetes (T2D). Potential mechanisms for metals-T2D associations involve biological processes including oxidative stress and disruption of insulin-regulated glucose uptake. In this study, we assessed whether associations between metal exposure and metabolite profiles relate to biological pathways linked to T2D. MATERIALS AND METHODS: We used data from 29 adults rural Colorado residents enrolled in the San Luis Valley Diabetes Study. Urinary concentrations of arsenic, cadmium, cobalt, lead, manganese, and tungsten were measured. Metabolic effects were evaluated using untargeted metabolic profiling, which included 61,851 metabolite signals detected in serum. We evaluated cross-sectional associations between metals and metabolites present in at least 50% of samples. Primary analyses adjusted urinary heavy metal concentrations for creatinine. Metabolite outcomes associated with each metal exposure were evaluated using pathway enrichment to investigate potential mechanisms underlying the relationship between metals and T2D. RESULTS: Participants had a mean age of 58.5 years (standard deviation = 9.2), 48.3% were female, 48.3% identified as Hispanic/Latino, 13.8% were current smokers, and 65.5% had T2D. Of the detected metabolites, 455 were associated with at least one metal, including 42 associated with arsenic, 22 with cadmium, 10 with cobalt, 313 with lead, 66 with manganese, and two with tungsten. The metabolic features were linked to 24 pathways including linoleate metabolism, butanoate metabolism, and arginine and proline metabolism. Several of these pathways have been previously associated with T2D, and our results were similar when including only participants with T2D. CONCLUSIONS: Our results support the hypothesis that metals exposure may be associated with biological processes related to T2D, including amino acid, co-enzyme, and sugar and fatty acid metabolism. Insight into biological pathways could influence interventions to prevent adverse health outcomes due to metal exposure.


Assuntos
Arsênio , Diabetes Mellitus Tipo 2 , Metais Pesados , Adulto , Humanos , Feminino , Pessoa de Meia-Idade , Masculino , Diabetes Mellitus Tipo 2/epidemiologia , Manganês , Cádmio , Arsênio/toxicidade , Tungstênio , Estudos Transversais , Cobalto
14.
Environ Res ; 247: 118276, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38246299

RESUMO

Ambient PM2.5 exposure has been recognized as a major health risk and related to aging, cardiovascular, respiratory and neurologic diseases, and cancer. However, underlying mechanism of epigenetic alteration and regulated pathways still remained unclear. The study on methylome effect of PM2.5 exposure was quite limited in Chinese population, and cohort-based study was absent. The study included blood-derived DNA methylation for 3365 Chinese participants from the NSPT cohort. We estimated individual PM2.5 exposure level of short-medium-, medium- and long-term, based on a validated prediction model. We preformed epigenome-wide association studies to estimate the links between PM2.5 exposure and DNA methylation change, as well as stratification and sensitive analysis to examined the robustness of the association models. A systematic review was conducted to obtain the previously published CpGs and examined for replication. We also conducted comparison on the DNA methylation variation corresponding to different time windows. We further conducted gene function analysis and pathway enrichment analysis to reveal related biological response. We identified a total of 177 CpGs and 107 DMRs associated with short-medium-term PM2.5 exposure, at a strict genome-wide significance (P < 5 × 10-8). The effect sizes on most CpGs tended to cease with the exposure of extended time scale. Associated markers and aligned genes were related to aging, immunity, inflammation and carcinogenesis. Enriched pathways were mostly involved in cell cycle and cell division, signal transduction, inflammatory pathway. Our study is the first EWAS on PM2.5 exposure conducted in large-scale Han Chinese cohort and identified associated DNA methylation change on CpGs and regions, as well as related gene functions and pathways.

15.
Metab Brain Dis ; 39(1): 29-42, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38153584

RESUMO

Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition characterized by altered brain connectivity and function. In this study, we employed advanced bioinformatics and explainable AI to analyze gene expression associated with ASD, using data from five GEO datasets. Among 351 neurotypical controls and 358 individuals with autism, we identified 3,339 Differentially Expressed Genes (DEGs) with an adjusted p-value (≤ 0.05). A subsequent meta-analysis pinpointed 342 DEGs (adjusted p-value ≤ 0.001), including 19 upregulated and 10 down-regulated genes across all datasets. Shared genes, pathogenic single nucleotide polymorphisms (SNPs), chromosomal positions, and their impact on biological pathways were examined. We identified potential biomarkers (HOXB3, NR2F2, MAPK8IP3, PIGT, SEMA4D, and SSH1) through text mining, meriting further investigation. Additionally, we shed light on the roles of RPS4Y1 and KDM5D genes in neurogenesis and neurodevelopment. Our analysis detected 1,286 SNPs linked to ASD-related conditions, of which 14 high-risk SNPs were located on chromosomes 10 and X. We highlighted potential missense SNPs associated with FGFR inhibitors, suggesting that it may serve as a promising biomarker for responsiveness to targeted therapies. Our explainable AI model identified the MID2 gene as a potential ASD biomarker. This research unveils vital genes and potential biomarkers, providing a foundation for novel gene discovery in complex diseases.


Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Humanos , Transtorno do Espectro Autista/diagnóstico , Transtorno do Espectro Autista/genética , Biomarcadores , Encéfalo , Genômica , Antígenos de Histocompatibilidade Menor , Histona Desmetilases
16.
Molecules ; 29(12)2024 Jun 09.
Artigo em Inglês | MEDLINE | ID: mdl-38930811

RESUMO

Due to the intricate complexity of the original microbiota, residual heat-resistant enzymes, and chemical components, identifying the essential factors that affect dairy quality using traditional methods is challenging. In this study, raw milk, pasteurized milk, and ultra-heat-treated (UHT) milk samples were collectively analyzed using metagenomic next-generation sequencing (mNGS), high-throughput liquid chromatography-mass spectrometry (LC-MS), and gas chromatography-mass spectrometry (GC-MS). The results revealed that raw milk and its corresponding heated dairy products exhibited different trends in terms of microbiota shifts and metabolite changes during storage. Via the analysis of differences in microbiota and correlation analysis of the microorganisms present in differential metabolites in refrigerated pasteurized milk, the top three differential microorganisms with increased abundance, Microbacterium (p < 0.01), unclassified Actinomycetia class (p < 0.05), and Micrococcus (p < 0.01), were detected; these were highly correlated with certain metabolites in pasteurized milk (r > 0.8). This indicated that these genera were the main proliferating microorganisms and were the primary genera involved in the metabolism of pasteurized milk during refrigeration-based storage. Microorganisms with decreased abundance were classified into two categories based on correlation analysis with certain metabolites. It was speculated that the heat-resistant enzyme system of a group of microorganisms with high correlation (r > 0.8), such as Pseudomonas and Acinetobacter, was the main factor causing milk spoilage and that the group with lower correlation (r < 0.3) had a lower impact on the storage process of pasteurized dairy products. By comparing the metabolic pathway results based on metagenomic and metabolite annotation, it was proposed that protein degradation may be associated with microbial growth, whereas lipid degradation may be linked to raw milk's initial heat-resistant enzymes. By leveraging the synergy of metagenomics and metabolomics, the interacting factors determining the quality evolution of dairy products were systematically investigated, providing a novel perspective for controlling dairy processing and storage effectively.


Assuntos
Microbiota , Leite , Animais , Leite/microbiologia , Leite/metabolismo , Armazenamento de Alimentos/métodos , Pasteurização , Sequenciamento de Nucleotídeos em Larga Escala , Laticínios/microbiologia , Metagenômica/métodos , Cromatografia Gasosa-Espectrometria de Massas , Manipulação de Alimentos/métodos , Bactérias/metabolismo , Bactérias/classificação , Bactérias/genética , Metaboloma
17.
J Gene Med ; 25(12): e3561, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37394280

RESUMO

BACKGROUND: The present study aimed to identify the module genes and key gene functions and biological pathways of septic shock (SS) through integrated bioinformatics analysis. METHODS: In the study, we performed batch correction and principal component analysis on 282 SS samples and 79 normal control samples in three datasets, GSE26440, GSE95233 and GSE57065, to obtain a combined corrected gene expression matrix containing 21,654 transcripts. Patients with SS were then divided into three molecular subtypes according to sample subtyping analysis. RESULTS: By analyzing the demographic characteristics of the different subtypes, we found no statistically significant differences in gender ratio and age composition among the three groups. Then, three subtypes of differentially expressed genes (DEGs) and specific upregulated DEGs (SDEGs) were identified by differential gene expression analysis. We found 7361 DEGs in the type I group, 5594 DEGs in the type II group, and 7159 DEGs in the type III group. There were 1698 SDEGs in the type I group, 2443 in the type II group, and 1831 in the type III group. In addition, we analyzed the correlation between the expression data of 5972 SDEGs in the three subtypes and the gender and age of 227 patients, constructed a weighted gene co-expression network, and identified 11 gene modules, among which the module with the highest correlation with gender ratio was MEgrey. The modules with the highest correlation with age composition were MEgrey60 and MElightyellow. Then, by analyzing the differences in module genes among different subgroups of SS, we obtained the differential expression of 11 module genes in four groups: type I, type II, type III and the control group. Finally, we analyzed the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment of all module DEGs, and the GO function and KEGG pathway enrichment of different module genes were different. CONCLUSIONS: Our findings aim to identify the specific genes and intrinsic molecular functional pathways of SS subtypes, as well as further explore the genetic and molecular pathophysiological mechanisms of SS.


Assuntos
Mapas de Interação de Proteínas , Choque Séptico , Humanos , Mapas de Interação de Proteínas/genética , Choque Séptico/genética , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Biomarcadores , Biologia Computacional
18.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33316063

RESUMO

Erectile dysfunction (ED) can be caused by different diseases and controlled by several genetic networks. In this study, to identify the genes related to ED, the expression profiles of normal and ED samples were investigated by the Gene Expression Omnibus (GEO) database. Seventeen genes were identified as associated genes with ED. The protein and nucleic acid sequences of selected genes were retrieved from the UCSC database. Selected genes were diverse according to their physicochemical properties and functions. Category function revealed that selected genes are involved in pathways related to humans some diseases. Furthermore, based on protein interactions, genes associated with the insulin pathway had the greatest interaction with the studied genes. To identify the common cis-regulatory elements, the promoter site of the selected genes was retrieved from the UCSC database. The Gapped Local Alignment of Motifs tool was used for finding common conserved motifs into the promoter site of selected genes. Besides, INSR protein as an insulin receptor precursor showed a high potential site for posttranslation modifications, including phosphorylation and N-glycosylation. Also, in this study, two Guanine-Cytosine (GC)-rich regions were identified as conserved motifs in the upstream of studied genes which can be involved in regulating the expression of genes associated with ED. Also, the conserved binding site of miR-29-3p that is involved in various cancers was observed in the 3' untranslated region of genes associated with ED. Our study introduced new genes associated with ED, which can be good candidates for further analyzing related to human ED.


Assuntos
Regiões 3' não Traduzidas , Bases de Dados de Ácidos Nucleicos , Disfunção Erétil , Regulação da Expressão Gênica , Regiões Promotoras Genéticas , Disfunção Erétil/genética , Disfunção Erétil/metabolismo , Estudo de Associação Genômica Ampla , Humanos , Masculino
19.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34396389

RESUMO

Typical clustering analysis for large-scale genomics data combines two unsupervised learning techniques: dimensionality reduction and clustering (DR-CL) methods. It has been demonstrated that transforming gene expression to pathway-level information can improve the robustness and interpretability of disease grouping results. This approach, referred to as biological knowledge-driven clustering (BK-CL) approach, is often neglected, due to a lack of tools enabling systematic comparisons with more established DR-based methods. Moreover, classic clustering metrics based on group separability tend to favor the DR-CL paradigm, which may increase the risk of identifying less actionable disease subtypes that have ambiguous biological and clinical explanations. Hence, there is a need for developing metrics that assess biological and clinical relevance. To facilitate the systematic analysis of BK-CL methods, we propose a computational protocol for quantitative analysis of clustering results derived from both DR-CL and BK-CL methods. Moreover, we propose a new BK-CL method that combines prior knowledge of disease relevant genes, network diffusion algorithms and gene set enrichment analysis to generate robust pathway-level information. Benchmarking studies were conducted to compare the grouping results from different DR-CL and BK-CL approaches with respect to standard clustering evaluation metrics, concordance with known subtypes, association with clinical outcomes and disease modules in co-expression networks of genes. No single approach dominated every metric, showing the importance multi-objective evaluation in clustering analysis. However, we demonstrated that, on gene expression data sets derived from TCGA samples, the BK-CL approach can find groupings that provide significant prognostic value in both breast and prostate cancers.


Assuntos
Biomarcadores , Biologia Computacional/métodos , Mineração de Dados , Suscetibilidade a Doenças , Algoritmos , Análise por Conglomerados , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Predisposição Genética para Doença , Genômica/métodos , Humanos , Prognóstico , Transdução de Sinais , Análise de Sobrevida , Fluxo de Trabalho
20.
Exp Eye Res ; 235: 109644, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37683796

RESUMO

Sulfur mustard (SM) ocular exposure severely damages the cornea and causes vision impairment. At present, no specific therapy exists to mitigate SM-induced corneal injury and vision loss. This study performed transcriptome profiling of naïve, SM-damaged, and SM-undamaged rabbit corneas using RNA-seq analysis and bioinformatic tools to gain a better mechanistic understanding and develop SM-specific medical countermeasures. The mRNA profiles of rabbit corneas 4 weeks post SM vapor exposure were generated using Illumina-NextSeq deep sequencing (Gene Expression Omnibus accession # GSE127708). The RNA sequences of naïve (n = 4), SM-damaged (n = 5), and SM-undamaged (n = 5) corneas were subjected to differential expression (DE) analysis after quality control profiling with FastQC. DE analysis was performed using HISAT2, StringTie, and DESeq2. The log2(FC)±2 and adjusted p˂0.05 were chosen to identify the most relevant genes. A total of 5930 differentially expressed genes (DEGs) (upregulated: 3196, downregulated: 2734) were found in SM-damaged corneas compared to naïve corneas, whereas SM-undamaged corneas showed 1884 DEGs (upregulated: 1029, downregulated: 855) compared to naïve corneas. DE profiling of SM-damaged corneas to SM-undamaged corneas revealed 985 genes (upregulated: 308, downregulated: 677). The DE profiles were subsequently subjected to signaling pathway enrichment, and protein‒protein interactions (PPIs) were analyzed. Pathway enrichment was performed for the genes associated with cellular apoptosis, death, adhesion, migration, differentiation, proliferation, extracellular matrix, and tumor necrosis factor production. To identify novel targets, we narrowed the pathway analysis to upregulated and downregulated genes associated with cell proliferation and differentiation, and PPI networks were developed. Furthermore, protein targets associated with cell differentiation and proliferation that may play vital roles in corneal fibrosis and wound healing post SM injury were identified.


Assuntos
Gás de Mostarda , Animais , Coelhos , Gás de Mostarda/toxicidade , Mapas de Interação de Proteínas , RNA-Seq , Córnea , Perfilação da Expressão Gênica , Expressão Gênica , Biologia Computacional
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA