Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
J Comput Biol ; 2024 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-39117342

RESUMO

Recent technological advancements have enabled spatially resolved transcriptomic profiling but at a multicellular resolution that is more cost-effective. The task of cell type deconvolution has been introduced to disentangle discrete cell types from such multicellular spots. However, existing benchmark datasets for cell type deconvolution are either generated from simulation or limited in scale, predominantly encompassing data on mice and are not designed for human immuno-oncology. To overcome these limitations and promote comprehensive investigation of cell type deconvolution for human immuno-oncology, we introduce a large-scale spatial transcriptomic deconvolution benchmark dataset named SpatialCTD, encompassing 1.8 million cells and 12,900 pseudo spots from the human tumor microenvironment across the lung, kidney, and liver. In addition, SpatialCTD provides more realistic reference than those generated from single-cell RNA sequencing (scRNA-seq) data for most reference-based deconvolution methods. To utilize the location-aware SpatialCTD reference, we propose a graph neural network-based deconvolution method (i.e., GNNDeconvolver). Extensive experiments show that GNNDeconvolver often outperforms existing state-of-the-art methods by a substantial margin, without requiring scRNA-seq data. To enable comprehensive evaluations of spatial transcriptomics data from flexible protocols, we provide an online tool capable of converting spatial transcriptomic data from various platforms (e.g., 10× Visium, MERFISH, and sci-Space) into pseudo spots, featuring adjustable spot size. The SpatialCTD dataset and GNNDeconvolver implementation are available at https://github.com/OmicsML/SpatialCTD, and the online converter tool can be accessed at https://omicsml.github.io/SpatialCTD/.

2.
Front Cardiovasc Med ; 11: 1414974, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39055656

RESUMO

Background: Atrial fibrillation (AF) is a common persistent arrhythmia characterized by rapid and chaotic atrial electrical activity, potentially leading to severe complications such as thromboembolism, heart failure, and stroke, significantly affecting patient quality of life and safety. As the global population ages, the prevalence of AF is on the rise, placing considerable strains on individuals and healthcare systems. This study utilizes bioinformatics and Mendelian Randomization (MR) to analyze transcriptome data and genome-wide association study (GWAS) summary statistics, aiming to identify biomarkers causally associated with AF and explore their potential pathogenic pathways. Methods: We obtained AF microarray datasets GSE41177 and GSE79768 from the Gene Expression Omnibus (GEO) database, merged them, and corrected for batch effects to pinpoint differentially expressed genes (DEGs). We gathered exposure data from expression quantitative trait loci (eQTL) and outcome data from AF GWAS through the IEU Open GWAS database. We employed inverse variance weighting (IVW), MR-Egger, weighted median, and weighted model approaches for MR analysis to assess exposure-outcome causality. IVW was the primary method, supplemented by other techniques. The robustness of our results was evaluated using Cochran's Q test, MR-Egger intercept, MR-PRESSO, and leave-one-out sensitivity analysis. A "Veen" diagram visualized the overlap of DEGs with significant eQTL genes from MR analysis, referred to as common genes (CGs). Additional analyses, including Gene Ontology (GO) enrichment, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and immune cell infiltration studies, were conducted on these intersecting genes to reveal their roles in AF pathogenesis. Results: The combined dataset revealed 355 differentially expressed genes (DEGs), with 228 showing significant upregulation and 127 downregulated. Mendelian randomization (MR) analysis identified that the autocrine motility factor receptor (AMFR) [IVW: OR = 0.977; 95% CI, 0.956-0.998; P = 0.030], leucine aminopeptidase 3 (LAP3) [IVW: OR = 0.967; 95% CI, 0.934-0.997; P = 0.048], Rab acceptor 1 (RABAC1) [IVW: OR = 0.928; 95% CI, 0.875-0.985; P = 0.015], and tryptase beta 2 (TPSB2) [IVW: OR = 0.971; 95% CI, 0.943-0.999; P = 0.049] are associated with a reduced risk of atrial fibrillation (AF). Conversely, GTPase-activating SH3 domain-binding protein 2 (G3BP2) [IVW: OR = 1.030; 95% CI, 1.004-1.056; P = 0.024], integrin subunit beta 2 (ITGB2) [IVW: OR = 1.050; 95% CI, 1.017-1.084; P = 0.003], glutaminyl-peptide cyclotransferase (QPCT) [IVW: OR = 1.080; 95% CI, 1.010-0.997; P = 1.154], and tripartite motif containing 22 (TRIM22) [IVW: OR = 1.048; 95% CI, 1.003-1.095; P = 0.035] are positively associated with AF risk. Sensitivity analyses indicated a lack of heterogeneity or horizontal pleiotropy (P > 0.05), and leave-one-out analysis did not reveal any single nucleotide polymorphisms (SNPs) impacting the MR results significantly. GO and KEGG analyses showed that CG is involved in processes such as protein polyubiquitination, neutrophil degranulation, specific and tertiary granule formation, protein-macromolecule adaptor activity, molecular adaptor activity, and the SREBP signaling pathway, all significantly enriched. The analysis of immune cell infiltration demonstrated associations of CG with various immune cells, including plasma cells, CD8T cells, resting memory CD4T cells, regulatory T cells (Tregs), gamma delta T cells, activated NK cells, activated mast cells, and neutrophils. Conclusion: By integrating bioinformatics and MR approaches, genes such as AMFR, G3BP2, ITGB2, LAP3, QPCT, RABAC1, TPSB2, and TRIM22 are identified as causally linked to AF, enhancing our understanding of its molecular foundations. This strategy may facilitate the development of more precise biomarkers and therapeutic targets for AF diagnosis and treatment.

3.
Methods Mol Biol ; 2822: 293-309, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38907925

RESUMO

Dynamic and reversible N6-methyladenosine (m6A) modifications are associated with many essential cellular functions as well as physiological and pathological phenomena. In-depth study of m6A co-functional patterns in epi-transcriptomic data may help to understand its complex regulatory mechanisms. In this chapter, we describe several biclustering mining algorithms for epi-transcriptomic data to discover potential co-functional patterns. The concepts and computational methods discussed in this chapter will be particularly useful for researchers working in related fields. We also aim to introduce new deep learning techniques into the field of co-functional analysis of epi-transcriptomic data.


Assuntos
Adenosina , Algoritmos , Biologia Computacional , Transcriptoma , Adenosina/análogos & derivados , Adenosina/metabolismo , Biologia Computacional/métodos , Humanos , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos , Aprendizado Profundo , Epigênese Genética , Epigenômica/métodos , Software
4.
Comput Biol Chem ; 112: 108127, 2024 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-38870559

RESUMO

Spatial transcriptomics, a groundbreaking field in cellular biology, faces the challenge of effectively deciphering complex spatial-temporal gene expression patterns. Traditional data analysis methods often fail to capture the intricate nuances of this data, limiting the depth of understanding in spatial distribution and gene interactions. In response, we present Spatial-Temporal Patterns for Downstream Analysis (STPDA), a sophisticated computational framework tailored for spatial transcriptomic data analysis. STPDA leverages high-resolution mapping to bridge the gap between genomics and histopathology, offering a comprehensive perspective on the spatial dynamics of gene expression within tissues. This approach enables a view of cellular function and organization, marking a paradigm shift in our comprehension of biological systems. By employing Autoregressive Moving Average (ARMA) and Long Short-Term Memory (LSTM) models, STPDA effectively deciphers both global and local spatio-temporal dynamics in cellular environments. This integration of spatial-temporal patterns for downstream analysis offers a transformative approach to spatial transcriptomics data analysis. STPDA excels in various single-cell analytical tasks, including the identification of ligand-receptor interactions and cell type classification. Its ability to harness spatial-temporal patterns not only matches but frequently surpasses the performance of existing state-of-the-art methods. To ensure widespread usability and impact, we have encapsulated STPDA in a scalable and accessible Python package, addressing single-cell tasks through advanced spatial-temporal pattern analysis. This development promises to enhance our understanding of cellular biology, offering novel insights and therapeutic strategies, and represents a substantial advancement in the field of spatial transcriptomics.

5.
BMC Bioinformatics ; 25(1): 167, 2024 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-38671342

RESUMO

BACKGROUND: Numerous transcriptomic-based models have been developed to predict or understand the fundamental mechanisms driving biological phenotypes. However, few models have successfully transitioned into clinical practice due to challenges associated with generalizability and interpretability. To address these issues, researchers have turned to dimensionality reduction methods and have begun implementing transfer learning approaches. METHODS: In this study, we aimed to determine the optimal combination of dimensionality reduction and regularization methods for predictive modeling. We applied seven dimensionality reduction methods to various datasets, including two supervised methods (linear optimal low-rank projection and low-rank canonical correlation analysis), two unsupervised methods [principal component analysis and consensus independent component analysis (c-ICA)], and three methods [autoencoder (AE), adversarial variational autoencoder, and c-ICA] within a transfer learning framework, trained on > 140,000 transcriptomic profiles. To assess the performance of the different combinations, we used a cross-validation setup encapsulated within a permutation testing framework, analyzing 30 different transcriptomic datasets with binary phenotypes. Furthermore, we included datasets with small sample sizes and phenotypes of varying degrees of predictability, and we employed independent datasets for validation. RESULTS: Our findings revealed that regularized models without dimensionality reduction achieved the highest predictive performance, challenging the necessity of dimensionality reduction when the primary goal is to achieve optimal predictive performance. However, models using AE and c-ICA with transfer learning for dimensionality reduction showed comparable performance, with enhanced interpretability and robustness of predictors, compared to models using non-dimensionality-reduced data. CONCLUSION: These findings offer valuable insights into the optimal combination of strategies for enhancing the predictive performance, interpretability, and generalizability of transcriptomic-based models.


Assuntos
Fenótipo , Transcriptoma , Transcriptoma/genética , Humanos , Perfilação da Expressão Gênica/métodos , Aprendizado de Máquina , Biologia Computacional/métodos , Algoritmos , Análise de Componente Principal
6.
Front Immunol ; 15: 1270401, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38464525

RESUMO

Background: The co-occurrence of primary biliary cholangitis (PBC) and systemic lupus erythematosus (SLE) has been consistently reported in observational studies. Nevertheless, the underlying causal correlation between these two conditions still needs to be established. Methods: We performed a bidirectional two-sample Mendelian randomization (MR) study to assess their causal association. Five MR analysis methods were utilized for causal inference, with inverse-variance weighted (IVW) selected as the primary method. The Mendelian Randomization Pleiotropy RESidual Sum and Outlier (MR-PRESSO) and the IVW Radial method were applied to exclude outlying SNPs. To assess the robustness of the MR results, five sensitivity analyses were carried out. Multivariable MR (MVMR) analysis was also employed to evaluate the effect of possible confounders. In addition, we integrated transcriptomic data from PBC and SLE, employing Weighted Gene Co-expression Network Analysis (WGCNA) to explore shared genes between the two diseases. Then, we used Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment methods to perform on the shared genes. The Least Absolute Shrinkage and Selection Operator (LASSO) regression algorithm was utilized to identify potential shared diagnostic genes. Finally, we verified the potential shared diagnostic genes in peripheral blood mononuclear cells (PBMCs)-specific cell populations of SLE patients by single-cell analysis. Results: Our MR study provided evidence that PBC had a causal relationship with SLE (IVW, OR: 1.347, 95% CI: 1.276 - 1.422, P < 0.001) after removing outliers (MR-PRESSO, rs35464393, rs3771317; IVW Radial, rs11065987, rs12924729, rs3745516). Conversely, SLE also had a causal association with PBC (IVW, OR: 1.225, 95% CI: 1.141 - 1.315, P < 0.001) after outlier correction (MR-PRESSO, rs11065987, rs3763295, rs7774434; IVW Radial, rs2297067). Sensitivity analyses confirmed the robustness of the MR findings. MVMR analysis indicated that body mass index (BMI), smoking and drinking were not confounding factors. Moreover, bioinformatic analysis identified PARP9, ABCA1, CEACAM1, and DDX60L as promising diagnostic biomarkers for PBC and SLE. These four genes are highly expressed in CD14+ monocytes in PBMCs of SLE patients and potentially associated with innate immune responses and immune activation. Conclusion: Our study confirmed the bidirectional causal relationship between PBC and SLE and identified PARP9, ABCA1, CEACAM1, and DDX60L genes as the most potentially shared diagnostic genes between the two diseases, providing insights for the exploration of the underlying mechanisms of these disorders.


Assuntos
Cirrose Hepática Biliar , Lúpus Eritematoso Sistêmico , Humanos , Leucócitos Mononucleares , Cirrose Hepática Biliar/diagnóstico , Cirrose Hepática Biliar/genética , Análise da Randomização Mendeliana , Perfilação da Expressão Gênica , Proteína CEACAM1 , Lúpus Eritematoso Sistêmico/diagnóstico , Lúpus Eritematoso Sistêmico/genética
7.
Front Genet ; 15: 1270387, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38348453

RESUMO

Preserving data privacy is an important concern in the research use of patient data. The DataSHIELD suite enables privacy-aware advanced statistical analysis in a federated setting. Despite its many applications, it has a few open practical issues: the complexity of hosting a federated infrastructure, the performance penalty imposed by the privacy-preserving constraints, and the ease of use by non-technical users. In this work, we describe a case study in which we review different breast cancer classifiers and report our findings about the limits and advantages of such non-disclosive suite of tools in a realistic setting. Five independent gene expression datasets of breast cancer survival were downloaded from Gene Expression Omnibus (GEO) and pooled together through the federated infrastructure. Three previously published and two newly proposed 5-year cancer-free survival risk score classifiers were trained in a federated environment, and an additional reference classifier was trained with unconstrained data access. The performance of these six classifiers was systematically evaluated, and the results show that i) the published classifiers do not generalize well when applied to patient cohorts that differ from those used to develop them; ii) among the methods we tried, the classification using logistic regression worked better on average, closely followed by random forest; iii) the unconstrained version of the logistic regression classifier outperformed the federated version by 4% on average. Reproducibility of our experiments is ensured through the use of VisualSHIELD, an open-source tool that augments DataSHIELD with new functions, a standardized deployment procedure, and a simple graphical user interface.

8.
BMC Bioinformatics ; 25(1): 53, 2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38302900

RESUMO

BACKGROUND: Non-coding RNAs represent a large part of the human transcriptome and have been shown to play an important role in disease such as cancer. However, their biological functions are still incompletely understood. Among non-coding RNAs, circular RNAs (circRNAs) have recently been identified for their microRNA (miRNA) sponge function which allows them to modulate the expression of miRNA target genes by taking on the role of competitive endogenous RNAs (ce-circRNAs). Today, most computational tools are not adapted to the search for ce-circRNAs or have not been developed for the search for ce-circRNAs from user's transcriptomic data. RESULTS: In this study, we present Cirscan (CIRcular RNA Sponge CANdidates), an interactive Shiny application that automatically infers circRNA-miRNA-mRNA networks from human multi-level transcript expression data from two biological conditions (e.g. tumor versus normal conditions in the case of cancer study) in order to identify on a large scale, potential sponge mechanisms active in a specific condition. Cirscan ranks each circRNA-miRNA-mRNA subnetwork according to a sponge score that integrates multiple criteria based on interaction reliability and expression level. Finally, the top ranked sponge mechanisms can be visualized as networks and an enrichment analysis is performed to help its biological interpretation. We showed on two real case studies that Cirscan is capable of retrieving sponge mechanisms previously described, as well as identifying potential novel circRNA sponge candidates. CONCLUSIONS: Cirscan can be considered as a companion tool for biologists, facilitating their ability to prioritize sponge mechanisms for experimental validations and identifying potential therapeutic targets. Cirscan is implemented in R, released under the license GPL-3 and accessible on GitLab ( https://gitlab.com/geobioinfo/cirscan_Rshiny ). The scripts used in this paper are also provided on Gitlab ( https://gitlab.com/geobioinfo/cirscan_paper ).


Assuntos
MicroRNAs , Neoplasias , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , RNA Circular/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Reprodutibilidade dos Testes , Redes Reguladoras de Genes
9.
Int J Mol Sci ; 25(2)2024 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-38279299

RESUMO

Parkinson's disease (PD) is a prevalent neurodegenerative disorder characterized by the progressive degeneration of dopaminergic neurons in the substantia nigra region of the brain. The hallmark pathological feature of PD is the accumulation of misfolded proteins, leading to the formation of intracellular aggregates known as Lewy bodies. Recent data evidenced how disruptions in protein synthesis, folding, and degradation are events commonly observed in PD and may provide information on the molecular background behind its etiopathogenesis. In the present study, we used a publicly available transcriptomic microarray dataset of peripheral blood of PD patients and healthy controls (GSE6613) to investigate the potential dysregulation of elements involved in proteostasis-related processes at the transcriptomic level. Our bioinformatics analysis revealed 375 differentially expressed genes (DEGs), of which 281 were down-regulated and 94 were up-regulated. Network analysis performed on the observed DEGs highlighted a cluster of 36 elements mainly involved in the protein synthesis processes. Different enriched ontologies were related to translation initiation and regulation, ribosome structure, and ribosome components nuclear export. Overall, this data consistently points to a generalized impairment of the translational machinery and proteostasis. Dysregulation of these mechanics has been associated with PD pathogenesis. Understanding the precise regulation of such processes may shed light on the molecular mechanisms of PD and provide potential data for early diagnosis.


Assuntos
Doença de Parkinson , Humanos , Doença de Parkinson/metabolismo , Transcriptoma , Corpos de Lewy/metabolismo , Perfilação da Expressão Gênica , Biossíntese de Proteínas , Substância Negra/metabolismo
10.
BMC Bioinformatics ; 24(1): 492, 2023 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-38129786

RESUMO

BACKGROUND: Flux Balance Analysis (FBA) is a key metabolic modeling method used to simulate cellular metabolism under steady-state conditions. Its simplicity and versatility have led to various strategies incorporating transcriptomic and proteomic data into FBA, successfully predicting flux distribution and phenotypic results. However, despite these advances, the untapped potential lies in leveraging gene-related connections like co-expression patterns for valuable insights. RESULTS: To fill this gap, we introduce ICON-GEMs, an innovative constraint-based model to incorporate gene co-expression network into the FBA model, facilitating more precise determination of flux distributions and functional pathways. In this study, transcriptomic data from both Escherichia coli and Saccharomyces cerevisiae were integrated into their respective genome-scale metabolic models. A comprehensive gene co-expression network was constructed as a global view of metabolic mechanism of the cell. By leveraging quadratic programming, we maximized the alignment between pairs of reaction fluxes and the correlation of their corresponding genes in the co-expression network. The outcomes notably demonstrated that ICON-GEMs outperformed existing methodologies in predictive accuracy. Flux variabilities over subsystems and functional modules also demonstrate promising results. Furthermore, a comparison involving different types of biological networks, including protein-protein interactions and random networks, reveals insights into the utilization of the co-expression network in genome-scale metabolic engineering. CONCLUSION: ICON-GEMs introduce an innovative constrained model capable of simultaneous integration of gene co-expression networks, ready for board application across diverse transcriptomic data sets and multiple organisms. It is freely available as open-source at https://github.com/ThummaratPaklao/ICOM-GEMs.git .


Assuntos
Proteômica , Biologia de Sistemas , Genoma , Engenharia Metabólica , Perfilação da Expressão Gênica , Escherichia coli/genética , Escherichia coli/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Modelos Biológicos , Redes e Vias Metabólicas/genética , Análise do Fluxo Metabólico/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA