Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 105
Filtrar
1.
Biology (Basel) ; 13(10)2024 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-39452096

RESUMO

Bipolar disorder (BPD) is a serious psychiatric condition that is characterized by the frequent shifting of mood patterns, ranging from manic to depressive episodes. Although there are already treatment strategies that aim at regulating the manifestations of this disorder, its etiology remains unclear and continues to be a question of interest within the scientific community. The development of RNA sequencing techniques has provided newer and better approaches to studying disorders at the transcriptomic level. Hence, using RNA-seq data, we employed intramodular connectivity analysis and network pharmacology assessment of disease-associated variants to elucidate the biological pathways underlying the complex nature of BPD. This study was intended to characterize the expression profiles obtained from three regions in the brain, which are the nucleus accumbens (nAcc), the anterior cingulate cortex (AnCg), and the dorsolateral prefrontal cortex (DLPFC), provide insights into the specific roles of these regions in the onset of the disorder, and present potential targets for drug design and development. The nAcc was found to be highly associated with genes responsible for the deregulated transcription of neurotransmitters, while the DLPFC was greatly correlated with genes involved in the impairment of components crucial in neurotransmission. The AnCg did show association with some of the expressions, but the relationship was not as strong as the other two regions. Furthermore, disease-associated variants or single nucleotide polymorphisms (SNPs) were identified among the significant genes in BPD, which suggests the genetic interrelatedness of such a disorder and other mental illnesses. DRD2, GFRA2, and DCBLD1 were the genes with disease-associated variants expressed in the nAcc; ST8SIA2 and ADAMTS16 were the genes with disease-associated variants expressed in the AnCg; and FOXO3, ITGA9, CUBN, PLCB4, and RORB were the genes with disease-associated variants expressed in the DLPFC. Aside from unraveling the molecular and cellular mechanisms behind the expression of BPD, this investigation was envisioned to propose a new research pipeline in studying the transcriptome of psychiatric disorders to support and improve existing studies.

2.
Brief Bioinform ; 25(6)2024 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-39322626

RESUMO

RNA sequencing is the gold-standard method to quantify transcriptomic changes between two conditions. The overwhelming majority of data analysis methods available are focused on polyadenylated RNA transcribed from single-copy genes and overlook transcripts from repeated sequences such as transposable elements (TEs). These self-autonomous genetic elements are increasingly studied, and specialized tools designed to handle multimapping sequencing reads are available. Transfer RNAs are transcribed by RNA polymerase III and are essential for protein translation. There is a need for integrated software that is able to analyze multiple types of RNA. Here, we present 3t-seq, a Snakemake pipeline for integrated differential expression analysis of transcripts from single-copy genes, TEs, and tRNA. 3t-seq produces an accessible report and easy-to-use results for downstream analysis starting from raw sequencing data and performing quality control, genome mapping, gene expression quantification, and statistical testing. It implements three methods to quantify TEs expression and one for tRNA genes. It provides an easy-to-configure method to manage software dependencies that lets the user focus on results. 3t-seq is released under MIT license and is available at https://github.com/boulardlab/3t-seq.


Assuntos
Elementos de DNA Transponíveis , RNA de Transferência , RNA-Seq , Software , RNA de Transferência/genética , RNA-Seq/métodos , Perfilação da Expressão Gênica/métodos , Humanos , Biologia Computacional/métodos , Análise de Sequência de RNA/métodos
3.
BMC Genomics ; 25(1): 875, 2024 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-39294558

RESUMO

BACKGROUND: The widely adopted bulk RNA-seq measures the gene expression average of cells, masking cell type heterogeneity, which confounds downstream analyses. Therefore, identifying the cellular composition and cell type-specific gene expression profiles (GEPs) facilitates the study of the underlying mechanisms of various biological processes. Although single-cell RNA-seq focuses on cell type heterogeneity in gene expression, it requires specialized and expensive resources and currently is not practical for a large number of samples or a routine clinical setting. Recently, computational deconvolution methodologies have been developed, while many of them only estimate cell type composition or cell type-specific GEPs by requiring the other as input. The development of more accurate deconvolution methods to infer cell type abundance and cell type-specific GEPs is still essential. RESULTS: We propose a new deconvolution algorithm, DSSC, which infers cell type-specific gene expression and cell type proportions of heterogeneous samples simultaneously by leveraging gene-gene and sample-sample similarities in bulk expression and single-cell RNA-seq data. Through comparisons with the other existing methods, we demonstrate that DSSC is effective in inferring both cell type proportions and cell type-specific GEPs across simulated pseudo-bulk data (including intra-dataset and inter-dataset simulations) and experimental bulk data (including mixture data and real experimental data). DSSC shows robustness to the change of marker gene number and sample size and also has cost and time efficiencies. CONCLUSIONS: DSSC provides a practical and promising alternative to the experimental techniques to characterize cellular composition and heterogeneity in the gene expression of heterogeneous samples.


Assuntos
Algoritmos , Perfilação da Expressão Gênica , RNA-Seq , Análise de Célula Única , Análise de Célula Única/métodos , Humanos , RNA-Seq/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Biologia Computacional/métodos , Transcriptoma , Análise da Expressão Gênica de Célula Única
4.
Plants (Basel) ; 13(13)2024 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-38999718

RESUMO

Heat shock proteins (HSPs) are molecular chaperones that play essential roles in plant development and in response to various environmental stresses. Understanding R. delavayi HSP genes is of great importance since R. delavayi is severely affected by heat stress. In the present study, a total of 76 RdHSP genes were identified in the R. delavayi genome, which were divided into five subfamilies based on molecular weight and domain composition. Analyses of the chromosome distribution, gene structure, and conserved motif of the RdHSP family genes were conducted using bioinformatics analysis methods. Gene duplication analysis showed that 15 and 8 RdHSP genes were obtained and retained from the WGD/segmental duplication and tandem duplication, respectively. Cis-element analysis revealed the importance of RdHSP genes in plant adaptations to the environment. Moreover, the expression patterns of RdHSP family genes were investigated in R. delavayi treated with high temperature based on our RNA-seq data, which were further verified by qRT-PCR. Further analysis revealed that nine candidate genes, including six RdHSP20 subfamily genes (RdHSP20.4, RdHSP20.8, RdHSP20.6, RdHSP20.3, RdHSP20.10, and RdHSP20.15) and three RdHSP70 subfamily genes (RdHSP70.15, RdHSP70.21, and RdHSP70.16), might be involved in enhancing the heat stress tolerance. The subcellular localization of two candidate RdHSP genes (RdHSP20.8 and RdHSP20.6) showed that two candidate RdHSPs were expressed and function in the chloroplast and nucleus, respectively. These results provide a basis for the functional characterization of HSP genes and investigations on the molecular mechanisms of heat stress response in R. delavayi.

5.
Methods Mol Biol ; 2812: 39-46, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39068356

RESUMO

In this chapter, we outline an approach to analyzing metatranscriptomic data, focusing on the assessment of differential enzyme expression and metabolic pathway activities using a novel bioinformatics software tool, EMPathways2. The analysis pipeline commences with raw data originating from a sequencer and concludes with an output of enzyme expressions and an estimate of metabolic pathway activities. The initial step involves aligning specific transcriptomes assembled from RNA-Seq data using Bowtie2 and acquiring gene expression data with IsoEM2. Subsequently, the pipeline proceeds to quality assessment and preprocessing of the input data, ensuring accurate estimates of enzymes and their differential regulation. Upon completion of the preprocessing stage, EMPathways2 is employed to decipher the intricate relationships between genes, enzymes, and pathways. An online repository containing sample data has been made available, alongside custom Python scripts designed to modify the output of the programs within the pipeline for diverse downstream analyses. This chapter highlights the technical aspects and practical applications of using EMPathways2, which facilitates the advancement of transcriptome data analysis and contributes to a deeper understanding of the complex regulatory mechanisms underlying living systems.


Assuntos
Biologia Computacional , Perfilação da Expressão Gênica , Redes e Vias Metabólicas , RNA-Seq , Software , RNA-Seq/métodos , Redes e Vias Metabólicas/genética , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Transcriptoma , Humanos , Análise de Sequência de RNA/métodos
6.
BMC Genomics ; 25(1): 631, 2024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-38914930

RESUMO

BACKGROUND: Current RNA-seq analysis software for RNA-seq data tends to use similar parameters across different species without considering species-specific differences. However, the suitability and accuracy of these tools may vary when analyzing data from different species, such as humans, animals, plants, fungi, and bacteria. For most laboratory researchers lacking a background in information science, determining how to construct an analysis workflow that meets their specific needs from the array of complex analytical tools available poses a significant challenge. RESULTS: By utilizing RNA-seq data from plants, animals, and fungi, it was observed that different analytical tools demonstrate some variations in performance when applied to different species. A comprehensive experiment was conducted specifically for analyzing plant pathogenic fungal data, focusing on differential gene analysis as the ultimate goal. In this study, 288 pipelines using different tools were applied to analyze five fungal RNA-seq datasets, and the performance of their results was evaluated based on simulation. This led to the establishment of a relatively universal and superior fungal RNA-seq analysis pipeline that can serve as a reference, and certain standards for selecting analysis tools were derived for reference. Additionally, we compared various tools for alternative splicing analysis. The results based on simulated data indicated that rMATS remained the optimal choice, although consideration could be given to supplementing with tools such as SpliceWiz. CONCLUSION: The experimental results demonstrate that, in comparison to the default software parameter configurations, the analysis combination results after tuning can provide more accurate biological insights. It is beneficial to carefully select suitable analysis software based on the data, rather than indiscriminately choosing tools, in order to achieve high-quality analysis results more efficiently.


Assuntos
RNA-Seq , Software , Fluxo de Trabalho , RNA-Seq/métodos , Fungos/genética , Biologia Computacional/métodos , Análise de Sequência de RNA/métodos , Processamento Alternativo
7.
Endocrine ; 86(1): 255-267, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38676768

RESUMO

PURPOSE: To perform an extensive exploratory analysis to build a deeper insight into clinically relevant molecular biomarkers in Papillary, Follicular, and Anaplastic thyroid carcinomas (PTC, FTC, ATC). METHODS: Thirteen Thyroid Cancer (THCA) datasets incorporating PTC, FTC, and ATC were derived from the Gene Expression Omnibus. Genes differentially expressed (DEGs) between THCA and normal were identified and subjected to GO and KEGG analyses. Multiple topological properties were harnessed and protein-protein interaction (PPI) networks were constructed to identify the hub genes followed by survival analysis and validation. RESULTS: There were 70, 87, and 377 DEGs, and 23, 27, and 53 hub genes for PTC, FTC, and ATC samples, respectively. Survival analysis detected 39 overall and 49 relapse-free survival-relevant hub genes. Six hub genes, BCL2, FN1, ITPR1, LYVE1, NTRK2, TBC1D4, were found common to more than one THCA type. The most significant hub genes found in the study were: BCL2, CD44, DCN, FN1, IRS1, ITPR1, MFAP4, MKI67, NTRK2, PCLO, TGFA. The most enriched and significant GO terms were Melanocyte differentiation for PTC, Extracellular region for FTC, and Extracellular exosome for ATC. Prostate cancer for PTC was the most significantly enriched KEGG pathway. The results were validated using TCGA data. CONCLUSIONS: The findings unravel potential biomarkers and therapeutic targets of thyroid carcinomas.


Assuntos
Adenocarcinoma Folicular , Biologia Computacional , Carcinoma Anaplásico da Tireoide , Neoplasias da Glândula Tireoide , Humanos , Neoplasias da Glândula Tireoide/genética , Neoplasias da Glândula Tireoide/patologia , Carcinoma Anaplásico da Tireoide/genética , Carcinoma Anaplásico da Tireoide/patologia , Adenocarcinoma Folicular/genética , Adenocarcinoma Folicular/patologia , Câncer Papilífero da Tireoide/genética , Câncer Papilífero da Tireoide/patologia , Mapas de Interação de Proteínas , Biomarcadores Tumorais/genética , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes
8.
Int J Mol Sci ; 25(7)2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38612783

RESUMO

Although the pathogenesis of solar lentigo (SL) involves chronic ultraviolet (UV) exposure, cellular senescence, and upregulated melanogenesis, underlying molecular-level mechanisms associated with SL remain unclear. The aim of this study was to investigate the gene regulatory mechanisms intimately linked to inflammation in SL. Skin samples from patients with SL with or without histological inflammatory features were obtained. RNA-seq data from the samples were analyzed via multiple analysis approaches, including exploration of core inflammatory gene alterations, identifying functional pathways at both transcription and protein levels, comparison of inflammatory module (gene clusters) activation levels, and analyzing correlations between modules. These analyses disclosed specific core genes implicated in oxidative stress, especially the upregulation of nuclear factor kappa B in the inflammatory SLs, while genes associated with protective mechanisms, such as SLC6A9, were highly expressed in the non-inflammatory SLs. For inflammatory modules, Extracellular Immunity and Mitochondrial Innate Immunity were exclusively upregulated in the inflammatory SL. Analysis of protein-protein interactions revealed the significance of CXCR3 upregulation in the pathogenesis of inflammatory SL. In conclusion, the upregulation of stress response-associated genes and inflammatory pathways in response to UV-induced oxidative stress implies their involvement in the pathogenesis of inflammatory SL.


Assuntos
Lentigo , Família Multigênica , Humanos , Inflamação/genética , Senescência Celular , Imunidade Inata , Lentigo/genética
9.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38600665

RESUMO

Single-cell RNA sequencing (scRNA-seq) facilitates the study of cell type heterogeneity and the construction of cell atlas. However, due to its limitations, many genes may be detected to have zero expressions, i.e. dropout events, leading to bias in downstream analyses and hindering the identification and characterization of cell types and cell functions. Although many imputation methods have been developed, their performances are generally lower than expected across different kinds and dimensions of data and application scenarios. Therefore, developing an accurate and robust single-cell gene expression data imputation method is still essential. Considering to maintain the original cell-cell and gene-gene correlations and leverage bulk RNA sequencing (bulk RNA-seq) data information, we propose scINRB, a single-cell gene expression imputation method with network regularization and bulk RNA-seq data. scINRB adopts network-regularized non-negative matrix factorization to ensure that the imputed data maintains the cell-cell and gene-gene similarities and also approaches the gene average expression calculated from bulk RNA-seq data. To evaluate the performance, we test scINRB on simulated and experimental datasets and compare it with other commonly used imputation methods. The results show that scINRB recovers gene expression accurately even in the case of high dropout rates and dimensions, preserves cell-cell and gene-gene similarities and improves various downstream analyses including visualization, clustering and trajectory inference.


Assuntos
Algoritmos , Análise de Célula Única , RNA-Seq , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos , Análise por Conglomerados , Expressão Gênica , Perfilação da Expressão Gênica , Software
10.
Front Microbiol ; 15: 1342328, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38655085

RESUMO

Introduction: Our study undertakes a detailed exploration of gene expression dynamics within human lung organ tissue equivalents (OTEs) in response to Influenza A virus (IAV), Human metapneumovirus (MPV), and Parainfluenza virus type 3 (PIV3) infections. Through the analysis of RNA-Seq data from 19,671 genes, we aim to identify differentially expressed genes under various infection conditions, elucidating the complexities of virus-host interactions. Methods: We employ Generalized Linear Models (GLMs) with Quasi-Likelihood (QL) F-tests (GLMQL) and introduce the novel Magnitude-Altitude Score (MAS) and Relaxed Magnitude-Altitude Score (RMAS) algorithms to navigate the intricate landscape of RNA-Seq data. This approach facilitates the precise identification of potential biomarkers, highlighting the host's reliance on innate immune mechanisms. Our comprehensive methodological framework includes RNA extraction, library preparation, sequencing, and Gene Ontology (GO) enrichment analysis to interpret the biological significance of our findings. Results: The differential expression analysis unveils significant changes in gene expression triggered by IAV, MPV, and PIV3 infections. The MAS and RMAS algorithms enable focused identification of biomarkers, revealing a consistent activation of interferon-stimulated genes (e.g., IFIT1, IFIT2, IFIT3, OAS1) across all viruses. Our GO analysis provides deep insights into the host's defense mechanisms and viral strategies exploiting host cellular functions. Notably, changes in cellular structures, such as cilium assembly and mitochondrial ribosome assembly, indicate a strategic shift in cellular priorities. The precision of our methodology is validated by a 92% mean accuracy in classifying respiratory virus infections using multinomial logistic regression, demonstrating the superior efficacy of our approach over traditional methods. Discussion: This study highlights the intricate interplay between viral infections and host gene expression, underscoring the need for targeted therapeutic interventions. The stability and reliability of the MAS/RMAS ranking method, even under stringent statistical corrections, and the critical importance of adequate sample size for biomarker reliability are significant findings. Our comprehensive analysis not only advances our understanding of the host's response to viral infections but also sets a new benchmark for the identification of biomarkers, paving the way for the development of effective diagnostic and therapeutic strategies.

11.
Inflamm Regen ; 44(1): 10, 2024 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-38475915

RESUMO

Inflammatory responses are known to suppress neural regeneration in patients receiving stem cell-based regenerative therapy for spinal cord injury (SCI). Consequently, pathways involved in neurogenesis and immunomodulation, such as the hepatocyte growth factor (HGF)/MET signaling cascade, have garnered significant attention. Notably, various studies, including our own, have highlighted the enhanced recovery of locomotor functions achieved in SCI animal models by combining HGF pretreatment and human induced stem cell-derived neural stem/progenitor cell (hiPSC-NS/PC) transplantation. However, these studies implicitly hypothesized that the functionality of HGF in SCI would be time consistent and did not elucidate its dynamics. In the present article, we investigated the time-course of the effect of HGF on SCI, aiming to uncover a more precise mechanism for HGF administration, which is indispensable for developing crystallizing protocols for combination therapy. To this end, we performed a detailed investigation of the temporal variation of HGF using the RNA-seq data we obtained in our most recent study. Leveraging the time-series design of the data, which we did not fully exploit previously, we identified three components in the effects of HGF that operate at different times: early effects, continuous effects, and delayed effects. Our findings suggested a concept where the three components together contribute to the acceleration of neurogenesis and immunomodulation, which reinforce the legitimacy of empirically fine-tuned protocols for HGF administration and advocate the novel possibility that the time-inconsistent effects of HGF progressively augment the efficacy of combined therapy.

12.
Heliyon ; 10(5): e27132, 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38449649

RESUMO

In Catharanthus roseus, vital plant hormones, namely methyl jasmonate (MeJA) and ethylene, serve as abiotic triggers, playing a crucial role in stimulating the production of specific secondary compounds with anticancer properties. Understanding how plants react to various stresses, stimuli, and the pathways involved in biosynthesis holds significant promise. The application of stressors like ethylene and MeJA induces the plant's defense mechanisms, leading to increased secondary metabolite production. To delve into the essential transcriptomic processes linked to hormonal responses, this study employed an integrated approach combining RNA-Seq data meta-analysis and system biology methodologies. Furthermore, the validity of the meta-analysis findings was confirmed using RT-qPCR. Within the meta-analysis, 903 genes exhibited differential expression (DEGs) when comparing normal conditions to those of the treatment. Subsequent analysis, encompassing gene ontology, KEGG, TF, and motifs, revealed that these DEGs were actively engaged in multiple biological processes, particularly in responding to various stresses and stimuli. Additionally, these genes were notably enriched in diverse biosynthetic pathways, including those related to TIAs, housing valuable medicinal compounds found in this plant. Furthermore, by conducting co-expression network analysis, we identified hub genes within modules associated with stress response and the production of TIAs. Most genes linked to the biosynthesis pathway of TIAs clustered within three specific modules. Noteworthy hub genes, including Helicase ATP-binding domain, hbdA, and ALP1 genes within the blue, turquoise, and green module networks, are presumed to play a role in the TIAs pathway. These identified candidate genes hold potential for forthcoming genetic and metabolic engineering initiatives aimed at augmenting the production of secondary metabolites and medicinal compounds within C. roseus.

13.
bioRxiv ; 2024 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-38328080

RESUMO

Background: Gene co-expression networks (GCNs) describe relationships among expressed genes key to maintaining cellular identity and homeostasis. However, the small sample size of typical RNA-seq experiments which is several orders of magnitude fewer than the number of genes is too low to infer GCNs reliably. recount3, a publicly available dataset comprised of 316,443 uniformly processed human RNA-seq samples, provides an opportunity to improve power for accurate network reconstruction and obtain biological insight from the resulting networks. Results: We compared alternate aggregation strategies to identify an optimal workflow for GCN inference by data aggregation and inferred three consensus networks: a universal network, a non-cancer network, and a cancer network in addition to 27 tissue context-specific networks. Central network genes from our consensus networks were enriched for evolutionarily constrained genes and ubiquitous biological pathways, whereas central context-specific network genes included tissue-specific transcription factors and factorization based on the hubs led to clustering of related tissue contexts. We discovered that annotations corresponding to context-specific networks inferred from aggregated data were enriched for trait heritability beyond known functional genomic annotations and were significantly more enriched when we aggregated over a larger number of samples. Conclusion: This study outlines best practices for network GCN inference and evaluation by data aggregation. We recommend estimating and regressing confounders in each data set before aggregation and prioritizing large sample size studies for GCN reconstruction. Increased statistical power in inferring context-specific networks enabled the derivation of variant annotations that were enriched for concordant trait heritability independent of functional genomic annotations that are context-agnostic. While we observed strictly increasing held-out log-likelihood with data aggregation, we noted diminishing marginal improvements. Future directions aimed at alternate methods for estimating confounders and integrating orthogonal information from modalities such as Hi-C and ChIP-seq can further improve GCN inference.

14.
Brief Funct Genomics ; 23(2): 118-127, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-36752035

RESUMO

Analysis of cell-cell communication (CCC) in the tumor micro-environment helps decipher the underlying mechanism of cancer progression and drug tolerance. Currently, single-cell RNA-Seq data are available on a large scale, providing an unprecedented opportunity to predict cellular communications. There have been many achievements and applications in inferring cell-cell communication based on the known interactions between molecules, such as ligands, receptors and extracellular matrix. However, the prior information is not quite adequate and only involves a fraction of cellular communications, producing many false-positive or false-negative results. To this end, we propose an improved hierarchical variational autoencoder (HiVAE) based model to fully use single-cell RNA-seq data for automatically estimating CCC. Specifically, the HiVAE model is used to learn the potential representation of cells on known ligand-receptor genes and all genes in single-cell RNA-seq data, respectively, which are then utilized for cascade integration. Subsequently, transfer entropy is employed to measure the transmission of information flow between two cells based on the learned representations, which are regarded as directed communication relationships. Experiments are conducted on single-cell RNA-seq data of the human skin disease dataset and the melanoma dataset, respectively. Results show that the HiVAE model is effective in learning cell representations, and transfer entropy could be used to estimate the communication scores between cell types.


Assuntos
Neoplasias , Análise da Expressão Gênica de Célula Única , Humanos , Análise de Célula Única/métodos , Comunicação Celular , Sequenciamento do Exoma , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos , Microambiente Tumoral
15.
Geroscience ; 46(1): 999-1015, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37314668

RESUMO

Following prolonged cell division, mesenchymal stem cells enter replicative senescence, a state of permanent cell cycle arrest that constrains the use of this cell type in regenerative medicine applications and that in vivo substantially contributes to organismal ageing. Multiple cellular processes such as telomere dysfunction, DNA damage and oncogene activation are implicated in promoting replicative senescence, but whether mesenchymal stem cells enter different pre-senescent and senescent states has remained unclear. To address this knowledge gap, we subjected serially passaged human ESC-derived mesenchymal stem cells (esMSCs) to single cell profiling and single cell RNA-sequencing during their progressive entry into replicative senescence. We found that esMSC transitioned through newly identified pre-senescent cell states before entering into three different senescent cell states. By deconstructing this heterogeneity and temporally ordering these pre-senescent and senescent esMSC subpopulations into developmental trajectories, we identified markers and predicted drivers of these cell states. Regulatory networks that capture connections between genes at each timepoint demonstrated a loss of connectivity, and specific genes altered their gene expression distributions as cells entered senescence. Collectively, this data reconciles previous observations that identified different senescence programs within an individual cell type and should enable the design of novel senotherapeutic regimes that can overcome in vitro MSC expansion constraints or that can perhaps slow organismal ageing.


Assuntos
Senescência Celular , Células-Tronco Mesenquimais , Humanos , Senescência Celular/fisiologia , Células-Tronco Mesenquimais/metabolismo
16.
J Theor Biol ; 577: 111636, 2024 01 21.
Artigo em Inglês | MEDLINE | ID: mdl-37944593

RESUMO

Gene expression analysis is valuable for cancer type classification and identifying diverse cancer phenotypes. The latest high-throughput RNA sequencing devices have enabled access to large volumes of gene expression data. However, we face several challenges, such as data security and privacy, when we develop machine learning-based classifiers for categorizing cancer types with these datasets. To address these issues, we propose IP3G (Intelligent Phenotype-detection and Gene expression profile Generation with Generative adversarial network), a model based on Generative Adversarial Networks. IP3G tackles two major problems: augmenting gene expression data and unsupervised phenotype discovery. By converting gene expression profiles into 2-Dimensional images and leveraging IP3G, we generate new profiles for specific phenotypes. IP3G learns disentangled representations of gene expression patterns and identifies phenotypes without labeled data. We improve the objective function of the GAN used in IP3G by employing the earth mover distance and a novel mutual information function. IP3G outperforms clustering methods like k-Means, DBSCAN, and GMM in unsupervised phenotype discovery, while also surpassing SVM and CNN classification accuracy by up to 6% through gene expression profile augmentation. The source code for the developed IP3G is accessible to the public on GitHub.


Assuntos
Neoplasias , Transcriptoma , Humanos , Perfilação da Expressão Gênica , Análise por Conglomerados , Fenótipo , Neoplasias/genética
17.
Biol Methods Protoc ; 8(1): bpad028, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38023349

RESUMO

High-throughput RNA-seq enables comprehensive analysis of the transcriptome for various purposes. However, this technology generally generates massive amounts of sequencing reads with a shorter read length. Consequently, fast, accurate, and flexible tools are needed for assembling raw RNA-seq data into full-length transcripts and quantifying their expression levels. In this protocol, we report TransBorrow, a novel transcriptome assembly software specifically designed for short RNA-seq reads. TransBorrow is employed in conjunction with a splice-aware alignment tool (e.g. Hisat2 and Star) and some other transcriptome assembly tools (e.g. StringTie, Cufflinks, and Scallop). The protocol encompasses all necessary steps, starting from downloading and processing raw sequencing data to assembling the full-length transcripts and quantifying their expressed abundances. The execution time of the protocol may vary depending on the sizes of processed datasets and computational platforms.

18.
bioRxiv ; 2023 Oct 29.
Artigo em Inglês | MEDLINE | ID: mdl-37961165

RESUMO

Intratumor heterogeneity (ITH) of tumor-infiltrated leukocytes (TILs) is an important phenomenon of cancer biology with potentially profound clinical impacts. Multi-region gene expression sequencing data provide a promising opportunity that allows for explorations of TILs and their intratumor heterogeneity for each subject. Although several existing methods are available to infer the proportions of TILs, considerable methodological gaps exist for evaluating intratumor heterogeneity of TILs with multi-region gene expression data. Here, we develop ICeITH, immune cell estimation reveals intratumor heterogeneity, a Bayesian hierarchical model that borrows cell type profiles as prior knowledge to decompose mixed bulk data while accounting for the within-subject correlations among tumor samples. ICeITH quantifies intratumor heterogeneity by the variability of targeted cellular compositions. Through extensive simulation studies, we demonstrate that ICeITH is more accurate in measuring relative cellular abundance and evaluating intratumor heterogeneity compared with existing methods. We also assess the ability of ICeITH to stratify patients by their intratumor heterogeneity score and associate the estimations with the survival outcomes. Finally, we apply ICeITH to two multi-region gene expression datasets from lung cancer studies to classify patients into different risk groups according to the ITH estimations of targeted TILs that shape either pro- or anti-tumor processes. In conclusion, ICeITH is a useful tool to evaluate intratumor heterogeneity of TILs from multi-region gene expression data.

19.
Brief Bioinform ; 25(1)2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-37991248

RESUMO

Due to the high dimensionality and sparsity of the gene expression matrix in single-cell RNA-sequencing (scRNA-seq) data, coupled with significant noise generated by shallow sequencing, it poses a great challenge for cell clustering methods. While numerous computational methods have been proposed, the majority of existing approaches center on processing the target dataset itself. This approach disregards the wealth of knowledge present within other species and batches of scRNA-seq data. In light of this, our paper proposes a novel method named graph-based deep embedding clustering (GDEC) that leverages transfer learning across species and batches. GDEC integrates graph convolutional networks, effectively overcoming the challenges posed by sparse gene expression matrices. Additionally, the incorporation of DEC in GDEC enables the partitioning of cell clusters within a lower-dimensional space, thereby mitigating the adverse effects of noise on clustering outcomes. GDEC constructs a model based on existing scRNA-seq datasets and then applying transfer learning techniques to fine-tune the model using a limited amount of prior knowledge gleaned from the target dataset. This empowers GDEC to adeptly cluster scRNA-seq data cross different species and batches. Through cross-species and cross-batch clustering experiments, we conducted a comparative analysis between GDEC and conventional packages. Furthermore, we implemented GDEC on the scRNA-seq data of uterine fibroids. Compared results obtained from the Seurat package, GDEC unveiled a novel cell type (epithelial cells) and identified a notable number of new pathways among various cell types, thus underscoring the enhanced analytical capabilities of GDEC. Availability and implementation: https://github.com/YuzhiSun/GDEC/tree/main.


Assuntos
Perfilação da Expressão Gênica , Leiomioma , Humanos , Perfilação da Expressão Gênica/métodos , Algoritmos , Análise de Sequência de RNA/métodos , Análise da Expressão Gênica de Célula Única , Análise de Célula Única/métodos , Análise por Conglomerados , Aprendizado de Máquina
20.
BMC Genomics ; 24(1): 687, 2023 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-37974076

RESUMO

BACKGROUND: Advances in sequencing technology and cost reduction have enabled an emergence of various statistical methods used in RNA-sequencing data, including the differential co-expression network analysis (or differential network analysis). A key benefit of this method is that it takes into consideration the interactions between or among genes and do not require an established knowledge in biological pathways. As of now, none of existing softwares can incorporate covariates that should be adjusted if they are confounding factors while performing the differential network analysis. RESULTS: We develop an R package PRANA which a user can easily include multiple covariates. The main R function in this package leverages a novel pseudo-value regression approach for a differential network analysis in RNA-sequencing data. This software is also enclosed with complementary R functions for extracting adjusted p-values and coefficient estimates of all or specific variable for each gene, as well as for identifying the names of genes that are differentially connected (DC, hereafter) between subjects under biologically different conditions from the output. CONCLUSION: Herewith, we demonstrate the application of this package in a real data on chronic obstructive pulmonary disease. PRANA is available through the CRAN repositories under the GPL-3 license: https://cran.r-project.org/web/packages/PRANA/index.html .


Assuntos
RNA , Software , Humanos , Sequência de Bases , Análise de Sequência de RNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA