Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38189539

RESUMEN

Sequence motif discovery algorithms enhance the identification of novel deoxyribonucleic acid sequences with pivotal biological significance, especially transcription factor (TF)-binding motifs. The advent of assay for transposase-accessible chromatin using sequencing (ATAC-seq) has broadened the toolkit for motif characterization. Nonetheless, prevailing computational approaches have focused on delineating TF-binding footprints, with motif discovery receiving less attention. Herein, we present Cis rEgulatory Motif Influence using de Bruijn Graph (CEMIG), an algorithm leveraging de Bruijn and Hamming distance graph paradigms to predict and map motif sites. Assessment on 129 ATAC-seq datasets from the Cistrome Data Browser demonstrates CEMIG's exceptional performance, surpassing three established methodologies on four evaluative metrics. CEMIG accurately identifies both cell-type-specific and common TF motifs within GM12878 and K562 cell lines, demonstrating its comparative genomic capabilities in the identification of evolutionary conservation and cell-type specificity. In-depth transcriptional and functional genomic studies have validated the functional relevance of CEMIG-identified motifs across various cell types. CEMIG is available at https://github.com/OSU-BMBL/CEMIG, developed in C++ to ensure cross-platform compatibility with Linux, macOS and Windows operating systems.


Asunto(s)
Algoritmos , Secuenciación de Inmunoprecipitación de Cromatina , Benchmarking , Evolución Biológica , Línea Celular
2.
Circ Res ; 132(2): 187-204, 2023 01 20.
Artículo en Inglés | MEDLINE | ID: mdl-36583388

RESUMEN

BACKGROUND: NOTCH1 pathogenic variants are implicated in multiple types of congenital heart defects including hypoplastic left heart syndrome, where the left ventricle is underdeveloped. It is unknown how NOTCH1 regulates human cardiac cell lineage determination and cardiomyocyte proliferation. In addition, mechanisms by which NOTCH1 pathogenic variants lead to ventricular hypoplasia in hypoplastic left heart syndrome remain elusive. METHODS: CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas9 genome editing was utilized to delete NOTCH1 in human induced pluripotent stem cells. Cardiac differentiation was carried out by sequential modulation of WNT signaling, and NOTCH1 knockout and wild-type differentiating cells were collected at day 0, 2, 5, 10, 14, and 30 for single-cell RNA-seq. RESULTS: Human NOTCH1 knockout induced pluripotent stem cells are able to generate functional cardiomyocytes and endothelial cells, suggesting that NOTCH1 is not required for mesoderm differentiation and cardiovascular development in vitro. However, disruption of NOTCH1 blocks human ventricular-like cardiomyocyte differentiation but promotes atrial-like cardiomyocyte generation through shortening the action potential duration. NOTCH1 deficiency leads to defective proliferation of early human cardiomyocytes, and transcriptomic analysis indicates that pathways involved in cell cycle progression and mitosis are downregulated in NOTCH1 knockout cardiomyocytes. Single-cell transcriptomic analysis reveals abnormal cell lineage determination of cardiac mesoderm, which is manifested by the biased differentiation toward epicardial and second heart field progenitors at the expense of first heart field progenitors in NOTCH1 knockout cell populations. CONCLUSIONS: NOTCH1 is essential for human ventricular-like cardiomyocyte differentiation and proliferation through balancing cell fate determination of cardiac mesoderm and modulating cell cycle progression. Because first heart field progenitors primarily contribute to the left ventricle, we speculate that pathogenic NOTCH1 variants lead to biased differentiation of first heart field progenitors, blocked ventricular-like cardiomyocyte differentiation, and defective cardiomyocyte proliferation, which collaboratively contribute to left ventricular hypoplasia in hypoplastic left heart syndrome.


Asunto(s)
Síndrome del Corazón Izquierdo Hipoplásico , Células Madre Pluripotentes Inducidas , Humanos , Células Endoteliales/metabolismo , Células Madre Pluripotentes Inducidas/metabolismo , Diferenciación Celular/fisiología , Miocitos Cardíacos/metabolismo , Receptor Notch1/genética , Receptor Notch1/metabolismo
3.
Cancer Immunol Immunother ; 73(3): 52, 2024 Feb 13.
Artículo en Inglés | MEDLINE | ID: mdl-38349405

RESUMEN

INTRODUCTION: As one of the major components of the tumor microenvironment, tumor-associated macrophages (TAMs) possess profound inhibitory activity against T cells and facilitate tumor escape from immune checkpoint blockade therapy. Converting this pro-tumorigenic toward the anti-tumorigenic phenotype thus is an important strategy for enhancing adaptive immunity against cancer. However, a plethora of mechanisms have been described for pro-tumorigenic differentiation in cancer, metabolic switches to program the anti-tumorigenic property of TAMs are elusive. MATERIALS AND METHODS: From an unbiased analysis of single-cell transcriptome data from multiple tumor models, we discovered that anti-tumorigenic TAMs uniquely express elevated levels of a specific fatty acid receptor, G-protein-coupled receptor 84 (GPR84). Genetic ablation of GPR84 in mice leads to impaired pro-inflammatory polarization of macrophages, while enhancing their anti-inflammatory phenotype. By contrast, GPR84 activation by its agonist, 6-n-octylaminouracil (6-OAU), potentiates pro-inflammatory phenotype via the enhanced STAT1 pathway. Moreover, 6-OAU treatment significantly retards tumor growth and increases the anti-tumor efficacy of anti-PD-1 therapy. CONCLUSION: Overall, we report a previously unappreciated fatty acid receptor, GPR84, that serves as an important metabolic sensing switch for orchestrating anti-tumorigenic macrophage polarization. Pharmacological agonists of GPR84 hold promise to reshape and reverse the immunosuppressive TME, and thereby restore responsiveness of cancer to overcome resistance to immune checkpoint blockade.


Asunto(s)
Inhibidores de Puntos de Control Inmunológico , Inmunoterapia , Animales , Ratones , Carcinogénesis , Ácidos Grasos , Macrófagos , Microambiente Tumoral , Macrófagos Asociados a Tumores
4.
Brief Bioinform ; 22(4)2021 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-33367506

RESUMEN

Non-coding RNAs (ncRNAs) play crucial roles in multiple biological processes. However, only a few ncRNAs' functions have been well studied. Given the significance of ncRNAs classification for understanding ncRNAs' functions, more and more computational methods have been introduced to improve the classification automatically and accurately. In this paper, based on a convolutional neural network and a deep forest algorithm, multi-grained cascade forest (GcForest), we propose a novel deep fusion learning framework, GcForest fusion method (GCFM), to classify alignments of ncRNA sequences for accurate clustering of ncRNAs. GCFM integrates a multi-view structure feature representation including sequence-structure alignment encoding, structure image representation and shape alignment encoding of structural subunits, enabling us to capture the potential specificity between ncRNAs. For the classification of pairwise alignment of two ncRNA sequences, the F-value of GCFM improves 6% than an existing alignment-based method. Furthermore, the clustering of ncRNA families is carried out based on the classification matrix generated from GCFM. Results suggest better performance (with 20% accuracy improved) than existing ncRNA clustering methods (RNAclust, Ensembleclust and CNNclust). Additionally, we apply GCFM to construct a phylogenetic tree of ncRNA and predict the probability of interactions between RNAs. Most ncRNAs are located correctly in the phylogenetic tree, and the prediction accuracy of RNA interaction is 90.63%. A web server (http://bmbl.sdstate.edu/gcfm/) is developed to maximize its availability, and the source code and related data are available at the same URL.


Asunto(s)
Redes Neurales de la Computación , Conformación de Ácido Nucleico , ARN no Traducido/genética , Alineación de Secuencia , Programas Informáticos
5.
J Med Virol ; 95(8): e29060, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37638381

RESUMEN

Human Papillomaviruses (HPVs) are associated with around 5%-10% of human cancer, notably nearly 99% of cervical cancer. The mechanisms HPV interacts with stratified epithelium (differentiated layers) during the viral life cycle, and oncogenesis remain unclear. In this study, we used single-cell transcriptome analysis to study viral gene and host cell differentiation-associated heterogeneity of HPV-positive cervical cancer tissue. We examined the HPV16 genes-E1, E6, and E7, and found they expressed differently across nine epithelial clusters. We found that three epithelial clusters had the highest proportion of HPV-positive cells (33.6%, 37.5%, and 32.4%, respectively), while two exhibited the lowest proportions (7.21% and 5.63%, respectively). Notably, the cluster with the most HPV-positive cells deviated significantly from normal epithelial layer markers, exhibiting functional heterogeneity and altered epithelial structuring, indicating that significant molecular heterogeneity existed in cancer tissues and that these cells exhibited unique/different gene signatures compared with normal epithelial cells. These HPV-positive cells, compared to HPV-negative, showed different gene expressions related to the extracellular matrix, cell adhesion, proliferation, and apoptosis. Further, the viral oncogenes E6 and E7 appeared to modify epithelial function via distinct pathways, thus contributing to cervical cancer progression. We investigated the HPV and host transcripts from a novel viewpoint focusing on layer heterogeneity. Our results indicated varied HPV expression across epithelial clusters and epithelial heterogeneity associated with viral oncogenes, contributing biological insights to this critical field of study.


Asunto(s)
Infecciones por Papillomavirus , Neoplasias del Cuello Uterino , Humanos , Femenino , Neoplasias del Cuello Uterino/genética , Infecciones por Papillomavirus/genética , Transcriptoma , Oncogenes , Virus del Papiloma Humano , Diferenciación Celular
6.
BMC Bioinformatics ; 23(1): 135, 2022 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-35428172

RESUMEN

BACKGROUND: Long non-coding RNA (LncRNA) plays important roles in physiological and pathological processes. Identifying LncRNA-protein interactions (LPIs) is essential to understand the molecular mechanism and infer the functions of lncRNAs. With the overwhelming size of the biomedical literature, extracting LPIs directly from the biomedical literature is essential, promising and challenging. However, there is no webserver of LPIs relationship extraction from literature. RESULTS: LPInsider is developed as the first webserver for extracting LPIs from biomedical literature texts based on multiple text features (semantic word vectors, syntactic structure vectors, distance vectors, and part of speech vectors) and logistic regression. LPInsider allows researchers to extract LPIs by uploading PMID, PMCID, PMID List, or biomedical text. A manually filtered and highly reliable LPI corpus is integrated in LPInsider. The performance of LPInsider is optimal by comprehensive experiment on different combinations of different feature and machine learning models. CONCLUSIONS: LPInsider is an efficient analytical tool for LPIs that helps researchers to enhance their comprehension of lncRNAs from text mining, and also saving their time. In addition, LPInsider is freely accessible from http://www.csbg-jlu.info/LPInsider/ with no login requirement. The source code and LPIs corpus can be downloaded from https://github.com/qiufengdiewu/LPInsider .


Asunto(s)
ARN Largo no Codificante , Biología Computacional , Minería de Datos , Aprendizaje Automático , ARN Largo no Codificante/genética , Programas Informáticos
7.
Acta Neuropathol ; 143(5): 547-569, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35389045

RESUMEN

Selective neuronal vulnerability to protein aggregation is found in many neurodegenerative diseases including Alzheimer's disease (AD). Understanding the molecular origins of this selective vulnerability is, therefore, of fundamental importance. Tau protein aggregates have been found in Wolframin (WFS1)-expressing excitatory neurons in the entorhinal cortex, one of the earliest affected regions in AD. The role of WFS1 in Tauopathies and its levels in tau pathology-associated neurodegeneration, however, is largely unknown. Here we report that WFS1 deficiency is associated with increased tau pathology and neurodegeneration, whereas overexpression of WFS1 reduces those changes. We also find that WFS1 interacts with tau protein and controls the susceptibility to tau pathology. Furthermore, chronic ER stress and autophagy-lysosome pathway (ALP)-associated genes are enriched in WFS1-high excitatory neurons in human AD at early Braak stages. The protein levels of ER stress and autophagy-lysosome pathway (ALP)-associated proteins are changed in tau transgenic mice with WFS1 deficiency, while overexpression of WFS1 reverses those changes. This work demonstrates a possible role for WFS1 in the regulation of tau pathology and neurodegeneration via chronic ER stress and the downstream ALP. Our findings provide insights into mechanisms that underpin selective neuronal vulnerability, and for developing new therapeutics to protect vulnerable neurons in AD.


Asunto(s)
Enfermedad de Alzheimer , Tauopatías , Enfermedad de Alzheimer/patología , Animales , Lisosomas/metabolismo , Ratones , Ratones Transgénicos , Neuronas/patología , Agregado de Proteínas , Tauopatías/patología
8.
Nucleic Acids Res ; 48(W1): W275-W286, 2020 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-32421805

RESUMEN

A group of genes controlled as a unit, usually by the same repressor or activator gene, is known as a regulon. The ability to identify active regulons within a specific cell type, i.e., cell-type-specific regulons (CTSR), provides an extraordinary opportunity to pinpoint crucial regulators and target genes responsible for complex diseases. However, the identification of CTSRs from single-cell RNA-Seq (scRNA-Seq) data is computationally challenging. We introduce IRIS3, the first-of-its-kind web server for CTSR inference from scRNA-Seq data for human and mouse. IRIS3 is an easy-to-use server empowered by over 20 functionalities to support comprehensive interpretations and graphical visualizations of identified CTSRs. CTSR data can be used to reliably characterize and distinguish the corresponding cell type from others and can be combined with other computational or experimental analyses for biomedical studies. CTSRs can, therefore, aid in the discovery of major regulatory mechanisms and allow reliable constructions of global transcriptional regulation networks encoded in a specific cell type. The broader impact of IRIS3 includes, but is not limited to, investigation of complex diseases hierarchies and heterogeneity, causal gene regulatory network construction, and drug development. IRIS3 is freely accessible from https://bmbl.bmi.osumc.edu/iris3/ with no login requirement.


Asunto(s)
RNA-Seq , Regulón , Análisis de la Célula Individual , Programas Informáticos , Animales , Encéfalo/metabolismo , Análisis por Conglomerados , Ratones
9.
Brief Bioinform ; 20(6): 2009-2027, 2019 11 27.
Artículo en Inglés | MEDLINE | ID: mdl-30084867

RESUMEN

Discovering new long non-coding RNAs (lncRNAs) has been a fundamental step in lncRNA-related research. Nowadays, many machine learning-based tools have been developed for lncRNA identification. However, many methods predict lncRNAs using sequence-derived features alone, which tend to display unstable performances on different species. Moreover, the majority of tools cannot be re-trained or tailored by users and neither can the features be customized or integrated to meet researchers' requirements. In this study, features extracted from sequence-intrinsic composition, secondary structure and physicochemical property are comprehensively reviewed and evaluated. An integrated platform named LncFinder is also developed to enhance the performance and promote the research of lncRNA identification. LncFinder includes a novel lncRNA predictor using the heterologous features we designed. Experimental results show that our method outperforms several state-of-the-art tools on multiple species with more robust and satisfactory results. Researchers can additionally employ LncFinder to extract various classic features, build classifier with numerous machine learning algorithms and evaluate classifier performance effectively and efficiently. LncFinder can reveal the properties of lncRNA and mRNA from various perspectives and further inspire lncRNA-protein interaction prediction and lncRNA evolution analysis. It is anticipated that LncFinder can significantly facilitate lncRNA-related research, especially for the poorly explored species. LncFinder is released as R package (https://CRAN.R-project.org/package=LncFinder). A web server (http://bmbl.sdstate.edu/lncfinder/) is also developed to maximize its availability.


Asunto(s)
Conformación de Ácido Nucleico , ARN Largo no Codificante/química , Algoritmos , Animales , Biología Computacional/métodos , Humanos , Aprendizaje Automático
10.
Bioinformatics ; 36(4): 1143-1149, 2020 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-31503285

RESUMEN

MOTIVATION: The biclustering of large-scale gene expression data holds promising potential for detecting condition-specific functional gene modules (i.e. biclusters). However, existing methods do not adequately address a comprehensive detection of all significant bicluster structures and have limited power when applied to expression data generated by RNA-Sequencing (RNA-Seq), especially single-cell RNA-Seq (scRNA-Seq) data, where massive zero and low expression values are observed. RESULTS: We present a new biclustering algorithm, QUalitative BIClustering algorithm Version 2 (QUBIC2), which is empowered by: (i) a novel left-truncated mixture of Gaussian model for an accurate assessment of multimodality in zero-enriched expression data, (ii) a fast and efficient dropouts-saving expansion strategy for functional gene modules optimization using information divergency and (iii) a rigorous statistical test for the significance of all the identified biclusters in any organism, including those without substantial functional annotations. QUBIC2 demonstrated considerably improved performance in detecting biclusters compared to other five widely used algorithms on various benchmark datasets from E.coli, Human and simulated data. QUBIC2 also showcased robust and superior performance on gene expression data generated by microarray, bulk RNA-Seq and scRNA-Seq. AVAILABILITY AND IMPLEMENTATION: The source code of QUBIC2 is freely available at https://github.com/OSU-BMBL/QUBIC2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Perfilación de la Expresión Génica , ARN , Algoritmos , Humanos , Análisis de Secuencia de ARN , Programas Informáticos
11.
Nucleic Acids Res ; 47(15): 7809-7824, 2019 09 05.
Artículo en Inglés | MEDLINE | ID: mdl-31372637

RESUMEN

The identification of transcription factor binding sites and cis-regulatory motifs is a frontier whereupon the rules governing protein-DNA binding are being revealed. Here, we developed a new method (DEep Sequence and Shape mOtif or DESSO) for cis-regulatory motif prediction using deep neural networks and the binomial distribution model. DESSO outperformed existing tools, including DeepBind, in predicting motifs in 690 human ENCODE ChIP-sequencing datasets. Furthermore, the deep-learning framework of DESSO expanded motif discovery beyond the state-of-the-art by allowing the identification of known and new protein-protein-DNA tethering interactions in human transcription factors (TFs). Specifically, 61 putative tethering interactions were identified among the 100 TFs expressed in the K562 cell line. In this work, the power of DESSO was further expanded by integrating the detection of DNA shape features. We found that shape information has strong predictive power for TF-DNA binding and provides new putative shape motif information for human TFs. Thus, DESSO improves in the identification and structural analysis of TF binding sites, by integrating the complexities of DNA binding into a deep-learning framework.


Asunto(s)
Biología Computacional/estadística & datos numéricos , ADN/química , Aprendizaje Profundo , Factores de Transcripción/genética , Sitios de Unión , Biología Computacional/métodos , ADN/genética , ADN/metabolismo , Regulación de la Expresión Génica , Humanos , Células K562 , Motivos de Nucleótidos , Unión Proteica , Factores de Transcripción/clasificación , Factores de Transcripción/metabolismo
13.
PLoS Comput Biol ; 15(2): e1006792, 2019 02.
Artículo en Inglés | MEDLINE | ID: mdl-30763315

RESUMEN

Next-Generation Sequencing has made available substantial amounts of large-scale Omics data, providing unprecedented opportunities to understand complex biological systems. Specifically, the value of RNA-Sequencing (RNA-Seq) data has been confirmed in inferring how gene regulatory systems will respond under various conditions (bulk data) or cell types (single-cell data). RNA-Seq can generate genome-scale gene expression profiles that can be further analyzed using correlation analysis, co-expression analysis, clustering, differential gene expression (DGE), among many other studies. While these analyses can provide invaluable information related to gene expression, integration and interpretation of the results can prove challenging. Here we present a tool called IRIS-EDA, which is a Shiny web server for expression data analysis. It provides a straightforward and user-friendly platform for performing numerous computational analyses on user-provided RNA-Seq or Single-cell RNA-Seq (scRNA-Seq) data. Specifically, three commonly used R packages (edgeR, DESeq2, and limma) are implemented in the DGE analysis with seven unique experimental design functionalities, including a user-specified design matrix option. Seven discovery-driven methods and tools (correlation analysis, heatmap, clustering, biclustering, Principal Component Analysis (PCA), Multidimensional Scaling (MDS), and t-distributed Stochastic Neighbor Embedding (t-SNE)) are provided for gene expression exploration which is useful for designing experimental hypotheses and determining key factors for comprehensive DGE analysis. Furthermore, this platform integrates seven visualization tools in a highly interactive manner, for improved interpretation of the analyses. It is noteworthy that, for the first time, IRIS-EDA provides a framework to expedite submission of data and results to NCBI's Gene Expression Omnibus following the FAIR (Findable, Accessible, Interoperable and Reusable) Data Principles. IRIS-EDA is freely available at http://bmbl.sdstate.edu/IRIS/.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Células Cultivadas , Análisis por Conglomerados , Bases de Datos Factuales , Humanos , ARN/análisis , ARN/genética , ARN/metabolismo
14.
Patterns (N Y) ; 5(3): 100927, 2024 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-38487805

RESUMEN

In this study, we introduce TESA (weighted two-stage alignment), an innovative motif prediction tool that refines the identification of DNA-binding protein motifs, essential for deciphering transcriptional regulatory mechanisms. Unlike traditional algorithms that rely solely on sequence data, TESA integrates the high-resolution chromatin immunoprecipitation (ChIP) signal, specifically from ChIP-exonuclease (ChIP-exo), by assigning weights to sequence positions, thereby enhancing motif discovery. TESA employs a nuanced approach combining a binomial distribution model with a graph model, further supported by a "bookend" model, to improve the accuracy of predicting motifs of varying lengths. Our evaluation, utilizing an extensive compilation of 90 prokaryotic ChIP-exo datasets from proChIPdb and 167 H. sapiens datasets, compared TESA's performance against seven established tools. The results indicate TESA's improved precision in motif identification, suggesting its valuable contribution to the field of genomic research.

15.
Nat Commun ; 15(1): 4710, 2024 Jun 06.
Artículo en Inglés | MEDLINE | ID: mdl-38844475

RESUMEN

Alzheimer's Disease (AD) pathology has been increasingly explored through single-cell and single-nucleus RNA-sequencing (scRNA-seq & snRNA-seq) and spatial transcriptomics (ST). However, the surge in data demands a comprehensive, user-friendly repository. Addressing this, we introduce a single-cell and spatial RNA-seq database for Alzheimer's disease (ssREAD). It offers a broader spectrum of AD-related datasets, an optimized analytical pipeline, and improved usability. The database encompasses 1,053 samples (277 integrated datasets) from 67 AD-related scRNA-seq & snRNA-seq studies, totaling 7,332,202 cells. Additionally, it archives 381 ST datasets from 18 human and mouse brain studies. Each dataset is annotated with details such as species, gender, brain region, disease/control status, age, and AD Braak stages. ssREAD also provides an analysis suite for cell clustering, identification of differentially expressed and spatially variable genes, cell-type-specific marker genes and regulons, and spot deconvolution for integrative analysis. ssREAD is freely available at https://bmblx.bmi.osumc.edu/ssread/ .


Asunto(s)
Enfermedad de Alzheimer , RNA-Seq , Análisis de la Célula Individual , Enfermedad de Alzheimer/genética , Humanos , Análisis de la Célula Individual/métodos , Animales , Ratones , RNA-Seq/métodos , Encéfalo/metabolismo , Encéfalo/patología , Bases de Datos Genéticas , Transcriptoma , Análisis de Secuencia de ARN/métodos , Perfilación de la Expresión Génica/métodos , Masculino
16.
medRxiv ; 2024 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-39040206

RESUMEN

Alzheimer's Disease (AD) is a complex neurodegenerative disorder significantly influenced by sex differences, with approximately two-thirds of AD patients being women. Characterizing the sex-specific AD progression and identifying its progression trajectory is a crucial step to developing effective risk stratification and prevention strategies. In this study, we developed an autoencoder to uncover sex-specific sub-phenotypes in AD progression leveraging longitudinal electronic health record (EHR) data from OneFlorida+ Clinical Research Consortium. Specifically, we first constructed temporal patient representation using longitudinal EHRs from a sex-stratified AD cohort. We used a long short-term memory (LSTM)-based autoencoder to extract and generate latent representation embeddings from sequential clinical records of patients. We then applied hierarchical agglomerative clustering to the learned representations, grouping patients based on their progression sub-phenotypes. The experimental results show we successfully identified five primary sex-based AD sub-phenotypes with corresponding progression pathways with high confidence. These sex-specific sub-phenotypes not only illustrated distinct AD progression patterns but also revealed differences in clinical characteristics and comorbidities between females and males in AD development. These findings could provide valuable insights for advancing personalized AD intervention and treatment strategies.

17.
Cancer Res Commun ; 4(2): 293-302, 2024 02 05.
Artículo en Inglés | MEDLINE | ID: mdl-38259095

RESUMEN

Evidence supports significant interactions among microbes, immune cells, and tumor cells in at least 10%-20% of human cancers, emphasizing the importance of further investigating these complex relationships. However, the implications and significance of tumor-related microbes remain largely unknown. Studies have demonstrated the critical roles of host microbes in cancer prevention and treatment responses. Understanding interactions between host microbes and cancer can drive cancer diagnosis and microbial therapeutics (bugs as drugs). Computational identification of cancer-specific microbes and their associations is still challenging due to the high dimensionality and high sparsity of intratumoral microbiome data, which requires large datasets containing sufficient event observations to identify relationships, and the interactions within microbial communities, the heterogeneity in microbial composition, and other confounding effects that can lead to spurious associations. To solve these issues, we present a bioinformatics tool, microbial graph attention (MEGA), to identify the microbes most strongly associated with 12 cancer types. We demonstrate its utility on a dataset from a consortium of nine cancer centers in the Oncology Research Information Exchange Network. This package has three unique features: species-sample relations are represented in a heterogeneous graph and learned by a graph attention network; it incorporates metabolic and phylogenetic information to reflect intricate relationships within microbial communities; and it provides multiple functionalities for association interpretations and visualizations. We analyzed 2,704 tumor RNA sequencing samples and MEGA interpreted the tissue-resident microbial signatures of each of 12 cancer types. MEGA can effectively identify cancer-associated microbial signatures and refine their interactions with tumors. SIGNIFICANCE: Studying the tumor microbiome in high-throughput sequencing data is challenging because of the extremely sparse data matrices, heterogeneity, and high likelihood of contamination. We present a new deep learning tool, MEGA, to refine the organisms that interact with tumors.


Asunto(s)
Microbiota , Humanos , Filogenia , Microbiota/genética , Biología Computacional , Secuenciación de Nucleótidos de Alto Rendimiento
18.
bioRxiv ; 2023 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-37745592

RESUMEN

Alzheimer's Disease (AD) is a neurodegenerative malady predominantly affecting the elderly and exhibits its debilitating effects on a dementia-prone population. Recently, the advent of innovative technologies, such as single-cell and single-nucleus RNA-sequencing (scRNA-seq & snRNA-seq) and spatial transcriptomics (ST), has reformed our investigative approaches toward comprehending AD's neuropathological intricacies and underpinning regulatory mechanisms, encompassing sub-cellular, cellular, and spatial dimensions. In light of the overwhelming proliferation of single-cell and ST data associated with AD, the imperative for a comprehensive, user-friendly database that addresses the scientific community's analytical demands has never been more paramount. Introduced initially in 2020, scREAD presented itself as a pioneering repository that systematized publicly available scRNA-seq and snRNA-seq datasets derived from post-mortem human brain tissues and mouse models mirroring AD pathology. Here, we introduce ssREAD, a substantial upgrade over scREAD, enriching the platform with a broader spectrum of datasets, an optimized analytical pipeline, and enhanced usability and visibility. Specifically, ssREAD amalgamates an impressive portfolio of over 189 datasets extracted from 35 distinct AD-related scRNA-seq and snRNA-seq studies, encompassing a staggering 2,572,355 cells. In addition, we have diligently curated and archived 300 ST datasets, originating from 12 human and mouse brain studies, which include two focused on AD and ten control studies. Every dataset within our repository is meticulously annotated, bearing critical identifiers including species, gender, brain region, disease/control status, age, and AD stages. Besides the collection of above datasets in ssREAD, it delivers an exhaustive analysis suite offering cell clustering and annotation, inference of differentially expressed and spatially variable genes, identification of cell-type-specific marker genes and regulons, and spot deconvolution for integrative analysis of ST and scRNA-seq & snRNA-seq data from public domains. All these resources are freely accessible through a user-friendly, consolidated web portal available at https://bmblx.bmi.osumc.edu/ssread/.

19.
Heliyon ; 9(12): e22232, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38107273

RESUMEN

In this work, the comprehensive properties of flammable casing for underground coal gasification is systematically investigated, including flammable casing material physical, chemical and mechanical properties and full-size flammable casing mechanical properties and burning behavior. The flammable casing material consists of magnesium alloy matrix and rare earth particles, thermal conductivity and expansion property of which are weak. Results of high-temperature tensile test reveal that flammable casing material has good high temperature strength which declines by 30 % at 300 °C. Corrosion rate of flammable casing material is relatively high without extra protection. The full-size flammable casing possesses considerable mechanical property, thread property and high temperature collapse resistance. Burning of flammable casing is safe and stable. Burning rate of flammable casing material can be effectively controlled by water flow. Combustion product of flammable casing presents powder condition, which has no risk of blocking the gasification channel. To sum up, flammable casing is necessary to the realization of underground coal gasifying process, which plays the significant role of the development and application of underground coal gasification technology.

20.
bioRxiv ; 2023 Aug 14.
Artículo en Inglés | MEDLINE | ID: mdl-37645794

RESUMEN

Human Papillomaviruses (HPVs) are associated with around 5-10% of human cancer, notably nearly 99% of cervical cancer. The mechanisms HPV interacts with stratified epithelium (differentiated layers) during the viral life cycle, and oncogenesis remain unclear. In this study, we used single-cell transcriptome analysis to study viral gene and host cell differentiation-associated heterogeneity of HPV-positive cervical cancer tissue. We examined the HPV16 genes - E1, E6, and E7, and found they expressed differently across nine epithelial clusters. We found that three epithelial clusters had the highest proportion of HPV-positive cells (33.6%, 37.5%, and 32.4%, respectively), while two exhibited the lowest proportions (7.21% and 5.63%, respectively). Notably, the cluster with the most HPV-positive cells deviated significantly from normal epithelial layer markers, exhibiting functional heterogeneity and altered epithelial structuring, indicating that significant molecular heterogeneity existed in cancer tissues and that these cells exhibited unique/different gene signatures compared with normal epithelial cells. These HPV-positive cells, compared to HPV-negative, showed different gene expressions related to the extracellular matrix, cell adhesion, proliferation, and apoptosis. Further, the viral oncogenes E6 and E7 appeared to modify epithelial function via distinct pathways, thus contributing to cervical cancer progression. We investigated the HPV and host transcripts from a novel viewpoint focusing on layer heterogeneity. Our results indicated varied HPV expression across epithelial clusters and epithelial heterogeneity associated with viral oncogenes, contributing biological insights to this critical field of study.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA