Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46.868
Filtrar
1.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38980369

RESUMEN

Recent studies have extensively used deep learning algorithms to analyze gene expression to predict disease diagnosis, treatment effectiveness, and survival outcomes. Survival analysis studies on diseases with high mortality rates, such as cancer, are indispensable. However, deep learning models are plagued by overfitting owing to the limited sample size relative to the large number of genes. Consequently, the latest style-transfer deep generative models have been implemented to generate gene expression data. However, these models are limited in their applicability for clinical purposes because they generate only transcriptomic data. Therefore, this study proposes ctGAN, which enables the combined transformation of gene expression and survival data using a generative adversarial network (GAN). ctGAN improves survival analysis by augmenting data through style transformations between breast cancer and 11 other cancer types. We evaluated the concordance index (C-index) enhancements compared with previous models to demonstrate its superiority. Performance improvements were observed in nine of the 11 cancer types. Moreover, ctGAN outperformed previous models in seven out of the 11 cancer types, with colon adenocarcinoma (COAD) exhibiting the most significant improvement (median C-index increase of ~15.70%). Furthermore, integrating the generated COAD enhanced the log-rank p-value (0.041) compared with using only the real COAD (p-value = 0.797). Based on the data distribution, we demonstrated that the model generated highly plausible data. In clustering evaluation, ctGAN exhibited the highest performance in most cases (89.62%). These findings suggest that ctGAN can be meaningfully utilized to predict disease progression and select personalized treatments in the medical field.


Asunto(s)
Aprendizaje Profundo , Humanos , Análisis de Supervivencia , Algoritmos , Neoplasias/genética , Neoplasias/mortalidad , Perfilación de la Expresión Génica/métodos , Redes Neurales de la Computación , Biología Computacional/métodos , Neoplasias de la Mama/genética , Neoplasias de la Mama/mortalidad , Femenino , Regulación Neoplásica de la Expresión Génica
2.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38980371

RESUMEN

Accurate prediction of protein-ligand binding affinity (PLA) is important for drug discovery. Recent advances in applying graph neural networks have shown great potential for PLA prediction. However, existing methods usually neglect the geometric information (i.e. bond angles), leading to difficulties in accurately distinguishing different molecular structures. In addition, these methods also pose limitations in representing the binding process of protein-ligand complexes. To address these issues, we propose a novel geometry-enhanced mid-fusion network, named GEMF, to learn comprehensive molecular geometry and interaction patterns. Specifically, the GEMF consists of a graph embedding layer, a message passing phase, and a multi-scale fusion module. GEMF can effectively represent protein-ligand complexes as graphs, with graph embeddings based on physicochemical and geometric properties. Moreover, our dual-stream message passing framework models both covalent and non-covalent interactions. In particular, the edge-update mechanism, which is based on line graphs, can fuse both distance and angle information in the covalent branch. In addition, the communication branch consisting of multiple heterogeneous interaction modules is developed to learn intricate interaction patterns. Finally, we fuse the multi-scale features from the covalent, non-covalent, and heterogeneous interaction branches. The extensive experimental results on several benchmarks demonstrate the superiority of GEMF compared with other state-of-the-art methods.


Asunto(s)
Redes Neurales de la Computación , Unión Proteica , Proteínas , Proteínas/química , Proteínas/metabolismo , Ligandos , Algoritmos , Biología Computacional/métodos , Descubrimiento de Drogas/métodos
3.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38980373

RESUMEN

Inferring gene regulatory networks (GRNs) allows us to obtain a deeper understanding of cellular function and disease pathogenesis. Recent advances in single-cell RNA sequencing (scRNA-seq) technology have improved the accuracy of GRN inference. However, many methods for inferring individual GRNs from scRNA-seq data are limited because they overlook intercellular heterogeneity and similarities between different cell subpopulations, which are often present in the data. Here, we propose a deep learning-based framework, DeepGRNCS, for jointly inferring GRNs across cell subpopulations. We follow the commonly accepted hypothesis that the expression of a target gene can be predicted based on the expression of transcription factors (TFs) due to underlying regulatory relationships. We initially processed scRNA-seq data by discretizing data scattering using the equal-width method. Then, we trained deep learning models to predict target gene expression from TFs. By individually removing each TF from the expression matrix, we used pre-trained deep model predictions to infer regulatory relationships between TFs and genes, thereby constructing the GRN. Our method outperforms existing GRN inference methods for various simulated and real scRNA-seq datasets. Finally, we applied DeepGRNCS to non-small cell lung cancer scRNA-seq data to identify key genes in each cell subpopulation and analyzed their biological relevance. In conclusion, DeepGRNCS effectively predicts cell subpopulation-specific GRNs. The source code is available at https://github.com/Nastume777/DeepGRNCS.


Asunto(s)
Aprendizaje Profundo , Redes Reguladoras de Genes , Análisis de la Célula Individual , Humanos , Análisis de la Célula Individual/métodos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Biología Computacional/métodos , Análisis de Secuencia de ARN/métodos , RNA-Seq/métodos
4.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38980372

RESUMEN

Around 50 years ago, molecular biology opened the path to understand changes in forms, adaptations, complexity, or the basis of human diseases through myriads of reports on gene birth, gene duplication, gene expression regulation, and splicing regulation, among other relevant mechanisms behind gene function. Here, with the advent of big data and artificial intelligence (AI), we focus on an elusive and intriguing mechanism of gene function regulation, RNA editing, in which a single nucleotide from an RNA molecule is changed, with a remarkable impact in the increase of the complexity of the transcriptome and proteome. We present a new generation approach to assess the functional conservation of the RNA-editing targeting mechanism using two AI learning algorithms, random forest (RF) and bidirectional long short-term memory (biLSTM) neural networks with an attention layer. These algorithms, combined with RNA-editing data coming from databases and variant calling from same-individual RNA and DNA-seq experiments from different species, allowed us to predict RNA-editing events using both primary sequence and secondary structure. Then, we devised a method for assessing conservation or divergence in the molecular mechanisms of editing completely in silico: the cross-testing analysis. This novel method not only helps to understand the conservation of the editing mechanism through evolution but could set the basis for achieving a better understanding of the adenosine-targeting mechanism in other fields.


Asunto(s)
Aprendizaje Automático , Edición de ARN , Humanos , Algoritmos , Simulación por Computador , Biología Computacional/métodos , Redes Neurales de la Computación , ARN/genética , ARN/metabolismo
5.
BMC Bioinformatics ; 25(1): 232, 2024 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-38982382

RESUMEN

BACKGROUND: Characterization of microbial growth is of both fundamental and applied interest. Modern platforms can automate collection of high-throughput microbial growth curves, necessitating the development of computational tools to handle and analyze these data to produce insights. RESULTS: To address this need, here I present a newly-developed R package: gcplyr. gcplyr can flexibly import growth curve data in common tabular formats, and reshapes it under a tidy framework that is flexible and extendable, enabling users to design custom analyses or plot data with popular visualization packages. gcplyr can also incorporate metadata and generate or import experimental designs to merge with data. Finally, gcplyr carries out model-free (non-parametric) analyses. These analyses do not require mathematical assumptions about microbial growth dynamics, and gcplyr is able to extract a broad range of important traits, including growth rate, doubling time, lag time, maximum density and carrying capacity, diauxie, area under the curve, extinction time, and more. CONCLUSIONS: gcplyr makes scripted analyses of growth curve data in R straightforward, streamlines common data wrangling and analysis steps, and easily integrates with common visualization and statistical analyses.


Asunto(s)
Programas Informáticos , Biología Computacional/métodos , Análisis de Datos
6.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38982642

RESUMEN

Inferring cell type proportions from bulk transcriptome data is crucial in immunology and oncology. Here, we introduce guided LDA deconvolution (GLDADec), a bulk deconvolution method that guides topics using cell type-specific marker gene names to estimate topic distributions for each sample. Through benchmarking using blood-derived datasets, we demonstrate its high estimation performance and robustness. Moreover, we apply GLDADec to heterogeneous tissue bulk data and perform comprehensive cell type analysis in a data-driven manner. We show that GLDADec outperforms existing methods in estimation performance and evaluate its biological interpretability by examining enrichment of biological processes for topics. Finally, we apply GLDADec to The Cancer Genome Atlas tumor samples, enabling subtype stratification and survival analysis based on estimated cell type proportions, thus proving its practical utility in clinical settings. This approach, utilizing marker gene names as partial prior information, can be applied to various scenarios for bulk data deconvolution. GLDADec is available as an open-source Python package at https://github.com/mizuno-group/GLDADec.


Asunto(s)
Programas Informáticos , Humanos , Perfilación de la Expresión Génica/métodos , Algoritmos , Transcriptoma , Biología Computacional/métodos , Neoplasias/genética , Biomarcadores de Tumor/genética , Marcadores Genéticos
7.
Front Immunol ; 15: 1424197, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38983866

RESUMEN

Background: Lung squamous cell carcinoma (LUSC) ranks among the carcinomas with the highest incidence and dismal survival rates, suffering from a lack of effective therapeutic strategies. Consequently, biomarkers facilitating early diagnosis of LUSC could significantly enhance patient survival. This study aims to identify novel biomarkers for LUSC. Methods: Utilizing the TCGA, GTEx, and CGGA databases, we focused on the gene encoding Family with Sequence Similarity 20, Member A (FAM20A) across various cancers. We then corroborated these bioinformatic predictions with clinical samples. A range of analytical tools, including Kaplan-Meier, MethSurv database, Wilcoxon rank-sum, Kruskal-Wallis tests, Gene Set Enrichment Analysis, and TIMER database, were employed to assess the diagnostic and prognostic value of FAM20A in LUSC. These tools also helped evaluate immune cell infiltration, immune checkpoint genes, DNA repair-related genes, DNA methylation, and tumor-related pathways. Results: FAM20A expression was found to be significantly reduced in LUSC, correlating with lower survival rates. It exhibited a negative correlation with key proteins in DNA repair signaling pathways, potentially contributing to LUSC's radiotherapy resistance. Additionally, FAM20A showed a positive correlation with immune checkpoints like CTLA-4, indicating potential heightened sensitivity to immunotherapies targeting these checkpoints. Conclusion: FAM20A emerges as a promising diagnostic and prognostic biomarker for LUSC, offering potential clinical applications.


Asunto(s)
Biomarcadores de Tumor , Carcinoma de Células Escamosas , Neoplasias Pulmonares , Humanos , Biomarcadores de Tumor/genética , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/mortalidad , Neoplasias Pulmonares/inmunología , Carcinoma de Células Escamosas/diagnóstico , Carcinoma de Células Escamosas/genética , Carcinoma de Células Escamosas/inmunología , Pronóstico , Regulación Neoplásica de la Expresión Génica , Biología Computacional/métodos , Bases de Datos Genéticas , Proteínas que Contienen Bromodominio , Proteínas del Tejido Nervioso , Factores de Transcripción , Antígenos Nucleares
8.
F1000Res ; 13: 556, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38984017

RESUMEN

Background: Determining the appropriate computational requirements and software performance is essential for efficient genomic surveillance. The lack of standardized benchmarking complicates software selection, especially with limited resources. Methods: We developed a containerized benchmarking pipeline to evaluate seven long-read assemblers-Canu, GoldRush, MetaFlye, Strainline, HaploDMF, iGDA, and RVHaplo-for viral haplotype reconstruction, using both simulated and experimental Oxford Nanopore sequencing data of HIV-1 and other viruses. Benchmarking was conducted on three computational systems to assess each assembler's performance, utilizing QUAST and BLASTN for quality assessment. Results: Our findings show that assembler choice significantly impacts assembly time, with CPU and memory usage having minimal effect. Assembler selection also influences the size of the contigs, with a minimum read length of 2,000 nucleotides required for quality assembly. A 4,000-nucleotide read length improves quality further. Canu was efficient among de novo assemblers but not suitable for multi-strain mixtures, while GoldRush produced only consensus assemblies. Strainline and MetaFlye were suitable for metagenomic sequencing data, with Strainline requiring high memory and MetaFlye operable on low-specification machines. Among reference-based assemblers, iGDA had high error rates, RVHaplo showed the best runtime and accuracy but became ineffective with similar sequences, and HaploDMF, utilizing machine learning, had fewer errors with a slightly longer runtime. Conclusions: The HIV-64148 pipeline, containerized using Docker, facilitates easy deployment and offers flexibility to select from a range of assemblers to match computational systems or study requirements. This tool aids in genome assembly and provides valuable information on HIV-1 sequences, enhancing viral evolution monitoring and understanding.


Asunto(s)
Biología Computacional , Genómica , VIH-1 , Programas Informáticos , VIH-1/genética , Biología Computacional/métodos , Genómica/métodos , Humanos , Genoma Viral/genética
9.
Chin Clin Oncol ; 13(3): 32, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38984486

RESUMEN

BACKGROUND: Hepatocellular carcinoma (HCC) is the third leading cause of cancer-related deaths globally. To reduce HCC-related mortality, early diagnosis and therapeutic improvement are essential. Hub differentially expressed genes (HubGs) may serve as potential diagnostic and prognostic biomarkers, also offering therapeutic targets for precise therapies. Therefore, we aimed to identify top-ranked hub genes for the diagnosis, prognosis, and therapy of HCC. METHODS: Through a systematic literature review, 202 HCC-related HubGs were derived from 59 studies, yet consistent detection across these was lacking. Then, we identified top-ranked HubGs (tHubGs) by integrated bioinformatics analysis, highlighting their functions, pathways, and regulators that might be more representative of the diagnosis, prognosis, and therapies of HCC. RESULTS: In this study, eight HubGs (CDK1, AURKA, CDC20, CCNB2, TOP2A, PLK1, BUB1B, and BIRC5) were identified as the tHubGs through the protein-protein interaction (PPI) network and survival analysis. Their differential expression in different stages of HCC, validated using The Cancer Genome Atlas (TCGA) Program database, suggests their potential as early HCC markers. The enrichment analyses revealed some important roles in HCC-related biological processes (BPs), molecular functions (MFs), cellular components (CCs), and signaling pathways. Moreover, the gene regulatory network analysis highlighted key transcription factors (TFs) and microRNAs (miRNAs) that regulate these tHubGs at transcriptional and post-transcriptional. Finally, we selected three drugs (CD437, avrainvillamide, and LRRK2-IN-1) as candidate drugs for HCC treatment as they showed strong binding with all of our proposed and published protein receptors. CONCLUSIONS: The findings of this study may provide valuable resources for early diagnosis, prognosis, and therapies for HCC.


Asunto(s)
Carcinoma Hepatocelular , Neoplasias Hepáticas , Humanos , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/terapia , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/terapia , Pronóstico , Mapas de Interacción de Proteínas , Biología Computacional/métodos , Biomarcadores de Tumor/genética , Regulación Neoplásica de la Expresión Génica
10.
NPJ Syst Biol Appl ; 10(1): 71, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38969664

RESUMEN

This article reviews the current knowledge and recent advancements in computational modeling of the cell cycle. It offers a comparative analysis of various modeling paradigms, highlighting their unique strengths, limitations, and applications. Specifically, the article compares deterministic and stochastic models, single-cell versus population models, and mechanistic versus abstract models. This detailed analysis helps determine the most suitable modeling framework for various research needs. Additionally, the discussion extends to the utilization of these computational models to illuminate cell cycle dynamics, with a particular focus on cell cycle viability, crosstalk with signaling pathways, tumor microenvironment, DNA replication, and repair mechanisms, underscoring their critical roles in tumor progression and the optimization of cancer therapies. By applying these models to crucial aspects of cancer therapy planning for better outcomes, including drug efficacy quantification, drug discovery, drug resistance analysis, and dose optimization, the review highlights the significant potential of computational insights in enhancing the precision and effectiveness of cancer treatments. This emphasis on the intricate relationship between computational modeling and therapeutic strategy development underscores the pivotal role of advanced modeling techniques in navigating the complexities of cell cycle dynamics and their implications for cancer therapy.


Asunto(s)
Ciclo Celular , Simulación por Computador , Modelos Biológicos , Neoplasias , Humanos , Neoplasias/terapia , Neoplasias/patología , Ciclo Celular/fisiología , Transducción de Señal , Microambiente Tumoral , Antineoplásicos/farmacología , Antineoplásicos/uso terapéutico , Biología Computacional/métodos
11.
Sci Rep ; 14(1): 15551, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38969714

RESUMEN

A major challenge in therapeutic approaches applying hematopoietic stem cells (HSCs) is the cell quantity. The primary objective of this study was to predict the miRNAs and anti-miRNAs using bioinformatics tools and investigate their effects on the expression levels of key genes predicted in the improvement of proliferation, and the inhibition of differentiation in HSCs isolated from Human umbilical cord blood (HUCB). A network including genes related to the differentiation and proliferation stages of HSCs was constructed by enriching data of text (PubMed) and StemChecker server with KEGG signaling pathways, and was improved using GEO datasets. Bioinformatics tools predicted a profile from miRNAs containing miR-20a-5p, miR-423-5p, and chimeric anti-miRNA constructed from 5'-miR-340/3'-miR-524 for the high-score genes (RB1, SMAD4, STAT1, CALML4, GNG13, and CDKN1A/CDKN1B genes) in the network. The miRNAs and anti-miRNA were transferred into HSCs using polyethylenimine (PEI). The gene expression levels were estimated using the RT-qPCR technique in the PEI + (miRNA/anti-miRNA)-contained cell groups (n = 6). Furthermore, CD markers (90, 16, and 45) were evaluated using flow cytometry. Strong relationships were found between the high-score genes, miRNAs, and chimeric anti-miRNA. The RB1, SMAD4, and STAT1 gene expression levels were decreased by miR-20a-5p (P < 0.05). Additionally, the anti-miRNA increased the gene expression level of GNG13 (P < 0.05), whereas the miR-423-5p decreased the CDKN1A gene expression level (P < 0.01). The cellular count also increased significantly (P < 0.05) but the CD45 differentiation marker did not change in the cell groups. The study revealed the predicted miRNA/anti-miRNA profile expands HSCs isolated from HUCB. While miR-20a-5p suppressed the RB1, SMAD4, and STAT1 genes involved in cellular differentiation, the anti-miRNA promoted the GNG13 gene related to the proliferation process. Notably, the mixed miRNA/anti-miRNA group exhibited the highest cellular expansion. This approach could hold promise for enhancing the cell quantity in HSC therapy.


Asunto(s)
Diferenciación Celular , Proliferación Celular , Células Madre Hematopoyéticas , MicroARNs , MicroARNs/genética , MicroARNs/metabolismo , Células Madre Hematopoyéticas/metabolismo , Células Madre Hematopoyéticas/citología , Humanos , Proliferación Celular/genética , Diferenciación Celular/genética , Sangre Fetal/citología , Biología Computacional/métodos , Redes Reguladoras de Genes , Regulación de la Expresión Génica , Perfilación de la Expresión Génica
12.
BMC Biotechnol ; 24(1): 45, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38970027

RESUMEN

Marburg virus (MARV) is a highly contagious and virulent agent belonging to Filoviridae family. MARV causes severe hemorrhagic fever in humans and non-human primates. Owing to its highly virulent nature, preventive approaches are promising for its control. There is currently no approved drug or vaccine against MARV, and management mainly involves supportive care to treat symptoms and prevent complications. Our aim was to design a novel multi-epitope vaccine (MEV) against MARV using immunoinformatics studies. In this study, various proteins (VP35, VP40 and glycoprotein precursor) were used and potential epitopes were selected. CTL and HTL epitopes covered 79.44% and 70.55% of the global population, respectively. The designed MEV construct was stable and expressed in Escherichia coli (E. coli) host. The physicochemical properties were also acceptable. MARV MEV candidate could predict comprehensive immune responses such as those of humoral and cellular in silico. Additionally, efficient interaction to toll-like receptor 3 (TLR3) and its agonist (ß-defensin) was predicted. There is a need for validation of these results using further in vitro and in vivo studies.


Asunto(s)
Biología Computacional , Enfermedad del Virus de Marburg , Marburgvirus , Vacunas Virales , Marburgvirus/inmunología , Enfermedad del Virus de Marburg/prevención & control , Enfermedad del Virus de Marburg/inmunología , Vacunas Virales/inmunología , Biología Computacional/métodos , Animales , Humanos , Epítopos de Linfocito T/inmunología , Epítopos de Linfocito T/genética , Epítopos/inmunología , Epítopos/genética , Epítopos/química , Escherichia coli/genética , Escherichia coli/metabolismo , Inmunoinformática
14.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38975895

RESUMEN

Spatial transcriptomics provides valuable insights into gene expression within the native tissue context, effectively merging molecular data with spatial information to uncover intricate cellular relationships and tissue organizations. In this context, deciphering cellular spatial domains becomes essential for revealing complex cellular dynamics and tissue structures. However, current methods encounter challenges in seamlessly integrating gene expression data with spatial information, resulting in less informative representations of spots and suboptimal accuracy in spatial domain identification. We introduce stCluster, a novel method that integrates graph contrastive learning with multi-task learning to refine informative representations for spatial transcriptomic data, consequently improving spatial domain identification. stCluster first leverages graph contrastive learning technology to obtain discriminative representations capable of recognizing spatially coherent patterns. Through jointly optimizing multiple tasks, stCluster further fine-tunes the representations to be able to capture complex relationships between gene expression and spatial organization. Benchmarked against six state-of-the-art methods, the experimental results reveal its proficiency in accurately identifying complex spatial domains across various datasets and platforms, spanning tissue, organ, and embryo levels. Moreover, stCluster can effectively denoise the spatial gene expression patterns and enhance the spatial trajectory inference. The source code of stCluster is freely available at https://github.com/hannshu/stCluster.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Perfilación de la Expresión Génica/métodos , Biología Computacional/métodos , Algoritmos , Humanos , Animales , Programas Informáticos , Aprendizaje Automático
15.
Nat Commun ; 15(1): 5690, 2024 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-38971800

RESUMEN

Omics techniques generate comprehensive profiles of biomolecules in cells and tissues. However, a holistic understanding of underlying systems requires joint analyses of multiple data modalities. We present DPM, a data fusion method for integrating omics datasets using directionality and significance estimates of genes, transcripts, or proteins. DPM allows users to define how the input datasets are expected to interact directionally given the experimental design or biological relationships between the datasets. DPM prioritises genes and pathways that change consistently across the datasets and penalises those with inconsistent directionality. To demonstrate our approach, we characterise gene and pathway regulation in IDH-mutant gliomas by jointly analysing transcriptomic, proteomic, and DNA methylation datasets. Directional integration of survival information in ovarian cancer reveals candidate biomarkers with consistent prognostic signals in transcript and protein expression. DPM is a general and adaptable framework for gene prioritisation and pathway analysis in multi-omics datasets.


Asunto(s)
Metilación de ADN , Glioma , Neoplasias Ováricas , Proteómica , Humanos , Proteómica/métodos , Glioma/genética , Glioma/metabolismo , Femenino , Neoplasias Ováricas/genética , Neoplasias Ováricas/metabolismo , Transcriptoma , Perfilación de la Expresión Génica/métodos , Genómica/métodos , Biología Computacional/métodos , Regulación Neoplásica de la Expresión Génica , Bases de Datos Genéticas , Multiómica
16.
Sci Rep ; 14(1): 15578, 2024 Jul 06.
Artículo en Inglés | MEDLINE | ID: mdl-38971817

RESUMEN

There is a growing body of evidence suggesting that Hashimoto's thyroiditis (HT) may contribute to an increased risk of papillary thyroid carcinoma (PTC). However, the exact relationship between HT and PTC is still not fully understood. The objective of this study was to identify potential common biomarkers that may be associated with both PTC and HT. Three microarray datasets from the GEO database and RNA-seq dataset from TCGA database were collected to identify shared differentially expressed genes (DEGs) between HT and PTC. A total of 101 genes was identified as common DEGs, primarily enriched inflammation- and immune-related pathways through GO and KEGG analysis. We performed protein-protein interaction analysis and identified six significant modules comprising a total of 29 genes. Subsequently, tree hub genes (CD53, FCER1G, TYROBP) were selected using random forest (RF) algorithms for the development of three diagnostic models. The artificial neural network (ANN) model demonstrates superior performance. Notably, CD53 exerted the greatest influence on the ANN model output. We analyzed the protein expressions of the three genes using the Human Protein Atlas database. Moreover, we observed various dysregulated immune cells that were significantly associated with the hub genes through immune infiltration analysis. Immunofluorescence staining confirmed the differential expression of CD53, FCER1G, and TYROBP, as well as the results of immune infiltration analysis. Lastly, we hypothesise that benzylpenicilloyl polylysine and aspirinmay be effective in the treatment of HT and PTC and may prevent HT carcinogenesis. This study indicates that CD53, FCER1G, and TYROBP play a role in the development of HT and PTC, and may contribute to the progression of HT to PTC. These hub genes could potentially serve as diagnostic markers and therapeutic targets for PTC and HT.


Asunto(s)
Biomarcadores de Tumor , Biología Computacional , Enfermedad de Hashimoto , Aprendizaje Automático , Cáncer Papilar Tiroideo , Neoplasias de la Tiroides , Humanos , Enfermedad de Hashimoto/genética , Cáncer Papilar Tiroideo/genética , Cáncer Papilar Tiroideo/diagnóstico , Biología Computacional/métodos , Biomarcadores de Tumor/genética , Neoplasias de la Tiroides/genética , Neoplasias de la Tiroides/diagnóstico , Mapas de Interacción de Proteínas/genética , Regulación Neoplásica de la Expresión Génica , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Redes Neurales de la Computación
17.
Sci Rep ; 14(1): 15581, 2024 Jul 06.
Artículo en Inglés | MEDLINE | ID: mdl-38971877

RESUMEN

In higher organisms, individual cells respond to signals and perturbations by epigenetic regulation and transcriptional adaptation. However, in addition to shifting the expression level of individual genes, the adaptive response of cells can also lead to shifts in the proportions of different cell types. Recent methods such as scRNA-seq allow for the interrogation of expression on the single-cell level, and can quantify individual cell type clusters within complex tissue samples. In order to identify clusters showing differential composition between different biological conditions, differential proportion analysis has recently been introduced. However, bioinformatics tools for robust proportion analysis of both replicated and unreplicated single-cell datasets are critically missing. In this manuscript, we present Scanpro, a modular tool for proportion analysis, seamlessly integrating into widely accepted frameworks in the Python environment. Scanpro is fast, accurate, supports datasets without replicates, and is intended to be used by bioinformatics experts and beginners alike.


Asunto(s)
Biología Computacional , Análisis de la Célula Individual , Programas Informáticos , Análisis de la Célula Individual/métodos , Biología Computacional/métodos , Humanos , Animales , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos
18.
Front Cell Infect Microbiol ; 14: 1393108, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38975327

RESUMEN

Multiple research groups have consistently underscored the intricate interplay between the microbiome and apical periodontitis. However, the presence of variability in experimental design and quantitative assessment have added a layer of complexity, making it challenging to comprehensively assess the relationship. Through an unbiased methodological refinement analysis, we re-analyzed 4 microbiota studies including 120 apical samples from infected teeth (with/without root canal treatment), healthy teeth, using meta-analysis and machine learning. With high-performing machine-learning models, we discover disease signatures of related species and enriched metabolic pathways, expanded understanding of apical periodontitis with potential therapeutic implications. Our approach employs uniform computational tools across datasets to leverage statistical power and define a reproducible signal potentially linked to the development of secondary apical periodontitis (SAP).


Asunto(s)
Aprendizaje Automático , Microbiota , Periodontitis Periapical , Periodontitis Periapical/microbiología , Humanos , Bacterias/clasificación , Bacterias/genética , Bacterias/aislamiento & purificación , Biología Computacional/métodos
19.
PLoS One ; 19(7): e0305413, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38976715

RESUMEN

Pancreatic ductal adenocarcinoma is the most prevalent pancreatic cancer, which is considered a significant global health concern. Chemotherapy and surgery are the mainstays of current pancreatic cancer treatments; however, a few cases are suitable for surgery, and most of the cases will experience recurrent episodes. Compared to DNA or peptide vaccines, mRNA vaccines for pancreatic cancer have more promise because of their delivery, enhanced immune responses, and lower proneness to mutation. We constructed an mRNA vaccine by analyzing S100 family proteins, which are all major activators of receptors for advanced glycation end products. We applied immunoinformatic approaches, including physicochemical properties analysis, structural prediction and validation, molecular docking study, in silico cloning, and immune simulations. The designed mRNA vaccine was estimated to have a molecular weight of 165023.50 Da and was highly soluble (grand average of hydropathicity of -0.440). In the structural assessment, the vaccine seemed to be a well-stable and functioning protein (Z score of -8.94). Also, the docking analysis suggested that the vaccine had a high affinity for TLR-2 and TLR-4 receptors. Additionally, the molecular mechanics with generalized Born and surface area solvation analysis of the "Vaccine-TLR-2" (-141.07 kcal/mol) and "Vaccine-TLR-4" (-271.72 kcal/mol) complexes also suggests a strong binding affinity for the receptors. Codon optimization also provided a high expression level with a GC content of 47.04% and a codon adaptation index score 1.0. The appearance of memory B-cells and T-cells was also observed over a while, with an increased level of helper T-cells and immunoglobulins (IgM and IgG). Moreover, the minimum free energy of the mRNA vaccine was predicted at -1760.00 kcal/mol, indicating the stability of the vaccine following its entry, transcription, and expression. This hypothetical vaccine offers a groundbreaking tool for future research and therapeutic development of pancreatic cancer.


Asunto(s)
Vacunas contra el Cáncer , Simulación del Acoplamiento Molecular , Neoplasias Pancreáticas , Neoplasias Pancreáticas/inmunología , Humanos , Vacunas contra el Cáncer/inmunología , Vacunas contra el Cáncer/uso terapéutico , Vacunas de ARNm/inmunología , Biología Computacional/métodos , Receptor Toll-Like 4/inmunología , Receptor Toll-Like 4/metabolismo , Vacunología/métodos , Receptor Toll-Like 2/inmunología , Simulación por Computador , ARN Mensajero/genética , ARN Mensajero/inmunología , Inmunoinformática
20.
Nat Commun ; 15(1): 5700, 2024 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-38972896

RESUMEN

Identifying spatially variable genes (SVGs) is crucial for understanding the spatiotemporal characteristics of diseases and tissue structures, posing a distinctive challenge in spatial transcriptomics research. We propose HEARTSVG, a distribution-free, test-based method for fast and accurately identifying spatially variable genes in large-scale spatial transcriptomic data. Extensive simulations demonstrate that HEARTSVG outperforms state-of-the-art methods with higher F 1 scores (average F 1 Score=0.948), improved computational efficiency, scalability, and reduced false positives (FPs). Through analysis of twelve real datasets from various spatial transcriptomic technologies, HEARTSVG identifies a greater number of biologically significant SVGs (average AUC = 0.792) than other comparative methods without prespecifying spatial patterns. Furthermore, by clustering SVGs, we uncover two distinct tumor spatial domains characterized by unique spatial expression patterns, spatial-temporal locations, and biological functions in human colorectal cancer data, unraveling the complexity of tumors.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Humanos , Perfilación de la Expresión Génica/métodos , Neoplasias Colorrectales/genética , Biología Computacional/métodos , Algoritmos , Regulación Neoplásica de la Expresión Génica , Simulación por Computador , Bases de Datos Genéticas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...