Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 45.995
Filtrar
1.
PLoS Comput Biol ; 20(5): e1012024, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38717988

RESUMEN

The activation levels of biologically significant gene sets are emerging tumor molecular markers and play an irreplaceable role in the tumor research field; however, web-based tools for prognostic analyses using it as a tumor molecular marker remain scarce. We developed a web-based tool PESSA for survival analysis using gene set activation levels. All data analyses were implemented via R. Activation levels of The Molecular Signatures Database (MSigDB) gene sets were assessed using the single sample gene set enrichment analysis (ssGSEA) method based on data from the Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA), The European Genome-phenome Archive (EGA) and supplementary tables of articles. PESSA was used to perform median and optimal cut-off dichotomous grouping of ssGSEA scores for each dataset, relying on the survival and survminer packages for survival analysis and visualisation. PESSA is an open-access web tool for visualizing the results of tumor prognostic analyses using gene set activation levels. A total of 238 datasets from the GEO, TCGA, EGA, and supplementary tables of articles; covering 51 cancer types and 13 survival outcome types; and 13,434 tumor-related gene sets are obtained from MSigDB for pre-grouping. Users can obtain the results, including Kaplan-Meier analyses based on the median and optimal cut-off values and accompanying visualization plots and the Cox regression analyses of dichotomous and continuous variables, by selecting the gene set markers of interest. PESSA (https://smuonco.shinyapps.io/PESSA/ OR http://robinl-lab.com/PESSA) is a large-scale web-based tumor survival analysis tool covering a large amount of data that creatively uses predefined gene set activation levels as molecular markers of tumors.


Asunto(s)
Biomarcadores de Tumor , Biología Computacional , Bases de Datos Genéticas , Internet , Neoplasias , Programas Informáticos , Humanos , Neoplasias/genética , Neoplasias/mortalidad , Análisis de Supervivencia , Biomarcadores de Tumor/genética , Biomarcadores de Tumor/metabolismo , Biología Computacional/métodos , Pronóstico , Perfilación de la Expresión Génica/métodos , Regulación Neoplásica de la Expresión Génica/genética
2.
PLoS One ; 19(5): e0303471, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38718074

RESUMEN

OBJECTIVE: Preeclampsia (PE) is a severe complication of unclear pathogenesis associated with pregnancy. This research aimed to elucidate the properties of immune cell infiltration and potential biomarkers of PE based on bioinformatics analysis. MATERIALS AND METHODS: Two PE datasets were imported from the Gene ExpressioOmnibus (GEO) and screened to identify differentially expressed genes (DEGs). Significant module genes were identified by weighted gene co-expression network analysis (WGCNA). DEGs that interacted with key module genes (GLu-DEGs) were analyzed further by Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) analyses. The diagnostic value of the genes was assessed using receiver operating characteristic (ROC) curves and protein-protein interaction (PPI) networks were constructed using GeneMANIA, and GSVA analysis was performed using the MSigDB database. Immune cell infiltration was analyzed using the TISIDB database, and StarBase and Cytoscape were used to construct an RBP-mRNA network. The identified hub genes were validated in two independent datasets. For further confirmation, placental tissue from healthy pregnant women and women with PE were collected and analyzed using both RT-qPCR and immunohistochemistry. RESULTS: A total of seven GLu-DEGs were obtained and were found to be involved in pathways associated with the transport of sulfur compounds, PPAR signaling, and energy metabolism, shown by GO and KEGG analyses. GSVA indicated significant increases in adipocytokine signaling. Furthermore, single-sample Gene Set Enrichment Analysis (ssGSEA) indicated that the levels of activated B cells and T follicular helper cells were significantly increased in the PE group and were negatively correlated with GLu-DEGs, suggesting their potential importance. CONCLUSION: In summary, the results showed a correlation between glutamine metabolism and immune cells, providing new insights into the understandingPE pathogenesis and furnishing evidence for future advances in the treatment of this disease.


Asunto(s)
Redes Reguladoras de Genes , Glutamina , Preeclampsia , Mapas de Interacción de Proteínas , Humanos , Preeclampsia/genética , Preeclampsia/inmunología , Femenino , Embarazo , Mapas de Interacción de Proteínas/genética , Glutamina/metabolismo , Biología Computacional/métodos , Ontología de Genes , Perfilación de la Expresión Génica , Adulto , Placenta/metabolismo , Placenta/inmunología
3.
Sci Adv ; 10(19): eadj1424, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38718126

RESUMEN

The ongoing expansion of human genomic datasets propels therapeutic target identification; however, extracting gene-disease associations from gene annotations remains challenging. Here, we introduce Mantis-ML 2.0, a framework integrating AstraZeneca's Biological Insights Knowledge Graph and numerous tabular datasets, to assess gene-disease probabilities throughout the phenome. We use graph neural networks, capturing the graph's holistic structure, and train them on hundreds of balanced datasets via a robust semi-supervised learning framework to provide gene-disease probabilities across the human exome. Mantis-ML 2.0 incorporates natural language processing to automate disease-relevant feature selection for thousands of diseases. The enhanced models demonstrate a 6.9% average classification power boost, achieving a median receiver operating characteristic (ROC) area under curve (AUC) score of 0.90 across 5220 diseases from Human Phenotype Ontology, OpenTargets, and Genomics England. Notably, Mantis-ML 2.0 prioritizes associations from an independent UK Biobank phenome-wide association study (PheWAS), providing a stronger form of triaging and mitigating against underpowered PheWAS associations. Results are exposed through an interactive web resource.


Asunto(s)
Bancos de Muestras Biológicas , Redes Neurales de la Computación , Humanos , Estudio de Asociación del Genoma Completo/métodos , Fenotipo , Reino Unido , Fenómica/métodos , Predisposición Genética a la Enfermedad , Genómica/métodos , Bases de Datos Genéticas , Algoritmos , Biología Computacional/métodos , Biobanco del Reino Unido
4.
Methods Mol Biol ; 2808: 121-127, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38743366

RESUMEN

During the infection of a host cell by an infectious agent, a series of gene expression changes occurs as a consequence of host-pathogen interactions. Unraveling this complex interplay is the key for understanding of microbial virulence and host response pathways, thus providing the basis for new molecular insights into the mechanisms of pathogenesis and the corresponding immune response. Dual RNA sequencing (dual RNA-seq) has been developed to simultaneously determine pathogen and host transcriptomes enabling both differential and coexpression analyses between the two partners as well as genome characterization in the case of RNA viruses. Here, we provide a detailed laboratory protocol and bioinformatics analysis guidelines for dual RNA-seq experiments focusing on - but not restricted to - measles virus (MeV) as a pathogen of interest. The application of dual RNA-seq technologies in MeV-infected patients can potentially provide valuable information on the structure of the viral RNA genome and on cellular innate immune responses and drive the discovery of new targets for antiviral therapy.


Asunto(s)
Genoma Viral , Interacciones Huésped-Patógeno , Virus del Sarampión , Sarampión , ARN Viral , Humanos , Sarampión/virología , Sarampión/inmunología , Sarampión/genética , Virus del Sarampión/genética , Virus del Sarampión/patogenicidad , ARN Viral/genética , Interacciones Huésped-Patógeno/genética , Interacciones Huésped-Patógeno/inmunología , Biología Computacional/métodos , Análisis de Secuencia de ARN/métodos , RNA-Seq/métodos , Transcriptoma , Perfilación de la Expresión Génica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
5.
Nat Commun ; 15(1): 4055, 2024 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-38744843

RESUMEN

We introduce GRouNdGAN, a gene regulatory network (GRN)-guided reference-based causal implicit generative model for simulating single-cell RNA-seq data, in silico perturbation experiments, and benchmarking GRN inference methods. Through the imposition of a user-defined GRN in its architecture, GRouNdGAN simulates steady-state and transient-state single-cell datasets where genes are causally expressed under the control of their regulating transcription factors (TFs). Training on six experimental reference datasets, we show that our model captures non-linear TF-gene dependencies and preserves gene identities, cell trajectories, pseudo-time ordering, and technical and biological noise, with no user manipulation and only implicit parameterization. GRouNdGAN can synthesize cells under new conditions to perform in silico TF knockout experiments. Benchmarking various GRN inference algorithms reveals that GRouNdGAN effectively bridges the existing gap between simulated and biological data benchmarks of GRN inference algorithms, providing gold standard ground truth GRNs and realistic cells corresponding to the biological system of interest.


Asunto(s)
Algoritmos , Simulación por Computador , Redes Reguladoras de Genes , RNA-Seq , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , RNA-Seq/métodos , Humanos , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Biología Computacional/métodos , Benchmarking , Análisis de Secuencia de ARN/métodos , Análisis de Expresión Génica de una Sola Célula
6.
Sci Rep ; 14(1): 10782, 2024 05 11.
Artículo en Inglés | MEDLINE | ID: mdl-38734775

RESUMEN

The inflammatory corpuscle recombinant absents in melanoma 2 (AIM2) and cholesterol efflux protein ATP binding cassette transporter A1(ABCA1) have been reported to play opposing roles in atherosclerosis (AS) plaques. However, the relationship between AIM2 and ABCA1 remains unclear. In this study, we explored the potential connection between AIM2 and ABCA1 in the modulation of AS by bioinformatic analysis combined with in vitro experiments. The GEO database was used to obtain AS transcriptional profiling data; screen differentially expressed genes (DEGs) and construct a weighted gene co-expression network analysis (WGCNA) to obtain AS-related modules. Phorbol myristate acetate (PMA) was used to induce macrophage modelling in THP-1 cells, and ox-LDL was used to induce macrophage foam cell formation. The experiment was divided into Negative Control (NC) group, Model Control (MC) group, AIM2 overexpression + ox-LDL (OE AIM2 + ox-LDL) group, and AIM2 short hairpin RNA + ox-LDL (sh AIM2 + ox-LDL) group. The intracellular cholesterol efflux rate was detected by scintillation counting; high-performance liquid chromatography (HPLC) was used to detect intracellular cholesterol levels; apoptosis levels were detected by TUNEL kit; levels of inflammatory markers (IL-1ß, IL-18, ROS, and GSH) were detected by ELISA kits; and levels of AIM2 and ABCA1 proteins were detected by Western blot. Bioinformatic analysis revealed that the turquoise module correlated most strongly with AS, and AIM2 and ABCA1 were co-expressed in the turquoise module with a trend towards negative correlation. In vitro experiments demonstrated that AIM2 inhibited macrophage cholesterol efflux, resulting in increased intracellular cholesterol levels and foam cell formation. Moreover, AIM2 had a synergistic effect with ox-LDL, exacerbating macrophage oxidative stress and inflammatory response. Silencing AIM2 ameliorated the above conditions. Furthermore, the protein expression levels of AIM2 and ABCA1 were consistent with the bioinformatic analysis, showing a negative correlation. AIM2 inhibits ABCA1 expression, causing abnormal cholesterol metabolism in macrophages and ultimately leading to foam cell formation. Inhibiting AIM2 may reverse this process. Overall, our study suggests that AIM2 is a reliable anti-inflammatory therapeutic target for AS. Inhibiting AIM2 expression may reduce foam cell formation and, consequently, inhibit the progression of AS plaques.


Asunto(s)
Transportador 1 de Casete de Unión a ATP , Colesterol , Proteínas de Unión al ADN , Células Espumosas , Lipoproteínas LDL , Transportador 1 de Casete de Unión a ATP/metabolismo , Transportador 1 de Casete de Unión a ATP/genética , Células Espumosas/metabolismo , Humanos , Colesterol/metabolismo , Lipoproteínas LDL/metabolismo , Proteínas de Unión al ADN/metabolismo , Proteínas de Unión al ADN/genética , Aterosclerosis/metabolismo , Aterosclerosis/patología , Aterosclerosis/genética , Células THP-1 , Macrófagos/metabolismo , Biología Computacional/métodos , Apoptosis , Inflamación/metabolismo , Inflamación/patología
7.
Front Immunol ; 15: 1347415, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38736878

RESUMEN

Objective: Emerging evidence has shown that gut diseases can regulate the development and function of the immune, metabolic, and nervous systems through dynamic bidirectional communication on the brain-gut axis. However, the specific mechanism of intestinal diseases and vascular dementia (VD) remains unclear. We designed this study especially, to further clarify the connection between VD and inflammatory bowel disease (IBD) from bioinformatics analyses. Methods: We downloaded Gene expression profiles for VD (GSE122063) and IBD (GSE47908, GSE179285) from the Gene Expression Omnibus (GEO) database. Then individual Gene Set Enrichment Analysis (GSEA) was used to confirm the connection between the two diseases respectively. The common differentially expressed genes (coDEGs) were identified, and the STRING database together with Cytoscape software were used to construct protein-protein interaction (PPI) network and core functional modules. We identified the hub genes by using the Cytohubba plugin. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were applied to identify pathways of coDEGs and hub genes. Subsequently, receiver operating characteristic (ROC) analysis was used to identify the diagnostic ability of these hub genes, and a training dataset was used to verify the expression levels of the hub genes. An alternative single-sample gene set enrichment (ssGSEA) algorithm was used to analyze immune cell infiltration between coDEGs and immune cells. Finally, the correlation between hub genes and immune cells was analyzed. Results: We screened 167 coDEGs. The main articles of coDEGs enrichment analysis focused on immune function. 8 shared hub genes were identified, including PTPRC, ITGB2, CYBB, IL1B, TLR2, CASP1, IL10RA, and BTK. The functional categories of hub genes enrichment analysis were mainly involved in the regulation of immune function and neuroinflammatory response. Compared to the healthy controls, abnormal infiltration of immune cells was found in VD and IBD. We also found the correlation between 8 shared hub genes and immune cells. Conclusions: This study suggests that IBD may be a new risk factor for VD. The 8 hub genes may predict the IBD complicated with VD. Immune-related coDEGS may be related to their association, which requires further research to prove.


Asunto(s)
Biología Computacional , Demencia Vascular , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Enfermedades Inflamatorias del Intestino , Mapas de Interacción de Proteínas , Humanos , Enfermedades Inflamatorias del Intestino/genética , Enfermedades Inflamatorias del Intestino/inmunología , Biología Computacional/métodos , Demencia Vascular/genética , Demencia Vascular/inmunología , Bases de Datos Genéticas , Transcriptoma , Ontología de Genes
8.
Curr Protoc ; 4(5): e1046, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38717471

RESUMEN

Whole-genome sequencing is widely used to investigate population genomic variation in organisms of interest. Assorted tools have been independently developed to call variants from short-read sequencing data aligned to a reference genome, including single nucleotide polymorphisms (SNPs) and structural variations (SVs). We developed SNP-SVant, an integrated, flexible, and computationally efficient bioinformatic workflow that predicts high-confidence SNPs and SVs in organisms without benchmarked variants, which are traditionally used for distinguishing sequencing errors from real variants. In the absence of these benchmarked datasets, we leverage multiple rounds of statistical recalibration to increase the precision of variant prediction. The SNP-SVant workflow is flexible, with user options to tradeoff accuracy for sensitivity. The workflow predicts SNPs and small insertions and deletions using the Genome Analysis ToolKit (GATK) and predicts SVs using the Genome Rearrangement IDentification Software Suite (GRIDSS), and it culminates in variant annotation using custom scripts. A key utility of SNP-SVant is its scalability. Variant calling is a computationally expensive procedure, and thus, SNP-SVant uses a workflow management system with intermediary checkpoint steps to ensure efficient use of resources by minimizing redundant computations and omitting steps where dependent files are available. SNP-SVant also provides metrics to assess the quality of called variants and converts between VCF and aligned FASTA format outputs to ensure compatibility with downstream tools to calculate selection statistics, which are commonplace in population genomics studies. By accounting for both small and large structural variants, users of this workflow can obtain a wide-ranging view of genomic alterations in an organism of interest. Overall, this workflow advances our capabilities in assessing the functional consequences of different types of genomic alterations, ultimately improving our ability to associate genotypes with phenotypes. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol: Predicting single nucleotide polymorphisms and structural variations Support Protocol 1: Downloading publicly available sequencing data Support Protocol 2: Visualizing variant loci using Integrated Genome Viewer Support Protocol 3: Converting between VCF and aligned FASTA formats.


Asunto(s)
Polimorfismo de Nucleótido Simple , Programas Informáticos , Flujo de Trabajo , Polimorfismo de Nucleótido Simple/genética , Biología Computacional/métodos , Genómica/métodos , Anotación de Secuencia Molecular/métodos , Secuenciación Completa del Genoma/métodos
9.
BMC Bioinformatics ; 25(1): 180, 2024 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-38720249

RESUMEN

BACKGROUND: High-throughput sequencing (HTS) has become the gold standard approach for variant analysis in cancer research. However, somatic variants may occur at low fractions due to contamination from normal cells or tumor heterogeneity; this poses a significant challenge for standard HTS analysis pipelines. The problem is exacerbated in scenarios with minimal tumor DNA, such as circulating tumor DNA in plasma. Assessing sensitivity and detection of HTS approaches in such cases is paramount, but time-consuming and expensive: specialized experimental protocols and a sufficient quantity of samples are required for processing and analysis. To overcome these limitations, we propose a new computational approach specifically designed for the generation of artificial datasets suitable for this task, simulating ultra-deep targeted sequencing data with low-fraction variants and demonstrating their effectiveness in benchmarking low-fraction variant calling. RESULTS: Our approach enables the generation of artificial raw reads that mimic real data without relying on pre-existing data by using NEAT, a fine-grained read simulator that generates artificial datasets using models learned from multiple different datasets. Then, it incorporates low-fraction variants to simulate somatic mutations in samples with minimal tumor DNA content. To prove the suitability of the created artificial datasets for low-fraction variant calling benchmarking, we used them as ground truth to evaluate the performance of widely-used variant calling algorithms: they allowed us to define tuned parameter values of major variant callers, considerably improving their detection of very low-fraction variants. CONCLUSIONS: Our findings highlight both the pivotal role of our approach in creating adequate artificial datasets with low tumor fraction, facilitating rapid prototyping and benchmarking of algorithms for such dataset type, as well as the important need of advancing low-fraction variant calling techniques.


Asunto(s)
Benchmarking , Secuenciación de Nucleótidos de Alto Rendimiento , Neoplasias , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Neoplasias/genética , Mutación , Algoritmos , ADN de Neoplasias/genética , Análisis de Secuencia de ADN/métodos , Biología Computacional/métodos
10.
BMC Genomics ; 25(1): 455, 2024 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-38720252

RESUMEN

BACKGROUND: Standard ChIP-seq and RNA-seq processing pipelines typically disregard sequencing reads whose origin is ambiguous ("multimappers"). This usual practice has potentially important consequences for the functional interpretation of the data: genomic elements belonging to clusters composed of highly similar members are left unexplored. RESULTS: In particular, disregarding multimappers leads to the underrepresentation in epigenetic studies of recently active transposable elements, such as AluYa5, L1HS and SVAs. Furthermore, this common strategy also has implications for transcriptomic analysis: members of repetitive gene families, such the ones including major histocompatibility complex (MHC) class I and II genes, are under-quantified. CONCLUSION: Revealing inherent biases that permeate routine tasks such as functional enrichment analysis, our results underscore the urgency of broadly adopting multimapper-aware bioinformatic pipelines -currently restricted to specific contexts or communities- to ensure the reliability of genomic and transcriptomic studies.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Elementos Transponibles de ADN/genética , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Genómica/métodos , Análisis de Secuencia de ARN/métodos
11.
BMC Bioinformatics ; 25(1): 181, 2024 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-38720247

RESUMEN

BACKGROUND: RNA sequencing combined with machine learning techniques has provided a modern approach to the molecular classification of cancer. Class predictors, reflecting the disease class, can be constructed for known tissue types using the gene expression measurements extracted from cancer patients. One challenge of current cancer predictors is that they often have suboptimal performance estimates when integrating molecular datasets generated from different labs. Often, the quality of the data is variable, procured differently, and contains unwanted noise hampering the ability of a predictive model to extract useful information. Data preprocessing methods can be applied in attempts to reduce these systematic variations and harmonize the datasets before they are used to build a machine learning model for resolving tissue of origins. RESULTS: We aimed to investigate the impact of data preprocessing steps-focusing on normalization, batch effect correction, and data scaling-through trial and comparison. Our goal was to improve the cross-study predictions of tissue of origin for common cancers on large-scale RNA-Seq datasets derived from thousands of patients and over a dozen tumor types. The results showed that the choice of data preprocessing operations affected the performance of the associated classifier models constructed for tissue of origin predictions in cancer. CONCLUSION: By using TCGA as a training set and applying data preprocessing methods, we demonstrated that batch effect correction improved performance measured by weighted F1-score in resolving tissue of origin against an independent GTEx test dataset. On the other hand, the use of data preprocessing operations worsened classification performance when the independent test dataset was aggregated from separate studies in ICGC and GEO. Therefore, based on our findings with these publicly available large-scale RNA-Seq datasets, the application of data preprocessing techniques to a machine learning pipeline is not always appropriate.


Asunto(s)
Aprendizaje Automático , Neoplasias , RNA-Seq , Humanos , RNA-Seq/métodos , Neoplasias/genética , Transcriptoma/genética , Análisis de Secuencia de ARN/métodos , Perfilación de la Expresión Génica/métodos , Biología Computacional/métodos
12.
Microbiome ; 12(1): 84, 2024 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-38725076

RESUMEN

BACKGROUND: Emergence of antibiotic resistance in bacteria is an important threat to global health. Antibiotic resistance genes (ARGs) are some of the key components to define bacterial resistance and their spread in different environments. Identification of ARGs, particularly from high-throughput sequencing data of the specimens, is the state-of-the-art method for comprehensively monitoring their spread and evolution. Current computational methods to identify ARGs mainly rely on alignment-based sequence similarities with known ARGs. Such approaches are limited by choice of reference databases and may potentially miss novel ARGs. The similarity thresholds are usually simple and could not accommodate variations across different gene families and regions. It is also difficult to scale up when sequence data are increasing. RESULTS: In this study, we developed ARGNet, a deep neural network that incorporates an unsupervised learning autoencoder model to identify ARGs and a multiclass classification convolutional neural network to classify ARGs that do not depend on sequence alignment. This approach enables a more efficient discovery of both known and novel ARGs. ARGNet accepts both amino acid and nucleotide sequences of variable lengths, from partial (30-50 aa; 100-150 nt) sequences to full-length protein or genes, allowing its application in both target sequencing and metagenomic sequencing. Our performance evaluation showed that ARGNet outperformed other deep learning models including DeepARG and HMD-ARG in most of the application scenarios especially quasi-negative test and the analysis of prediction consistency with phylogenetic tree. ARGNet has a reduced inference runtime by up to 57% relative to DeepARG. CONCLUSIONS: ARGNet is flexible, efficient, and accurate at predicting a broad range of ARGs from the sequencing data. ARGNet is freely available at https://github.com/id-bioinfo/ARGNet , with an online service provided at https://ARGNet.hku.hk . Video Abstract.


Asunto(s)
Bacterias , Redes Neurales de la Computación , Bacterias/genética , Bacterias/efectos de los fármacos , Bacterias/clasificación , Farmacorresistencia Bacteriana/genética , Antibacterianos/farmacología , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Biología Computacional/métodos , Genes Bacterianos/genética , Farmacorresistencia Microbiana/genética , Humanos , Aprendizaje Profundo
13.
Front Immunol ; 15: 1347139, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38726016

RESUMEN

Background: Autism spectrum disorder (ASD) is a disease characterized by social disorder. Recently, the population affected by ASD has gradually increased around the world. There are great difficulties in diagnosis and treatment at present. Methods: The ASD datasets were obtained from the Gene Expression Omnibus database and the immune-relevant genes were downloaded from a previously published compilation. Subsequently, we used WGCNA to screen the modules related to the ASD and immune. We also choose the best combination and screen out the core genes from Consensus Machine Learning Driven Signatures (CMLS). Subsequently, we evaluated the genetic correlation between immune cells and ASD used GNOVA. And pleiotropic regions identified by PLACO and CPASSOC between ASD and immune cells. FUMA was used to identify pleiotropic regions, and expression trait loci (EQTL) analysis was used to determine their expression in different tissues and cells. Finally, we use qPCR to detect the gene expression level of the core gene. Results: We found a close relationship between neutrophils and ASD, and subsequently, CMLS identified a total of 47 potential candidate genes. Secondly, GNOVA showed a significant genetic correlation between neutrophils and ASD, and PLACO and CPASSOC identified a total of 14 pleiotropic regions. We annotated the 14 regions mentioned above and identified a total of 6 potential candidate genes. Through EQTL, we found that the CFLAR gene has a specific expression pattern in neutrophils, suggesting that it may serve as a potential biomarker for ASD and is closely related to its pathogenesis. Conclusions: In conclusion, our study yields unprecedented insights into the molecular and genetic heterogeneity of ASD through a comprehensive bioinformatics analysis. These valuable findings hold significant implications for tailoring personalized ASD therapies.


Asunto(s)
Trastorno del Espectro Autista , Biología Computacional , Predisposición Genética a la Enfermedad , Sitios de Carácter Cuantitativo , Humanos , Trastorno del Espectro Autista/genética , Trastorno del Espectro Autista/inmunología , Biología Computacional/métodos , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Aprendizaje Automático , Bases de Datos Genéticas , Inmunogenética , Neutrófilos/inmunología , Neutrófilos/metabolismo , Transcriptoma
14.
Cell ; 187(10): 2343-2358, 2024 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-38729109

RESUMEN

As the number of single-cell datasets continues to grow rapidly, workflows that map new data to well-curated reference atlases offer enormous promise for the biological community. In this perspective, we discuss key computational challenges and opportunities for single-cell reference-mapping algorithms. We discuss how mapping algorithms will enable the integration of diverse datasets across disease states, molecular modalities, genetic perturbations, and diverse species and will eventually replace manual and laborious unsupervised clustering pipelines.


Asunto(s)
Algoritmos , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Humanos , Biología Computacional/métodos , Análisis de Datos , Animales , Análisis por Conglomerados
15.
PLoS One ; 19(5): e0302425, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38728301

RESUMEN

The joint analysis of two datasets [Formula: see text] and [Formula: see text] that describe the same phenomena (e.g. the cellular state), but measure disjoint sets of variables (e.g. mRNA vs. protein levels) is currently challenging. Traditional methods typically analyze single interaction patterns such as variance or covariance. However, problem-tailored external knowledge may contain multiple different information about the interaction between the measured variables. We introduce MIASA, a holistic framework for the joint analysis of multiple different variables. It consists of assembling multiple different information such as similarity vs. association, expressed in terms of interaction-scores or distances, for subsequent clustering/classification. In addition, our framework includes a novel qualitative Euclidean embedding method (qEE-Transition) which enables using Euclidean-distance/vector-based clustering/classification methods on datasets that have a non-Euclidean-based interaction structure. As an alternative to conventional optimization-based multidimensional scaling methods which are prone to uncertainties, our qEE-Transition generates a new vector representation for each element of the dataset union [Formula: see text] in a common Euclidean space while strictly preserving the original ordering of the assembled interaction-distances. To demonstrate our work, we applied the framework to three types of simulated datasets: samples from families of distributions, samples from correlated random variables, and time-courses of statistical moments for three different types of stochastic two-gene interaction models. We then compared different clustering methods with vs. without the qEE-Transition. For all examples, we found that the qEE-Transition followed by Ward clustering had superior performance compared to non-agglomerative clustering methods but had a varied performance against ultrametric-based agglomerative methods. We also tested the qEE-Transition followed by supervised and unsupervised machine learning methods and found promising results, however, more work is needed for optimal parametrization of these methods. As a future perspective, our framework points to the importance of more developments and validation of distance-distribution models aiming to capture multiple-complex interactions between different variables.


Asunto(s)
Algoritmos , Análisis por Conglomerados , Humanos , Biología Computacional/métodos
16.
Medicine (Baltimore) ; 103(19): e38066, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38728485

RESUMEN

CDCA3, a cell cycle regulator gene that plays a catalytic role in many tumors, was initially identified as a regulator of cell cycle progression, specifically facilitating the transition from the G2 phase to mitosis. However, its role in glioma remains unknown. In this study, bioinformatics analyses (TCGA, CGGA, Rembrandt) shed light on the upregulation and prognostic value of CDCA3 in gliomas. It can also be included in a column chart as a parameter predicting 3- and 5-year survival risk (C index = 0.86). According to Gene Set Enrichment Analysis and gene ontology analysis, the biological processes of CDCA3 are mainly concentrated in the biological activities related to cell cycle such as DNA replication and nuclear division. CDCA3 is closely associated with many classic glioma biomarkers (CDK4, CDK6), and inhibitors of CDK4 and CDK6 have been shown to be effective in tumor therapy. We have demonstrated that high expression of CDCA3 indicates a higher malignancy and poorer prognosis in gliomas.


Asunto(s)
Biomarcadores de Tumor , Neoplasias Encefálicas , Proteínas de Ciclo Celular , Glioma , Humanos , Glioma/genética , Glioma/metabolismo , Biomarcadores de Tumor/metabolismo , Biomarcadores de Tumor/genética , Proteínas de Ciclo Celular/genética , Proteínas de Ciclo Celular/metabolismo , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/metabolismo , Pronóstico , Terapia Molecular Dirigida/métodos , Regulación hacia Arriba , Biología Computacional/métodos
17.
Medicine (Baltimore) ; 103(19): e37999, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38728502

RESUMEN

Glioma is a typical malignant tumor of the nervous system. It is of great significance to identify new biomarkers for accurate diagnosis of glioma. In this context, THOC6 has been studied as a highly diagnostic prognostic biomarker, which contributes to improve the dilemma in diagnosing gliomas. We used online databases and a variety of statistical methods, such as Wilcoxon rank sum test, Dunn test and t test. We analyzed the mutation, location and expression profile of THOC6, revealing the network of THOC6 interaction with disease. Wilcoxon rank sum test showed that THOC6 is highly expressed in gliomas (P < 0.001). Dunn test, Wilcoxon rank sum test and t test showed that THOC6 expression was correlated with multiple clinical features. Logistic regression analysis further confirmed that THOC6 gene expression was a categorical dependent variable related to clinical features of poor prognosis. Kaplan-Meier survival analysis showed that the overall survival (OS) of glioma patients with high expression of THOC6 was poor (P < 0.001). Both univariate (P < 0.001) and multivariate (P = 0.04) Cox analysis confirmed that THOC6 gene expression was an independent risk factor for OS in patients with glioma. ROC curve analysis showed that THOC6 had a high diagnostic value in glioma (AUC = 0.915). Based on this, we constructed a nomogram to predict patient survival. Enrichment analysis showed that THOC6 expression was associated with multiple signal pathways. Immuno-infiltration analysis showed that the expression of THOC6 in glioma was closely related to the infiltration level of multiple immune cells. Molecular docking results showed that THOC6 might be the target of anti-glioma drugs. THOC6 is a novel diagnostic factor and prognostic biomarker of glioma.


Asunto(s)
Biomarcadores de Tumor , Neoplasias Encefálicas , Biología Computacional , Glioma , Simulación del Acoplamiento Molecular , Humanos , Glioma/genética , Glioma/patología , Biomarcadores de Tumor/genética , Biomarcadores de Tumor/metabolismo , Biología Computacional/métodos , Pronóstico , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/metabolismo , Femenino , Masculino , Estimación de Kaplan-Meier
18.
Mol Genet Genomics ; 299(1): 50, 2024 May 11.
Artículo en Inglés | MEDLINE | ID: mdl-38734849

RESUMEN

Intracerebral hemorrhage (ICH) is one of the major causes of death and disability, and hypertensive ICH (HICH) is the most common type of ICH. Currently, the outcomes of HICH patients remain poor after treatment, and early prognosis prediction of HICH is important. However, there are limited effective clinical treatments and biomarkers for HICH patients. Although circRNA has been widely studied in diseases, the role of plasma exosomal circRNAs in HICH remains unknown. The present study was conducted to investigate the characteristics and function of plasma exosomal circRNAs in six HICH patients using circRNA microarray and bioinformatics analysis. The results showed that there were 499 differentially expressed exosomal circRNAs between the HICH patients and control subjects. According to GO annotation and KEGG pathway analyses, the targets regulated by differentially expressed exosomal circRNAs were tightly related to the development of HICH via nerve/neuronal growth, neuroinflammation and endothelial homeostasis. And the differentially expressed exosomal circRNAs could mainly bind to four RNA-binding proteins (EIF4A3, FMRP, AGO2 and HUR). Moreover, of differentially expressed exosomal circRNAs, hsa_circ_00054843, hsa_circ_0010493 and hsa_circ_00090516 were significantly associated with bleeding volume and Glasgow Coma Scale score of the subjects. Our findings firstly revealed that the plasma exosomal circRNAs are significantly involved in the progression of HICH, and could be potent biomarkers for HICH. This provides the basis for further research to pinpoint the best biomarkers and illustrate the mechanism of exosomal circRNAs in HICH.


Asunto(s)
Exosomas , ARN Circular , Humanos , ARN Circular/genética , ARN Circular/sangre , Exosomas/genética , Exosomas/metabolismo , Masculino , Femenino , Persona de Mediana Edad , Anciano , Hemorragia Intracraneal Hipertensiva/genética , Hemorragia Intracraneal Hipertensiva/sangre , Biomarcadores/sangre , Biología Computacional/métodos , Perfilación de la Expresión Génica , Hemorragia Cerebral/genética , Hemorragia Cerebral/sangre , Redes Reguladoras de Genes
19.
PLoS One ; 19(5): e0302753, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38739634

RESUMEN

Leprosy has a high rate of cripplehood and lacks available early effective diagnosis methods for prevention and treatment, thus novel effective molecule markers are urgently required. In this study, we conducted bioinformatics analysis with leprosy and normal samples acquired from the GEO database(GSE84893, GSE74481, GSE17763, GSE16844 and GSE443). Through WGCNA analysis, 85 hub genes were screened(GS > 0.7 and MM > 0.8). Through DEG analysis, 82 up-regulated and 3 down-regulated genes were screened(|Log2FC| > 3 and FDR < 0.05). Then 49 intersection genes were considered as crucial and subjected to GO annotation, KEGG pathway and PPI analysis to determine the biological significance in the pathogenesis of leprosy. Finally, we identified a gene-pathway network, suggesting ITK, CD48, IL2RG, CCR5, FGR, JAK3, STAT1, LCK, PTPRC, CXCR4 can be used as biomarkers and these genes are active in 6 immune system pathways, including Chemokine signaling pathway, Th1 and Th2 cell differentiation, Th17 cell differentiation, T cell receptor signaling pathway, Natural killer cell mediated cytotoxicity and Leukocyte transendothelial migration. We identified 10 crucial gene markers and related important pathways that acted as essential components in the etiology of leprosy. Our study provides potential targets for diagnostic biomarkers and therapy of leprosy.


Asunto(s)
Biomarcadores , Redes Reguladoras de Genes , Lepra , Lepra/genética , Lepra/microbiología , Humanos , Biomarcadores/metabolismo , Biología Computacional/métodos , Bases de Datos Genéticas , Perfilación de la Expresión Génica , Mapas de Interacción de Proteínas/genética , Transducción de Señal
20.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38739759

RESUMEN

Proteins interact with diverse ligands to perform a large number of biological functions, such as gene expression and signal transduction. Accurate identification of these protein-ligand interactions is crucial to the understanding of molecular mechanisms and the development of new drugs. However, traditional biological experiments are time-consuming and expensive. With the development of high-throughput technologies, an increasing amount of protein data is available. In the past decades, many computational methods have been developed to predict protein-ligand interactions. Here, we review a comprehensive set of over 160 protein-ligand interaction predictors, which cover protein-protein, protein-nucleic acid, protein-peptide and protein-other ligands (nucleotide, heme, ion) interactions. We have carried out a comprehensive analysis of the above four types of predictors from several significant perspectives, including their inputs, feature profiles, models, availability, etc. The current methods primarily rely on protein sequences, especially utilizing evolutionary information. The significant improvement in predictions is attributed to deep learning methods. Additionally, sequence-based pretrained models and structure-based approaches are emerging as new trends.


Asunto(s)
Biología Computacional , Ácidos Nucleicos , Proteínas , Ácidos Nucleicos/metabolismo , Ácidos Nucleicos/química , Proteínas/química , Proteínas/metabolismo , Biología Computacional/métodos , Ligandos , Unión Proteica , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA