Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 99
Filtrar
1.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38975895

RESUMO

Spatial transcriptomics provides valuable insights into gene expression within the native tissue context, effectively merging molecular data with spatial information to uncover intricate cellular relationships and tissue organizations. In this context, deciphering cellular spatial domains becomes essential for revealing complex cellular dynamics and tissue structures. However, current methods encounter challenges in seamlessly integrating gene expression data with spatial information, resulting in less informative representations of spots and suboptimal accuracy in spatial domain identification. We introduce stCluster, a novel method that integrates graph contrastive learning with multi-task learning to refine informative representations for spatial transcriptomic data, consequently improving spatial domain identification. stCluster first leverages graph contrastive learning technology to obtain discriminative representations capable of recognizing spatially coherent patterns. Through jointly optimizing multiple tasks, stCluster further fine-tunes the representations to be able to capture complex relationships between gene expression and spatial organization. Benchmarked against six state-of-the-art methods, the experimental results reveal its proficiency in accurately identifying complex spatial domains across various datasets and platforms, spanning tissue, organ, and embryo levels. Moreover, stCluster can effectively denoise the spatial gene expression patterns and enhance the spatial trajectory inference. The source code of stCluster is freely available at https://github.com/hannshu/stCluster.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Perfilação da Expressão Gênica/métodos , Biologia Computacional/métodos , Algoritmos , Humanos , Animais , Software , Aprendizado de Máquina
2.
Front Genet ; 15: 1407072, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38846963

RESUMO

Background and Objective: Accurate identification of cancer stages is challenging due to the complexity and heterogeneity of the disease. Current clinical diagnosis methods primarily rely on phenotypic observations, which may not capture early molecular-level changes accurately. Methods: In this study, a novel biomarker recognition method was proposed tailored for cancer stages by considering the change of gene expression relationships. Utilizing the sample-specific information and protein-protein interaction networks, the group specific networks were constructed to address the limited specificity of potential biomarkers. Then, a specific feature recognition method was proposed based on these group specific networks, which employed the random forest algorithm for initial screening followed by a recursive feature elimination process to identify the optimal biomarker subset. During exploring optimal results, a strategy termed the Cost-Benefit Ratio, was devised to facilitate the identification of stage-specific biomarkers. Results: Comparative experiments were conducted on lung adenocarcinoma and breast cancer datasets to validate the method's efficacy and generalizability. The results showed that the identified biomarkers were highly stage-specific, and the F1 scores for predicting cancer stages were significantly improved. For the lung adenocarcinoma dataset, the F1 score reached 97.68%, and for the breast cancer dataset, it achieved 96.87%. These results significantly surpassed those of three conventional methods in terms of F1 scores. Moreover, from the perspective of biological functions, the biomarkers were proved playing an important role in cancer stage-evolution. Conclusion: The proposed method demonstrated its effectiveness in identifying stage-related biomarkers. By using these biomarkers as features, accurate prediction of cancer stages was achieved. Furthermore, the method exhibited potential for biomarker identification in subtype analyses, offering novel perspectives for cancer prognosis.

3.
Int J Mol Sci ; 25(8)2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38673997

RESUMO

The pathogenesis of carcinoma is believed to come from the combined effect of polygenic variation, and the initiation and progression of malignant tumors are closely related to the dysregulation of biological pathways. Quantifying the alteration in pathway activation and identifying coordinated patterns of pathway dysfunction are the imperative part of understanding the malignancy process and distinguishing different tumor stages or clinical outcomes of individual patients. In this study, we have conducted in silico pathway activation analysis using Riemannian manifold (RiePath) toward pan-cancer personalized characterization, which is the first attempt to apply the Riemannian manifold theory to measure the extent of pathway dysregulation in individual patient on the tangent space of the Riemannian manifold. RiePath effectively integrates pathway and gene expression information, not only generating a relatively low-dimensional and biologically relevant representation, but also identifying a robust panel of biologically meaningful pathway signatures as biomarkers. The pan-cancer analysis across 16 cancer types reveals the capability of RiePath to evaluate pathway activation accurately and identify clinical outcome-related pathways. We believe that RiePath has the potential to provide new prospects in understanding the molecular mechanisms of complex diseases and may find broader applications in predicting biomarkers for other intricate diseases.


Assuntos
Neoplasias , Medicina de Precisão , Humanos , Neoplasias/genética , Neoplasias/metabolismo , Medicina de Precisão/métodos , Biomarcadores Tumorais/genética , Biomarcadores Tumorais/metabolismo , Regulação Neoplásica da Expressão Gênica , Transdução de Sinais , Perfilação da Expressão Gênica/métodos , Algoritmos , Biologia Computacional/métodos , Redes Reguladoras de Genes , Simulação por Computador
4.
iScience ; 27(4): 109387, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38510118

RESUMO

Identifying cancer genes is vital for cancer diagnosis and treatment. However, because of the complexity of cancer occurrence and limited cancer genes knowledge, it is hard to identify cancer genes accurately using only a few omics data, and the overall performance of existing methods is being called for further improvement. Here, we introduce a two-stage gradual-learning strategy GLIMS to predict cancer genes using integrative features from multi-omics data. Firstly, it uses a semi-supervised hierarchical graph neural network to predict the initial candidate cancer genes by integrating multi-omics data and protein-protein interaction (PPI) network. Then, it uses an unsupervised approach to further optimize the initial prediction by integrating the co-splicing network in post-transcriptional regulation, which plays an important role in cancer development. Systematic experiments on multi-omics cancer data demonstrated that GLIMS outperforms the state-of-the-art methods for the identification of cancer genes and it could be a useful tool to help advance cancer analysis.

5.
Front Microbiol ; 15: 1345794, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38314434

RESUMO

Introduction: Seasonal influenza A H3N2 viruses are constantly changing, reducing the effectiveness of existing vaccines. As a result, the World Health Organization (WHO) needs to frequently update the vaccine strains to match the antigenicity of emerged H3N2 variants. Traditional assessments of antigenicity rely on serological methods, which are both labor-intensive and time-consuming. Although numerous computational models aim to simplify antigenicity determination, they either lack a robust quantitative linkage between antigenicity and viral sequences or focus restrictively on selected features. Methods: Here, we propose a novel computational method to predict antigenic distances using multiple features, including not only viral sequence attributes but also integrating four distinct categories of features that significantly affect viral antigenicity in sequences. Results: This method exhibits low error in virus antigenicity prediction and achieves superior accuracy in discerning antigenic drift. Utilizing this method, we investigated the evolution process of the H3N2 influenza viruses and identified a total of 21 major antigenic clusters from 1968 to 2022. Discussion: Interestingly, our predicted antigenic map aligns closely with the antigenic map generated with serological data. Thus, our method is a promising tool for detecting antigenic variants and guiding the selection of vaccine candidates.

6.
Comput Biol Med ; 171: 108108, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38359659

RESUMO

While genome-wide association studies (GWAS) have unequivocally identified vast disease susceptibility variants, a majority of them are situated in non-coding regions and are in high linkage disequilibrium (LD). To pave the way of translating GWAS signals to clinical drug targets, it is essential to identify the underlying causal variants and further causal genes. To this end, a myriad of post-GWAS methods have been devised, each grounded in distinct principles including fine-mapping, co-localization, and transcriptome-wide association study (TWAS) techniques. Yet, no platform currently exists that seamlessly integrates these diverse post-GWAS methodologies. In this work, we present a user-friendly web server for post-GWAS analysis, that seamlessly integrates 9 distinct methods with 12 models, categorized by fine-mapping, colocalization, and TWAS. The server mainly helps users decipher the causality hindered by complex GWAS signals, including casual variants and casual genes, without the burden of computational skills and complex environment configuration, and provides a convenient platform for post-GWAS analysis, result visualization, facilitating the understanding and interpretation of the genome-wide association studies. The postGWAS server is available at http://g2g.biographml.com/.


Assuntos
Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Humanos , Estudo de Associação Genômica Ampla/métodos , Desequilíbrio de Ligação/genética , Transcriptoma , Polimorfismo de Nucleotídeo Único/genética , Predisposição Genética para Doença/genética
7.
iScience ; 27(1): 108592, 2024 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-38205240

RESUMO

A key regulatory mechanism involves circular RNA (circRNA) acting as a sponge to modulate microRNA (miRNA), and thus, studying their interaction has significant medical implications. In this field, there are currently two pressing issues that remain unresolved. Firstly, due to the scarcity of verified interactions, we require a minimal amount of samples for training. Secondly, the current models lack interpretability. Therefore, we propose SPBCMI, a method that combines sequence features extracted using the Bidirectional Encoder Representations from Transformer (BERT) model and structural features of biological molecule networks extracted through graph embedding to train a GBDT (Gradient-boosted decision trees) classifier for prediction. Our method yielded an AUC of 0.9143, which is currently the best for this problem. Furthermore, in the case study, SPBCMI accurately predicted 7 out of 10 circRNA-miRNA interactions. These results show that our method provides an innovative and high-performing approach to understanding the interaction between circRNA and miRNA.

8.
PLoS One ; 19(1): e0291741, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38181020

RESUMO

Although various methods have been developed to detect structural variations (SVs) in genomic sequences, few are used to validate these results. Several commonly used SV callers produce many false positive SVs, and existing validation methods are not accurate enough. Therefore, a highly efficient and accurate validation method is essential. In response, we propose SVvalidation-a new method that uses long-read sequencing data for validating SVs with higher accuracy and efficiency. Compared to existing methods, SVvalidation performs better in validating SVs in repeat regions and can determine the homozygosity or heterozygosity of an SV. Additionally, SVvalidation offers the highest recall, precision, and F1-score (improving by 7-16%) across all datasets. Moreover, SVvalidation is suitable for different types of SVs. The program is available at https://github.com/nwpuzhengyan/SVvalidation.


Assuntos
Aberrações Cromossômicas , Variação Estrutural do Genoma , Humanos , Projetos de Pesquisa , Genômica , Heterozigoto
9.
Bioinformatics ; 40(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38290765

RESUMO

SUMMARY: Single-cell multi-omics technologies provide a unique platform for characterizing cell states and reconstructing developmental process by simultaneously quantifying and integrating molecular signatures across various modalities, including genome, transcriptome, epigenome, and other omics layers. However, there is still an urgent unmet need for novel computational tools in this nascent field, which are critical for both effective and efficient interrogation of functionality across different omics modalities. Scbean represents a user-friendly Python library, designed to seamlessly incorporate a diverse array of models for the examination of single-cell data, encompassing both paired and unpaired multi-omics data. The library offers uniform and straightforward interfaces for tasks, such as dimensionality reduction, batch effect elimination, cell label transfer from well-annotated scRNA-seq data to scATAC-seq data, and the identification of spatially variable genes. Moreover, Scbean's models are engineered to harness the computational power of GPU acceleration through Tensorflow, rendering them capable of effortlessly handling datasets comprising millions of cells. AVAILABILITY AND IMPLEMENTATION: Scbean is released on the Python Package Index (PyPI) (https://pypi.org/project/scbean/) and GitHub (https://github.com/jhu99/scbean) under the MIT license. The documentation and example code can be found at https://scbean.readthedocs.io/en/latest/.


Assuntos
Multiômica , Software , Genoma , Transcriptoma , Análise de Célula Única , Análise de Dados
10.
Brief Funct Genomics ; 23(2): 118-127, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-36752035

RESUMO

Analysis of cell-cell communication (CCC) in the tumor micro-environment helps decipher the underlying mechanism of cancer progression and drug tolerance. Currently, single-cell RNA-Seq data are available on a large scale, providing an unprecedented opportunity to predict cellular communications. There have been many achievements and applications in inferring cell-cell communication based on the known interactions between molecules, such as ligands, receptors and extracellular matrix. However, the prior information is not quite adequate and only involves a fraction of cellular communications, producing many false-positive or false-negative results. To this end, we propose an improved hierarchical variational autoencoder (HiVAE) based model to fully use single-cell RNA-seq data for automatically estimating CCC. Specifically, the HiVAE model is used to learn the potential representation of cells on known ligand-receptor genes and all genes in single-cell RNA-seq data, respectively, which are then utilized for cascade integration. Subsequently, transfer entropy is employed to measure the transmission of information flow between two cells based on the learned representations, which are regarded as directed communication relationships. Experiments are conducted on single-cell RNA-seq data of the human skin disease dataset and the melanoma dataset, respectively. Results show that the HiVAE model is effective in learning cell representations, and transfer entropy could be used to estimate the communication scores between cell types.


Assuntos
Neoplasias , Análise da Expressão Gênica de Célula Única , Humanos , Análise de Célula Única/métodos , Comunicação Celular , Sequenciamento do Exoma , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos , Microambiente Tumoral
11.
Neural Netw ; 169: 475-484, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37939536

RESUMO

The assessment of Enterprise Credit Risk (ECR) is a critical technique for investment decisions and financial regulation. Previous methods usually construct enterprise representations by credit-related indicators, such as liquidity and staff quality. However, indicators of many enterprises are not accessible, especially for the small- and medium-sized enterprises. To alleviate the indicator deficiency, graph learning based methods are proposed to enhance enterprise representation learning by the neighbor structure of enterprise graphs. However, existing methods usually only focus on pairwise relationships, and overlook the ubiquitous high-order relationships among enterprises, e.g., supply chain connecting multiple enterprises. To resolve this issue, we propose a Multi-Structure Cascaded Graph Neural Network framework (MS-CGNN) for ECR assessment. It enhances enterprise representation learning based on enterprise graph structures of different granularity, including knowledge graphs of pairwise relationships, homogeneous and heterogeneous hypergraphs of high-order relationships. To distinguish influences of different types of hyperedges, MS-CGNN redefine new type-dependent hyperedge weight matrices for heterogeneous hypergraph convolutions. Experimental results show that MS-CGNN achieves state-of-the-art performance on real-world ECR datasets.


Assuntos
Investimentos em Saúde , Aprendizagem , Humanos , Conhecimento , Redes Neurais de Computação , Medição de Risco
12.
Sci Rep ; 13(1): 21407, 2023 Dec 04.
Artigo em Inglês | MEDLINE | ID: mdl-38049546

RESUMO

A scientific and rational evaluation of teaching is essential for personalized learning. In the current teaching assessment model that solely relies on Grade Point Average (GPA), learners with different learning abilities may be classified as the same type of student. It is challenging to uncover the underlying logic behind different learning patterns when GPA scores are the same. To address the limitations of pure GPA evaluation, we propose a data-driven assessment strategy as a supplement to the current methodology. Firstly, we integrate self-paced learning and graph memory neural networks to develop a learning performance prediction model called the self-paced graph memory network. Secondly, inspired by outliers in linear regression, we use a t-test approach to identify those student samples whose loss values significantly differ from normal samples, indicating that these students have different inherent learning patterns/logic compared to the majority. We find that these learners' GPA levels are distributed across different levels. Through analyzing the learning process data of learners with the same GPA level, we find that our data-driven strategy effectively addresses the shortcomings of the GPA evaluation model. Furthermore, we validate the rationality of our method for student data modeling through protein classification experiments and student performance prediction experiments, it ensuring the rationality and effectiveness of our method.

13.
Artigo em Inglês | MEDLINE | ID: mdl-38055356

RESUMO

Acquiring big-size datasets to raise the performance of deep models has become one of the most critical problems in representation learning (RL) techniques, which is the core potential of the emerging paradigm of federated learning (FL). However, most current FL models concentrate on seeking an identical model for isolated clients and thus fail to make full use of the data specificity between clients. To enhance the classification performance of each client, this study introduces the FDRL, a federated discriminative RL model, by partitioning the data features of each client into a global subspace and a local subspace. More specifically, FDRL learns the global representation for federated communication between those isolated clients, which is to capture common features from all protected datasets via model sharing, and local representations for personalization in each client, which is to preserve specific features of clients via model differentiating. Toward this goal, FDRL in each client trains a shared submodel for federated communication and, meanwhile, a not-shared submodel for locality preservation, in which the two models partition client-feature space by maximizing their differences, followed by a linear model fed with combined features for image classification. The proposed model is implemented with neural networks and optimized in an iterative manner between the server of computing the global model and the clients of learning the local classifiers. Thanks to the powerful capability of local feature preservation, FDRL leads to more discriminative data representations than the compared FL models. Experimental results on public datasets demonstrate that our FDRL benefits from the subspace partition and achieves better performance on federated image classification than the state-of-the-art FL models.

14.
Commun Biol ; 6(1): 1268, 2023 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-38097699

RESUMO

Recent developments in single-cell technology have enabled the exploration of cellular heterogeneity at an unprecedented level, providing invaluable insights into various fields, including medicine and disease research. Cell type annotation is an essential step in its omics research. The mainstream approach is to utilize well-annotated single-cell data to supervised learning for cell type annotation of new singlecell data. However, existing methods lack good generalization and robustness in cell annotation tasks, partially due to difficulties in dealing with technical differences between datasets, as well as not considering the heterogeneous associations of genes in regulatory mechanism levels. Here, we propose the scPML model, which utilizes various gene signaling pathway data to partition the genetic features of cells, thus characterizing different interaction maps between cells. Extensive experiments demonstrate that scPML performs better in cell type annotation and detection of unknown cell types from different species, platforms, and tissues.


Assuntos
Medicina , Análise da Expressão Gênica de Célula Única , Transdução de Sinais , Tecnologia
15.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37903416

RESUMO

The emergence of single-cell RNA sequencing (scRNA-seq) technology has revolutionized the identification of cell types and the study of cellular states at a single-cell level. Despite its significant potential, scRNA-seq data analysis is plagued by the issue of missing values. Many existing imputation methods rely on simplistic data distribution assumptions while ignoring the intrinsic gene expression distribution specific to cells. This work presents a novel deep-learning model, named scMultiGAN, for scRNA-seq imputation, which utilizes multiple collaborative generative adversarial networks (GAN). Unlike traditional GAN-based imputation methods that generate missing values based on random noises, scMultiGAN employs a two-stage training process and utilizes multiple GANs to achieve cell-specific imputation. Experimental results show the efficacy of scMultiGAN in imputation accuracy, cell clustering, differential gene expression analysis and trajectory analysis, significantly outperforming existing state-of-the-art techniques. Additionally, scMultiGAN is scalable to large scRNA-seq datasets and consistently performs well across sequencing platforms. The scMultiGAN code is freely available at https://github.com/Galaxy8172/scMultiGAN.


Assuntos
Análise de Célula Única , Transcriptoma , Análise de Célula Única/métodos , Análise por Conglomerados , Sequenciamento do Exoma , Análise de Dados , Análise de Sequência de RNA , Perfilação da Expressão Gênica
16.
BMC Bioinformatics ; 24(1): 213, 2023 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-37221476

RESUMO

BACKGROUND: Structural variations (SVs) refer to variations in an organism's chromosome structure that exceed a length of 50 base pairs. They play a significant role in genetic diseases and evolutionary mechanisms. While long-read sequencing technology has led to the development of numerous SV caller methods, their performance results have been suboptimal. Researchers have observed that current SV callers often miss true SVs and generate many false SVs, especially in repetitive regions and areas with multi-allelic SVs. These errors are due to the messy alignments of long-read data, which are affected by their high error rate. Therefore, there is a need for a more accurate SV caller method. RESULT: We propose a new method-SVcnn, a more accurate deep learning-based method for detecting SVs by using long-read sequencing data. We run SVcnn and other SV callers in three real datasets and find that SVcnn improves the F1-score by 2-8% compared with the second-best method when the read depth is greater than 5×. More importantly, SVcnn has better performance for detecting multi-allelic SVs. CONCLUSIONS: SVcnn is an accurate deep learning-based method to detect SVs. The program is available at https://github.com/nwpuzhengyan/SVcnn .


Assuntos
Aprendizado Profundo , Evolução Biológica
17.
Front Microbiol ; 14: 1147778, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37180267

RESUMO

Introduction: Abnormal lncRNA expression can lead to the resistance of tumor cells to anticancer drugs, which is a crucial factor leading to high cancer mortality. Studying the relationship between lncRNA and drug resistance becomes necessary. Recently, deep learning has achieved promising results in predicting biomolecular associations. However, to our knowledge, deep learning-based lncRNA-drug resistance associations prediction has yet to be studied. Methods: Here, we proposed a new computational model, DeepLDA, which used deep neural networks and graph attention mechanisms to learn lncRNA and drug embeddings for predicting potential relationships between lncRNAs and drug resistance. DeepLDA first constructed similarity networks for lncRNAs and drugs using known association information. Subsequently, deep graph neural networks were utilized to automatically extract features from multiple attributes of lncRNAs and drugs. These features were fed into graph attention networks to learn lncRNA and drug embeddings. Finally, the embeddings were used to predict potential associations between lncRNAs and drug resistance. Results: Experimental results on the given datasets show that DeepLDA outperforms other machine learning-related prediction methods, and the deep neural network and attention mechanism can improve model performance. Dicsussion: In summary, this study proposes a powerful deep-learning model that can effectively predict lncRNA-drug resistance associations and facilitate the development of lncRNA-targeted drugs. DeepLDA is available at https://github.com/meihonggao/DeepLDA.

18.
Comput Biol Med ; 158: 106843, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37019014

RESUMO

Structural variations (SVs) represent genomic rearrangements (such as deletions, insertions, and inversions) whose sizes are larger than 50bp. They play important roles in genetic diseases and evolution mechanism. Due to the advance of long-read sequencing (i.e. PacBio long-read sequencing and Oxford Nanopore (ONT) long-read sequencing), we can call SVs accurately. However, for ONT long reads, we observe that existing long read SV callers miss a lot of true SVs and call a lot of false SVs in repetitive regions and in regions with multi-allelic SVs. Those errors are caused by messy alignments of ONT reads due to their high error rate. Hence, we propose a novel method, SVsearcher, to solve these issues. We run SVsearcher and other callers in three real datasets and find that SVsearcher improves the F1 score by approximately 10% for high coverage (50×) datasets and more than 25% for low coverage (10×) datasets. More importantly, SVsearcher can identify 81.7%-91.8% multi-allelic SVs while existing methods only identify 13.2% (Sniffles)-54.0% (nanoSV) of them. SVsearcher is available at https://github.com/kensung-lab/SVsearcher.


Assuntos
Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Genômica/métodos , Genoma , Análise de Sequência de DNA/métodos
19.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36847701

RESUMO

Emerging studies have shown that circular RNAs (circRNAs) are involved in a variety of biological processes and play a key role in disease diagnosing, treating and inferring. Although many methods, including traditional machine learning and deep learning, have been developed to predict associations between circRNAs and diseases, the biological function of circRNAs has not been fully exploited. Some methods have explored disease-related circRNAs based on different views, but how to efficiently use the multi-view data about circRNA is still not well studied. Therefore, we propose a computational model to predict potential circRNA-disease associations based on collaborative learning with circRNA multi-view functional annotations. First, we extract circRNA multi-view functional annotations and build circRNA association networks, respectively, to enable effective network fusion. Then, a collaborative deep learning framework for multi-view information is designed to get circRNA multi-source information features, which can make full use of the internal relationship among circRNA multi-view information. We build a network consisting of circRNAs and diseases by their functional similarity and extract the consistency description information of circRNAs and diseases. Last, we predict potential associations between circRNAs and diseases based on graph auto encoder. Our computational model has better performance in predicting candidate disease-related circRNAs than the existing ones. Furthermore, it shows the high practicability of the method that we use several common diseases as case studies to find some unknown circRNAs related to them. The experiments show that CLCDA can efficiently predict disease-related circRNAs and are helpful for the diagnosis and treatment of human disease.


Assuntos
Aprendizado Profundo , Práticas Interdisciplinares , Humanos , RNA Circular/genética , Aprendizado de Máquina , Biologia Computacional/métodos
20.
Front Genet ; 14: 1120185, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36741325

RESUMO

In the world, colon cancer is regarded as one of the most common deadly cancer. Due to the lack of a better understanding of its prognosis system, this prevailing cancer has the second-highest morbidity and mortality rate compared with other cancers. A variety of genes are responsible to participate in colon cancer and the molecular mechanism is almost unsure. In addition, various studies have been done to identify the differentially expressed genes to investigate the dysfunctions of the genes but most of them did it individually. In this study, we constructed a functional interaction network for identifying the group of genes that conduct cellular functions and Protein-Protein Interaction network, which aims to better understanding protein functions and their biological relationships. A functional evolution network was also generated to analyze the dysfunctions from initial stage to later stage of colon cancer by investigating the gene modules and their molecular functions. The results show that the proposed evolution network is able to detect the significant cellular functions, which can be used to explore the evolution process of colon cancer. Moreover, a total of 10 core genes associated with colon cancer were identified, which were INS, SNAP25, GRIA2, SST, GCG, PVALB, SLC17A7, SLC32A1, SLC17A6, and NPY, respectively. The responsible candidate genes and corresponding pathways presented in this study could be used to develop new tumor indicators and novel therapeutic targets for the prevention and treatment of colon cancer.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA