Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37930025

RESUMO

Drug combination therapy has gradually become a promising treatment strategy for complex or co-existing diseases. As drug-drug interactions (DDIs) may cause unexpected adverse drug reactions, DDI prediction is an important task in pharmacology and clinical applications. Recently, researchers have proposed several deep learning methods to predict DDIs. However, these methods mainly exploit the chemical or biological features of drugs, which is insufficient and limits the performances of DDI prediction. Here, we propose a new deep multimodal feature fusion framework for DDI prediction, DMFDDI, which fuses drug molecular graph, DDI network and the biochemical similarity features of drugs to predict DDIs. To fully extract drug molecular structure, we introduce an attention-gated graph neural network for capturing the global features of the molecular graph and the local features of each atom. A sparse graph convolution network is introduced to learn the topological structure information of the DDI network. In the multimodal feature fusion module, an attention mechanism is used to efficiently fuse different features. To validate the performance of DMFDDI, we compare it with 10 state-of-the-art methods. The comparison results demonstrate that DMFDDI achieves better performance in DDI prediction. Our method DMFDDI is implemented in Python using the Pytorch machine-learning library, and it is freely available at https://github.com/DHUDEBLab/DMFDDI.git.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Interações Medicamentosas , Estrutura Molecular , Biblioteca Gênica
2.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37313714

RESUMO

Single-cell RNA sequencing (scRNA-seq) measures transcriptome-wide gene expression at single-cell resolution. Clustering analysis of scRNA-seq data enables researchers to characterize cell types and states, shedding new light on cell-to-cell heterogeneity in complex tissues. Recently, self-supervised contrastive learning has become a prominent technique for underlying feature representation learning. However, for the noisy, high-dimensional and sparse scRNA-seq data, existing methods still encounter difficulties in capturing the intrinsic patterns and structures of cells, and seldom utilize prior knowledge, resulting in clusters that mismatch with the real situation. To this end, we propose scDECL, a novel deep enhanced constraint clustering algorithm for scRNA-seq data analysis based on contrastive learning and pairwise constraints. Specifically, based on interpolated contrastive learning, a pre-training model is trained to learn the feature embedding, and then perform clustering according to the constructed enhanced pairwise constraint. In the pre-training stage, a mixup data augmentation strategy and interpolation loss is introduced to improve the diversity of the dataset and the robustness of the model. In the clustering stage, the prior information is converted into enhanced pairwise constraints to guide the clustering. To validate the performance of scDECL, we compare it with six state-of-the-art algorithms on six real scRNA-seq datasets. The experimental results demonstrate the proposed algorithm outperforms the six competing methods. In addition, the ablation studies on each module of the algorithm indicate that these modules are complementary to each other and effective in improving the performance of the proposed algorithm. Our method scDECL is implemented in Python using the Pytorch machine-learning library, and it is freely available at https://github.com/DBLABDHU/scDECL.


Assuntos
Perfilação da Expressão Gênica , Análise da Expressão Gênica de Célula Única , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Algoritmos , Análise por Conglomerados
3.
Bioinformatics ; 40(5)2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38810116

RESUMO

MOTIVATION: Gene regulatory networks (GRNs) encode gene regulation in living organisms, and have become a critical tool to understand complex biological processes. However, due to the dynamic and complex nature of gene regulation, inferring GRNs from scRNA-seq data is still a challenging task. Existing computational methods usually focus on the close connections between genes, and ignore the global structure and distal regulatory relationships. RESULTS: In this study, we develop a supervised deep learning framework, IGEGRNS, to infer GRNs from scRNA-seq data based on graph embedding. In the framework, contextual information of genes is captured by GraphSAGE, which aggregates gene features and neighborhood structures to generate low-dimensional embedding for genes. Then, the k most influential nodes in the whole graph are filtered through Top-k pooling. Finally, potential regulatory relationships between genes are predicted by stacking CNNs. Compared with nine competing supervised and unsupervised methods, our method achieves better performance on six time-series scRNA-seq datasets. AVAILABILITY AND IMPLEMENTATION: Our method IGEGRNS is implemented in Python using the Pytorch machine learning library, and it is freely available at https://github.com/DHUDBlab/IGEGRNS.


Assuntos
Redes Reguladoras de Genes , Análise de Célula Única , Análise de Célula Única/métodos , Biologia Computacional/métodos , Transcriptoma/genética , Perfilação da Expressão Gênica/métodos , Humanos , Aprendizado Profundo , Algoritmos
4.
Brief Bioinform ; 23(4)2022 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-35696651

RESUMO

The development of single-cell RNA-seq (scRNA-seq) technology allows researchers to characterize the cell types, states and transitions during dynamic biological processes at single-cell resolution. One of the critical tasks is to infer pseudo-time trajectory. However, the existence of transition cells in the intermediate state of complex biological processes poses a challenge for the trajectory inference. Here, we propose a new single-cell trajectory inference method based on transition entropy, named scTite, to identify transitional states and reconstruct cell trajectory from scRNA-seq data. Taking into account the continuity of cellular processes, we introduce a new metric called transition entropy to measure the uncertainty of a cell belonging to different cell clusters, and then identify cell states and transition cells. Specifically, we adopt different strategies to infer the trajectory for the identified cell states and transition cells, and combine them to obtain a detailed cell trajectory. For the identified cell clusters, we utilize the Wasserstein distance based on the probability distribution to calculate distance between clusters, and construct the minimum spanning tree. Meanwhile, we adopt the signaling entropy and partial correlation coefficient to determine transition paths, which contain a group of transition cells with the largest similarity. Then the transitional paths and the MST are combined to infer a refined cell trajectory. We apply scTite to four real scRNA-seq datasets and an integrated dataset, and conduct extensive performance comparison with nine existing trajectory inference methods. The experimental results demonstrate that the proposed method can reconstruct the cell trajectory more accurately than the compared algorithms. The scTite software package is available at https://github.com/dblab2022/scTite.


Assuntos
Análise de Célula Única , Transcriptoma , Entropia , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Software
5.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35172334

RESUMO

Single-cell RNA sequencing (scRNA-seq) permits researchers to study the complex mechanisms of cell heterogeneity and diversity. Unsupervised clustering is of central importance for the analysis of the scRNA-seq data, as it can be used to identify putative cell types. However, due to noise impacts, high dimensionality and pervasive dropout events, clustering analysis of scRNA-seq data remains a computational challenge. Here, we propose a new deep structural clustering method for scRNA-seq data, named scDSC, which integrate the structural information into deep clustering of single cells. The proposed scDSC consists of a Zero-Inflated Negative Binomial (ZINB) model-based autoencoder, a graph neural network (GNN) module and a mutual-supervised module. To learn the data representation from the sparse and zero-inflated scRNA-seq data, we add a ZINB model to the basic autoencoder. The GNN module is introduced to capture the structural information among cells. By joining the ZINB-based autoencoder with the GNN module, the model transfers the data representation learned by autoencoder to the corresponding GNN layer. Furthermore, we adopt a mutual supervised strategy to unify these two different deep neural architectures and to guide the clustering task. Extensive experimental results on six real scRNA-seq datasets demonstrate that scDSC outperforms state-of-the-art methods in terms of clustering accuracy and scalability. Our method scDSC is implemented in Python using the Pytorch machine-learning library, and it is freely available at https://github.com/DHUDBlab/scDSC.


Assuntos
Redes Neurais de Computação , Análise de Célula Única , Análise por Conglomerados , Perfilação da Expressão Gênica , RNA-Seq , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos
6.
Bioinformatics ; 39(10)2023 10 03.
Artigo em Inglês | MEDLINE | ID: mdl-37812255

RESUMO

MOTIVATION: Drug combination therapy has exhibited remarkable therapeutic efficacy and has gradually become a promising clinical treatment strategy of complex diseases such as cancers. As the related databases keep expanding, computational methods based on deep learning model have become powerful tools to predict synergistic drug combinations. However, predicting effective synergistic drug combinations is still a challenge due to the high complexity of drug combinations, the lack of biological interpretability, and the large discrepancy in the response of drug combinations in vivo and in vitro biological systems. RESULTS: Here, we propose DGSSynADR, a new deep learning method based on global structured features of drugs and targets for predicting synergistic anticancer drug combinations. DGSSynADR constructs a heterogeneous graph by integrating the drug-drug, drug-target, protein-protein interactions and multi-omics data, utilizes a low-rank global attention (LRGA) model to perform global weighted aggregation of graph nodes and learn the global structured features of drugs and targets, and then feeds the embedded features into a bilinear predictor to predict the synergy scores of drug combinations in different cancer cell lines. Specifically, LRGA network brings better model generalization ability, and effectively reduces the complexity of graph computation. The bilinear predictor facilitates the dimension transformation of the features and fuses the feature representation of the two drugs to improve the prediction performance. The loss function Smooth L1 effectively avoids gradient explosion, contributing to better model convergence. To validate the performance of DGSSynADR, we compare it with seven competitive methods. The comparison results demonstrate that DGSSynADR achieves better performance. Meanwhile, the prediction of DGSSynADR is validated by previous findings in case studies. Furthermore, detailed ablation studies indicate that the one-hot coding drug feature, LRGA model and bilinear predictor play a key role in improving the prediction performance. AVAILABILITY AND IMPLEMENTATION: DGSSynADR is implemented in Python using the Pytorch machine-learning library, and it is freely available at https://github.com/DHUDBlab/DGSSynADR.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica , Neoplasias , Humanos , Biologia Computacional/métodos , Combinação de Medicamentos , Neoplasias/tratamento farmacológico , Aprendizado de Máquina
7.
BMC Bioinformatics ; 20(Suppl 15): 598, 2019 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-31874597

RESUMO

BACKGROUND: Super-enhancers (SEs) are clusters of transcriptional active enhancers, which dictate the expression of genes defining cell identity and play an important role in the development and progression of tumors and other diseases. Many key cancer oncogenes are driven by super-enhancers, and the mutations associated with common diseases such as Alzheimer's disease are significantly enriched with super-enhancers. Super-enhancers have shown great potential for the identification of key oncogenes and the discovery of disease-associated mutational sites. RESULTS: In this paper, we propose a new computational method called DEEPSEN for predicting super-enhancers based on convolutional neural network. The proposed method integrates 36 kinds of features. Compared with existing approaches, our method performs better and can be used for genome-wide prediction of super-enhancers. Besides, we screen important features for predicting super-enhancers. CONCLUSION: Convolutional neural network is effective in boosting the performance of super-enhancer prediction.


Assuntos
Redes Neurais de Computação , Humanos , Neoplasias/genética , Oncogenes
8.
BMC Genomics ; 20(Suppl 2): 221, 2019 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-30967107

RESUMO

BACKGROUND: Epigenome is highly dynamic during the early stages of embryonic development. Epigenetic modifications provide the necessary regulation for lineage specification and enable the maintenance of cellular identity. Given the rapid accumulation of genome-wide epigenomic modification maps across cellular differentiation process, there is an urgent need to characterize epigenetic dynamics and reveal their impacts on differential gene regulation. METHODS: We proposed DiffEM, a computational method for differential analysis of epigenetic modifications and identified highly dynamic modification sites along cellular differentiation process. We applied this approach to investigating 6 epigenetic marks of 20 kinds of human early developmental stages and tissues, including hESCs, 4 hESC-derived lineages and 15 human primary tissues. RESULTS: We identified highly dynamic modification sites where different cell types exhibit distinctive modification patterns, and found that these highly dynamic sites enriched in the genes related to cellular development and differentiation. Further, to evaluate the effectiveness of our method, we correlated the dynamics scores of epigenetic modifications with the variance of gene expression, and compared the results of our method with those of the existing algorithms. The comparison results demonstrate the power of our method in evaluating the epigenetic dynamics and identifying highly dynamic regions along cell differentiation process.


Assuntos
Linhagem da Célula , Células-Tronco Embrionárias/citologia , Células-Tronco Embrionárias/metabolismo , Epigenômica , Regulação da Expressão Gênica no Desenvolvimento , Genoma Humano , Diferenciação Celular , Histonas/genética , Histonas/metabolismo , Humanos , Especificidade de Órgãos
9.
BMC Bioinformatics ; 18(Suppl 12): 418, 2017 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-29072144

RESUMO

BACKGROUND: Studies have shown that enhancers are significant regulatory elements to play crucial roles in gene expression regulation. Since enhancers are unrelated to the orientation and distance to their target genes, it is a challenging mission for scholars and researchers to accurately predicting distal enhancers. In the past years, with the high-throughout ChiP-seq technologies development, several computational techniques emerge to predict enhancers using epigenetic or genomic features. Nevertheless, the inconsistency of computational models across different cell-lines and the unsatisfactory prediction performance call for further research in this area. RESULTS: Here, we propose a new Deep Belief Network (DBN) based computational method for enhancer prediction, which is called EnhancerDBN. This method combines diverse features, composed of DNA sequence compositional features, DNA methylation and histone modifications. Our computational results indicate that 1) EnhancerDBN outperforms 13 existing methods in prediction, and 2) GC content and DNA methylation can serve as relevant features for enhancer prediction. CONCLUSION: Deep learning is effective in boosting the performance of enhancer prediction.


Assuntos
Algoritmos , Biologia Computacional/métodos , Elementos Facilitadores Genéticos , Bases de Dados Genéticas , Humanos , Curva ROC
10.
BMC Bioinformatics ; 18(1): 103, 2017 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-28187703

RESUMO

BACKGROUND: Differences in chromatin states are critical to the multiplicity of cell states. Recently genome-wide histone modification maps of diverse human developmental stages and tissues have been charted. DESCRIPTION: To facilitate the investigation of epigenetic dynamics and regulatory mechanisms in cellular differentiation processes, we developed iHMS, an integrated human histone modification database that incorporates massive histone modification maps spanning different developmental stages, lineages and tissues ( http://www.tongjidmb.com/human/index.html ). It also includes genome-wide expression data of different conditions, reference gene annotations, GC content and CpG island information. By providing an intuitive and user-friendly query interface, iHMS enables comprehensive query and comparative analysis based on gene names, genomic region locations, histone modification marks and cell types. Moreover, it offers an efficient browser that allows users to visualize and compare multiple genome-wide histone modification maps and related expression profiles across different developmental stages and tissues. CONCLUSION: iHMS is of great helpfulness to understand how global histone modification state transitions impact cellular phenotypes across different developmental stages and tissues in the human genome. This extensive catalog of histone modification states thus presents an important resource for epigenetic and developmental studies.


Assuntos
Bases de Dados Genéticas , Histonas/metabolismo , Interface Usuário-Computador , Cromatina/metabolismo , Ilhas de CpG , Humanos , Internet , Processamento de Proteína Pós-Traducional
11.
BMC Bioinformatics ; 17(Suppl 17): 537, 2016 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-28155634

RESUMO

BACKGROUND: Differentiation of human embryonic stem cells requires precise control of gene expression that depends on specific spatial and temporal epigenetic regulation. Recently available temporal epigenomic data derived from cellular differentiation processes provides an unprecedented opportunity for characterizing fundamental properties of epigenomic dynamics and revealing regulatory roles of epigenetic modifications. RESULTS: This paper presents a spatial temporal clustering approach, named STCluster, which exploits the temporal variation information of epigenomes to characterize dynamic epigenetic mode during cellular differentiation. This approach identifies significant spatial temporal patterns of epigenetic modifications along human embryonic stem cell differentiation and cluster regulatory sequences by their spatial temporal epigenetic patterns. CONCLUSIONS: The results show that this approach is effective in capturing epigenetic modification patterns associated with specific cell types. In addition, STCluster allows straightforward identification of coherent epigenetic modes in multiple cell types, indicating the ability in the establishment of the most conserved epigenetic signatures during cellular differentiation process.


Assuntos
Diferenciação Celular/genética , Análise por Conglomerados , Células-Tronco Embrionárias/fisiologia , Epigênese Genética , Regulação da Expressão Gênica no Desenvolvimento , Metilação de DNA , Células-Tronco Embrionárias/metabolismo , Histonas/metabolismo , Humanos
12.
BMC Bioinformatics ; 13: 4, 2012 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-22226192

RESUMO

BACKGROUND: Promoter prediction is an integrant step for understanding gene regulation and annotating genomes. Traditional promoter analysis is mainly based on sequence compositional features. Recently, many kinds of structural features have been employed in promoter prediction. However, considering the high-dimensionality and overfitting problems, it is unfeasible to utilize all available features for promoter prediction. Thus it is necessary to choose some appropriate features for the prediction task. RESULTS: This paper conducts an extensive comparison study on feature selection of DNA structural properties for promoter prediction. Firstly, to examine whether promoters possess some special structures, we carry out a systematical comparison among the profiles of thirteen structural features on promoter and non-promoter sequences. Secondly, we investigate the correlations between these structural features and promoter sequences. Thirdly, both filter and wrapper methods are utilized to select appropriate feature subsets from thirteen different kinds of structural features for promoter prediction, and the predictive power of the selected feature subsets is evaluated. Finally, we compare the prediction performance of the feature subsets selected in this paper with nine existing promoter prediction approaches. CONCLUSIONS: Experimental results show that the structural features are differentially correlated to promoters. Specifically, DNA-bending stiffness, DNA denaturation and energy-related features are highly correlated with promoters. The predictive power for promoter sequences differentiates greatly among different structural features. Selecting the relevant features can significantly improve the accuracy of promoter prediction.


Assuntos
DNA/química , Genoma Humano , Regiões Promotoras Genéticas , Animais , Sequência de Bases , RNA Polimerases Dirigidas por DNA/metabolismo , Humanos , Análise de Sequência de DNA , Software , Máquina de Vetores de Suporte
13.
BMC Bioinformatics ; 13: 49, 2012 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-22449207

RESUMO

BACKGROUND: Nucleosome distribution along chromatin dictates genomic DNA accessibility and thus profoundly influences gene expression. However, the underlying mechanism of nucleosome formation remains elusive. Here, taking a structural perspective, we systematically explored nucleosome formation potential of genomic sequences and the effect on chromatin organization and gene expression in S. cerevisiae. RESULTS: We analyzed twelve structural features related to flexibility, curvature and energy of DNA sequences. The results showed that some structural features such as DNA denaturation, DNA-bending stiffness, Stacking energy, Z-DNA, Propeller twist and free energy, were highly correlated with in vitro and in vivo nucleosome occupancy. Specifically, they can be classified into two classes, one positively and the other negatively correlated with nucleosome occupancy. These two kinds of structural features facilitated nucleosome binding in centromere regions and repressed nucleosome formation in the promoter regions of protein-coding genes to mediate transcriptional regulation. Based on these analyses, we integrated all twelve structural features in a model to predict more accurately nucleosome occupancy in vivo than the existing methods that mainly depend on sequence compositional features. Furthermore, we developed a novel approach, named DLaNe, that located nucleosomes by detecting peaks of structural profiles, and built a meta predictor to integrate information from different structural features. As a comparison, we also constructed a hidden Markov model (HMM) to locate nucleosomes based on the profiles of these structural features. The result showed that the meta DLaNe and HMM-based method performed better than the existing methods, demonstrating the power of these structural features in predicting nucleosome positions. CONCLUSIONS: Our analysis revealed that DNA structures significantly contribute to nucleosome organization and influence chromatin structure and gene expression regulation. The results indicated that our proposed methods are effective in predicting nucleosome occupancy and positions and that these structural features are highly predictive of nucleosome organization.The implementation of our DLaNe method based on structural features is available online.


Assuntos
Regulação Fúngica da Expressão Gênica , Nucleossomos/metabolismo , Saccharomyces cerevisiae/genética , Centrômero , Cromatina/metabolismo , DNA Forma Z/metabolismo , Genoma Fúngico , Estudo de Associação Genômica Ampla , Cadeias de Markov , Regiões Promotoras Genéticas , Saccharomyces cerevisiae/metabolismo
14.
Front Oncol ; 12: 899825, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35692809

RESUMO

Accurate inference of gene regulatory rules is critical to understanding cellular processes. Existing computational methods usually decompose the inference of gene regulatory networks (GRNs) into multiple subproblems, rather than detecting potential causal relationships simultaneously, which limits the application to data with a small number of genes. Here, we propose BiRGRN, a novel computational algorithm for inferring GRNs from time-series single-cell RNA-seq (scRNA-seq) data. BiRGRN utilizes a bidirectional recurrent neural network to infer GRNs. The recurrent neural network is a complex deep neural network that can capture complex, non-linear, and dynamic relationships among variables. It maps neurons to genes, and maps the connections between neural network layers to the regulatory relationship between genes, providing an intuitive solution to model GRNs with biological closeness and mathematical flexibility. Based on the deep network, we transform the inference of GRNs into a regression problem, using the gene expression data at previous time points to predict the gene expression data at the later time point. Furthermore, we adopt two strategies to improve the accuracy and stability of the algorithm. Specifically, we utilize a bidirectional structure to integrate the forward and reverse inference results and exploit an incomplete set of prior knowledge to filter out some candidate inferences of low confidence. BiRGRN is applied to four simulated datasets and three real scRNA-seq datasets to verify the proposed method. We perform comprehensive comparisons between our proposed method with other state-of-the-art techniques. These experimental results indicate that BiRGRN is capable of inferring GRN simultaneously from time-series scRNA-seq data. Our method BiRGRN is implemented in Python using the TensorFlow machine-learning library, and it is freely available at https://gitee.com/DHUDBLab/bi-rgrn.

15.
IEEE/ACM Trans Comput Biol Bioinform ; 19(4): 2512-2522, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-33630737

RESUMO

Cellular programs often exhibit strong heterogeneity and asynchrony in the timing of program execution. Single-cell RNA-seq technology has provided an unprecedented opportunity for characterizing these cellular processes by simultaneously quantifying many parameters at single-cell resolution. Robust trajectory inference is a critical step in the analysis of dynamic temporal gene expression, which can shed light on the mechanisms of normal development and diseases. Here, we present TiC2D, a novel algorithm for cell trajectory inference from single-cell RNA-seq data, which adopts a consensus clustering strategy to precisely cluster cells. To evaluate the power of TiC2D, we compare it with three state-of-the-art methods on four independent single-cell RNA-seq datasets. The results show that TiC2D can accurately infer developmental trajectories from single-cell transcriptome. Furthermore, the reconstructed trajectories enable us to identify key genes involved in cell fate determination and to obtain new insights about their roles at different developmental stages.


Assuntos
Algoritmos , Análise de Célula Única , Análise por Conglomerados , Consenso , Perfilação da Expressão Gênica/métodos , RNA-Seq , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos
16.
Comput Biol Chem ; 93: 107512, 2021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-34044202

RESUMO

Gene regulatory network models the interactions between transcription factors and target genes. Reconstructing gene regulation network is critically important to understand gene function in a particular cellular context, providing key insights into complex biological systems. We develop a new computational method, named iMPRN, which integrates multiple prior networks to infer regulatory network. Based on the network component analysis model, iMPRN adopts linear regression, graph embedding, and elastic networks to optimize each prior network in line with specific biological context. For each rewired prior networks, iMPRN evaluate the confidence of the regulatory edges in each network based on B scores and finally integrated these optimized networks. We validate the effectiveness of iMPRN by comparing it with four widely-used gene regulatory network reconstruction algorithms on a simulation data set. The results show that iMPRN can infer the gene regulatory network more accurately. Further, on a real scRNA-seq dataset, iMPRN is respectively applied to reconstruct gene regulatory networks for malignant and nonmalignant head and neck tumor cells, demonstrating distinctive differences in their corresponding regulatory networks.


Assuntos
Redes Reguladoras de Genes , Análise de Célula Única , Humanos , Transcriptoma
17.
Bioinformatics ; 25(16): 2006-12, 2009 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-19515962

RESUMO

MOTIVATION: Identification of core promoters is a key clue in understanding gene regulations. However, due to the diverse nature of promoter sequences, the accuracy of existing prediction approaches for non-CpG island (simply CGI)-related promoters is not as high as that for CGI-related promoters. This consequently leads to a low genome-wide promoter prediction accuracy. RESULTS: In this article, we first systematically analyze the similarities and differences between the two types of promoters (CGI- and non-CGI-related) from a novel structural perspective, and then devise a unified framework, called PNNP (Pattern-based Nearest Neighbor search for Promoter), to predict both CGI- and non-CGI-related promoters based on their structural features. Our comparative analysis on the structural characteristics of promoters reveals two interesting facts: (i) the structural values of CGI- and non-CGI-related promoters are quite different, but they exhibit nearly similar structural patterns; (ii) the structural patterns of promoters are obviously different from that of non-promoter sequences though the sequences have almost similar structural values. Extensive experiments demonstrate that the proposed PNNP approach is effective in capturing the structural patterns of promoters, and can significantly improve genome-wide performance of promoters prediction, especially non-CGI-related promoters prediction. AVAILABILITY: The implementation of the program PNNP is available at http://admis.tongji.edu.cn/Projects/pnnp.aspx.


Assuntos
Biologia Computacional/métodos , DNA/química , Regiões Promotoras Genéticas/genética , Análise por Conglomerados , Ilhas de CpG , Análise de Sequência de DNA/métodos
18.
Front Cell Dev Biol ; 8: 588041, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33195248

RESUMO

A complex tissue contains a variety of cells with distinct molecular signatures. Single-cell RNA sequencing has characterized the transcriptomes of different cell types and enables researchers to discover the underlying mechanisms of cellular heterogeneity. A critical task in single-cell transcriptome studies is to uncover transcriptional differences among specific cell types. However, the intercellular transcriptional variation is usually confounded with high level of technical noise, which masks the important biological signals. Here, we propose a new computational method DiffGE for differential analysis, adopting network entropy to measure the expression dynamics of gene groups among different cell types and to identify the highly differential gene groups. To evaluate the effectiveness of our proposed method, DiffGE is applied to three independent single-cell RNA-seq datasets and to identify the highly dynamic gene groups that exhibit distinctive expression patterns in different cell types. We compare the results of our method with those of three widely applied algorithms. Further, the gene function analysis indicates that these detected differential gene groups are significantly related to cellular regulation processes. The results demonstrate the power of our method in evaluating the transcriptional dynamics and identifying highly differential gene groups among different cell types.

19.
Front Genet ; 10: 1298, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-32010182

RESUMO

Epigenetic alteration is a fundamental characteristic of nearly all human cancers. Tumor cells not only harbor genetic alterations, but also are regulated by diverse epigenetic modifications. Identification of epigenetic similarities across different cancer types is beneficial for the discovery of treatments that can be extended to different cancers. Nowadays, abundant epigenetic modification profiles have provided a great opportunity to achieve this goal. Here, we proposed a new approach TriPCE, introducing tri-clustering strategy to integrative pan-cancer epigenomic analysis. The method is able to identify coherent patterns of various epigenetic modifications across different cancer types. To validate its capability, we applied the proposed TriPCE to analyze six important epigenetic marks among seven cancer types, and identified significant cross-cancer epigenetic similarities. These results suggest that specific epigenetic patterns indeed exist among these investigated cancers. Furthermore, the gene functional analysis performed on the associated gene sets demonstrates strong relevance with cancer development and reveals consistent risk tendency among these investigated cancer types.

20.
BMC Med Genomics ; 11(Suppl 6): 117, 2018 Dec 31.
Artigo em Inglês | MEDLINE | ID: mdl-30598115

RESUMO

BACKGROUND: Human cancers are complex ecosystems composed of cells with distinct molecular signatures. Such intratumoral heterogeneity poses a major challenge to cancer diagnosis and treatment. Recent advancements of single-cell techniques such as scRNA-seq have brought unprecedented insights into cellular heterogeneity. Subsequently, a challenging computational problem is to cluster high dimensional noisy datasets with substantially fewer cells than the number of genes. METHODS: In this paper, we introduced a consensus clustering framework conCluster, for cancer subtype identification from single-cell RNA-seq data. Using an ensemble strategy, conCluster fuses multiple basic partitions to consensus clusters. RESULTS: Applied to real cancer scRNA-seq datasets, conCluster can more accurately detect cancer subtypes than the widely used scRNA-seq clustering methods. Further, we conducted co-expression network analysis for the identified melanoma subtypes. CONCLUSIONS: Our analysis demonstrates that these subtypes exhibit distinct gene co-expression networks and significant gene sets with different functional enrichment.


Assuntos
Neoplasias/classificação , RNA Neoplásico , Análise por Conglomerados , Conjuntos de Dados como Assunto , Humanos , Dados de Sequência Molecular , Neoplasias/genética
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa