Pesquisa | Biblioteca Virtual em Saúde

1.

Systems biology intertwines with single cell and AI.

Wang, Yong; Zhang, Xiang-Sun; Chen, Luonan.

BMC Bioinformatics ; 20(Suppl 7): 204, 2019 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-31074375

RESUMO

A report of the 12th International Conference on Systems Biology (ISB2018), 18-21 August, Guiyang, China.

Assuntos

Inteligência Artificial , Biologia Computacional/métodos , Genômica/métodos , Análise de Célula Única/métodos , Biologia de Sistemas/métodos , Congressos como Assunto , Humanos

2.

CEA: Combination-based gene set functional enrichment analysis.

Sun, Duanchen; Liu, Yinliang; Zhang, Xiang-Sun; Wu, Ling-Yun.

Sci Rep ; 8(1): 13085, 2018 08 30.

Artigo em Inglês | MEDLINE | ID: mdl-30166636

RESUMO

Functional enrichment analysis is a fundamental and challenging task in bioinformatics. Most of the current enrichment analysis approaches individually evaluate functional terms and often output a list of enriched terms with high similarity and redundancy, which makes it difficult for downstream studies to extract the underlying biological interpretation. In this paper, we proposed a novel framework to assess the performance of combination-based enrichment analysis. Using this framework, we formulated the enrichment analysis as a multi-objective combinatorial optimization problem and developed the CEA (Combination-based Enrichment Analysis) method. CEA provides the whole landscape of term combinations; therefore, it is a good benchmark for evaluating the current state-of-the-art combination-based functional enrichment methods in a comprehensive manner. We tested the effectiveness of CEA on four published microarray datasets. Enriched functional terms identified by CEA not only involve crucial biological processes of related diseases, but also have much less redundancy and can serve as a preferable representation for the enriched terms found by traditional single-term-based methods. CEA has been implemented in the R package CopTea and is available at http://github.com/wulingyun/CopTea/.

Assuntos

Algoritmos , Biologia Computacional , Bases de Dados Genéticas , Ontologia Genética , Modelos Genéticos , Software , Humanos , Análise de Sequência com Séries de Oligonucleotídeos

3.

Integrating data- and model-driven strategies in systems biology.

Wang, Yong; Zhang, Xiang-Sun; Chen, Luonan.

BMC Syst Biol ; 12(Suppl 4): 38, 2018 04 24.

Artigo em Inglês | MEDLINE | ID: mdl-29745831

RESUMO

A report of the 11th International Conference on Systems Biology (ISB2017), 18-21 August, Shenzhen, China.

Assuntos

Big Data , Modelos Biológicos , Biologia de Sistemas

4.

NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis.

Sun, Duanchen; Liu, Yinliang; Zhang, Xiang-Sun; Wu, Ling-Yun.

BMC Syst Biol ; 11(Suppl 4): 75, 2017 Sep 21.

Artigo em Inglês | MEDLINE | ID: mdl-28950861

RESUMO

BACKGROUND: High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. RESULTS: In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub ( http://github.com/wulingyun/CopTea/ ). CONCLUSION: Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases.

Assuntos

Biologia Computacional/métodos , Ontologia Genética , Algoritmos , Bases de Dados Genéticas , Modelos Estatísticos

5.

Understanding biological systems through the lens of data.

Wang, Yong; Zhang, Xiang-Sun; Chen, Luonan.

BMC Syst Biol ; 11(Suppl 4): 77, 2017 Sep 21.

Artigo em Inglês | MEDLINE | ID: mdl-28950868

RESUMO

A report of the 10th International Conference on Systems Biology (ISB2016), 19-22 August, Weihai, China.

Assuntos

Bases de Dados Factuais , Biologia de Sistemas

6.

NCC-AUC: an AUC optimization method to identify multi-biomarker panel for cancer prognosis from genomic and clinical data.

Zou, Meng; Liu, Zhaoqi; Zhang, Xiang-Sun; Wang, Yong.

Bioinformatics ; 31(20): 3330-8, 2015 Oct 15.

Artigo em Inglês | MEDLINE | ID: mdl-26092859

RESUMO

MOTIVATION: In prognosis and survival studies, an important goal is to identify multi-biomarker panels with predictive power using molecular characteristics or clinical observations. Such analysis is often challenged by censored, small-sample-size, but high-dimensional genomic profiles or clinical data. Therefore, sophisticated models and algorithms are in pressing need. RESULTS: In this study, we propose a novel Area Under Curve (AUC) optimization method for multi-biomarker panel identification named Nearest Centroid Classifier for AUC optimization (NCC-AUC). Our method is motived by the connection between AUC score for classification accuracy evaluation and Harrell's concordance index in survival analysis. This connection allows us to convert the survival time regression problem to a binary classification problem. Then an optimization model is formulated to directly maximize AUC and meanwhile minimize the number of selected features to construct a predictor in the nearest centroid classifier framework. NCC-AUC shows its great performance by validating both in genomic data of breast cancer and clinical data of stage IB Non-Small-Cell Lung Cancer (NSCLC). For the genomic data, NCC-AUC outperforms Support Vector Machine (SVM) and Support Vector Machine-based Recursive Feature Elimination (SVM-RFE) in classification accuracy. It tends to select a multi-biomarker panel with low average redundancy and enriched biological meanings. Also NCC-AUC is more significant in separation of low and high risk cohorts than widely used Cox model (Cox proportional-hazards regression model) and L1-Cox model (L1 penalized in Cox model). These performance gains of NCC-AUC are quite robust across 5 subtypes of breast cancer. Further in an independent clinical data, NCC-AUC outperforms SVM and SVM-RFE in predictive accuracy and is consistently better than Cox model and L1-Cox model in grouping patients into high and low risk categories. CONCLUSION: In summary, NCC-AUC provides a rigorous optimization framework to systematically reveal multi-biomarker panel from genomic and clinical data. It can serve as a useful tool to identify prognostic biomarkers for survival analysis. AVAILABILITY AND IMPLEMENTATION: NCC-AUC is available at http://doc.aporc.org/wiki/NCC-AUC. CONTACT: ywang@amss.ac.cn SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Algoritmos , Área Sob a Curva , Biomarcadores/análise , Neoplasias da Mama/diagnóstico , Carcinoma Pulmonar de Células não Pequenas/diagnóstico , Interpretação Estatística de Dados , Genômica/métodos , Neoplasias Pulmonares/diagnóstico , Neoplasias da Mama/genética , Neoplasias da Mama/mortalidade , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/mortalidade , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/mortalidade , Modelos Biológicos , Reconhecimento Automatizado de Padrão , Prognóstico , Modelos de Riscos Proporcionais , Máquina de Vetores de Suporte , Taxa de Sobrevida , Biologia de Sistemas , Integração de Sistemas

7.

Cancer stem cells display extremely large evolvability: alternating plastic and rigid networks as a potential Mechanism: network models, novel therapeutic target strategies, and the contributions of hypoxia, inflammation and cellular senescence.

Csermely, Peter; Hódsági, János; Korcsmáros, Tamás; Módos, Dezso; Perez-Lopez, Áron R; Szalay, Kristóf; Veres, Dániel V; Lenti, Katalin; Wu, Ling-Yun; Zhang, Xiang-Sun.

Semin Cancer Biol ; 30: 42-51, 2015 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-24412105

RESUMO

Cancer is increasingly perceived as a systems-level, network phenomenon. The major trend of malignant transformation can be described as a two-phase process, where an initial increase of network plasticity is followed by a decrease of plasticity at late stages of tumor development. The fluctuating intensity of stress factors, like hypoxia, inflammation and the either cooperative or hostile interactions of tumor inter-cellular networks, all increase the adaptation potential of cancer cells. This may lead to the bypass of cellular senescence, and to the development of cancer stem cells. We propose that the central tenet of cancer stem cell definition lies exactly in the indefinability of cancer stem cells. Actual properties of cancer stem cells depend on the individual "stress-history" of the given tumor. Cancer stem cells are characterized by an extremely large evolvability (i.e. a capacity to generate heritable phenotypic variation), which corresponds well with the defining hallmarks of cancer stem cells: the possession of the capacity to self-renew and to repeatedly re-build the heterogeneous lineages of cancer cells that comprise a tumor in new environments. Cancer stem cells represent a cell population, which is adapted to adapt. We argue that the high evolvability of cancer stem cells is helped by their repeated transitions between plastic (proliferative, symmetrically dividing) and rigid (quiescent, asymmetrically dividing, often more invasive) phenotypes having plastic and rigid networks. Thus, cancer stem cells reverse and replay cancer development multiple times. We describe network models potentially explaining cancer stem cell-like behavior. Finally, we propose novel strategies including combination therapies and multi-target drugs to overcome the Nietzschean dilemma of cancer stem cell targeting: "what does not kill me makes me stronger".

Assuntos

Hipóxia Celular/fisiologia , Transformação Celular Neoplásica/patologia , Senescência Celular/fisiologia , Inflamação/patologia , Células-Tronco Neoplásicas/patologia , Humanos

8.

Discovery of co-occurring driver pathways in cancer.

Zhang, Junhua; Wu, Ling-Yun; Zhang, Xiang-Sun; Zhang, Shihua.

BMC Bioinformatics ; 15: 271, 2014 Aug 09.

Artigo em Inglês | MEDLINE | ID: mdl-25106096

RESUMO

BACKGROUND: It has been widely realized that pathways rather than individual genes govern the course of carcinogenesis. Therefore, discovering driver pathways is becoming an important step to understand the molecular mechanisms underlying cancer and design efficient treatments for cancer patients. Previous studies have focused mainly on observation of the alterations in cancer genomes at the individual gene or single pathway level. However, a great deal of evidence has indicated that multiple pathways often function cooperatively in carcinogenesis and other key biological processes. RESULTS: In this study, an exact mathematical programming method was proposed to de novo identify co-occurring mutated driver pathways (CoMDP) in carcinogenesis without any prior information beyond mutation profiles. Two possible properties of mutations that occurred in cooperative pathways were exploited to achieve this: (1) each individual pathway has high coverage and high exclusivity; and (2) the mutations between the pair of pathways showed statistically significant co-occurrence. The efficiency of CoMDP was validated first by testing on simulated data and comparing it with a previous method. Then CoMDP was applied to several real biological data including glioblastoma, lung adenocarcinoma, and ovarian carcinoma datasets. The discovered co-occurring driver pathways were here found to be involved in several key biological processes, such as cell survival and protein synthesis. Moreover, CoMDP was modified to (1) identify an extra pathway co-occurring with a known pathway and (2) detect multiple significant co-occurring driver pathways for carcinogenesis. CONCLUSIONS: The present method can be used to identify gene sets with more biological relevance than the ones currently used for the discovery of single driver pathways.

Assuntos

Carcinogênese/genética , Neoplasias/genética , Neoplasias/patologia , Software , Biologia de Sistemas/métodos , Algoritmos , Progressão da Doença , Humanos , Mutação , Transdução de Sinais/genética

9.

Breast tumor subgroups reveal diverse clinical prognostic power.

Liu, Zhaoqi; Zhang, Xiang-Sun; Zhang, Shihua.

Sci Rep ; 4: 4002, 2014 Feb 06.

Artigo em Inglês | MEDLINE | ID: mdl-24499868

RESUMO

Predicting the outcome of cancer therapies using molecular features and clinical observations is a key goal of cancer biology, which has been addressed comprehensively using whole patient datasets without considering the effect of tumor heterogeneity. We hypothesized that molecular features and clinical observations have different prognostic abilities for different cancer subtypes, and made a systematic study using both clinical observations and gene expression data. This analysis revealed that (1) gene expression profiles and clinical features show different prognostic power for the five breast cancer subtypes; (2) gene expression data of the normal-like subgroup contains more valuable prognostic information and survival associated contexts than the other subtypes, and the patient survival time of the normal-like subtype is more predictable based on the gene expression profiles; and (3) the prognostic power of many previously reported breast cancer gene signatures increased in the normal-like subtype and reduced in the other subtypes compared with that in the whole sample set.

Assuntos

Neoplasias da Mama/classificação , Neoplasias da Mama/mortalidade , Regulação Neoplásica da Expressão Gênica , Antibióticos Antineoplásicos/uso terapêutico , Anticorpos Monoclonais Humanizados/uso terapêutico , Antineoplásicos Fitogênicos/uso terapêutico , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Doxorrubicina/uso terapêutico , Resistencia a Medicamentos Antineoplásicos , Feminino , Perfilação da Expressão Gênica , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Paclitaxel/uso terapêutico , Prognóstico , Receptor ErbB-2/metabolismo , Receptores de Estrogênio/metabolismo , Receptores de Progesterona/metabolismo , Trastuzumab

10.

Discovery of cell-type specific regulatory elements in the human genome using differential chromatin modification analysis.

Chen, Chen; Zhang, Shihua; Zhang, Xiang-Sun.

Nucleic Acids Res ; 41(20): 9230-42, 2013 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-23945931

RESUMO

Chromatin modifications have been comprehensively illustrated to play important roles in gene regulation and cell diversity in recent years. Given the rapid accumulation of genome-wide chromatin modification maps across multiple cell types, there is an urgent need for computational methods to analyze multiple maps to reveal combinatorial modification patterns and define functional DNA elements, especially those are specific to cell types or tissues. In this current study, we developed a computational method using differential chromatin modification analysis (dCMA) to identify cell-type-specific genomic regions with distinctive chromatin modifications. We then apply this method to a public data set with modification profiles of nine marks for nine cell types to evaluate its effectiveness. We found cell-type-specific elements unique to each cell type investigated. These unique features show significant cell-type-specific biological relevance and tend to be located within functional regulatory elements. These results demonstrate the power of a differential comparative epigenomic strategy in deciphering the human genome and characterizing cell specificity.

Assuntos

Cromatina/metabolismo , Epigênese Genética , Genoma Humano , Sítios de Ligação , Proteína p300 Associada a E1A/metabolismo , Epigenômica/métodos , Histonas/metabolismo , Humanos , Transcrição Gênica

11.

iPcc: a novel feature extraction method for accurate disease class discovery and prediction.

Ren, Xianwen; Wang, Yong; Zhang, Xiang-Sun; Jin, Qi.

Nucleic Acids Res ; 41(14): e143, 2013 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-23761440

RESUMO

Gene expression profiling has gradually become a routine procedure for disease diagnosis and classification. In the past decade, many computational methods have been proposed, resulting in great improvements on various levels, including feature selection and algorithms for classification and clustering. In this study, we present iPcc, a novel method from the feature extraction perspective to further propel gene expression profiling technologies from bench to bedside. We define 'correlation feature space' for samples based on the gene expression profiles by iterative employment of Pearson's correlation coefficient. Numerical experiments on both simulated and real gene expression data sets demonstrate that iPcc can greatly highlight the latent patterns underlying noisy gene expression data and thus greatly improve the robustness and accuracy of the algorithms currently available for disease diagnosis and classification based on gene expression profiles.

Assuntos

Algoritmos , Doença/classificação , Perfilação da Expressão Gênica/métodos , Classificação/métodos , Análise por Conglomerados , Técnicas e Procedimentos Diagnósticos , Doença/genética , Humanos , Leucemia/classificação , Leucemia/genética , Masculino , Neoplasias da Próstata/classificação , Neoplasias da Próstata/genética , Psoríase/classificação , Psoríase/genética

12.

APG: an Active Protein-Gene network model to quantify regulatory signals in complex biological systems.

Wang, Jiguang; Sun, Yidan; Zheng, Si; Zhang, Xiang-Sun; Zhou, Huarong; Chen, Luonan.

Sci Rep ; 3: 1097, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23346354

RESUMO

Synergistic interactions among transcription factors (TFs) and their cofactors collectively determine gene expression in complex biological systems. In this work, we develop a novel graphical model, called Active Protein-Gene (APG) network model, to quantify regulatory signals of transcription in complex biomolecular networks through integrating both TF upstream-regulation and downstream-regulation high-throughput data. Firstly, we theoretically and computationally demonstrate the effectiveness of APG by comparing with the traditional strategy based only on TF downstream-regulation information. We then apply this model to study spontaneous type 2 diabetic Goto-Kakizaki (GK) and Wistar control rats. Our biological experiments validate the theoretical results. In particular, SP1 is found to be a hidden TF with changed regulatory activity, and the loss of SP1 activity contributes to the increased glucose production during diabetes development. APG model provides theoretical basis to quantitatively elucidate transcriptional regulation by modelling TF combinatorial interactions and exploiting multilevel high-throughput information.

Assuntos

Redes Reguladoras de Genes/genética , Proteínas/genética , Transdução de Sinais/genética , Animais , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Modelos Animais de Doenças , Glucose/genética , Glucose/metabolismo , Imunoglobulinas/genética , Imunoglobulinas/metabolismo , Masculino , Proteínas/metabolismo , Ratos , Ratos Wistar , Fatores de Transcrição/genética , Transcrição Gênica

13.

ellipsoidFN: a tool for identifying a heterogeneous set of cancer biomarkers based on gene expressions.

Ren, Xianwen; Wang, Yong; Chen, Luonan; Zhang, Xiang-Sun; Jin, Qi.

Nucleic Acids Res ; 41(4): e53, 2013 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-23262226

RESUMO

Computationally identifying effective biomarkers for cancers from gene expression profiles is an important and challenging task. The challenge lies in the complicated pathogenesis of cancers that often involve the dysfunction of many genes and regulatory interactions. Thus, sophisticated classification model is in pressing need. In this study, we proposed an efficient approach, called ellipsoidFN (ellipsoid Feature Net), to model the disease complexity by ellipsoids and seek a set of heterogeneous biomarkers. Our approach achieves a non-linear classification scheme for the mixed samples by the ellipsoid concept, and at the same time uses a linear programming framework to efficiently select biomarkers from high-dimensional space. ellipsoidFN reduces the redundancy and improves the complementariness between the identified biomarkers, thus significantly enhancing the distinctiveness between cancers and normal samples, and even between cancer types. Numerical evaluation on real prostate cancer, breast cancer and leukemia gene expression datasets suggested that ellipsoidFN outperforms the state-of-the-art biomarker identification methods, and it can serve as a useful tool for cancer biomarker identification in the future. The Matlab code of ellipsoidFN is freely available from http://doc.aporc.org/wiki/EllipsoidFN.

Assuntos

Biomarcadores Tumorais/análise , Software , Transcriptoma , Biomarcadores Tumorais/genética , Biomarcadores Tumorais/metabolismo , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Feminino , Humanos , Leucemia/genética , Leucemia/metabolismo , Masculino , Neoplasias da Próstata/genética , Neoplasias da Próstata/metabolismo

14.

De novo prediction of RNA-protein interactions from sequence information.

Wang, Ying; Chen, Xiaowei; Liu, Zhi-Ping; Huang, Qiang; Wang, Yong; Xu, Derong; Zhang, Xiang-Sun; Chen, Runsheng; Chen, Luonan.

Mol Biosyst ; 9(1): 133-42, 2013 Jan 27.

Artigo em Inglês | MEDLINE | ID: mdl-23138266

RESUMO

Protein-RNA interactions are fundamentally important in understanding cellular processes. In particular, non-coding RNA-protein interactions play an important role to facilitate biological functions in signalling, transcriptional regulation, and even the progression of complex diseases. However, experimental determination of protein-RNA interactions remains time-consuming and labour-intensive. Here, we develop a novel extended naïve-Bayes-classifier for de novo prediction of protein-RNA interactions, only using protein and RNA sequence information. Specifically, we first collect a set of known protein-RNA interactions as gold-standard positives and extract sequence-based features to represent each protein-RNA pair. To fill the gap between high dimensional features and scarcity of gold-standard positives, we select effective features by cutting a likelihood ratio score, which not only reduces the computational complexity but also allows transparent feature integration during prediction. An extended naïve Bayes classifier is then constructed using these effective features to train a protein-RNA interaction prediction model. Numerical experiments show that our method can achieve the prediction accuracy of 0.77 even though only a small number of protein-RNA interaction data are available. In particular, we demonstrate that the extended naïve-Bayes-classifier is superior to the naïve-Bayes-classifier by fully considering the dependences among features. Importantly, we conduct ncRNA pull-down experiments to validate the predicted novel protein-RNA interactions and identify the interacting proteins of sbRNA CeN72 in C. elegans, which further demonstrates the effectiveness of our method.

Assuntos

Teorema de Bayes , Biologia Computacional/métodos , Proteínas de Ligação a RNA/metabolismo , RNA/metabolismo , Análise de Sequência de Proteína/métodos , Animais , Caenorhabditis elegans/química , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/química , Proteínas de Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/metabolismo , Bases de Dados de Proteínas , Modelos Biológicos , Ligação Proteica , RNA/química , RNA/genética , Proteínas de Ligação a RNA/química , Curva ROC , Reprodutibilidade dos Testes

15.

GOMA: functional enrichment analysis tool based on GO modules.

Huang, Qiang; Wu, Ling-Yun; Wang, Yong; Zhang, Xiang-Sun.

Chin J Cancer ; 32(4): 195-204, 2013 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-23237213

RESUMO

Analyzing the function of gene sets is a critical step in interpreting the results of high-throughput experiments in systems biology. A variety of enrichment analysis tools have been developed in recent years, but most output a long list of significantly enriched terms that are often redundant, making it difficult to extract the most meaningful functions. In this paper, we present GOMA, a novel enrichment analysis method based on the new concept of enriched functional Gene Ontology (GO) modules. With this method, we systematically revealed functional GO modules, i.e., groups of functionally similar GO terms, via an optimization model and then ranked them by enrichment scores. Our new method simplifies enrichment analysis results by reducing redundancy, thereby preventing inconsistent enrichment results among functionally similar terms and providing more biologically meaningful results.

Assuntos

Biologia Computacional/métodos , Perfilação da Expressão Gênica , Ontologia Genética , Redes Reguladoras de Genes , Algoritmos , Neoplasias da Mama/genética , Bases de Dados Genéticas , Feminino , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos

16.

Discovering link communities in complex networks by an integer programming model and a genetic algorithm.

Li, Zhenping; Zhang, Xiang-Sun; Wang, Rui-Sheng; Liu, Hongwei; Zhang, Shihua.

PLoS One ; 8(12): e83739, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24386268

RESUMO

Identification of communities in complex networks is an important topic and issue in many fields such as sociology, biology, and computer science. Communities are often defined as groups of related nodes or links that correspond to functional subunits in the corresponding complex systems. While most conventional approaches have focused on discovering communities of nodes, some recent studies start partitioning links to find overlapping communities straightforwardly. In this paper, we propose a new quantity function for link community identification in complex networks. Based on this quantity function we formulate the link community partition problem into an integer programming model which allows us to partition a complex network into overlapping communities. We further propose a genetic algorithm for link community detection which can partition a network into overlapping communities without knowing the number of communities. We test our model and algorithm on both artificial networks and real-world networks. The results demonstrate that the model and algorithm are efficient in detecting overlapping community structure in complex networks.

Assuntos

Algoritmos , Modelos Teóricos , Humanos

17.

Computational systems biology in the big data era.

Wang, Yong; Zhang, Xiang-Sun; Chen, Luonan.

BMC Syst Biol ; 7 Suppl 2: S1, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24564834

RESUMO

A report of the 6th IEEE International Conference on Systems Biology (IEEE ISB2012), 18-20 August, Xi'an, China.

Assuntos

Bases de Dados Factuais , Biologia de Sistemas , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Software

18.

Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data.

Zhang, Junhua; Zhang, Shihua; Wang, Yong; Zhang, Xiang-Sun.

BMC Syst Biol ; 7 Suppl 2: S4, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24565034

RESUMO

MOTIVATION: Understanding the molecular mechanisms underlying cancer is an important step for the effective diagnosis and treatment of cancer patients. With the huge volume of data from the large-scale cancer genomics projects, an open challenge is to distinguish driver mutations, pathways, and gene sets (or core modules) that contribute to cancer formation and progression from random passengers which accumulate in somatic cells but do not contribute to tumorigenesis. Due to mutational heterogeneity, current analyses are often restricted to known pathways and functional modules for enrichment of somatic mutations. Therefore, discovery of new pathways and functional modules is a pressing need. RESULTS: In this study, we propose a novel method to identify Mutated Core Modules in Cancer (iMCMC) without any prior information other than cancer genomic data from patients with tumors. This is a network-based approach in which three kinds of data are integrated: somatic mutations, copy number variations (CNVs), and gene expressions. Firstly, the first two datasets are merged to obtain a mutation matrix, based on which a weighted mutation network is constructed where the vertex weight corresponds to gene coverage and the edge weight corresponds to the mutual exclusivity between gene pairs. Similarly, a weighted expression network is generated from the expression matrix where the vertex and edge weights correspond to the influence of a gene mutation on other genes and the Pearson correlation of gene mutation-correlated expressions, respectively. Then an integrative network is obtained by further combining these two networks, and the most coherent subnetworks are identified by using an optimization model. Finally, we obtained the core modules for tumors by filtering with significance and exclusivity tests. We applied iMCMC to the Cancer Genome Atlas (TCGA) glioblastoma multiforme (GBM) and ovarian carcinoma data, and identified several mutated core modules, some of which are involved in known pathways. Most of the implicated genes are oncogenes or tumor suppressors previously reported to be related to carcinogenesis. As a comparison, we also performed iMCMC on two of the three kinds of data, i.e., the datasets combining somatic mutations with CNVs and secondly the datasets combining somatic mutations with gene expressions. The results indicate that gene expressions or CNVs indeed provide extra useful information to the original data for the identification of core modules in cancer. CONCLUSIONS: This study demonstrates the utility of our iMCMC by integrating multiple data sources to identify mutated core modules in cancer. In addition to presenting a generally applicable methodology, our findings provide several candidate pathways or core modules recurrently perturbed in GBM or ovarian carcinoma for further studies.

Assuntos

Algoritmos , Biologia Computacional/métodos , Variações do Número de Cópias de DNA , Mutação , Neoplasias/genética , Transcriptoma , Carcinoma Epitelial do Ovário , Feminino , Redes Reguladoras de Genes , Glioblastoma/genética , Humanos , Neoplasias Epiteliais e Glandulares/genética , Neoplasias Ovarianas/genética

19.

Corbi: a new R package for biological network alignment and querying.

Huang, Qiang; Wu, Ling-Yun; Zhang, Xiang-Sun.

BMC Syst Biol ; 7 Suppl 2: S6, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24565104

RESUMO

In the last decade, plenty of biological networks are built from the large scale experimental data produced by the rapidly developing high-throughput techniques as well as literature and other sources. But the huge amount of network data have not been fully utilized due to the limited biological network analysis tools. As a basic and essential bioinformatics method, biological network alignment and querying have been applied in many fields such as predicting new protein-protein interactions (PPI). Although many algorithms were published, the network alignment and querying problems are not solved satisfactorily. In this paper, we extended CNetQ, a novel network querying method based on the conditional random fields model, to solve network alignment problem, by adopting an iterative bi-directional mapping strategy. The new method, called CNetA, was compared with other four methods on fifty simulated and three real PPI network alignment instances by using four structural and five biological measures. The computational experiments on the simulated data, which were generated from a biological network evolutionary model to validate the effectiveness of network alignment methods, show that CNetA gets the best accuracy in terms of both nodes and networks. For the real data, larger biological conserved subnetworks and larger connected subnetworks were identified, compared with the structural-dominated methods and the biological-dominated methods, respectively, which suggests that CNetA can better balances the biological and structural similarities. Further, CNetQ and CNetA have been implemented in a new R package Corbi (http://doc.aporc.org/wiki/Corbi), and freely accessible and easy used web services for CNetQ and CNetA have also been constructed based on the R package. The simulated and real datasets used in this paper are available for downloading at http://doc.aporc.org/wiki/CNetA/.

Assuntos

Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Bactérias/metabolismo , Humanos , Internet , Modelos Biológicos , Saccharomyces cerevisiae/metabolismo , Software

20.

Modelling biological systems from molecules to dynamical networks.

Wang, Yong; Zhang, Xiang-Sun; Chen, Luonan.

BMC Syst Biol ; 6 Suppl 1: S1, 2012.

Artigo em Inglês | MEDLINE | ID: mdl-23046669

RESUMO

A report of the 5th IEEE International Conference on Systems Biology (IEEE ISB2011), 2-4 September 2011, Zhuhai, China.

Assuntos

Modelos Biológicos , Biologia de Sistemas/métodos , Evolução Molecular , Humanos , Biologia Molecular

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA