Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 76
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
BMC Bioinformatics ; 20(Suppl 7): 204, 2019 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-31074375

RESUMEN

A report of the 12th International Conference on Systems Biology (ISB2018), 18-21 August, Guiyang, China.


Asunto(s)
Inteligencia Artificial , Biología Computacional/métodos , Genómica/métodos , Análisis de la Célula Individual/métodos , Biología de Sistemas/métodos , Congresos como Asunto , Humanos
2.
Sci Rep ; 8(1): 13085, 2018 08 30.
Artículo en Inglés | MEDLINE | ID: mdl-30166636

RESUMEN

Functional enrichment analysis is a fundamental and challenging task in bioinformatics. Most of the current enrichment analysis approaches individually evaluate functional terms and often output a list of enriched terms with high similarity and redundancy, which makes it difficult for downstream studies to extract the underlying biological interpretation. In this paper, we proposed a novel framework to assess the performance of combination-based enrichment analysis. Using this framework, we formulated the enrichment analysis as a multi-objective combinatorial optimization problem and developed the CEA (Combination-based Enrichment Analysis) method. CEA provides the whole landscape of term combinations; therefore, it is a good benchmark for evaluating the current state-of-the-art combination-based functional enrichment methods in a comprehensive manner. We tested the effectiveness of CEA on four published microarray datasets. Enriched functional terms identified by CEA not only involve crucial biological processes of related diseases, but also have much less redundancy and can serve as a preferable representation for the enriched terms found by traditional single-term-based methods. CEA has been implemented in the R package CopTea and is available at http://github.com/wulingyun/CopTea/.


Asunto(s)
Algoritmos , Biología Computacional , Bases de Datos Genéticas , Ontología de Genes , Modelos Genéticos , Programas Informáticos , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos
3.
BMC Syst Biol ; 12(Suppl 4): 38, 2018 04 24.
Artículo en Inglés | MEDLINE | ID: mdl-29745831

RESUMEN

A report of the 11th International Conference on Systems Biology (ISB2017), 18-21 August, Shenzhen, China.


Asunto(s)
Macrodatos , Modelos Biológicos , Biología de Sistemas
4.
BMC Syst Biol ; 11(Suppl 4): 75, 2017 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-28950861

RESUMEN

BACKGROUND: High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. RESULTS: In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub ( http://github.com/wulingyun/CopTea/ ). CONCLUSION: Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases.


Asunto(s)
Biología Computacional/métodos , Ontología de Genes , Algoritmos , Bases de Datos Genéticas , Modelos Estadísticos
5.
BMC Syst Biol ; 11(Suppl 4): 77, 2017 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-28950868

RESUMEN

A report of the 10th International Conference on Systems Biology (ISB2016), 19-22 August, Weihai, China.


Asunto(s)
Bases de Datos Factuales , Biología de Sistemas
6.
Bioinformatics ; 31(20): 3330-8, 2015 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-26092859

RESUMEN

MOTIVATION: In prognosis and survival studies, an important goal is to identify multi-biomarker panels with predictive power using molecular characteristics or clinical observations. Such analysis is often challenged by censored, small-sample-size, but high-dimensional genomic profiles or clinical data. Therefore, sophisticated models and algorithms are in pressing need. RESULTS: In this study, we propose a novel Area Under Curve (AUC) optimization method for multi-biomarker panel identification named Nearest Centroid Classifier for AUC optimization (NCC-AUC). Our method is motived by the connection between AUC score for classification accuracy evaluation and Harrell's concordance index in survival analysis. This connection allows us to convert the survival time regression problem to a binary classification problem. Then an optimization model is formulated to directly maximize AUC and meanwhile minimize the number of selected features to construct a predictor in the nearest centroid classifier framework. NCC-AUC shows its great performance by validating both in genomic data of breast cancer and clinical data of stage IB Non-Small-Cell Lung Cancer (NSCLC). For the genomic data, NCC-AUC outperforms Support Vector Machine (SVM) and Support Vector Machine-based Recursive Feature Elimination (SVM-RFE) in classification accuracy. It tends to select a multi-biomarker panel with low average redundancy and enriched biological meanings. Also NCC-AUC is more significant in separation of low and high risk cohorts than widely used Cox model (Cox proportional-hazards regression model) and L1-Cox model (L1 penalized in Cox model). These performance gains of NCC-AUC are quite robust across 5 subtypes of breast cancer. Further in an independent clinical data, NCC-AUC outperforms SVM and SVM-RFE in predictive accuracy and is consistently better than Cox model and L1-Cox model in grouping patients into high and low risk categories. CONCLUSION: In summary, NCC-AUC provides a rigorous optimization framework to systematically reveal multi-biomarker panel from genomic and clinical data. It can serve as a useful tool to identify prognostic biomarkers for survival analysis. AVAILABILITY AND IMPLEMENTATION: NCC-AUC is available at http://doc.aporc.org/wiki/NCC-AUC. CONTACT: ywang@amss.ac.cn SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Área Bajo la Curva , Biomarcadores/análisis , Neoplasias de la Mama/diagnóstico , Carcinoma de Pulmón de Células no Pequeñas/diagnóstico , Interpretación Estadística de Datos , Genómica/métodos , Neoplasias Pulmonares/diagnóstico , Neoplasias de la Mama/genética , Neoplasias de la Mama/mortalidad , Carcinoma de Pulmón de Células no Pequeñas/genética , Carcinoma de Pulmón de Células no Pequeñas/mortalidad , Femenino , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/mortalidad , Modelos Biológicos , Reconocimiento de Normas Patrones Automatizadas , Pronóstico , Modelos de Riesgos Proporcionales , Máquina de Vectores de Soporte , Tasa de Supervivencia , Biología de Sistemas , Integración de Sistemas
7.
Semin Cancer Biol ; 30: 42-51, 2015 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-24412105

RESUMEN

Cancer is increasingly perceived as a systems-level, network phenomenon. The major trend of malignant transformation can be described as a two-phase process, where an initial increase of network plasticity is followed by a decrease of plasticity at late stages of tumor development. The fluctuating intensity of stress factors, like hypoxia, inflammation and the either cooperative or hostile interactions of tumor inter-cellular networks, all increase the adaptation potential of cancer cells. This may lead to the bypass of cellular senescence, and to the development of cancer stem cells. We propose that the central tenet of cancer stem cell definition lies exactly in the indefinability of cancer stem cells. Actual properties of cancer stem cells depend on the individual "stress-history" of the given tumor. Cancer stem cells are characterized by an extremely large evolvability (i.e. a capacity to generate heritable phenotypic variation), which corresponds well with the defining hallmarks of cancer stem cells: the possession of the capacity to self-renew and to repeatedly re-build the heterogeneous lineages of cancer cells that comprise a tumor in new environments. Cancer stem cells represent a cell population, which is adapted to adapt. We argue that the high evolvability of cancer stem cells is helped by their repeated transitions between plastic (proliferative, symmetrically dividing) and rigid (quiescent, asymmetrically dividing, often more invasive) phenotypes having plastic and rigid networks. Thus, cancer stem cells reverse and replay cancer development multiple times. We describe network models potentially explaining cancer stem cell-like behavior. Finally, we propose novel strategies including combination therapies and multi-target drugs to overcome the Nietzschean dilemma of cancer stem cell targeting: "what does not kill me makes me stronger".


Asunto(s)
Hipoxia de la Célula/fisiología , Transformación Celular Neoplásica/patología , Senescencia Celular/fisiología , Inflamación/patología , Células Madre Neoplásicas/patología , Humanos
8.
BMC Bioinformatics ; 15: 271, 2014 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-25106096

RESUMEN

BACKGROUND: It has been widely realized that pathways rather than individual genes govern the course of carcinogenesis. Therefore, discovering driver pathways is becoming an important step to understand the molecular mechanisms underlying cancer and design efficient treatments for cancer patients. Previous studies have focused mainly on observation of the alterations in cancer genomes at the individual gene or single pathway level. However, a great deal of evidence has indicated that multiple pathways often function cooperatively in carcinogenesis and other key biological processes. RESULTS: In this study, an exact mathematical programming method was proposed to de novo identify co-occurring mutated driver pathways (CoMDP) in carcinogenesis without any prior information beyond mutation profiles. Two possible properties of mutations that occurred in cooperative pathways were exploited to achieve this: (1) each individual pathway has high coverage and high exclusivity; and (2) the mutations between the pair of pathways showed statistically significant co-occurrence. The efficiency of CoMDP was validated first by testing on simulated data and comparing it with a previous method. Then CoMDP was applied to several real biological data including glioblastoma, lung adenocarcinoma, and ovarian carcinoma datasets. The discovered co-occurring driver pathways were here found to be involved in several key biological processes, such as cell survival and protein synthesis. Moreover, CoMDP was modified to (1) identify an extra pathway co-occurring with a known pathway and (2) detect multiple significant co-occurring driver pathways for carcinogenesis. CONCLUSIONS: The present method can be used to identify gene sets with more biological relevance than the ones currently used for the discovery of single driver pathways.


Asunto(s)
Carcinogénesis/genética , Neoplasias/genética , Neoplasias/patología , Programas Informáticos , Biología de Sistemas/métodos , Algoritmos , Progresión de la Enfermedad , Humanos , Mutación , Transducción de Señal/genética
9.
Sci Rep ; 4: 4002, 2014 Feb 06.
Artículo en Inglés | MEDLINE | ID: mdl-24499868

RESUMEN

Predicting the outcome of cancer therapies using molecular features and clinical observations is a key goal of cancer biology, which has been addressed comprehensively using whole patient datasets without considering the effect of tumor heterogeneity. We hypothesized that molecular features and clinical observations have different prognostic abilities for different cancer subtypes, and made a systematic study using both clinical observations and gene expression data. This analysis revealed that (1) gene expression profiles and clinical features show different prognostic power for the five breast cancer subtypes; (2) gene expression data of the normal-like subgroup contains more valuable prognostic information and survival associated contexts than the other subtypes, and the patient survival time of the normal-like subtype is more predictable based on the gene expression profiles; and (3) the prognostic power of many previously reported breast cancer gene signatures increased in the normal-like subtype and reduced in the other subtypes compared with that in the whole sample set.


Asunto(s)
Neoplasias de la Mama/clasificación , Neoplasias de la Mama/mortalidad , Regulación Neoplásica de la Expresión Génica , Antibióticos Antineoplásicos/uso terapéutico , Anticuerpos Monoclonales Humanizados/uso terapéutico , Antineoplásicos Fitogénicos/uso terapéutico , Neoplasias de la Mama/tratamiento farmacológico , Neoplasias de la Mama/genética , Doxorrubicina/uso terapéutico , Resistencia a Antineoplásicos , Femenino , Perfilación de la Expresión Génica , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Paclitaxel/uso terapéutico , Pronóstico , Receptor ErbB-2/metabolismo , Receptores de Estrógenos/metabolismo , Receptores de Progesterona/metabolismo , Trastuzumab
10.
Nucleic Acids Res ; 41(20): 9230-42, 2013 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-23945931

RESUMEN

Chromatin modifications have been comprehensively illustrated to play important roles in gene regulation and cell diversity in recent years. Given the rapid accumulation of genome-wide chromatin modification maps across multiple cell types, there is an urgent need for computational methods to analyze multiple maps to reveal combinatorial modification patterns and define functional DNA elements, especially those are specific to cell types or tissues. In this current study, we developed a computational method using differential chromatin modification analysis (dCMA) to identify cell-type-specific genomic regions with distinctive chromatin modifications. We then apply this method to a public data set with modification profiles of nine marks for nine cell types to evaluate its effectiveness. We found cell-type-specific elements unique to each cell type investigated. These unique features show significant cell-type-specific biological relevance and tend to be located within functional regulatory elements. These results demonstrate the power of a differential comparative epigenomic strategy in deciphering the human genome and characterizing cell specificity.


Asunto(s)
Cromatina/metabolismo , Epigénesis Genética , Genoma Humano , Sitios de Unión , Proteína p300 Asociada a E1A/metabolismo , Epigenómica/métodos , Histonas/metabolismo , Humanos , Transcripción Genética
11.
Nucleic Acids Res ; 41(14): e143, 2013 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23761440

RESUMEN

Gene expression profiling has gradually become a routine procedure for disease diagnosis and classification. In the past decade, many computational methods have been proposed, resulting in great improvements on various levels, including feature selection and algorithms for classification and clustering. In this study, we present iPcc, a novel method from the feature extraction perspective to further propel gene expression profiling technologies from bench to bedside. We define 'correlation feature space' for samples based on the gene expression profiles by iterative employment of Pearson's correlation coefficient. Numerical experiments on both simulated and real gene expression data sets demonstrate that iPcc can greatly highlight the latent patterns underlying noisy gene expression data and thus greatly improve the robustness and accuracy of the algorithms currently available for disease diagnosis and classification based on gene expression profiles.


Asunto(s)
Algoritmos , Enfermedad/clasificación , Perfilación de la Expresión Génica/métodos , Clasificación/métodos , Análisis por Conglomerados , Técnicas y Procedimientos Diagnósticos , Enfermedad/genética , Humanos , Leucemia/clasificación , Leucemia/genética , Masculino , Neoplasias de la Próstata/clasificación , Neoplasias de la Próstata/genética , Psoriasis/clasificación , Psoriasis/genética
12.
Sci Rep ; 3: 1097, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23346354

RESUMEN

Synergistic interactions among transcription factors (TFs) and their cofactors collectively determine gene expression in complex biological systems. In this work, we develop a novel graphical model, called Active Protein-Gene (APG) network model, to quantify regulatory signals of transcription in complex biomolecular networks through integrating both TF upstream-regulation and downstream-regulation high-throughput data. Firstly, we theoretically and computationally demonstrate the effectiveness of APG by comparing with the traditional strategy based only on TF downstream-regulation information. We then apply this model to study spontaneous type 2 diabetic Goto-Kakizaki (GK) and Wistar control rats. Our biological experiments validate the theoretical results. In particular, SP1 is found to be a hidden TF with changed regulatory activity, and the loss of SP1 activity contributes to the increased glucose production during diabetes development. APG model provides theoretical basis to quantitatively elucidate transcriptional regulation by modelling TF combinatorial interactions and exploiting multilevel high-throughput information.


Asunto(s)
Redes Reguladoras de Genes/genética , Proteínas/genética , Transducción de Señal/genética , Animales , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Modelos Animales de Enfermedad , Glucosa/genética , Glucosa/metabolismo , Inmunoglobulinas/genética , Inmunoglobulinas/metabolismo , Masculino , Proteínas/metabolismo , Ratas , Ratas Wistar , Factores de Transcripción/genética , Transcripción Genética
13.
Nucleic Acids Res ; 41(4): e53, 2013 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-23262226

RESUMEN

Computationally identifying effective biomarkers for cancers from gene expression profiles is an important and challenging task. The challenge lies in the complicated pathogenesis of cancers that often involve the dysfunction of many genes and regulatory interactions. Thus, sophisticated classification model is in pressing need. In this study, we proposed an efficient approach, called ellipsoidFN (ellipsoid Feature Net), to model the disease complexity by ellipsoids and seek a set of heterogeneous biomarkers. Our approach achieves a non-linear classification scheme for the mixed samples by the ellipsoid concept, and at the same time uses a linear programming framework to efficiently select biomarkers from high-dimensional space. ellipsoidFN reduces the redundancy and improves the complementariness between the identified biomarkers, thus significantly enhancing the distinctiveness between cancers and normal samples, and even between cancer types. Numerical evaluation on real prostate cancer, breast cancer and leukemia gene expression datasets suggested that ellipsoidFN outperforms the state-of-the-art biomarker identification methods, and it can serve as a useful tool for cancer biomarker identification in the future. The Matlab code of ellipsoidFN is freely available from http://doc.aporc.org/wiki/EllipsoidFN.


Asunto(s)
Biomarcadores de Tumor/análisis , Programas Informáticos , Transcriptoma , Biomarcadores de Tumor/genética , Biomarcadores de Tumor/metabolismo , Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Femenino , Humanos , Leucemia/genética , Leucemia/metabolismo , Masculino , Neoplasias de la Próstata/genética , Neoplasias de la Próstata/metabolismo
14.
Mol Biosyst ; 9(1): 133-42, 2013 Jan 27.
Artículo en Inglés | MEDLINE | ID: mdl-23138266

RESUMEN

Protein-RNA interactions are fundamentally important in understanding cellular processes. In particular, non-coding RNA-protein interactions play an important role to facilitate biological functions in signalling, transcriptional regulation, and even the progression of complex diseases. However, experimental determination of protein-RNA interactions remains time-consuming and labour-intensive. Here, we develop a novel extended naïve-Bayes-classifier for de novo prediction of protein-RNA interactions, only using protein and RNA sequence information. Specifically, we first collect a set of known protein-RNA interactions as gold-standard positives and extract sequence-based features to represent each protein-RNA pair. To fill the gap between high dimensional features and scarcity of gold-standard positives, we select effective features by cutting a likelihood ratio score, which not only reduces the computational complexity but also allows transparent feature integration during prediction. An extended naïve Bayes classifier is then constructed using these effective features to train a protein-RNA interaction prediction model. Numerical experiments show that our method can achieve the prediction accuracy of 0.77 even though only a small number of protein-RNA interaction data are available. In particular, we demonstrate that the extended naïve-Bayes-classifier is superior to the naïve-Bayes-classifier by fully considering the dependences among features. Importantly, we conduct ncRNA pull-down experiments to validate the predicted novel protein-RNA interactions and identify the interacting proteins of sbRNA CeN72 in C. elegans, which further demonstrates the effectiveness of our method.


Asunto(s)
Teorema de Bayes , Biología Computacional/métodos , Proteínas de Unión al ARN/metabolismo , ARN/metabolismo , Análisis de Secuencia de Proteína/métodos , Animales , Caenorhabditis elegans/química , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/química , Proteínas de Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/metabolismo , Bases de Datos de Proteínas , Modelos Biológicos , Unión Proteica , ARN/química , ARN/genética , Proteínas de Unión al ARN/química , Curva ROC , Reproducibilidad de los Resultados
15.
Chin J Cancer ; 32(4): 195-204, 2013 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-23237213

RESUMEN

Analyzing the function of gene sets is a critical step in interpreting the results of high-throughput experiments in systems biology. A variety of enrichment analysis tools have been developed in recent years, but most output a long list of significantly enriched terms that are often redundant, making it difficult to extract the most meaningful functions. In this paper, we present GOMA, a novel enrichment analysis method based on the new concept of enriched functional Gene Ontology (GO) modules. With this method, we systematically revealed functional GO modules, i.e., groups of functionally similar GO terms, via an optimization model and then ranked them by enrichment scores. Our new method simplifies enrichment analysis results by reducing redundancy, thereby preventing inconsistent enrichment results among functionally similar terms and providing more biologically meaningful results.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica , Ontología de Genes , Redes Reguladoras de Genes , Algoritmos , Neoplasias de la Mama/genética , Bases de Datos Genéticas , Femenino , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos
16.
PLoS One ; 8(12): e83739, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24386268

RESUMEN

Identification of communities in complex networks is an important topic and issue in many fields such as sociology, biology, and computer science. Communities are often defined as groups of related nodes or links that correspond to functional subunits in the corresponding complex systems. While most conventional approaches have focused on discovering communities of nodes, some recent studies start partitioning links to find overlapping communities straightforwardly. In this paper, we propose a new quantity function for link community identification in complex networks. Based on this quantity function we formulate the link community partition problem into an integer programming model which allows us to partition a complex network into overlapping communities. We further propose a genetic algorithm for link community detection which can partition a network into overlapping communities without knowing the number of communities. We test our model and algorithm on both artificial networks and real-world networks. The results demonstrate that the model and algorithm are efficient in detecting overlapping community structure in complex networks.


Asunto(s)
Algoritmos , Modelos Teóricos , Humanos
17.
BMC Syst Biol ; 7 Suppl 2: S1, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24564834

RESUMEN

A report of the 6th IEEE International Conference on Systems Biology (IEEE ISB2012), 18-20 August, Xi'an, China.


Asunto(s)
Bases de Datos Factuales , Biología de Sistemas , Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos
18.
BMC Syst Biol ; 7 Suppl 2: S4, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24565034

RESUMEN

MOTIVATION: Understanding the molecular mechanisms underlying cancer is an important step for the effective diagnosis and treatment of cancer patients. With the huge volume of data from the large-scale cancer genomics projects, an open challenge is to distinguish driver mutations, pathways, and gene sets (or core modules) that contribute to cancer formation and progression from random passengers which accumulate in somatic cells but do not contribute to tumorigenesis. Due to mutational heterogeneity, current analyses are often restricted to known pathways and functional modules for enrichment of somatic mutations. Therefore, discovery of new pathways and functional modules is a pressing need. RESULTS: In this study, we propose a novel method to identify Mutated Core Modules in Cancer (iMCMC) without any prior information other than cancer genomic data from patients with tumors. This is a network-based approach in which three kinds of data are integrated: somatic mutations, copy number variations (CNVs), and gene expressions. Firstly, the first two datasets are merged to obtain a mutation matrix, based on which a weighted mutation network is constructed where the vertex weight corresponds to gene coverage and the edge weight corresponds to the mutual exclusivity between gene pairs. Similarly, a weighted expression network is generated from the expression matrix where the vertex and edge weights correspond to the influence of a gene mutation on other genes and the Pearson correlation of gene mutation-correlated expressions, respectively. Then an integrative network is obtained by further combining these two networks, and the most coherent subnetworks are identified by using an optimization model. Finally, we obtained the core modules for tumors by filtering with significance and exclusivity tests. We applied iMCMC to the Cancer Genome Atlas (TCGA) glioblastoma multiforme (GBM) and ovarian carcinoma data, and identified several mutated core modules, some of which are involved in known pathways. Most of the implicated genes are oncogenes or tumor suppressors previously reported to be related to carcinogenesis. As a comparison, we also performed iMCMC on two of the three kinds of data, i.e., the datasets combining somatic mutations with CNVs and secondly the datasets combining somatic mutations with gene expressions. The results indicate that gene expressions or CNVs indeed provide extra useful information to the original data for the identification of core modules in cancer. CONCLUSIONS: This study demonstrates the utility of our iMCMC by integrating multiple data sources to identify mutated core modules in cancer. In addition to presenting a generally applicable methodology, our findings provide several candidate pathways or core modules recurrently perturbed in GBM or ovarian carcinoma for further studies.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Variaciones en el Número de Copia de ADN , Mutación , Neoplasias/genética , Transcriptoma , Carcinoma Epitelial de Ovario , Femenino , Redes Reguladoras de Genes , Glioblastoma/genética , Humanos , Neoplasias Glandulares y Epiteliales/genética , Neoplasias Ováricas/genética
19.
BMC Syst Biol ; 7 Suppl 2: S6, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24565104

RESUMEN

In the last decade, plenty of biological networks are built from the large scale experimental data produced by the rapidly developing high-throughput techniques as well as literature and other sources. But the huge amount of network data have not been fully utilized due to the limited biological network analysis tools. As a basic and essential bioinformatics method, biological network alignment and querying have been applied in many fields such as predicting new protein-protein interactions (PPI). Although many algorithms were published, the network alignment and querying problems are not solved satisfactorily. In this paper, we extended CNetQ, a novel network querying method based on the conditional random fields model, to solve network alignment problem, by adopting an iterative bi-directional mapping strategy. The new method, called CNetA, was compared with other four methods on fifty simulated and three real PPI network alignment instances by using four structural and five biological measures. The computational experiments on the simulated data, which were generated from a biological network evolutionary model to validate the effectiveness of network alignment methods, show that CNetA gets the best accuracy in terms of both nodes and networks. For the real data, larger biological conserved subnetworks and larger connected subnetworks were identified, compared with the structural-dominated methods and the biological-dominated methods, respectively, which suggests that CNetA can better balances the biological and structural similarities. Further, CNetQ and CNetA have been implemented in a new R package Corbi (http://doc.aporc.org/wiki/Corbi), and freely accessible and easy used web services for CNetQ and CNetA have also been constructed based on the R package. The simulated and real datasets used in this paper are available for downloading at http://doc.aporc.org/wiki/CNetA/.


Asunto(s)
Biología Computacional/métodos , Mapeo de Interacción de Proteínas/métodos , Bacterias/metabolismo , Humanos , Internet , Modelos Biológicos , Saccharomyces cerevisiae/metabolismo , Programas Informáticos
20.
BMC Syst Biol ; 6 Suppl 1: S1, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23046669

RESUMEN

A report of the 5th IEEE International Conference on Systems Biology (IEEE ISB2011), 2-4 September 2011, Zhuhai, China.


Asunto(s)
Modelos Biológicos , Biología de Sistemas/métodos , Evolución Molecular , Humanos , Biología Molecular
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...