RESUMO
BACKGROUND: Biomolecular pathways and networks are dynamic and complex, and the perturbations to them which cause disease are often multiple, heterogeneous and contingent. Pathway and network visualizations, rendered on a computer or published on paper, however, tend to be static, lacking in detail, and ill-equipped to explore the variety and quantities of data available today, and the complex causes we seek to understand. RESULTS: RCytoscape integrates R (an open-ended programming environment rich in statistical power and data-handling facilities) and Cytoscape (powerful network visualization and analysis software). RCytoscape extends Cytoscape's functionality beyond what is possible with the Cytoscape graphical user interface. To illustrate the power of RCytoscape, a portion of the Glioblastoma multiforme (GBM) data set from the Cancer Genome Atlas (TCGA) is examined. Network visualization reveals previously unreported patterns in the data suggesting heterogeneous signaling mechanisms active in GBM Proneural tumors, with possible clinical relevance. CONCLUSIONS: Progress in bioinformatics and computational biology depends upon exploratory and confirmatory data analysis, upon inference, and upon modeling. These activities will eventually permit the prediction and control of complex biological systems. Network visualizations--molecular maps--created from an open-ended programming environment rich in statistical power and data-handling facilities, such as RCytoscape, will play an essential role in this progression.
Assuntos
Genoma Humano , Software , Mapeamento Cromossômico , Biologia Computacional , Glioblastoma/genética , Humanos , Modelos GenéticosRESUMO
SUMMARY: CytoscapeRPC is a plugin for Cytoscape which allows users to create, query and modify Cytoscape networks from any programming language which supports XML-RPC. This enables them to access Cytoscape functionality and visualize their data interactively without leaving the programming environment with which they are familiar. AVAILABILITY: Install through the Cytoscape plugin manager or visit the web page: http://wiki.nbic.nl/index.php/CytoscapeRPC for the user tutorial and download. CONTACT: j.j.bot@tudelft.nl; j.j.bot@tudelft.nl.
Assuntos
Gráficos por Computador , Software , Linguagens de ProgramaçãoRESUMO
MOTIVATION: We propose an efficient method to infer combinatorial association logic networks from multiple genome-wide measurements from the same sample. We demonstrate our method on a genetical genomics dataset, in which we search for Boolean combinations of multiple genetic loci that associate with transcript levels. RESULTS: Our method provably finds the global solution and is very efficient with runtimes of up to four orders of magnitude faster than the exhaustive search. This enables permutation procedures for determining accurate false positive rates and allows selection of the most parsimonious model. When applied to transcript levels measured in myeloid cells from 24 genotyped recombinant inbred mouse strains, we discovered that nine gene clusters are putatively modulated by a logical combination of trait loci rather than a single locus. A literature survey supports and further elucidates one of these findings. Due to our approach, optimal solutions for multi-locus logic models and accurate estimates of the associated false discovery rates become feasible. Our algorithm, therefore, offers a valuable alternative to approaches employing complex, albeit suboptimal optimization strategies to identify complex models. AVAILABILITY: The MATLAB code of the prototype implementation is available on: http://bioinformatics.tudelft.nl/ or http://bioinformatics.nki.nl/.
Assuntos
Algoritmos , Genoma , Genômica/métodos , Animais , Loci Gênicos , Genótipo , Camundongos , Modelos GenéticosRESUMO
Tumorigenesis is a multi-step process in which normal cells transform into malignant tumors following the accumulation of genetic mutations that enable them to evade the growth control checkpoints that would normally suppress their growth or result in apoptosis. It is therefore important to identify those combinations of mutations that collaborate in cancer development and progression. DNA copy number alterations (CNAs) are one of the ways in which cancer genes are deregulated in tumor cells. We hypothesized that synergistic interactions between cancer genes might be identified by looking for regions of co-occurring gain and/or loss. To this end we developed a scoring framework to separate truly co-occurring aberrations from passenger mutations and dominant single signals present in the data. The resulting regions of high co-occurrence can be investigated for between-region functional interactions. Analysis of high-resolution DNA copy number data from a panel of 95 hematological tumor cell lines correctly identified co-occurring recombinations at the T-cell receptor and immunoglobulin loci in T- and B-cell malignancies, respectively, showing that we can recover truly co-occurring genomic alterations. In addition, our analysis revealed networks of co-occurring genomic losses and gains that are enriched for cancer genes. These networks are also highly enriched for functional relationships between genes. We further examine sub-networks of these networks, core networks, which contain many known cancer genes. The core network for co-occurring DNA losses we find seems to be independent of the canonical cancer genes within the network. Our findings suggest that large-scale, low-intensity copy number alterations may be an important feature of cancer development or maintenance by affecting gene dosage of a large interconnected network of functionally related genes.
Assuntos
Mapeamento Cromossômico/métodos , Variações do Número de Cópias de DNA/genética , DNA de Neoplasias/genética , Regulação Neoplásica da Expressão Gênica , Genoma Humano/genética , Proteínas de Neoplasias/genética , Transdução de Sinais/genética , HumanosRESUMO
MOTIVATION: Cancers are caused by an accumulation of multiple independent mutations that collectively deregulate cellular pathways, e.g. such as those regulating cell division and cell-death. The publicly available Retroviral Tagged Cancer Gene Database (RTCGD) contains the data of many insertional mutagenesis screens, in which the virally induced mutations result in tumor formation in mice. The insertion loci therefore indicate the location of putative cancer genes. Additionally, the presence of multiple independent insertions within one tumor hints towards a cooperation between the insertionally mutated genes. In this study we focus on the detection of statistically significant co-mutations. RESULTS: We propose a two-dimensional Gaussian Kernel Convolution method (2DGKC), a computational technique that identifies the cooperating mutations in insertional mutagenesis data. We define the Common Co-occurrence of Insertions (CCI), signifying the co-mutations that are statistically significant across all different screens in the RTCGD. Significance estimates are made on multiple scales, and the results visualized in a scale space, thereby providing valuable extra information on the putative cooperation. The multidimensional analysis of the insertion data results in the discovery of 86 statistically significant co-mutations, indicating the presence of cooperating oncogenes that play a role in tumor development. Since oncogenes may cooperate with several members of a parallel pathway, we combined the co-occurrence data with gene family information to find significant cooperations between oncogenes and families of genes. We show, for instance, the interchangeable cooperation of Myc insertions with insertions in the Pim family. AVAILABILITY: A list of the resulting CCIs is available at: http://ict.ewi.tudelft.nl/~jeroen/CCI/CCI_list.txt.
Assuntos
Família Multigênica/genética , Mutagênese Insercional/métodos , Neoplasias/genética , Neoplasias/metabolismo , Proteínas Oncogênicas/metabolismo , Oncogenes/genética , Transdução de Sinais , Simulação por Computador , Bases de Dados Genéticas , Modelos Biológicos , Proteínas Oncogênicas/genéticaRESUMO
Genetic risk factors often localize to noncoding regions of the genome with unknown effects on disease etiology. Expression quantitative trait loci (eQTLs) help to explain the regulatory mechanisms underlying these genetic associations. Knowledge of the context that determines the nature and strength of eQTLs may help identify cell types relevant to pathophysiology and the regulatory networks underlying disease. Here we generated peripheral blood RNA-seq data from 2,116 unrelated individuals and systematically identified context-dependent eQTLs using a hypothesis-free strategy that does not require previous knowledge of the identity of the modifiers. Of the 23,060 significant cis-regulated genes (false discovery rate (FDR) ≤ 0.05), 2,743 (12%) showed context-dependent eQTL effects. The majority of these effects were influenced by cell type composition. A set of 145 cis-eQTLs depended on type I interferon signaling. Others were modulated by specific transcription factors binding to the eQTL SNPs.
Assuntos
Proteínas Sanguíneas/genética , Linhagem da Célula/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , RNA Mensageiro/sangue , Sequências Reguladoras de Ácido Nucleico/genética , Estudos de Coortes , Feminino , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Pessoa de Meia-Idade , RNA Mensageiro/genéticaRESUMO
Most disease-associated genetic variants are noncoding, making it challenging to design experiments to understand their functional consequences. Identification of expression quantitative trait loci (eQTLs) has been a powerful approach to infer the downstream effects of disease-associated variants, but most of these variants remain unexplained. The analysis of DNA methylation, a key component of the epigenome, offers highly complementary data on the regulatory potential of genomic regions. Here we show that disease-associated variants have widespread effects on DNA methylation in trans that likely reflect differential occupancy of trans binding sites by cis-regulated transcription factors. Using multiple omics data sets from 3,841 Dutch individuals, we identified 1,907 established trait-associated SNPs that affect the methylation levels of 10,141 different CpG sites in trans (false discovery rate (FDR) < 0.05). These included SNPs that affect both the expression of a nearby transcription factor (such as NFKB1, CTCF and NKX2-3) and methylation of its respective binding site across the genome. Trans methylation QTLs effectively expose the downstream effects of disease-associated variants.