Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Sci Signal ; 11(531)2018 05 22.
Artículo en Inglés | MEDLINE | ID: mdl-29789295

RESUMEN

Protein posttranslational modifications (PTMs) have typically been studied independently, yet many proteins are modified by more than one PTM type, and cell signaling pathways somehow integrate this information. We coupled immunoprecipitation using PTM-specific antibodies with tandem mass tag (TMT) mass spectrometry to simultaneously examine phosphorylation, methylation, and acetylation in 45 lung cancer cell lines compared to normal lung tissue and to cell lines treated with anticancer drugs. This simultaneous, large-scale, integrative analysis of these PTMs using a cluster-filtered network (CFN) approach revealed that cell signaling pathways were outlined by clustering patterns in PTMs. We used the t-distributed stochastic neighbor embedding (t-SNE) method to identify PTM clusters and then integrated each with known protein-protein interactions (PPIs) to elucidate functional cell signaling pathways. The CFN identified known and previously unknown cell signaling pathways in lung cancer cells that were not present in normal lung epithelial tissue. In various proteins modified by more than one type of PTM, the incidence of those PTMs exhibited inverse relationships, suggesting that molecular exclusive "OR" gates determine a large number of signal transduction events. We also showed that the acetyltransferase EP300 appears to be a hub in the network of pathways involving different PTMs. In addition, the data shed light on the mechanism of action of geldanamycin, an HSP90 inhibitor. Together, the findings reveal that cell signaling pathways mediated by acetylation, methylation, and phosphorylation regulate the cytoskeleton, membrane traffic, and RNA binding protein-mediated control of gene expression.


Asunto(s)
Biomarcadores de Tumor/metabolismo , Biología Computacional/métodos , Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Neoplasias Pulmonares/metabolismo , Mapas de Interacción de Proteínas , Procesamiento Proteico-Postraduccional , Acetilación , Antineoplásicos/farmacología , Biomarcadores de Tumor/genética , Carcinoma de Pulmón de Células no Pequeñas/tratamiento farmacológico , Carcinoma de Pulmón de Células no Pequeñas/genética , Carcinoma de Pulmón de Células no Pequeñas/metabolismo , Carcinoma de Pulmón de Células no Pequeñas/patología , Perfilación de la Expresión Génica , Humanos , Neoplasias Pulmonares/tratamiento farmacológico , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patología , Metilación , Fosforilación , Proteómica , Transducción de Señal , Carcinoma Pulmonar de Células Pequeñas/tratamiento farmacológico , Carcinoma Pulmonar de Células Pequeñas/genética , Carcinoma Pulmonar de Células Pequeñas/metabolismo , Carcinoma Pulmonar de Células Pequeñas/patología , Células Tumorales Cultivadas
2.
Pac Symp Biocomput ; 23: 32-43, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29218867

RESUMEN

Gene expression profiling of in vitro drug perturbations is useful for many biomedical discovery applications including drug repurposing and elucidation of drug mechanisms. However, limited data availability across cell types has hindered our capacity to leverage or explore the cell-specificity of these perturbations. While recent efforts have generated a large number of drug perturbation profiles across a variety of human cell types, many gaps remain in this combinatorial drug-cell space. Hence, we asked whether it is possible to fill these gaps by predicting cell-specific drug perturbation profiles using available expression data from related conditions--i.e. from other drugs and cell types. We developed a computational framework that first arranges existing profiles into a three-dimensional array (or tensor) indexed by drugs, genes, and cell types, and then uses either local (nearest-neighbors) or global (tensor completion) information to predict unmeasured profiles. We evaluate prediction accuracy using a variety of metrics, and find that the two methods have complementary performance, each superior in different regions in the drug-cell space. Predictions achieve correlations of 0.68 with true values, and maintain accurate differentially expressed genes (AUC 0.81). Finally, we demonstrate that the predicted profiles add value for making downstream associations with drug targets and therapeutic classes.


Asunto(s)
Transcriptoma/efectos de los fármacos , Algoritmos , Células/efectos de los fármacos , Células/metabolismo , Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos Farmacéuticas , Descubrimiento de Drogas , Reposicionamiento de Medicamentos , Perfilación de la Expresión Génica/estadística & datos numéricos , Humanos
3.
Nat Commun ; 7: 12846, 2016 Sep 26.
Artículo en Inglés | MEDLINE | ID: mdl-27667448

RESUMEN

Gene expression data are accumulating exponentially in public repositories. Reanalysis and integration of themed collections from these studies may provide new insights, but requires further human curation. Here we report a crowdsourcing project to annotate and reanalyse a large number of gene expression profiles from Gene Expression Omnibus (GEO). Through a massive open online course on Coursera, over 70 participants from over 25 countries identify and annotate 2,460 single-gene perturbation signatures, 839 disease versus normal signatures, and 906 drug perturbation signatures. All these signatures are unique and are manually validated for quality. Global analysis of these signatures confirms known associations and identifies novel associations between genes, diseases and drugs. The manually curated signatures are used as a training set to develop classifiers for extracting similar signatures from the entire GEO repository. We develop a web portal to serve these signatures for query, download and visualization.

4.
Sci Rep ; 6: 26419, 2016 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-27226058

RESUMEN

The glucocorticoid receptor (GR), a nuclear receptor and major drug target, has a highly conserved minor splice variant, GRγ, which differs by a single arginine within the DNA binding domain. GRγ, which comprises 10% of all GR transcripts, is constitutively expressed and tightly conserved through mammalian evolution, suggesting an important non-redundant role. However, to date no specific role for GRγ has been reported. We discovered significant differences in subcellular localisation, and nuclear-cytoplasmic shuttling in response to ligand. In addition the GRγ transcriptome and protein interactome was distinct, and with a gene ontology signal for mitochondrial regulation which was confirmed using Seahorse technology. We propose that evolutionary conservation of the single additional arginine in GRγ is driven by a distinct, non-redundant functional profile, including regulation of mitochondrial function.


Asunto(s)
Adenosina Trifosfato/metabolismo , Mitocondrias/genética , Mitocondrias/metabolismo , Receptores de Glucocorticoides/metabolismo , Células A549 , Núcleo Celular/metabolismo , Citoplasma , Evolución Molecular , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Células HEK293 , Humanos , Modelos Moleculares , Unión Proteica , Proteómica , Receptores de Glucocorticoides/química
5.
Bioinformatics ; 32(15): 2338-45, 2016 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-27153606

RESUMEN

MOTIVATION: Adverse drug reactions (ADRs) are a central consideration during drug development. Here we present a machine learning classifier to prioritize ADRs for approved drugs and pre-clinical small-molecule compounds by combining chemical structure (CS) and gene expression (GE) features. The GE data is from the Library of Integrated Network-based Cellular Signatures (LINCS) L1000 dataset that measured changes in GE before and after treatment of human cells with over 20 000 small-molecule compounds including most of the FDA-approved drugs. Using various benchmarking methods, we show that the integration of GE data with the CS of the drugs can significantly improve the predictability of ADRs. Moreover, transforming GE features to enrichment vectors of biological terms further improves the predictive capability of the classifiers. The most predictive biological-term features can assist in understanding the drug mechanisms of action. Finally, we applied the classifier to all >20 000 small-molecules profiled, and developed a web portal for browsing and searching predictive small-molecule/ADR connections. AVAILABILITY AND IMPLEMENTATION: The interface for the adverse event predictions for the >20 000 LINCS compounds is available at http://maayanlab.net/SEP-L1000/ CONTACT: avi.maayan@mssm.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Expresión Génica , Biblioteca de Genes , Humanos
6.
Artículo en Inglés | MEDLINE | ID: mdl-28413689

RESUMEN

The library of integrated network-based cellular signatures (LINCS) L1000 data set currently comprises of over a million gene expression profiles of chemically perturbed human cell lines. Through unique several intrinsic and extrinsic benchmarking schemes, we demonstrate that processing the L1000 data with the characteristic direction (CD) method significantly improves signal to noise compared with the MODZ method currently used to compute L1000 signatures. The CD processed L1000 signatures are served through a state-of-the-art web-based search engine application called L1000CDS2. The L1000CDS2 search engine provides prioritization of thousands of small-molecule signatures, and their pairwise combinations, predicted to either mimic or reverse an input gene expression signature using two methods. The L1000CDS2 search engine also predicts drug targets for all the small molecules profiled by the L1000 assay that we processed. Targets are predicted by computing the cosine similarity between the L1000 small-molecule signatures and a large collection of signatures extracted from the gene expression omnibus (GEO) for single-gene perturbations in mammalian cells. We applied L1000CDS2 to prioritize small molecules that are predicted to reverse expression in 670 disease signatures also extracted from GEO, and prioritized small molecules that can mimic expression of 22 endogenous ligand signatures profiled by the L1000 assay. As a case study, to further demonstrate the utility of L1000CDS2, we collected expression signatures from human cells infected with Ebola virus at 30, 60 and 120 min. Querying these signatures with L1000CDS2 we identified kenpaullone, a GSK3B/CDK2 inhibitor that we show, in subsequent experiments, has a dose-dependent efficacy in inhibiting Ebola infection in vitro without causing cellular toxicity in human cell lines. In summary, the L1000CDS2 tool can be applied in many biological and biomedical settings, while improving the extraction of knowledge from the LINCS L1000 resource.

7.
BMC Syst Biol ; 9: 26, 2015 Jun 06.
Artículo en Inglés | MEDLINE | ID: mdl-26048415

RESUMEN

BACKGROUND: Thousands of biological and biomedical investigators study of the functional role of single genes and their protein products in normal physiology and in disease. The findings from these studies are reported in research articles that stimulate new research. It is now established that a complex regulatory networks's is controlling human cellular fate, and this community of researchers are continually unraveling this network topology. Attempts to integrate results from such accumulated knowledge resulted in literature-based protein-protein interaction networks (PPINs) and pathway databases. These databases are widely used by the community to analyze new data collected from emerging genome-wide studies with the assumption that the data within these literature-based databases is the ground truth and contain no biases. While suspicion for research focus biases is growing, a concrete proof for it is still missing. It is difficult to prove because the real PPINs are mostly unknown. RESULTS: Here we analyzed the longitudinal discovery process of literature-based mammalian and yeast PPINs to observe that these networks are discovered non-uniformly. The pattern of discovery is related to a theoretical concept proposed by Kauffman called "expanding the adjacent possible". We introduce a network discovery model which explicitly includes the space of possibilities in the form of a true underlying PPIN. CONCLUSIONS: Our model strongly suggests that research focus biases exist in the observed discovery dynamics of these networks. In summary, more care should be placed when using PPIN databases for analysis of newly acquired data, and when considering prior knowledge when designing new experiments.


Asunto(s)
Biología Computacional/métodos , Mapeo de Interacción de Proteínas , Animales , Humanos , Ratones , Saccharomyces cerevisiae/metabolismo
8.
Artículo en Inglés | MEDLINE | ID: mdl-26848405

RESUMEN

Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.

9.
Eur J Med Chem ; 87: 611-23, 2014 Nov 24.
Artículo en Inglés | MEDLINE | ID: mdl-25299683

RESUMEN

A virtual screening procedure was applied to identify new tankyrase inhibitors. Through pharmacophore screening of a compounds collection from the SPECS database, the methoxy[l]benzothieno[2,3-c]quinolin-6(5H)-one scaffold was identified as nicotinamide mimetic able to inhibit tankyrase activity at low micromolar concentration. In order to improve potency and selectivity, tandem structure-based and scaffold hopping approaches were carried out over the new scaffold leading to the discovery of the 2-(phenyl)-3H-benzo[4,5]thieno[3,2-d]pyrimidin-4-one as powerful chemotype suitable for tankyrase inhibition. The best compound 2-(4-tert-butyl-phenyl)-3H-benzo[4,5]thieno[3,2-d]pyrimidin-4-one (23) displayed nanomolar potencies (IC50s TNKS-1 = 21 nM and TNKS-2 = 29 nM) and high selectivity when profiled against several other PARPs. Furthermore, a striking Wnt signaling, as well as cell growth inhibition, was observed assaying 23 in DLD-1 cancer cells.


Asunto(s)
Inhibidores Enzimáticos/farmacología , Tanquirasas/antagonistas & inhibidores , Inhibidores Enzimáticos/química , Espectroscopía de Resonancia Magnética , Modelos Moleculares , Espectrometría de Masa por Ionización de Electrospray
10.
Trends Pharmacol Sci ; 35(9): 450-60, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25109570

RESUMEN

Data sets from recent large-scale projects can be integrated into one unified puzzle that can provide new insights into how drugs and genetic perturbations applied to human cells are linked to whole-organism phenotypes. Data that report how drugs affect the phenotype of human cell lines and how drugs induce changes in gene and protein expression in human cell lines can be combined with knowledge about human disease, side effects induced by drugs, and mouse phenotypes. Such data integration efforts can be achieved through the conversion of data from the various resources into single-node-type networks, gene-set libraries, or multipartite graphs. This approach can lead us to the identification of more relationships between genes, drugs, and phenotypes as well as benchmark computational and experimental methods. Overall, this lean 'Big Data' integration strategy will bring us closer toward the goal of realizing personalized medicine.


Asunto(s)
Minería de Datos , Bases de Datos Factuales , Animales , Humanos , Farmacología , Biología de Sistemas
11.
PLoS One ; 9(6): e100660, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24949636

RESUMEN

MYH9 encodes non-muscle myosin heavy chain IIA (NMMHCIIA), the predominant force-generating ATPase in non-muscle cells. Several lines of evidence implicate a role for MYH9 in podocytopathies. However, NMMHCIIA's function in podocytes remains unknown. To better understand this function, we performed immuno-precipitation followed by mass-spectrometry proteomics to identify proteins interacting with the NMMHCIIA-enriched actin-myosin complexes. Computational analyses revealed that these proteins belong to functional networks including regulators of cytoskeletal organization, metabolism and networks regulated by the HIV-1 gene nef. We further characterized the subcellular localization of NMMHCIIA within podocytes in vivo, and found it to be present within the podocyte major foot processes. Finally, we tested the effect of loss of MYH9 expression in podocytes in vitro, and found that it was necessary for cytoskeletal organization. Our results provide the first survey of NMMHCIIA-enriched actin-myosin-interacting proteins within the podocyte, demonstrating the important role of NMMHCIIA in organizing the elaborate cytoskeleton structure of podocytes. Our characterization of NMMHCIIA's functions goes beyond the podocyte, providing important insights into its general molecular role.


Asunto(s)
Actinas/metabolismo , Proteínas Motoras Moleculares/metabolismo , Cadenas Pesadas de Miosina/metabolismo , Proteómica , Actinas/biosíntesis , Animales , Proteínas del Citoesqueleto/biosíntesis , Proteínas del Citoesqueleto/metabolismo , Regulación de la Expresión Génica , Humanos , Proteínas de la Membrana/biosíntesis , Proteínas de la Membrana/metabolismo , Ratones , Proteínas Motoras Moleculares/biosíntesis , Complejos Multiproteicos/metabolismo , Cadenas Pesadas de Miosina/biosíntesis , Podocitos/metabolismo , Mapas de Interacción de Proteínas , Productos del Gen nef del Virus de la Inmunodeficiencia Humana/metabolismo
12.
BMC Bioinformatics ; 15: 79, 2014 Mar 21.
Artículo en Inglés | MEDLINE | ID: mdl-24650281

RESUMEN

BACKGROUND: Identifying differentially expressed genes (DEG) is a fundamental step in studies that perform genome wide expression profiling. Typically, DEG are identified by univariate approaches such as Significance Analysis of Microarrays (SAM) or Linear Models for Microarray Data (LIMMA) for processing cDNA microarrays, and differential gene expression analysis based on the negative binomial distribution (DESeq) or Empirical analysis of Digital Gene Expression data in R (edgeR) for RNA-seq profiling. RESULTS: Here we present a new geometrical multivariate approach to identify DEG called the Characteristic Direction. We demonstrate that the Characteristic Direction method is significantly more sensitive than existing methods for identifying DEG in the context of transcription factor (TF) and drug perturbation responses over a large number of microarray experiments. We also benchmarked the Characteristic Direction method using synthetic data, as well as RNA-Seq data. A large collection of microarray expression data from TF perturbations (73 experiments) and drug perturbations (130 experiments) extracted from the Gene Expression Omnibus (GEO), as well as an RNA-Seq study that profiled genome-wide gene expression and STAT3 DNA binding in two subtypes of diffuse large B-cell Lymphoma, were used for benchmarking the method using real data. ChIP-Seq data identifying DNA binding sites of the perturbed TFs, as well as known drug targets of the perturbing drugs, were used as prior knowledge silver-standard for validation. In all cases the Characteristic Direction DEG calling method outperformed other methods. We find that when drugs are applied to cells in various contexts, the proteins that interact with the drug-targets are differentially expressed and more of the corresponding genes are discovered by the Characteristic Direction method. In addition, we show that the Characteristic Direction conceptualization can be used to perform improved gene set enrichment analyses when compared with the gene-set enrichment analysis (GSEA) and the hypergeometric test. CONCLUSIONS: The application of the Characteristic Direction method may shed new light on relevant biological mechanisms that would have remained undiscovered by the current state-of-the-art DEG methods. The method is freely accessible via various open source code implementations using four popular programming languages: R, Python, MATLAB and Mathematica, all available at: http://www.maayanlab.net/CD.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Proteínas/genética , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Proteínas/metabolismo , Programas Informáticos
13.
Bioinformatics ; 29(15): 1872-8, 2013 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-23749960

RESUMEN

MOTIVATION: Networks are vital to computational systems biology research, but visualizing them is a challenge. For networks larger than ∼100 nodes and ∼200 links, ball-and-stick diagrams fail to convey much information. To address this, we developed Network2Canvas (N2C), a web application that provides an alternative way to view networks. N2C visualizes networks by placing nodes on a square toroidal canvas. The network nodes are clustered on the canvas using simulated annealing to maximize local connections where a node's brightness is made proportional to its local fitness. The interactive canvas is implemented in HyperText Markup Language (HTML)5 with the JavaScript library Data-Driven Documents (D3). We applied N2C to visualize 30 canvases made from human and mouse gene-set libraries and 6 canvases made from the Food and Drug Administration (FDA)-approved drug-set libraries. Given lists of genes or drugs, enriched terms are highlighted on the canvases, and their degree of clustering is computed. Because N2C produces visual patterns of enriched terms on canvases, a trained eye can detect signatures instantly. In summary, N2C provides a new flexible method to visualize large networks and can be used to perform and visualize gene-set and drug-set enrichment analyses. AVAILABILITY: N2C is freely available at http://www.maayanlab.net/N2C and is open source. CONTACT: avi.maayan@mssm.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Gráficos por Computador , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Redes Reguladoras de Genes , Programas Informáticos , Animales , Células Madre Embrionarias/metabolismo , Biblioteca de Genes , Humanos , Internet , Ratones , Preparaciones Farmacéuticas/química , Biología de Sistemas
14.
Cell Stem Cell ; 13(2): 205-18, 2013 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-23770078

RESUMEN

Definitive hematopoiesis emerges during embryogenesis via an endothelial-to-hematopoietic transition. We attempted to induce this process in mouse fibroblasts by screening a panel of factors for hemogenic activity. We identified a combination of four transcription factors, Gata2, Gfi1b, cFos, and Etv6, that efficiently induces endothelial-like precursor cells, with the subsequent appearance of hematopoietic cells. The precursor cells express a human CD34 reporter, Sca1, and Prominin1 within a global endothelial transcription program. Emergent hematopoietic cells possess nascent hematopoietic stem cell gene-expression profiles and cell-surface phenotypes. After transgene silencing and reaggregation culture, the specified cells generate hematopoietic colonies in vitro. Thus, we show that a simple combination of transcription factors is sufficient to induce a complex, dynamic, and multistep developmental program in vitro. These findings provide insights into the specification of definitive hemogenesis and a platform for future development of patient-specific stem and progenitor cells, as well as more-differentiated blood products.


Asunto(s)
Fibroblastos/metabolismo , Hematopoyesis , Animales , Biomarcadores/metabolismo , Agregación Celular , Linaje de la Célula/genética , Membrana Celular/metabolismo , Células Cultivadas , Ensayo de Unidades Formadoras de Colonias , Células Endoteliales/citología , Células Endoteliales/metabolismo , Fibroblastos/citología , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Proteínas Fluorescentes Verdes/metabolismo , Hematopoyesis/genética , Humanos , Ratones , Ratones Endogámicos C57BL , Fenotipo , Factores de Transcripción/metabolismo
15.
J Am Soc Nephrol ; 24(5): 801-11, 2013 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-23559582

RESUMEN

The Connectivity Map database contains microarray signatures of gene expression derived from approximately 6000 experiments that examined the effects of approximately 1300 single drugs on several human cancer cell lines. We used these data to prioritize pairs of drugs expected to reverse the changes in gene expression observed in the kidneys of a mouse model of HIV-associated nephropathy (Tg26 mice). We predicted that the combination of an angiotensin-converting enzyme (ACE) inhibitor and a histone deacetylase inhibitor would maximally reverse the disease-associated expression of genes in the kidneys of these mice. Testing the combination of these inhibitors in Tg26 mice revealed an additive renoprotective effect, as suggested by reduction of proteinuria, improvement of renal function, and attenuation of kidney injury. Furthermore, we observed the predicted treatment-associated changes in the expression of selected genes and pathway components. In summary, these data suggest that the combination of an ACE inhibitor and a histone deacetylase inhibitor could have therapeutic potential for various kidney diseases. In addition, this study provides proof-of-concept that drug-induced expression signatures have potential use in predicting the effects of combination drug therapy.


Asunto(s)
Inhibidores de la Enzima Convertidora de Angiotensina/farmacología , Inhibidores de Histona Desacetilasas/farmacología , Riñón/efectos de los fármacos , Animales , Benzazepinas/farmacología , Línea Celular Tumoral , Sinergismo Farmacológico , Quimioterapia Combinada , Perfilación de la Expresión Génica , Humanos , Ácidos Hidroxámicos/farmacología , Enfermedades Renales/tratamiento farmacológico , Masculino , Ratones , Vorinostat
16.
BMC Bioinformatics ; 14: 128, 2013 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-23586463

RESUMEN

BACKGROUND: System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement. RESULTS: Here, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios. CONCLUSIONS: Enrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr.


Asunto(s)
Programas Informáticos , Transcriptoma , Animales , Línea Celular Tumoral , Regulación Neoplásica de la Expresión Génica , Biblioteca de Genes , Histonas/metabolismo , Humanos , Internet , Ratones , Proteínas del Grupo Polycomb/metabolismo , Proteínas/genética , Interfaz Usuario-Computador
17.
BMC Bioinformatics ; 13: 156, 2012 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-22748121

RESUMEN

BACKGROUND: Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent. RESULTS: Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user's PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories. CONCLUSIONS: Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in many cancers are mostly connected through PPIs whereas other complex diseases, such as autism and type-2 diabetes, are mostly connected through FANs without PPIs, can guide better strategies for disease gene discovery. Genes2FANs is available at: http://actin.pharm.mssm.edu/genes2FANs.


Asunto(s)
Redes Reguladoras de Genes , Mapeo de Interacción de Proteínas , Programas Informáticos , Animales , Enfermedad/genética , Genes , Humanos , Internet , Ratones , PubMed
18.
BMC Syst Biol ; 6: 89, 2012 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-22824380

RESUMEN

BACKGROUND: The skeleton of complex systems can be represented as networks where vertices represent entities, and edges represent the relations between these entities. Often it is impossible, or expensive, to determine the network structure by experimental validation of the binary interactions between every vertex pair. It is usually more practical to infer the network from surrogate observations. Network inference is the process by which an underlying network of relations between entities is determined from indirect evidence. While many algorithms have been developed to infer networks from quantitative data, less attention has been paid to methods which infer networks from repeated co-occurrence of entities in related sets. This type of data is ubiquitous in the field of systems biology and in other areas of complex systems research. Hence, such methods would be of great utility and value. RESULTS: Here we present a general method for network inference from repeated observations of sets of related entities. Given experimental observations of such sets, we infer the underlying network connecting these entities by generating an ensemble of networks consistent with the data. The frequency of occurrence of a given link throughout this ensemble is interpreted as the probability that the link is present in the underlying real network conditioned on the data. Exponential random graphs are used to generate and sample the ensemble of consistent networks, and we take an algorithmic approach to numerically execute the inference method. The effectiveness of the method is demonstrated on synthetic data before employing this inference approach to problems in systems biology and systems pharmacology, as well as to construct a co-authorship collaboration network. We predict direct protein-protein interactions from high-throughput mass-spectrometry proteomics, integrate data from Chip-seq and loss-of-function/gain-of-function followed by expression data to infer a network of associations between pluripotency regulators, extract a network that connects 53 cancer drugs to each other and to 34 severe adverse events by mining the FDA's Adverse Events Reporting Systems (AERS), and construct a co-authorship network that connects Mount Sinai School of Medicine investigators. The predicted networks and online software to create networks from entity-set libraries are provided online at http://www.maayanlab.net/S2N. CONCLUSIONS: The network inference method presented here can be applied to resolve different types of networks in current systems biology and systems pharmacology as well as in other fields of research.


Asunto(s)
Biología Computacional/métodos , Animales , Antineoplásicos/efectos adversos , Células Madre Embrionarias/citología , Humanos , Ratones , Mapas de Interacción de Proteínas , Factores de Tiempo
19.
Sci Signal ; 4(190): tr3, 2011 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-21917717

RESUMEN

This Teaching Resource provides lecture notes, slides, and a problem set for a series of lectures from a course entitled "Systems Biology: Biomedical Modeling." The materials are a lecture introducing the mathematical concepts behind principal components analysis (PCA). The lecture describes how to handle large data sets with correlation methods and unsupervised clustering with this popular method of analysis, PCA.


Asunto(s)
Interpretación Estadística de Datos , Modelos Biológicos , Biología de Sistemas/métodos , Biología de Sistemas/tendencias , Biología de Sistemas/educación
20.
Sci Signal ; 4(190): tr4, 2011 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-21917718

RESUMEN

This Teaching Resource provides lecture notes, slides, and a problem set for a series of lectures introducing the mathematical concepts behind gene-set enrichment analysis (GSEA) and were part of a course entitled "Systems Biology: Biomedical Modeling." GSEA is a statistical functional enrichment analysis commonly applied to identify enrichment of biological functional categories in sets of ranked differentially expressed genes from genome-wide mRNA expression data sets.


Asunto(s)
Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Modelos Biológicos , Biología de Sistemas/educación , Biología de Sistemas/métodos , Biología de Sistemas/tendencias
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA