Pesquisa | Secretaria de Estado da Saúde

Enrichr: a comprehensive gene set enrichment analysis web server 2016 update.

Kuleshov, Maxim V; Jones, Matthew R; Rouillard, Andrew D; Fernandez, Nicolas F; Duan, Qiaonan; Wang, Zichen; Koplev, Simon; Jenkins, Sherry L; Jagodnik, Kathleen M; Lachmann, Alexander; McDermott, Michael G; Monteiro, Caroline D; Gundersen, Gregory W; Ma'ayan, Avi.

Nucleic Acids Res ; 44(W1): W90-7, 2016 07 08.

Artigo em Inglês | MEDLINE | ID: mdl-27141961

RESUMO

Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr.

Assuntos

Biologia Computacional/métodos , Biblioteca Gênica , Ontologia Genética , Interface Usuário-Computador , Benchmarking , Biologia Computacional/estatística & dados numéricos , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Genoma Humano , Humanos , Internet , Anotação de Sequência Molecular

GEN3VA: aggregation and analysis of gene expression signatures from related studies.

Gundersen, Gregory W; Jagodnik, Kathleen M; Woodland, Holly; Fernandez, Nicholas F; Sani, Kevin; Dohlman, Anders B; Ung, Peter Man-Un; Monteiro, Caroline D; Schlessinger, Avner; Ma'ayan, Avi.

BMC Bioinformatics ; 17(1): 461, 2016 Nov 15.

Artigo em Inglês | MEDLINE | ID: mdl-27846806

RESUMO

BACKGROUND: Genome-wide gene expression profiling of mammalian cells is becoming a staple of many published biomedical and biological research studies. Such data is deposited into data repositories such as the Gene Expression Omnibus (GEO) for potential reuse. However, these repositories currently do not provide simple interfaces to systematically analyze collections of related studies. RESULTS: Here we present GENE Expression and Enrichment Vector Analyzer (GEN3VA), a web-based system that enables the integrative analysis of aggregated collections of tagged gene expression signatures identified and extracted from GEO. Each tagged collection of signatures is presented in a report that consists of heatmaps of the differentially expressed genes; principal component analysis of all signatures; enrichment analysis with several gene set libraries across all signatures, which we term enrichment vector analysis; and global mapping of small molecules that are predicted to reverse or mimic each signature in the aggregate. We demonstrate how GEN3VA can be used to identify common molecular mechanisms of aging by analyzing tagged signatures from 244 studies that compared young vs. old tissues in mammalian systems. In a second case study, we collected 86 signatures from treatment of human cells with dexamethasone, a glucocorticoid receptor (GR) agonist. Our analysis confirms consensus GR target genes and predicts potential drug mimickers. CONCLUSIONS: GEN3VA can be used to identify, aggregate, and analyze themed collections of gene expression signatures from diverse but related studies. Such integrative analyses can be used to address concerns about data reproducibility, confirm results across labs, and discover new collective knowledge by data reuse. GEN3VA is an open-source web-based system that is freely available at: http://amp.pharm.mssm.edu/gen3va .

Assuntos

Envelhecimento/genética , Dexametasona/farmacologia , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Software , Transcriptoma , Animais , Perfilação da Expressão Gênica/métodos , Humanos , Reprodutibilidade dos Testes

GEO2Enrichr: browser extension and server app to extract gene sets from GEO and analyze them for biological functions.

Gundersen, Gregory W; Jones, Matthew R; Rouillard, Andrew D; Kou, Yan; Monteiro, Caroline D; Feldmann, Axel S; Hu, Kevin S; Ma'ayan, Avi.

Bioinformatics ; 31(18): 3060-2, 2015 Sep 15.

Artigo em Inglês | MEDLINE | ID: mdl-25971742

RESUMO

MOTIVATION: Identification of differentially expressed genes is an important step in extracting knowledge from gene expression profiling studies. The raw expression data from microarray and other high-throughput technologies is deposited into the Gene Expression Omnibus (GEO) and served as Simple Omnibus Format in Text (SOFT) files. However, to extract and analyze differentially expressed genes from GEO requires significant computational skills. RESULTS: Here we introduce GEO2Enrichr, a browser extension for extracting differentially expressed gene sets from GEO and analyzing those sets with Enrichr, an independent gene set enrichment analysis tool containing over 70 000 annotated gene sets organized into 75 gene-set libraries. GEO2Enrichr adds JavaScript code to GEO web-pages; this code scrapes user selected accession numbers and metadata, and then, with one click, users can submit this information to a web-server application that downloads the SOFT files, parses, cleans and normalizes the data, identifies the differentially expressed genes, and then pipes the resulting gene lists to Enrichr for downstream functional analysis. GEO2Enrichr opens a new avenue for adding functionality to major bioinformatics resources such GEO by integrating tools and resources without the need for a plug-in architecture. Importantly, GEO2Enrichr helps researchers to quickly explore hypotheses with little technical overhead, lowering the barrier of entry for biologists by automating data processing steps needed for knowledge extraction from the major repository GEO. AVAILABILITY AND IMPLEMENTATION: GEO2Enrichr is an open source tool, freely available for installation as browser extensions at the Chrome Web Store and FireFox Add-ons. Documentation and a browser independent web application can be found at http://amp.pharm.mssm.edu/g2e/. CONTACT: avi.maayan@mssm.edu.

Assuntos

Biologia Computacional/métodos , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Análise em Microsséries/métodos , Canais de Cátion TRPV/fisiologia , Células 3T3 , Animais , Processamento Eletrônico de Dados , Regulação da Expressão Gênica , Biblioteca Gênica , Internet , Camundongos , Interface Usuário-Computador

The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins.

Rouillard, Andrew D; Gundersen, Gregory W; Fernandez, Nicolas F; Wang, Zichen; Monteiro, Caroline D; McDermott, Michael G; Ma'ayan, Avi.

Database (Oxford) ; 20162016.

Artigo em Inglês | MEDLINE | ID: mdl-27374120

RESUMO

Genomics, epigenomics, transcriptomics, proteomics and metabolomics efforts rapidly generate a plethora of data on the activity and levels of biomolecules within mammalian cells. At the same time, curation projects that organize knowledge from the biomedical literature into online databases are expanding. Hence, there is a wealth of information about genes, proteins and their associations, with an urgent need for data integration to achieve better knowledge extraction and data reuse. For this purpose, we developed the Harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins from over 70 major online resources. We extracted, abstracted and organized data into â¼72 million functional associations between genes/proteins and their attributes. Such attributes could be physical relationships with other biomolecules, expression in cell lines and tissues, genetic associations with knockout mouse or human phenotypes, or changes in expression after drug treatment. We stored these associations in a relational database along with rich metadata for the genes/proteins, their attributes and the original resources. The freely available Harmonizome web portal provides a graphical user interface, a web service and a mobile app for querying, browsing and downloading all of the collected data. To demonstrate the utility of the Harmonizome, we computed and visualized gene-gene and attribute-attribute similarity networks, and through unsupervised clustering, identified many unexpected relationships by combining pairs of datasets such as the association between kinase perturbations and disease signatures. We also applied supervised machine learning methods to predict novel substrates for kinases, endogenous ligands for G-protein coupled receptors, mouse phenotypes for knockout genes, and classified unannotated transmembrane proteins for likelihood of being ion channels. The Harmonizome is a comprehensive resource of knowledge about genes and proteins, and as such, it enables researchers to discover novel relationships between biological entities, as well as form novel data-driven hypotheses for experimental validation.Database URL: http://amp.pharm.mssm.edu/Harmonizome.

Assuntos

Mineração de Dados/métodos , Bases de Dados de Ácidos Nucleicos , Bases de Dados de Proteínas , Aprendizado de Máquina , Animais , Humanos , Camundongos

Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd.

Wang, Zichen; Monteiro, Caroline D; Jagodnik, Kathleen M; Fernandez, Nicolas F; Gundersen, Gregory W; Rouillard, Andrew D; Jenkins, Sherry L; Feldmann, Axel S; Hu, Kevin S; McDermott, Michael G; Duan, Qiaonan; Clark, Neil R; Jones, Matthew R; Kou, Yan; Goff, Troy; Woodland, Holly; Amaral, Fabio M R; Szeto, Gregory L; Fuchs, Oliver; Schüssler-Fiorenza Rose, Sophia M; Sharma, Shvetank; Schwartz, Uwe; Bausela, Xabier Bengoetxea; Szymkiewicz, Maciej; Maroulis, Vasileios; Salykin, Anton; Barra, Carolina M; Kruth, Candice D; Bongio, Nicholas J; Mathur, Vaibhav; Todoric, Radmila D; Rubin, Udi E; Malatras, Apostolos; Fulp, Carl T; Galindo, John A; Motiejunaite, Ruta; Jüschke, Christoph; Dishuck, Philip C; Lahl, Katharina; Jafari, Mohieddin; Aibar, Sara; Zaravinos, Apostolos; Steenhuizen, Linda H; Allison, Lindsey R; Gamallo, Pablo; de Andres Segura, Fernando; Dae Devlin, Tyler; Pérez-García, Vicente; Ma'ayan, Avi.

Nat Commun ; 7: 12846, 2016 Sep 26.

Artigo em Inglês | MEDLINE | ID: mdl-27667448

RESUMO

Gene expression data are accumulating exponentially in public repositories. Reanalysis and integration of themed collections from these studies may provide new insights, but requires further human curation. Here we report a crowdsourcing project to annotate and reanalyse a large number of gene expression profiles from Gene Expression Omnibus (GEO). Through a massive open online course on Coursera, over 70 participants from over 25 countries identify and annotate 2,460 single-gene perturbation signatures, 839 disease versus normal signatures, and 906 drug perturbation signatures. All these signatures are unique and are manually validated for quality. Global analysis of these signatures confirms known associations and identifies novel associations between genes, diseases and drugs. The manually curated signatures are used as a training set to develop classifiers for extracting similar signatures from the entire GEO repository. We develop a web portal to serve these signatures for query, download and visualization.

Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool.

Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi.

Proceedings (IEEE Int Conf Bioinformatics Biomed) ; 2015: 256-262, 2015 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-26848405

RESUMO

Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa