RESUMEN
Research in model organisms is central to the characterization of signaling pathways in multicellular organisms. Here, we present the comprehensive and systematic curation of 17 Drosophila signaling pathways using the Gene Ontology framework to establish a dynamic resource that has been incorporated into FlyBase, providing visualization and data integration tools to aid research projects. By restricting to experimental evidence reported in the research literature and quantifying the amount of such evidence for each gene in a pathway, we captured the landscape of empirical knowledge of signaling pathways in Drosophila.
Asunto(s)
Bases de Datos Genéticas , Drosophila , Animales , Drosophila/genética , Ontología de Genes , Transducción de Señal , Drosophila melanogaster/genéticaRESUMEN
Gene set enrichment analysis (GSEA) plays an important role in large-scale data analysis, helping scientists discover the underlying biological patterns over-represented in a gene list resulting from, for example, an 'omics' study. Gene Ontology (GO) annotation is the most frequently used classification mechanism for gene set definition. Here we present a new GSEA tool, PANGEA (PAthway, Network and Gene-set Enrichment Analysis; https://www.flyrnai.org/tools/pangea/), developed to allow a more flexible and configurable approach to data analysis using a variety of classification sets. PANGEA allows GO analysis to be performed on different sets of GO annotations, for example excluding high-throughput studies. Beyond GO, gene sets for pathway annotation and protein complex data from various resources as well as expression and disease annotation from the Alliance of Genome Resources (Alliance). In addition, visualizations of results are enhanced by providing an option to view network of gene set to gene relationships. The tool also allows comparison of multiple input gene lists and accompanying visualisation tools for quick and easy comparison. This new tool will facilitate GSEA for Drosophila and other major model organisms based on high-quality annotated information available for these species.
Asunto(s)
Drosophila , Programas Informáticos , Animales , Drosophila/genética , Genoma , Anotación de Secuencia Molecular , Bases de Datos GenéticasRESUMEN
FlyBase (flybase.org) is an essential online database for researchers using Drosophila melanogaster as a model organism, facilitating access to a diverse array of information that includes genetic, molecular, genomic and reagent resources. Here, we describe the introduction of several new features at FlyBase, including Pathway Reports, paralog information, disease models based on orthology, customizable tables within reports and overview displays ('ribbons') of expression and disease data. We also describe a variety of recent important updates, including incorporation of a developmental proteome, upgrades to the GAL4 search tab, additional Experimental Tool Reports, migration to JBrowse for genome browsing and improvements to batch queries/downloads and the Fast-Track Your Paper tool.
Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Drosophila melanogaster/genética , Genoma de los Insectos/genética , Genómica/métodos , Animales , Genes de Insecto/genética , Bases del Conocimiento , Anotación de Secuencia Molecular/métodos , Motor de Búsqueda/métodos , Navegador WebRESUMEN
FlyBase (flybase.org) is a knowledge base that supports the community of researchers that use the fruit fly, Drosophila melanogaster, as a model organism. The FlyBase team curates and organizes a diverse array of genetic, molecular, genomic, and developmental information about Drosophila. At the beginning of 2018, 'FlyBase 2.0' was released with a significantly improved user interface and new tools. Among these important changes are a new organization of search results into interactive lists or tables (hitlists), enhanced reference lists, and new protein domain graphics. An important new data class called 'experimental tools' consolidates information on useful fly strains and other resources related to a specific gene, which significantly enhances the ability of the Drosophila researcher to design and carry out experiments. With the release of FlyBase 2.0, there has also been a restructuring of backend architecture and a continued development of application programming interfaces (APIs) for programmatic access to FlyBase data. In this review, we describe these major new features and functionalities of the FlyBase 2.0 site and how they support the use of Drosophila as a model organism for biological discovery and translational research.
Asunto(s)
Bases de Datos Genéticas , Drosophila melanogaster/genética , Genoma de los Insectos/genética , Genómica , Animales , Dominios Proteicos/genética , Programas InformáticosRESUMEN
Since 1992, FlyBase (flybase.org) has been an essential online resource for the Drosophila research community. Concentrating on the most extensively studied species, Drosophila melanogaster, FlyBase includes information on genes (molecular and genetic), transgenic constructs, phenotypes, genetic and physical interactions, and reagents such as stocks and cDNAs. Access to data is provided through a number of tools, reports, and bulk-data downloads. Looking to the future, FlyBase is expanding its focus to serve a broader scientific community. In this update, we describe new features, datasets, reagent collections, and data presentations that address this goal, including enhanced orthology data, Human Disease Model Reports, protein domain search and visualization, concise gene summaries, a portal for external resources, video tutorials and the FlyBase Community Advisory Group.
Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Drosophila/genética , Genómica/métodos , Animales , Modelos Animales de Enfermedad , Estudios de Asociación Genética , Humanos , Navegador WebRESUMEN
Release 6, the latest reference genome assembly of the fruit fly Drosophila melanogaster, was released by the Berkeley Drosophila Genome Project in 2014; it replaces their previous Release 5 genome assembly, which had been the reference genome assembly for over 7 years. With the enormous amount of information now attached to the D. melanogaster genome in public repositories and individual laboratories, the replacement of the previous assembly by the new one is a major event requiring careful migration of annotations and genome-anchored data to the new, improved assembly. In this report, we describe the attributes of the new Release 6 reference genome assembly, the migration of FlyBase genome annotations to this new assembly, how genome features on this new assembly can be viewed in FlyBase (http://flybase.org) and how users can convert coordinates for their own data to the corresponding Release 6 coordinates.
Asunto(s)
Bases de Datos Genéticas , Drosophila melanogaster/genética , Genoma de los Insectos , Anotación de Secuencia Molecular , Animales , Genómica/normas , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Modelos Genéticos , Datos de Secuencia Molecular , Estándares de Referencia , Alineación de Secuencia , Programas InformáticosRESUMEN
An accurate, comprehensive, non-redundant and up-to-date bibliography is a crucial component of any Model Organism Database (MOD). Principally, the bibliography provides a set of references that are specific to the field served by the MOD. Moreover, it serves as a backbone to which all curated biological data can be attributed. Here, we describe the organization and main features of the bibliography in FlyBase (flybase.org), the MOD for Drosophila melanogaster. We present an overview of the current content of the bibliography, the pipeline for identifying and adding new references, the presentation of data within Reference Reports and effective methods for searching and retrieving bibliographic data. We highlight recent improvements in these areas and describe the advantages of using the FlyBase bibliography over alternative literature resources. Although this article is focused on bibliographic data, many of the features and tools described are applicable to browsing and querying other datasets in FlyBase.
Asunto(s)
Bibliografías como Asunto , Bases de Datos Genéticas , Drosophila melanogaster/genética , Animales , Drosophila/genética , InternetRESUMEN
Hemorrhagic fever viruses (HFVs) are a diverse set of over 80 viral species, found in 10 different genera comprising five different families: arena-, bunya-, flavi-, filo- and togaviridae. All these viruses are highly variable and evolve rapidly, making them elusive targets for the immune system and for vaccine and drug design. About 55,000 HFV sequences exist in the public domain today. A central website that provides annotated sequences and analysis tools will be helpful to HFV researchers worldwide. The HFV sequence database collects and stores sequence data and provides a user-friendly search interface and a large number of sequence analysis tools, following the model of the highly regarded and widely used Los Alamos HIV database [Kuiken, C., B. Korber, and R.W. Shafer, HIV sequence databases. AIDS Rev, 2003. 5: p. 52-61]. The database uses an algorithm that aligns each sequence to a species-wide reference sequence. The NCBI RefSeq database [Sayers et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 39, D38-D51.] is used for this; if a reference sequence is not available, a Blast search finds the best candidate. Using this method, sequences in each genus can be retrieved pre-aligned. The HFV website can be accessed via http://hfv.lanl.gov.
Asunto(s)
Bioterrorismo , Bases de Datos Genéticas , Virus/genética , Genes Virales , Genoma Viral , Anotación de Secuencia Molecular/normas , Control de Calidad , Alineación de Secuencia , Análisis de SecuenciaRESUMEN
FlyBase (http://flybase.org) is the leading database and web portal for genetic and genomic information on the fruit fly Drosophila melanogaster and related fly species. Whether you use the fruit fly as an experimental system or want to apply Drosophila biological knowledge to another field of study, FlyBase can help you successfully navigate the wealth of available Drosophila data. Here, we review the FlyBase web site with novice and less-experienced users of FlyBase in mind and point out recent developments stemming from the availability of genome-wide data from the modENCODE project. The first section of this paper explains the organization of the web site and describes the report pages available on FlyBase, focusing on the most popular, the Gene Report. The next section introduces some of the search tools available on FlyBase, in particular, our heavily used and recently redesigned search tool QuickSearch, found on the FlyBase homepage. The final section concerns genomic data, including recent modENCODE (http://www.modencode.org) data, available through our Genome Browser, GBrowse.
Asunto(s)
Bases de Datos Genéticas , Drosophila melanogaster/genética , Genoma de los Insectos , Animales , Genes de Insecto , Genómica , Internet , Programas InformáticosRESUMEN
Research in model organisms is central to the characterization of signaling pathways in multicellular organisms. Here, we present the systematic curation of 17 Drosophila signaling pathways using the Gene Ontology framework to establish a comprehensive and dynamic resource that has been incorporated into FlyBase, providing visualization and data integration tools to aid research projects. By restricting to experimental evidence reported in the research literature and quantifying the amount of such evidence for each gene in a pathway, we captured the landscape of empirical knowledge of signaling pathways in Drosophila . Summary statement: Comprehensive curation of Drosophila signaling pathways and new visual displays of the pathways provides a new FlyBase resource for researchers, and new insights into signaling pathway architecture.
RESUMEN
Gene set enrichment analysis (GSEA) plays an important role in large-scale data analysis, helping scientists discover the underlying biological patterns over-represented in a gene list resulting from, for example, an 'omics' study. Gene Ontology (GO) annotation is the most frequently used classification mechanism for gene set definition. Here we present a new GSEA tool, PANGEA (PAthway, Network and Gene-set Enrichment Analysis; https://www.flyrnai.org/tools/pangea/ ), developed to allow a more flexible and configurable approach to data analysis using a variety of classification sets. PANGEA allows GO analysis to be performed on different sets of GO annotations, for example excluding high-throughput studies. Beyond GO, gene sets for pathway annotation and protein complex data from various resources as well as expression and disease annotation from the Alliance of Genome Resources (Alliance). In addition, visualisations of results are enhanced by providing an option to view network of gene set to gene relationships. The tool also allows comparison of multiple input gene lists and accompanying visualisation tools for quick and easy comparison. This new tool will facilitate GSEA for Drosophila and other major model organisms based on high-quality annotated information available for these species.
RESUMEN
The Gene Ontology (GO) knowledgebase (http://geneontology.org) is a comprehensive resource concerning the functions of genes and gene products (proteins and noncoding RNAs). GO annotations cover genes from organisms across the tree of life as well as viruses, though most gene function knowledge currently derives from experiments carried out in a relatively small number of model organisms. Here, we provide an updated overview of the GO knowledgebase, as well as the efforts of the broad, international consortium of scientists that develops, maintains, and updates the GO knowledgebase. The GO knowledgebase consists of three components: (1) the GO-a computational knowledge structure describing the functional characteristics of genes; (2) GO annotations-evidence-supported statements asserting that a specific gene product has a particular functional characteristic; and (3) GO Causal Activity Models (GO-CAMs)-mechanistic models of molecular "pathways" (GO biological processes) created by linking multiple GO annotations using defined relations. Each of these components is continually expanded, revised, and updated in response to newly published discoveries and receives extensive QA checks, reviews, and user feedback. For each of these components, we provide a description of the current contents, recent developments to keep the knowledgebase up to date with new discoveries, and guidance on how users can best make use of the data that we provide. We conclude with future directions for the project.
Asunto(s)
Bases de Datos Genéticas , Proteínas , Ontología de Genes , Proteínas/genética , Anotación de Secuencia Molecular , Biología ComputacionalRESUMEN
Since 1992, FlyBase has provided a freely available online database of information about the model organism Drosophila melanogaster. Data in FlyBase is curated manually from research papers as well as computationally from a variety of relevant sources, to serve as an information hub that enables and accelerates research discovery. This chapter aims to give users new to the database an overview of the layout and types of data available, as well as introducing some tools with which to access the data. More experienced users will find useful information about recent improvements and descriptions to enable more efficient navigation of the database.
Asunto(s)
Drosophila melanogaster , Drosophila , Animales , Bases de Datos Genéticas , Drosophila/genética , Drosophila melanogaster/genética , Genes de Insecto , Genoma de los InsectosRESUMEN
FlyBase (flybase.org) is the primary online database of genetic, genomic, and functional information about Drosophila species, with a major focus on the model organism Drosophila melanogaster. The long and rich history of Drosophila research, combined with recent surges in genomic-scale and high-throughput technologies, mean that FlyBase now houses a huge quantity of data. Researchers need to be able to rapidly and intuitively query these data, and the QuickSearch tool has been designed to meet these needs. This tool is conveniently located on the FlyBase homepage and is organized into a series of simple tabbed interfaces that cover the major data and annotation classes within the database. This unit describes the functionality of all aspects of the QuickSearch tool. With this knowledge, FlyBase users will be equipped to take full advantage of all QuickSearch features and thereby gain improved access to data relevant to their research. © 2016 by John Wiley & Sons, Inc.