RESUMO
WormBase (https://wormbase.org/) is a mature Model Organism Information Resource supporting researchers using the nematode Caenorhabditis elegans as a model system for studies across a broad range of basic biological processes. Toward this mission, WormBase efforts are arranged in three primary facets: curation, user interface and architecture. In this update, we describe progress in each of these three areas. In particular, we discuss the status of literature curation and recently added data, detail new features of the web interface and options for users wishing to conduct data mining workflows, and discuss our efforts to build a robust and scalable architecture by leveraging commercial cloud offerings. We conclude with a description of WormBase's role as a founding member of the nascent Alliance of Genome Resources.
Assuntos
Caenorhabditis elegans/genética , Bases de Dados Genéticas , Genes de Helmintos , Animais , Mineração de Dados , Genômica , Internet , Interface Usuário-ComputadorRESUMO
WormBase (http://www.wormbase.org) is an important knowledge resource for biomedical researchers worldwide. To accommodate the ever increasing amount and complexity of research data, WormBase continues to advance its practices on data acquisition, curation and retrieval to most effectively deliver comprehensive knowledge about Caenorhabditis elegans, and genomic information about other nematodes and parasitic flatworms. Recent notable enhancements include user-directed submission of data, such as micropublication; genomic data curation and presentation, including additional genomes and JBrowse, respectively; new query tools, such as SimpleMine, Gene Enrichment Analysis; new data displays, such as the Person Lineage browser and the Summary of Ontology-based Annotations. Anticipating more rapid data growth ahead, WormBase continues the process of migrating to a cutting-edge database technology to achieve better stability, scalability, reproducibility and a faster response time. To better serve the broader research community, WormBase, with five other Model Organism Databases and The Gene Ontology project, have begun to collaborate formally as the Alliance of Genome Resources.
Assuntos
Bases de Dados Genéticas , Genoma , Nematoides/genética , Animais , Caenorhabditis/genética , Caenorhabditis elegans/genética , Curadoria de Dados , Mineração de Dados , Conjuntos de Dados como Assunto , Modelos Animais de Doenças , Previsões , Ontologia Genética , Humanos , Armazenamento e Recuperação da Informação , Platelmintos/genética , Editoração , Interferência de RNA , Alinhamento de Sequência , Interface Usuário-Computador , NavegadorRESUMO
WormBase (www.wormbase.org) is a central repository for research data on the biology, genetics and genomics of Caenorhabditis elegans and other nematodes. The project has evolved from its original remit to collect and integrate all data for a single species, and now extends to numerous nematodes, ranging from evolutionary comparators of C. elegans to parasitic species that threaten plant, animal and human health. Research activity using C. elegans as a model system is as vibrant as ever, and we have created new tools for community curation in response to the ever-increasing volume and complexity of data. To better allow users to navigate their way through these data, we have made a number of improvements to our main website, including new tools for browsing genomic features and ontology annotations. Finally, we have developed a new portal for parasitic worm genomes. WormBase ParaSite (parasite.wormbase.org) contains all publicly available nematode and platyhelminth annotated genome sequences, and is designed specifically to support helminth genomic research.
Assuntos
Caenorhabditis elegans/genética , Bases de Dados Genéticas , Genoma Helmíntico , Genômica , Nematoides/genética , Animais , Genes de Helmintos , Anotação de Sequência Molecular , Platelmintos/genética , SoftwareRESUMO
WormBase (http://www.wormbase.org/) is a highly curated resource dedicated to supporting research using the model organism Caenorhabditis elegans. With an electronic history predating the World Wide Web, WormBase contains information ranging from the sequence and phenotype of individual alleles to genome-wide studies generated using next-generation sequencing technologies. In recent years, we have expanded the contents to include data on additional nematodes of agricultural and medical significance, bringing the knowledge of C. elegans to bear on these systems and providing support for underserved research communities. Manual curation of the primary literature remains a central focus of the WormBase project, providing users with reliable, up-to-date and highly cross-linked information. In this update, we describe efforts to organize the original atomized and highly contextualized curated data into integrated syntheses of discrete biological topics. Next, we discuss our experiences coping with the vast increase in available genome sequences made possible through next-generation sequencing platforms. Finally, we describe some of the features and tools of the new WormBase Web site that help users better find and explore data of interest.
Assuntos
Caenorhabditis elegans/genética , Bases de Dados Genéticas , Genoma Helmíntico , Animais , Internet , Anotação de Sequência Molecular , Nematoides/genéticaRESUMO
Since its release in 2000, WormBase (http://www.wormbase.org) has grown from a small resource focusing on a single species and serving a dedicated research community, to one now spanning 15 species essential to the broader biomedical and agricultural research fields. To enhance the rate of curation, we have automated the identification of key data in the scientific literature and use similar methodology for data extraction. To ease access to the data, we are collaborating with journals to link entities in research publications to their report pages at WormBase. To facilitate discovery, we have added new views of the data, integrated large-scale datasets and expanded descriptions of models for human disease. Finally, we have introduced a dramatic overhaul of the WormBase website for public beta testing. Designed to balance complexity and usability, the new site is species-agnostic, highly customizable, and interactive. Casual users and developers alike will be able to leverage the public RESTful application programming interface (API) to generate custom data mining solutions and extensions to the site. We report on the growth of our database and on our work in keeping pace with the growing demand for data, efforts to anticipate the requirements of users and new collaborations with the larger science community.
Assuntos
Caenorhabditis elegans/genética , Bases de Dados Genéticas , Genoma Helmíntico , Nematoides/genética , Animais , Caenorhabditis/genética , Caenorhabditis elegans/anatomia & histologia , Gráficos por Computador , Perfilação da Expressão Gênica , Genômica , Internet , Anotação de Sequência Molecular , FenótipoRESUMO
WormBase and the Alliance of Genome Resources provide several types of gene data including annotations to ontology terms and controlled vocabularies. These are used to automatically generate text summaries to give users a cogent view of gene function. However, automated summaries are not available for genes that lack curated annotations. To increase the genome coverage of the summaries in WormBase, we developed a new software module that generates additional gene summaries for C. elegans and new gene summaries for nine other nematode species: four Caenorhabditis species ( C. brenneri, C. briggsae, C. japonica, C. remanei ), P. pacificus , and four parasitic species ( B. malayi, O. volvulus, S. ratti and T. muris ).
RESUMO
WormBase has been the major repository and knowledgebase of information about the genome and genetics of Caenorhabditis elegans and other nematodes of experimental interest for over 2 decades. We have 3 goals: to keep current with the fast-paced C. elegans research, to provide better integration with other resources, and to be sustainable. Here, we discuss the current state of WormBase as well as progress and plans for moving core WormBase infrastructure to the Alliance of Genome Resources (the Alliance). As an Alliance member, WormBase will continue to interact with the C. elegans community, develop new features as needed, and curate key information from the literature and large-scale projects.
Assuntos
Caenorhabditis elegans , Caenorhabditis elegans/genética , Animais , Bases de Dados Genéticas , Genoma Helmíntico , Genômica/métodosRESUMO
WormBase (http://www.wormbase.org) is a central data repository for nematode biology. Initially created as a service to the Caenorhabditis elegans research field, WormBase has evolved into a powerful research tool in its own right. In the past 2 years, we expanded WormBase to include the complete genomic sequence, gene predictions and orthology assignments from a range of related nematodes. This comparative data enrich the C. elegans data with improved gene predictions and a better understanding of gene function. In turn, they bring the wealth of experimental knowledge of C. elegans to other systems of medical and agricultural importance. Here, we describe new species and data types now available at WormBase. In addition, we detail enhancements to our curatorial pipeline and website infrastructure to accommodate new genomes and an extensive user base.
Assuntos
Caenorhabditis elegans/genética , Caenorhabditis/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Alelos , Animais , Biologia Computacional/tendências , Bases de Dados de Proteínas , Armazenamento e Recuperação da Informação/métodos , Internet , Fenótipo , Estrutura Terciária de Proteína , Software , Fatores de TranscriçãoRESUMO
WormBase (www.wormbase.org) is the central repository for the genetics and genomics of the nematode Caenorhabditis elegans. We provide the research community with data and tools to facilitate the use of C. elegans and related nematodes as model organisms for studying human health, development, and many aspects of fundamental biology. Throughout our 22-year history, we have continued to evolve to reflect progress and innovation in the science and technologies involved in the study of C. elegans. We strive to incorporate new data types and richer data sets, and to provide integrated displays and services that avail the knowledge generated by the published nematode genetics literature. Here, we provide a broad overview of the current state of WormBase in terms of data type, curation workflows, analysis, and tools, including exciting new advances for analysis of single-cell data, text mining and visualization, and the new community collaboration forum. Concurrently, we continue the integration and harmonization of infrastructure, processes, and tools with the Alliance of Genome Resources, of which WormBase is a founding member.
Assuntos
Caenorhabditis , Nematoides , Animais , Caenorhabditis/genética , Caenorhabditis elegans/genética , Bases de Dados Genéticas , Genoma , Genômica , Humanos , Nematoides/genéticaRESUMO
WormBase (www.wormbase.org) is the major publicly available database of information about Caenorhabditis elegans, an important system for basic biological and biomedical research. Derived from the initial ACeDB database of C. elegans genetic and sequence information, WormBase now includes the genomic, anatomical and functional information about C. elegans, other Caenorhabditis species and other nematodes. As such, it is a crucial resource not only for C. elegans biologists but the larger biomedical and bioinformatics communities. Coverage of core areas of C. elegans biology will allow the biomedical community to make full use of the results of intensive molecular genetic analysis and functional genomic studies of this organism. Improved search and display tools, wider cross-species comparisons and extended ontologies are some of the features that will help scientists extend their research and take advantage of other nematode species genome sequences.
Assuntos
Caenorhabditis elegans/genética , Bases de Dados Genéticas , Genoma Helmíntico , Animais , Caenorhabditis elegans/metabolismo , Mapeamento Cromossômico , Expressão Gênica , Redes Reguladoras de Genes , Genes de Helmintos , Genômica , Internet , Espectrometria de Massas , Peptídeos/química , Fenótipo , Interface Usuário-ComputadorRESUMO
Short paragraphs that describe gene function, referred to as gene summaries, are valued by users of biological knowledgebases for the ease with which they convey key aspects of gene function. Manual curation of gene summaries, while desirable, is difficult for knowledgebases to sustain. We developed an algorithm that uses curated, structured gene data at the Alliance of Genome Resources (Alliance; www.alliancegenome.org) to automatically generate gene summaries that simulate natural language. The gene data used for this purpose include curated associations (annotations) to ontology terms from the Gene Ontology, Disease Ontology, model organism knowledgebase (MOK)-specific anatomy ontologies and Alliance orthology data. The method uses sentence templates for each data category included in the gene summary in order to build a natural language sentence from the list of terms associated with each gene. To improve readability of the summaries when numerous gene annotations are present, we developed a new algorithm that traverses ontology graphs in order to group terms by their common ancestors. The algorithm optimizes the coverage of the initial set of terms and limits the length of the final summary, using measures of information content of each ontology term as a criterion for inclusion in the summary. The automated gene summaries are generated with each Alliance release, ensuring that they reflect current data at the Alliance. Our method effectively leverages category-specific curation efforts of the Alliance member databases to create modular, structured and standardized gene summaries for seven member species of the Alliance. These automatically generated gene summaries make cross-species gene function comparisons tenable and increase discoverability of potential models of human disease. In addition to being displayed on Alliance gene pages, these summaries are also included on several MOK gene pages.
Assuntos
Bases de Dados Genéticas , Genômica , Anotação de Sequência Molecular/métodos , Ontologia Genética , Armazenamento e Recuperação da InformaçãoRESUMO
WormBase (http://wormbase.org), a model organism database for Caenorhabditis elegans and other related nematodes, continues to evolve and expand. Over the past year WormBase has added new data on C.elegans, including data on classical genetics, cell biology and functional genomics; expanded the annotation of closely related nematodes with a new genome browser for Caenorhabditis remanei; and deployed new hardware for stronger performance. Several existing datasets including phenotype descriptions and RNAi experiments have seen a large increase in new content. New datasets such as the C.remanei draft assembly and annotations, the Vancouver Fosmid library and TEC-RED 5' end sites are now available as well. Access to and searching WormBase has become more dependable and flexible via multiple mirror sites and indexing through Google.
Assuntos
Caenorhabditis elegans/genética , Caenorhabditis/genética , Bases de Dados Genéticas , Animais , Genes de Helmintos , Genoma Helmíntico , Genômica , Internet , Análise de Sequência com Séries de Oligonucleotídeos , Fenótipo , Interferência de RNA , Interface Usuário-ComputadorRESUMO
WormBase (http://wormbase.org), the public database for genomics and biology of Caenorhabditis elegans, has been restructured for stronger performance and expanded for richer biological content. Performance was improved by accelerating the loading of central data pages such as the omnibus Gene page, by rationalizing internal data structures and software for greater portability, and by making the Genome Browser highly customizable in how it views and exports genomic subsequences. Arbitrarily complex, user-specified queries are now possible through Textpresso (for all available literature) and through WormMart (for most genomic data). Biological content was enriched by reconciling all available cDNA and expressed sequence tag data with gene predictions, clarifying single nucleotide polymorphism and RNAi sites, and summarizing known functions for most genes studied in this organism.
Assuntos
Proteínas de Caenorhabditis elegans/química , Proteínas de Caenorhabditis elegans/genética , Caenorhabditis elegans/genética , Bases de Dados Genéticas , Software , Animais , Caenorhabditis elegans/fisiologia , DNA Complementar/química , Etiquetas de Sequências Expressas/química , Genoma Helmíntico , Genômica , Internet , Polimorfismo de Nucleotídeo Único , Interferência de RNA , Interface Usuário-ComputadorRESUMO
WormBase ( www.wormbase.org ) provides the nematode research community with a centralized database for information pertaining to nematode genes and genomes. As more nematode genome sequences are becoming available and as richer data sets are published, WormBase strives to maintain updated information, displays, and services to facilitate efficient access to and understanding of the knowledge generated by the published nematode genetics literature. This chapter aims to provide an explanation of how to use basic features of WormBase, new features, and some commonly used tools and data queries. Explanations of the curated data and step-by-step instructions of how to access the data via the WormBase website and available data mining tools are provided.
Assuntos
Caenorhabditis elegans/genética , Bases de Dados Genéticas , Genoma Helmíntico , Genômica , Animais , Biologia Computacional/métodos , Mineração de Dados/métodos , Epistasia Genética , Ontologia Genética , Genes de Helmintos , Genômica/métodos , Humanos , Fenótipo , Proteoma , Ferramenta de Busca , Software , Transcriptoma , Interface Usuário-Computador , NavegadorRESUMO
Model organism databases (MODs) have been collecting and integrating biomedical research data for 30 years and were designed to meet specific needs of each model organism research community. The contributions of model organism research to understanding biological systems would be hard to overstate. Modern molecular biology methods and cost reductions in nucleotide sequencing have opened avenues for direct application of model organism research to elucidating mechanisms of human diseases. Thus, the mandate for model organism research and databases has now grown to include facilitating use of these data in translational applications. Challenges in meeting this opportunity include the distribution of research data across many databases and websites, a lack of data format standards for some data types, and sustainability of scale and cost for genomic database resources like MODs. The issues of widely distributed data and application of data standards are some of the challenges addressed by FAIR (Findable, Accessible, Interoperable, and Re-usable) data principles. The Alliance of Genome Resources is now moving to address these challenges by bringing together expertly curated research data from fly, mouse, rat, worm, yeast, zebrafish, and the Gene Ontology consortium. Centralized multi-species data access, integration, and format standardization will lower the data utilization barrier in comparative genomics and translational applications and will provide a framework in which sustainable scale and cost can be addressed. This article presents a brief historical perspective on how the Alliance model organisms are complementary and how they have already contributed to understanding the etiology of human diseases. In addition, we discuss four challenges for using data from MODs in translational applications and how the Alliance is working to address them, in part by applying FAIR data principles. Ultimately, combined data from these animal models are more powerful than the sum of the parts.
Assuntos
Animais de Laboratório , Bases de Dados como Assunto , Pesquisa Translacional Biomédica/métodos , Animais , Modelos AnimaisRESUMO
WormBase (http://www.wormbase.org), the model organism database for information about Caenorhabditis elegans and related nematodes, continues to expand in breadth and depth. Over the past year, WormBase has added multiple large-scale datasets including SAGE, interactome, 3D protein structure datasets and NCBI KOGs. To accommodate this growth, the International WormBase Consortium has improved the user interface by adding new features to aid in navigation, visualization of large-scale datasets, advanced searching and data mining. Internally, we have restructured the database models to rationalize the representation of genes and to prepare the system to accept the genome sequences of three additional Caenorhabditis species over the coming year.
Assuntos
Proteínas de Caenorhabditis elegans/química , Proteínas de Caenorhabditis elegans/genética , Caenorhabditis elegans/genética , Caenorhabditis/genética , Bases de Dados Genéticas , Genômica , Animais , Caenorhabditis/metabolismo , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/metabolismo , Bases de Dados Genéticas/tendências , Perfilação da Expressão Gênica , Conformação Proteica , Software , Integração de Sistemas , Técnicas do Sistema de Duplo-Híbrido , Interface Usuário-ComputadorRESUMO
WormBase (http://www.wormbase.org/) is a web-accessible central data repository for information about Caenorhabditis elegans and related nematodes. The past two years have seen a significant expansion in the biological scope of WormBase, including the integration of large-scale, genome-wide data sets, the inclusion of genome sequence and gene predictions from related species and active literature curation. This expansion of data has also driven the development and refinement of user interfaces and operability, including a new Genome Browser, new searches and facilities for data access and the inclusion of extensive documentation. These advances have expanded WormBase beyond the obvious target audience of C. elegans researchers, to include researchers wishing to explore problems in functional and comparative genomics within the context of a powerful genetic system.
Assuntos
Caenorhabditis elegans/genética , Caenorhabditis/genética , Bases de Dados de Ácidos Nucleicos , Genômica , Animais , Caenorhabditis elegans/embriologia , Caenorhabditis elegans/crescimento & desenvolvimento , DNA de Helmintos/análise , Coleta de Dados , Etiquetas de Sequências Expressas , Expressão Gênica , Armazenamento e Recuperação da Informação , Neurônios/classificação , Polimorfismo de Nucleotídeo Único , Controle de Qualidade , Interferência de RNA , RNA de Helmintos/antagonistas & inibidores , Homologia de Sequência do Ácido NucleicoRESUMO
WormBase (http://www.wormbase.org/) is the central data repository for information about Caenorhabditis elegans and related nematodes. As a model organism database, WormBase extends beyond the genomic sequence, integrating experimental results with extensively annotated views of the genome. The WormBase Consortium continues to expand the biological scope and utility of WormBase with the inclusion of large-scale genomic analyses, through active data and literature curation, through new analysis and visualization tools, and through refinement of the user interface. Over the past year, the nearly complete genomic sequence and comparative analyses of the closely related species Caenorhabditis briggsae have been integrated into WormBase, including gene predictions, ortholog assignments and a new synteny viewer to display the relationships between the two species. Extensive site-wide refinement of the user interface now provides quick access to the most frequently accessed resources and a consistent browsing experience across the site. Unified single-page views now provide complete summaries of commonly accessed entries like genes. These advances continue to increase the utility of WormBase for C.elegans researchers, as well as for those researchers exploring problems in functional and comparative genomics in the context of a powerful genetic system.
Assuntos
Caenorhabditis elegans/genética , Caenorhabditis/genética , Bases de Dados Genéticas , Genômica , Animais , Biologia Computacional , Armazenamento e Recuperação da Informação , Internet , Interface Usuário-ComputadorRESUMO
Vulval development in the nematode Caenorhabditis elegans can be divided into a fate specification phase controlled in part by let-60 Ras, and a fate execution phase involving stereotypical patterns of cell division and migration controlled in part by lin-17 Frizzled. Since the small GTPase Rac has been implicated as a downstream target of both Ras and Frizzled and influences cytoskeletal dynamics, we investigated the role of Rac signaling during each phase of vulval development. We show that the Rac gene ced-10 and the Rac-related gene mig-2 are redundantly required for the proper orientation of certain vulval cell divisions, suggesting a role in spindle positioning. ced-10 Rac and mig-2 are also redundantly required for vulval cell migrations and play a minor role in vulval fate specification. Constitutively active and dominant-negative mutant forms of mig-2 cause vulval defects that are very similar to those seen in ced-10;mig-2 double loss-of-function mutants, indicating that they interfere with the functions of both ced-10 Rac and mig-2. Mutations in unc-73 (a Trio-like guanine nucleotide exchange factor) cause similar vulval defects, suggesting that UNC-73 is an exchange factor for both CED-10 and MIG-2. We discuss the similarities and differences between the cellular defects seen in Rac mutants and let-60 Ras or lin-17 Frizzled mutants.