RESUMO
Formed in late 1999, the Rat Genome Database (RGD, https://rgd.mcw.edu) will be 20 in 2020, the Year of the Rat. Because the laboratory rat, Rattus norvegicus, has been used as a model for complex human diseases such as cardiovascular disease, diabetes, cancer, neurological disorders and arthritis, among others, for >150 years, RGD has always been disease-focused and committed to providing data and tools for researchers doing comparative genomics and translational studies. At its inception, before the sequencing of the rat genome, RGD started with only a few data types localized on genetic and radiation hybrid (RH) maps and offered only a few tools for querying and consolidating that data. Since that time, RGD has expanded to include a wealth of structured and standardized genetic, genomic, phenotypic, and disease-related data for eight species, and a suite of innovative tools for querying, analyzing and visualizing this data. This article provides an overview of recent substantial additions and improvements to RGD's data and tools that can assist researchers in finding and utilizing the data they need, whether their goal is to develop new precision models of disease or to more fully explore emerging details within a system or across multiple systems.
Assuntos
Mapeamento Cromossômico , Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma , Ratos/genética , Algoritmos , Animais , Chinchila/genética , Modelos Animais de Doenças , Cães/genética , Marcadores Genéticos , Variação Genética , Humanos , Internet , Camundongos/genética , Pan troglodytes/genética , Fenótipo , Mapeamento de Interação de Proteínas , Retina/metabolismo , Sciuridae/genética , Software , Especificidade da Espécie , Suínos/genética , Interface Usuário-ComputadorRESUMO
The Rat Genome Database (RGD, http://rgd.mcw.edu) provides the most comprehensive data repository and informatics platform related to the laboratory rat, one of the most important model organisms for disease studies. RGD maintains and updates datasets for genomic elements such as genes, transcripts and increasingly in recent years, sequence variations, as well as map positions for multiple assemblies and sequence information. Functional annotations for genomic elements are curated from published literature, submitted by researchers and integrated from other public resources. Complementing the genomic data catalogs are those associated with phenotypes and disease, including strains, QTL and experimental phenotype measurements across hundreds of strains. Data are submitted by researchers, acquired through bulk data pipelines or curated from published literature. Innovative software tools provide users with an integrated platform to query, mine, display and analyze valuable genomic and phenomic datasets for discovery and enhancement of their own research. This update highlights recent developments that reflect an increasing focus on: (i) genomic variation, (ii) phenotypes and diseases, (iii) data related to the environment and experimental conditions and (iv) datasets and software tools that allow the user to explore and analyze the interactions among these and their impact on disease.
Assuntos
Bases de Dados Genéticas , Variação Genética , Genômica , Fenótipo , Ratos/genética , Animais , Doença/genética , Meio Ambiente , Genoma , Internet , Anotação de Sequência MolecularRESUMO
INTRODUCTION: Decreasing costs and increased availability of genetic testing and genome sequencing mean many physicians will consider using these services over the next few years. Despite this promising future, some argue the present roadmap for translating genetics and genomics into routine clinical practice is unclear. OBJECTIVE: We conducted a pilot study to explore Wisconsin physicians' views, practices and educational desires regarding genetic and genomic testing. METHODS: Our study consists of an Internet survey (n=155) conducted in August and September 2015 and follow-up phone interviews with a portion of survey participants. Physicians of all specialties were invited to participate. Variables measured include physicians' general knowledge and experience regarding genetic and genomic testing, attitudes and perceptions toward these tests, testing intentions, and educational desires. Sociodemographic variables included gender, age, and medical specialty. RESULTS: In our exploratory survey of Wisconsin physicians, adult primary care providers (PCPs) lagged behind other providers in terms of familiarity and experience with genetic and genomic testing. PCPs in our sample were less likely than other physicians to feel their training in genetics and genomics is adequate. Physicians younger than 50 were more likely than older colleagues to feel their training is adequate. CONCLUSIONS: Our exploratory study suggests a gap in physician education and understanding regarding genomic testing, which is fast becoming part of personalized medical care. Future studies with larger samples should examine ways for physicians to close this gap, with special focus on the needs of PCPs.
Assuntos
Atitude do Pessoal de Saúde , Testes Genéticos/tendências , Genômica , Conhecimentos, Atitudes e Prática em Saúde , Médicos , Fatores Etários , Genômica/educação , Pesquisas sobre Atenção à Saúde , Humanos , Médicos/psicologia , Projetos Piloto , Padrões de Prática Médica , WisconsinRESUMO
Cardiovascular diseases are complex diseases caused by a combination of genetic and environmental factors. To facilitate progress in complex disease research, the Rat Genome Database (RGD) provides the community with a disease portal where genome objects and biological data related to cardiovascular diseases are systematically organized. The purpose of this study is to present biocuration at RGD, including disease, genetic, and pathway data. The RGD curation team uses controlled vocabularies/ontologies to organize data curated from the published literature or imported from disease and pathway databases. These organized annotations are associated with genes, strains, and quantitative trait loci (QTLs), thus linking functional annotations to genome objects. Screen shots from the web pages are used to demonstrate the organization of annotations at RGD. The human cardiovascular disease genes identified by annotations were grouped according to data sources and their annotation profiles were compared by in-house tools and other enrichment tools available to the public. The analysis results show that the imported cardiovascular disease genes from ClinVar and OMIM are functionally different from the RGD manually curated genes in terms of pathway and Gene Ontology annotations. The inclusion of disease genes from other databases enriches the collection of disease genes not only in quantity but also in quality.
Assuntos
Doenças Cardiovasculares/genética , Genoma/genética , Animais , Bases de Dados Genéticas , Ontologia Genética , Genômica/métodos , Humanos , Anotação de Sequência Molecular/métodos , Locos de Características Quantitativas/genética , RatosRESUMO
The Rat Genome Database (RGD) was started >10 years ago to provide a core genomic resource for rat researchers. Currently, RGD combines genetic, genomic, pathway, phenotype and strain information with a focus on disease. RGD users are provided with access to structured and curated data from the molecular level through the organismal level. Those users access RGD from all over the world. End users are not only rat researchers but also researchers working with mouse and human data. Translational research is supported by RGD's comparative genetics/genomics data in disease portals, in GBrowse, in VCMap and on gene report pages. The impact of RGD also goes beyond the traditional biomedical researcher, as the influence of RGD reaches bioinformaticians, tool developers and curators. Import of RGD data into other publicly available databases expands the influence of RGD to a larger set of end users than those who avail themselves of the RGD website. The value of RGD continues to grow as more types of data and more tools are added, while reaching more types of end users.
Assuntos
Bases de Dados Genéticas , Genoma , Animais , Humanos , Camundongos , Fenótipo , RatosRESUMO
BACKGROUND: Biological systems are exquisitely poised to respond and adjust to challenges, including damage. However, sustained damage can overcome the ability of the system to adjust and result in a disease phenotype, its underpinnings many times elusive. Unraveling the molecular mechanisms of systems biology, of how and why it falters, is essential for delineating the details of the path(s) leading to the diseased state and for designing strategies to revert its progression. An important aspect of this process is not only to define the function of a gene but to identify the context within which gene functions act. It is within the network, or pathway context, that the function of a gene fulfills its ultimate biological role. Resolving the extent to which defective function(s) affect the proceedings of pathway(s) and how altered pathways merge into overpowering the system's defense machinery are key to understanding the molecular aspects of disease and envisioning ways to counteract it. A network-centric approach to diseases is increasingly being considered in current research. It also underlies the deployment of disease pathways at the Rat Genome Database Pathway Portal. The portal is presented with an emphasis on disease and altered pathways, associated drug pathways, pathway suites, and suite networks. RESULTS: The Pathway Portal at the Rat Genome Database (RGD) provides an ever-increasing collection of interactive pathway diagrams and associated annotations for metabolic, signaling, regulatory, and drug pathways, including disease and altered pathways. A disease pathway is viewed from the perspective of networks whose alterations are manifested in the affected phenotype. The Pathway Ontology (PW), built and maintained at RGD, facilitates the annotations of genes, the deployment of pathway diagrams, and provides an overall navigational tool. Pathways that revolve around a common concept and are globally connected are presented within pathway suites; a suite network combines two or more pathway suites. CONCLUSIONS: The Pathway Portal is a rich resource that offers a range of pathway data and visualization, including disease pathways and related pathway suites. Viewing a disease pathway from the perspective of underlying altered pathways is an aid for dissecting the molecular mechanisms of disease.
Assuntos
Bases de Dados Genéticas , Redes Reguladoras de Genes/genética , Genoma , Redes e Vias Metabólicas/genética , Biologia de Sistemas/métodos , Animais , Modelos Animais de Doenças , Feminino , Masculino , Anotação de Sequência Molecular , Fenótipo , Ratos , Transdução de Sinais , Interface Usuário-ComputadorRESUMO
The details of protein pathways at a structural level provides a bridge between genetics/molecular biology and physiology. The renin-angiotensin system is involved in many physiological pathways with informative structural details in multiple components. Few studies have been performed assessing structural knowledge across the system. This assessment allows use of bioinformatics tools to fill in missing structural voids. In this paper we detail known structures of the renin-angiotensin system and use computational approaches to estimate and model components that do not have their protein structures defined. With the subsequent large library of protein structures, we then created a species specific protein library for human, mouse, rat, bovine, zebrafish, and chicken for the system. The rat structural system allowed for rapid screening of genetic variants from 51 commonly used rat strains, identifying amino acid variants in angiotensinogen, ACE2, and AT1b that are in contact positions with other macromolecules. We believe the structural map will be of value for other researchers to understand their experimental data in the context of an environment for multiple proteins, providing pdb files of proteins for the renin-angiotensin system in six species. With detailed structural descriptions of each protein, it is easier to assess a species for use in translating human diseases with animal models. Additionally, as whole genome sequencing continues to decrease in cost, tools such as molecular modeling will gain use as an initial step in designing efficient hypothesis driven research, addressing potential functional outcomes of genetic variants with precompiled protein libraries aiding in rapid characterizations.
Assuntos
Angiotensinogênio/química , Evolução Biológica , Biologia Computacional , Modelos Moleculares , Sistema Renina-Angiotensina , Renina/química , Sequência de Aminoácidos , Angiotensinogênio/metabolismo , Animais , Bovinos , Galinhas , Humanos , Camundongos , Dados de Sequência Molecular , Conformação Proteica , Ratos , Renina/metabolismo , Homologia de Sequência de Aminoácidos , Especificidade da Espécie , Peixe-ZebraRESUMO
The RGD Pathway Portal provides pathway annotations for rat, human and mouse genes and pathway diagrams and suites, all interconnected via the pathway ontology. Diagram pages present the diagram and description, with diagram objects linked to additional resources. A newly-developed dual-functionality web application composes the diagram page. Curators input the description, diagram, references and additional pathway objects. The application combines these with tables of rat, human and mouse pathway genes, including genetic information, analysis tool and reference links, and disease, phenotype and other pathway annotations to pathway genes. The application increases the information content of diagram pages while expediting publication.
Assuntos
Biologia Computacional/métodos , Genoma Humano , Software , Animais , Bases de Dados Genéticas , Redes Reguladoras de Genes , Humanos , Internet , Redes e Vias Metabólicas , Camundongos , Anotação de Sequência Molecular , Locos de Características Quantitativas , Ratos , Reprodutibilidade dos Testes , Ferramenta de Busca , Transdução de SinaisRESUMO
The rat has been widely used as a disease model in a laboratory setting, resulting in an abundance of genetic and phenotype data from a wide variety of studies. These data can be found at the Rat Genome Database (RGD, http://rgd.mcw.edu/), which provides a platform for researchers interested in linking genomic variations to phenotypes. Quantitative trait loci (QTLs) form one of the earliest and core datasets, allowing researchers to identify loci harboring genes associated with disease. These QTLs are not only important for those using the rat to identify genes and regions associated with disease, but also for cross-organism analyses of syntenic regions on the mouse and the human genomes to identify potential regions for study in these organisms. Currently, RGD has data on >1,900 rat QTLs that include details about the methods and animals used to determine the respective QTL along with the genomic positions and markers that define the region. RGD also curates human QTLs (>1,900) and houses>4,000 mouse QTLs (imported from Mouse Genome Informatics). Multiple ontologies are used to standardize traits, phenotypes, diseases, and experimental methods to facilitate queries, analyses, and cross-organism comparisons. QTLs are visualized in tools such as GBrowse and GViewer, with additional tools for analysis of gene sets within QTL regions. The QTL data at RGD provide valuable information for the study of mapped phenotypes and identification of candidate genes for disease associations.
Assuntos
Bases de Dados Genéticas , Genoma , Locos de Características Quantitativas , Acesso à Informação , Animais , Marcadores Genéticos , Humanos , Internet , Camundongos , Fenótipo , RatosRESUMO
Complex diseases result from contributions of multiple genes that act in concert through pathways. Here we present a method to prioritize novel candidates of disease-susceptibility genes depending on the biological similarities to the known disease-related genes. The extent of disease-susceptibility of a gene is prioritized by analyzing seven features of human genes captured in H-InvDB. Taking rheumatoid arthritis (RA) and prostate cancer (PC) as two examples, we evaluated the efficiency of our method. Highly scored genes obtained included TNFSF12 and OSM as candidate disease genes for RA and PC, respectively. Subsequent characterization of these genes based upon an extensive literature survey reinforced the validity of these highly scored genes as possible disease-susceptibility genes. Our approach, Prioritization ANalysis of Disease Association (PANDA), is an efficient and cost-effective method to narrow down a large set of genes into smaller subsets that are most likely to be involved in the disease pathogenesis.
Assuntos
Estudos de Associação Genética/métodos , Predisposição Genética para Doença/genética , Genômica/métodos , Artrite Reumatoide/genética , Análise Custo-Benefício , Citocina TWEAK , Mineração de Dados , Estudos de Associação Genética/economia , Humanos , Masculino , Oncostatina M/genética , Neoplasias da Próstata/genética , Fatores de Necrose Tumoral/genéticaRESUMO
The Rat Genome Database (RGD) (http://rgd.mcw.edu) provides a comprehensive platform for comparative genomics and genetics research. RGD houses gene, QTL and polymorphic marker data for rat, mouse and human and provides easy access to data through sophisticated searches, disease portals, interactive pathway diagrams and rat and human genome browsers.
Assuntos
Bases de Dados Genéticas , Animais , Doenças Cardiovasculares/genética , Genoma , Humanos , Doenças Metabólicas/genética , Camundongos , Modelos Genéticos , Neoplasias/genética , Doenças do Sistema Nervoso/genética , Obesidade/genética , Sistemas On-Line , Fenótipo , Locos de Características Quantitativas , RatosRESUMO
The Rat Genome Database (RGD, http://rgd.mcw.edu) was developed to provide a core resource for rat researchers combining genetic, genomic, pathway, phenotype and strain information with a focus on disease. RGD users are provided with access to structured and curated data from the molecular level through to the level of the whole organism, including the variations associated with disease phenotypes. To fully support use of the rat as a translational model for biological systems and human disease, RGD continues to curate these datasets while enhancing and developing tools to allow efficient and effective access to the data in a variety of formats including linear genome viewers, pathway diagrams and biological ontologies. To support pathophysiological analysis of data, RGD Disease Portals provide an entryway to integrated gene, QTL and strain data specific to a particular disease. In addition to tool and content development and maintenance, RGD promotes rat research and provides user education by creating and disseminating tutorials on the curated datasets, submission processes, and tools available at RGD. By curating, storing, integrating, visualizing and promoting rat data, RGD ensures that the investment made into rat genomics and genetics can be leveraged by all interested investigators.
Assuntos
Bases de Dados Genéticas , Genômica , Ratos/genética , Animais , Doença/genética , Modelos Animais de Doenças , Variação Genética , Genoma , Fenótipo , Ratos/metabolismo , Ratos/fisiologia , Transdução de Sinais , Software , Terminologia como AssuntoRESUMO
We present a genome assembly from an individual male Rattus norvegicus (the Norway rat; Chordata; Mammalia; Rodentia; Muridae). The genome sequence is 2.44 gigabases in span. The majority of the assembly is scaffolded into 20 chromosomal pseudomolecules, with both X and Y sex chromosomes assembled. This genome assembly, mRatBN7.2, represents the new reference genome for R. norvegicus and has been adopted by the Genome Reference Consortium.
RESUMO
Short paragraphs that describe gene function, referred to as gene summaries, are valued by users of biological knowledgebases for the ease with which they convey key aspects of gene function. Manual curation of gene summaries, while desirable, is difficult for knowledgebases to sustain. We developed an algorithm that uses curated, structured gene data at the Alliance of Genome Resources (Alliance; www.alliancegenome.org) to automatically generate gene summaries that simulate natural language. The gene data used for this purpose include curated associations (annotations) to ontology terms from the Gene Ontology, Disease Ontology, model organism knowledgebase (MOK)-specific anatomy ontologies and Alliance orthology data. The method uses sentence templates for each data category included in the gene summary in order to build a natural language sentence from the list of terms associated with each gene. To improve readability of the summaries when numerous gene annotations are present, we developed a new algorithm that traverses ontology graphs in order to group terms by their common ancestors. The algorithm optimizes the coverage of the initial set of terms and limits the length of the final summary, using measures of information content of each ontology term as a criterion for inclusion in the summary. The automated gene summaries are generated with each Alliance release, ensuring that they reflect current data at the Alliance. Our method effectively leverages category-specific curation efforts of the Alliance member databases to create modular, structured and standardized gene summaries for seven member species of the Alliance. These automatically generated gene summaries make cross-species gene function comparisons tenable and increase discoverability of potential models of human disease. In addition to being displayed on Alliance gene pages, these summaries are also included on several MOK gene pages.
Assuntos
Bases de Dados Genéticas , Genômica , Anotação de Sequência Molecular/métodos , Ontologia Genética , Armazenamento e Recuperação da InformaçãoRESUMO
The Rat Genome Database (RGD, http://rgd.mcw.edu) is one of the core resources for rat genomics and recent developments have focused on providing support for disease-based research using the rat model. Recognizing the importance of the rat as a disease model we have employed targeted curation strategies to curate genes, QTL and strain data for neurological and cardiovascular disease areas. This work has centered on rat but also includes data for mouse and human to create 'disease portals' that provide a unified view of the genes, QTL and strain models for these diseases across the three species. The disease curation efforts combined with normal curation activities have served to greatly increase the content of the database, particularly for biological information, including gene ontology, disease, pathway and phenotype ontology annotations. In addition to improving the features and database content, community outreach has been expanded to demonstrate how investigators can leverage the resources at RGD to facilitate their research and to elicit suggestions and needs for future developments. We have published a number of papers that provide additional information on the ontology annotations and the tools at RGD for data mining and analysis to better enable researchers to fully utilize the database.
Assuntos
Bases de Dados Genéticas , Modelos Animais de Doenças , Genômica , Ratos/genética , Animais , Doenças Cardiovasculares/genética , Mapeamento Cromossômico , Humanos , Internet , Camundongos , Doenças do Sistema Nervoso/genética , Locos de Características Quantitativas , Interface Usuário-ComputadorRESUMO
The laboratory rat has been widely used as an animal model in biomedical research. There are many strains exhibiting a wide variety of phenotypes. Capturing these phenotypes in a centralized database provides researchers with an easy method for choosing the appropriate strains for their studies. Existing resources have provided some preliminary work in rat phenotype databases. However, existing resources suffer from problems such as small number of animals, lack of updating, web interface queries limitations and lack of standardized metadata. The Rat Genome Database (RGD) PhenoMiner tool has provided the first step in this effort by standardizing and integrating data from individual studies. Our work, mainly utilizing data curated in RGD, involves the following key steps: (i) we developed a meta-analysis pipeline to automatically integrate data from heterogeneous sources and to produce expected ranges (standardized phenotype ranges) for different strains and phenotypes under different experimental conditions; (ii) we created tools to visualize expected ranges for individual strains and strain groups. We developed a meta-analysis pipeline and an interactive web interface that summarizes and visualizes expected ranges produced from the meta-analysis pipeline. Automation of the pipeline allows for updates as additional data becomes available. The interactive web interface provides curators and researchers with a platform for identifying and validating expected ranges for a variety of quantitative phenotypes. The data analysis result and visualization tools will promote an understanding of rat disease models, guide researchers to choose optimal strains for their research needs and encourage data sharing from different research hubs. Such resources also help to promote research reproducibility. The interactive platforms created in this project will continue to provide a valuable resource for translational research efforts.
Assuntos
Modelos Animais de Doenças , Animais , Pressão Sanguínea , Peso Corporal , Bases de Dados Genéticas , Feminino , Genoma , Masculino , Metanálise como Assunto , Modelos Biológicos , Fenótipo , Viés de Publicação , Controle de Qualidade , Ratos , Software , SístoleRESUMO
BACKGROUND: To improve the outcomes of biological pathway analysis, a better way of integrating pathway data is needed. Ontologies can be used to organize data from disparate sources, and we leverage the Pathway Ontology as a unifying ontology for organizing pathway data. We aim to associate pathway instances from different databases to the appropriate class in the Pathway Ontology. RESULTS: Using a supervised machine learning approach, we trained neural networks to predict mappings between Reactome pathways and Pathway Ontology (PW) classes. For 2222 Reactome classes, the neural network (NN) model generated 10,952 class recommendations. We compared against a baseline bag-of-words (BOW) model for predicting correct PW classes. A 5% subset of Reactome pathways (111 pathways) was randomly selected, and the corresponding class recommendations from both models were evaluated by two curators. The precision of the BOW model was higher (0.49 for BOW and 0.39 for NN), but the recall was lower (0.42 for BOW and 0.78 for NN). Around 78% of Reactome pathways received pertinent recommendations from the NN model. CONCLUSIONS: The neural predictive model produced meaningful class recommendations that assisted PW curators in selecting appropriate class mappings for Reactome pathways. Our methods can be used to reduce the manual effort associated with ontology curation, and more broadly, for augmenting the curators' ability to organize and integrate data from pathway databases using the Pathway Ontology.
Assuntos
Ontologias Biológicas , Redes Neurais de Computação , Aprendizado de Máquina SupervisionadoRESUMO
Rats have been used as research models in biomedical research for over 150 years. These disease models arise from naturally occurring mutations, selective breeding and, more recently, genome manipulation. Through the innovation of genome-editing technologies, genome-modified rats provide precision models of disease by disrupting or complementing targeted genes. To facilitate the use of these data produced from rat disease models, the Rat Genome Database (RGD) organizes rat strains and annotates these strains with disease and qualitative phenotype terms as well as quantitative phenotype measurements. From the curated quantitative data, the expected phenotype profile ranges were established through a meta-analysis pipeline using inbred rat strains in control conditions. The disease and qualitative phenotype annotations are propagated to their associated genes and alleles if applicable. Currently, RGD has curated nearly 1300 rat strains with disease/phenotype annotations and about 11% of them have known allele associations. All of the annotations (disease and phenotype) are integrated and displayed on the strain, gene and allele report pages. Finding disease and phenotype models at RGD can be done by searching for terms in the ontology browser, browsing the disease or phenotype ontology branches or entering keywords in the general search. Use cases are provided to show different targeted searches of rat strains at RGD.
Assuntos
Curadoria de Dados , Mineração de Dados , Bases de Dados Genéticas , Doença/genética , Genoma , Animais , Sistema Enzimático do Citocromo P-450/genética , Modelos Animais de Doenças , Anotação de Sequência Molecular , Fenótipo , RatosRESUMO
Resources for rat researchers are extensive, including strain repositories and databases all around the world. The Rat Genome Database (RGD) serves as the primary rat data repository, providing both manual and computationally collected data from other databases.