Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Data ; 11(1): 363, 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38605048

RESUMO

Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.


Assuntos
Disciplinas das Ciências Biológicas , Bases de Conhecimento , Reconhecimento Automatizado de Padrão , Algoritmos , Pesquisa Translacional Biomédica
2.
Bioinformatics ; 39(7)2023 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-37389415

RESUMO

MOTIVATION: Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of KGs is lacking. RESULTS: Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of KGs. Features include a simple, modular extract-transform-load pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate KGs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph ML, including node embeddings and training of models for link prediction and node classification. AVAILABILITY AND IMPLEMENTATION: https://kghub.org.


Assuntos
Ontologias Biológicas , COVID-19 , Humanos , Reconhecimento Automatizado de Padrão , Doenças Raras , Aprendizado de Máquina
3.
Sci Data ; 9(1): 714, 2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36402838

RESUMO

The standardized identification of biomedical entities is a cornerstone of interoperability, reuse, and data integration in the life sciences. Several registries have been developed to catalog resources maintaining identifiers for biomedical entities such as small molecules, proteins, cell lines, and clinical trials. However, existing registries have struggled to provide sufficient coverage and metadata standards that meet the evolving needs of modern life sciences researchers. Here, we introduce the Bioregistry, an integrative, open, community-driven metaregistry that synthesizes and substantially expands upon 23 existing registries. The Bioregistry addresses the need for a sustainable registry by leveraging public infrastructure and automation, and employing a progressive governance model centered around open code and open data to foster community contribution. The Bioregistry can be used to support the standardized annotation of data, models, ontologies, and scientific literature, thereby promoting their interoperability and reuse. The Bioregistry can be accessed through https://bioregistry.io and its source code and data are available under the MIT and CC0 Licenses at https://github.com/biopragmatics/bioregistry .

4.
Clin Transl Sci ; 15(8): 1848-1855, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-36125173

RESUMO

Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open-source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.


Assuntos
Reconhecimento Automatizado de Padrão , Ciência Translacional Biomédica , Conhecimento
5.
Front Plant Sci ; 11: 592730, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33193550

RESUMO

MaizeMine is the data mining resource of the Maize Genetics and Genome Database (MaizeGDB; http://maizemine.maizegdb.org). It enables researchers to create and export customized annotation datasets that can be merged with their own research data for use in downstream analyses. MaizeMine uses the InterMine data warehousing system to integrate genomic sequences and gene annotations from the Zea mays B73 RefGen_v3 and B73 RefGen_v4 genome assemblies, Gene Ontology annotations, single nucleotide polymorphisms, protein annotations, homologs, pathways, and precomputed gene expression levels based on RNA-seq data from the Z. mays B73 Gene Expression Atlas. MaizeMine also provides database cross references between genes of alternative gene sets from Gramene and NCBI RefSeq. MaizeMine includes several search tools, including a keyword search, built-in template queries with intuitive search menus, and a QueryBuilder tool for creating custom queries. The Genomic Regions search tool executes queries based on lists of genome coordinates, and supports both the B73 RefGen_v3 and B73 RefGen_v4 assemblies. The List tool allows you to upload identifiers to create custom lists, perform set operations such as unions and intersections, and execute template queries with lists. When used with gene identifiers, the List tool automatically provides gene set enrichment for Gene Ontology (GO) and pathways, with a choice of statistical parameters and background gene sets. With the ability to save query outputs as lists that can be input to new queries, MaizeMine provides limitless possibilities for data integration and meta-analysis.

6.
Nucleic Acids Res ; 48(D1): D676-D681, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31647100

RESUMO

The Bovine Genome Database (BGD) (http://bovinegenome.org) has been the key community bovine genomics database for more than a decade. To accommodate the increasing amount and complexity of bovine genomics data, BGD continues to advance its practices in data acquisition, curation, integration and efficient data retrieval. BGD provides tools for genome browsing (JBrowse), genome annotation (Apollo), data mining (BovineMine) and sequence database searching (BLAST). To augment the BGD genome annotation capabilities, we have developed a new Apollo plug-in, called the Locus-Specific Alternate Assembly (LSAA) tool, which enables users to identify and report potential genome assembly errors and structural variants. BGD now hosts both the newest bovine reference genome assembly, ARS-UCD1.2, as well as the previous reference genome, UMD3.1.1, with cross-genome navigation and queries supported in JBrowse and BovineMine, respectively. Other notable enhancements to BovineMine include the incorporation of genomes and gene annotation datasets for non-bovine ruminant species (goat and sheep), support for multiple assemblies per organism in the Regions Search tool, integration of additional ontologies and development of many new template queries. To better serve the research community, we continue to focus on improving existing tools, developing new tools, adding new datasets and encouraging researchers to use these resources.


Assuntos
Bovinos/genética , Biologia Computacional/métodos , Bases de Dados Factuais , Genoma , Algoritmos , Animais , Gráficos por Computador , Mineração de Dados , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Genômica , Internet , Anotação de Sequência Molecular , RNA-Seq , Valores de Referência , Ruminantes/genética , Alinhamento de Sequência , Software , Interface Usuário-Computador
7.
PLoS Comput Biol ; 15(2): e1006790, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30726205

RESUMO

Genome annotation is the process of identifying the location and function of a genome's encoded features. Improving the biological accuracy of annotation is a complex and iterative process requiring researchers to review and incorporate multiple sources of information such as transcriptome alignments, predictive models based on sequence profiles, and comparisons to features found in related organisms. Because rapidly decreasing costs are enabling an ever-growing number of scientists to incorporate sequencing as a routine laboratory technique, there is widespread demand for tools that can assist in the deliberative analytical review of genomic information. To this end, we present Apollo, an open source software package that enables researchers to efficiently inspect and refine the precise structure and role of genomic features in a graphical browser-based platform. Some of Apollo's newer user interface features include support for real-time collaboration, allowing distributed users to simultaneously edit the same encoded features while also instantly seeing the updates made by other researchers on the same region in a manner similar to Google Docs. Its technical architecture enables Apollo to be integrated into multiple existing genomic analysis pipelines and heterogeneous laboratory workflow platforms. Finally, we consider the implications that Apollo and related applications may have on how the results of genome research are published and made accessible.


Assuntos
Biologia Computacional/métodos , Anotação de Sequência Molecular/métodos , Mapeamento Cromossômico/métodos , Sistemas de Gerenciamento de Base de Dados , Genoma/genética , Genômica , Armazenamento e Recuperação da Informação , Internet , Software , Interface Usuário-Computador
8.
Methods Mol Biol ; 1757: 211-249, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29761461

RESUMO

The Bovine Genome Database (BGD; http://bovinegenome.org ) is a web-accessible resource that supports bovine genomics research by providing genome annotation and data mining tools. BovineMine is a tool within BGD that integrates BGD data, including the genome, genes, precomputed gene expression levels and variant consequences, with external data sources that include quantitative trait loci (QTL), orthologues, Gene Ontology, gene interactions, and pathways. BovineMine enables researchers without programming skills to create custom integrated datasets for use in downstream analyses. This chapter describes how to enhance a bovine genomics project using the Bovine Genome Database, with data mining examples demonstrating BovineMine.


Assuntos
Bases de Dados Genéticas , Genoma , Genômica , Navegador , Animais , Bovinos , Biologia Computacional/métodos , Mineração de Dados/métodos , Expressão Gênica , Variação Genética , Estudo de Associação Genômica Ampla , Genômica/métodos , Metanálise como Assunto , Anotação de Sequência Molecular , Locos de Características Quantitativas , Software , Interface Usuário-Computador
9.
Methods Mol Biol ; 1757: 513-556, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29761469

RESUMO

The Hymenoptera Genome Database (HGD; http://hymenopteragenome.org ) is a genome informatics resource for insects of the order Hymenoptera, which includes bees, ants and wasps. HGD provides genome browsers with manual annotation tools (JBrowse/Apollo), BLAST, bulk data download, and a data mining warehouse (HymenopteraMine). This chapter focuses on the use of HymenopteraMine to create annotation data sets that can be exported for use in downstream analyses. HymenopteraMine leverages the InterMine platform to combine genome assemblies and official gene sets with data from OrthoDB, RefSeq, FlyBase, Gene Ontology, UniProt, InterPro, KEGG, Reactome, dbSNP, PubMed, and BioGrid, as well as precomputed gene expression information based on publicly available RNAseq. Built-in template queries provide starting points for data exploration, while the QueryBuilder tool supports construction of complex custom queries. The List Analysis and Genomic Regions search tools execute queries based on uploaded lists of identifiers and genome coordinates, respectively. HymenopteraMine facilitates cross-species data mining based on orthology and supports meta-analyses by tracking identifiers across gene sets and genome assemblies.


Assuntos
Bases de Dados Genéticas , Genoma de Inseto , Genômica , Himenópteros/genética , Animais , Biologia Computacional/métodos , Mineração de Dados , Genômica/métodos , Software , Interface Usuário-Computador , Navegador
10.
Nucleic Acids Res ; 44(D1): D793-800, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26578564

RESUMO

We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search.


Assuntos
Bases de Dados Genéticas , Genoma de Inseto , Himenópteros/genética , Anotação de Sequência Molecular , Animais , Mineração de Dados , Genômica , Alinhamento de Sequência
11.
Nucleic Acids Res ; 44(D1): D834-9, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26481361

RESUMO

We report an update of the Bovine Genome Database (BGD) (http://BovineGenome.org). The goal of BGD is to support bovine genomics research by providing genome annotation and data mining tools. We have developed new genome and annotation browsers using JBrowse and WebApollo for two Bos taurus genome assemblies, the reference genome assembly (UMD3.1.1) and the alternate genome assembly (Btau_4.6.1). Annotation tools have been customized to highlight priority genes for annotation, and to aid annotators in selecting gene evidence tracks from 91 tissue specific RNAseq datasets. We have also developed BovineMine, based on the InterMine data warehousing system, to integrate the bovine genome, annotation, QTL, SNP and expression data with external sources of orthology, gene ontology, gene interaction and pathway information. BovineMine provides powerful query building tools, as well as customized query templates, and allows users to analyze and download genome-wide datasets. With BovineMine, bovine researchers can use orthology to leverage the curated gene pathways of model organisms, such as human, mouse and rat. BovineMine will be especially useful for gene ontology and pathway analyses in conjunction with GWAS and QTL studies.


Assuntos
Bovinos/genética , Bases de Dados Genéticas , Genoma , Animais , Bovinos/metabolismo , Mineração de Dados , Expressão Gênica , Humanos , Camundongos , Anotação de Sequência Molecular , Ratos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...