Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 50(D1): D837-D847, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34788826

RESUMO

Since 2005, the Pathogen-Host Interactions Database (PHI-base) has manually curated experimentally verified pathogenicity, virulence and effector genes from fungal, bacterial and protist pathogens, which infect animal, plant, fish, insect and/or fungal hosts. PHI-base (www.phi-base.org) is devoted to the identification and presentation of phenotype information on pathogenicity and effector genes and their host interactions. Specific gene alterations that did not alter the in host interaction phenotype are also presented. PHI-base is invaluable for comparative analyses and for the discovery of candidate targets in medically and agronomically important species for intervention. Version 4.12 (September 2021) contains 4387 references, and provides information on 8411 genes from 279 pathogens, tested on 228 hosts in 18, 190 interactions. This provides a 24% increase in gene content since Version 4.8 (September 2019). Bacterial and fungal pathogens represent the majority of the interaction data, with a 54:46 split of entries, whilst protists, protozoa, nematodes and insects represent 3.6% of entries. Host species consist of approximately 54% plants and 46% others of medical, veterinary and/or environmental importance. PHI-base data is disseminated to UniProtKB, FungiDB and Ensembl Genomes. PHI-base will migrate to a new gene-centric version (version 5.0) in early 2022. This major development is briefly described.


Assuntos
Bases de Dados Factuais , Interações Hospedeiro-Patógeno/genética , Fenótipo , Interface Usuário-Computador , Animais , Apicomplexa/classificação , Apicomplexa/genética , Apicomplexa/patogenicidade , Bactérias/classificação , Bactérias/genética , Bactérias/patogenicidade , Diplomonadida/classificação , Diplomonadida/genética , Diplomonadida/patogenicidade , Fungos/classificação , Fungos/genética , Fungos/patogenicidade , Insetos/classificação , Insetos/genética , Insetos/patogenicidade , Internet , Nematoides/classificação , Nematoides/genética , Nematoides/patogenicidade , Filogenia , Plantas/microbiologia , Plantas/parasitologia , Virulência
2.
Nucleic Acids Res ; 48(D1): D613-D620, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31733065

RESUMO

The pathogen-host interactions database (PHI-base) is available at www.phi-base.org. PHI-base contains expertly curated molecular and biological information on genes proven to affect the outcome of pathogen-host interactions reported in peer reviewed research articles. PHI-base also curates literature describing specific gene alterations that did not affect the disease interaction phenotype, in order to provide complete datasets for comparative purposes. Viruses are not included, due to their extensive coverage in other databases. In this article, we describe the increased data content of PHI-base, plus new database features and further integration with complementary databases. The release of PHI-base version 4.8 (September 2019) contains 3454 manually curated references, and provides information on 6780 genes from 268 pathogens, tested on 210 hosts in 13,801 interactions. Prokaryotic and eukaryotic pathogens are represented in almost equal numbers. Host species consist of approximately 60% plants (split 50:50 between cereal and non-cereal plants), and 40% other species of medical and/or environmental importance. The information available on pathogen effectors has risen by more than a third, and the entries for pathogens that infect crop species of global importance has dramatically increased in this release. We also briefly describe the future direction of the PHI-base project, and some existing problems with the PHI-base curation process.


Assuntos
Doenças Transmissíveis/microbiologia , Doenças Transmissíveis/parasitologia , Biologia Computacional/métodos , Bases de Dados Factuais , Interações Hospedeiro-Patógeno/genética , Algoritmos , Animais , Antifúngicos , Bioensaio , Produtos Agrícolas , Gerenciamento de Dados , Genoma de Planta , Humanos , Internet , Fenótipo , Plantas , Ferramenta de Busca
3.
Nucleic Acids Res ; 47(D1): D821-D827, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30321395

RESUMO

PomBase (www.pombase.org), the model organism database for the fission yeast Schizosaccharomyces pombe, has undergone a complete redevelopment, resulting in a more fully integrated, better-performing service. The new infrastructure supports daily data updates as well as fast, efficient querying and smoother navigation within and between pages. New pages for publications and genotypes provide routes to all data curated from a single source and to all phenotypes associated with a specific genotype, respectively. For ontology-based annotations, improved displays balance comprehensive data coverage with ease of use. The default view now uses ontology structure to provide a concise, non-redundant summary that can be expanded to reveal underlying details and metadata. The phenotype annotation display also offers filtering options to allow users to focus on specific areas of interest. An instance of the JBrowse genome browser has been integrated, facilitating loading of and intuitive access to, genome-scale datasets. Taken together, the new data and pages, along with improvements in annotation display and querying, allow users to probe connections among different types of data to form a comprehensive view of fission yeast biology. The new PomBase implementation also provides a rich set of modular, reusable tools that can be deployed to create new, or enhance existing, organism-specific databases.


Assuntos
Bases de Dados Genéticas , Genoma Fúngico/genética , Schizosaccharomyces/genética , Internet , Software , Interface Usuário-Computador
4.
FEMS Yeast Res ; 19(2)2019 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-30753445

RESUMO

Topological analysis of large networks, which focus on a specific biological process or on related biological processes, where functional coherence exists among the interacting members, may provide a wealth of insight into cellular functionality. This work presents an unbiased systems approach to analyze genetic, transcriptional regulatory and physical interaction networks of yeast genes possessing such functional coherence to gain novel biological insight. The present analysis identified only a few transcriptional regulators amongst a large gene cohort associated with the protein metabolism and processing in yeast. These transcription factors are not functionally required for the maintenance of these tasks in growing cells. Rather, they are involved in rewiring gene transcription in response to such major challenges as starvation, hypoxia, DNA damage, heat shock or the accumulation of unfolded proteins. Indeed, only a subset of these proteins were captured empirically in the nuclear-enriched fraction of non-stressed yeast cells, suggesting that the transcriptional regulation of protein metabolism and processing in yeast is primarily concerned with maintaining cellular robustness in the face of threat by either internal or external stressors.


Assuntos
Regulação Fúngica da Expressão Gênica , Processamento de Proteína Pós-Traducional , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Fatores de Transcrição/metabolismo , Transcrição Gênica , Redes Reguladoras de Genes
5.
Nucleic Acids Res ; 45(D1): D128-D134, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27794554

RESUMO

RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. The website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality. All RNAcentral data is provided for free and is available for browsing, bulk downloads, and programmatic access at http://rnacentral.org/.


Assuntos
Bases de Dados de Ácidos Nucleicos , RNA não Traduzido/química , Animais , Genômica , Humanos , Nucleotídeos/química , Análise de Sequência de RNA , Especificidade da Espécie
6.
RNA ; 22(5): 667-76, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-26917558

RESUMO

MicroRNA regulation of developmental and cellular processes is a relatively new field of study, and the available research data have not been organized to enable its inclusion in pathway and network analysis tools. The association of gene products with terms from the Gene Ontology is an effective method to analyze functional data, but until recently there has been no substantial effort dedicated to applying Gene Ontology terms to microRNAs. Consequently, when performing functional analysis of microRNA data sets, researchers have had to rely instead on the functional annotations associated with the genes encoding microRNA targets. In consultation with experts in the field of microRNA research, we have created comprehensive recommendations for the Gene Ontology curation of microRNAs. This curation manual will enable provision of a high-quality, reliable set of functional annotations for the advancement of microRNA research. Here we describe the key aspects of the work, including development of the Gene Ontology to represent this data, standards for describing the data, and guidelines to support curators making these annotations. The full microRNA curation guidelines are available on the GO Consortium wiki (http://wiki.geneontology.org/index.php/MicroRNA_GO_annotation_manual).


Assuntos
Guias como Assunto , MicroRNAs/genética , Animais , Inativação Gênica , Humanos , Camundongos
7.
Nucleic Acids Res ; 43(Database issue): D656-61, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25361970

RESUMO

PomBase (http://www.pombase.org) is the model organism database for the fission yeast Schizosaccharomyces pombe. PomBase provides a central hub for the fission yeast community, supporting both exploratory and hypothesis-driven research. It provides users easy access to data ranging from the sequence level, to molecular and phenotypic annotations, through to the display of genome-wide high-throughput studies. Recent improvements to the site extend annotation specificity, improve usability and allow for monthly data updates. Both in-house curators and community researchers provide manually curated data to PomBase. The genome browser provides access to published high-throughput data sets and the genomes of three additional Schizosaccharomyces species (Schizosaccharomyces cryophilus, Schizosaccharomyces japonicus and Schizosaccharomyces octosporus).


Assuntos
Bases de Dados Genéticas , Schizosaccharomyces/genética , Expressão Gênica , Ontologia Genética , Genes Fúngicos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Anotação de Sequência Molecular
8.
BMC Biol ; 14: 49, 2016 06 22.
Artigo em Inglês | MEDLINE | ID: mdl-27334346

RESUMO

Modern biomedical research depends critically on access to databases that house and disseminate genetic, genomic, molecular, and cell biological knowledge. Even as the explosion of available genome sequences and associated genome-scale data continues apace, the sustainability of professionally maintained biological databases is under threat due to policy changes by major funding agencies. Here, we focus on model organism databases to demonstrate the myriad ways in which biological databases not only act as repositories but actively facilitate advances in research. We present data that show that reducing financial support to model organism databases could prove to be not just scientifically, but also economically, unsound.


Assuntos
Pesquisa Biomédica , Bases de Dados Genéticas , Genoma Fúngico , Genômica , Biologia Molecular , Anotação de Sequência Molecular , Schizosaccharomyces/genética
9.
Bioinformatics ; 30(12): 1791-2, 2014 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-24574118

RESUMO

MOTIVATION: Detailed curation of published molecular data is essential for any model organism database. Community curation enables researchers to contribute data from their papers directly to databases, supplementing the activity of professional curators and improving coverage of a growing body of literature. We have developed Canto, a web-based tool that provides an intuitive curation interface for both curators and researchers, to support community curation in the fission yeast database, PomBase. Canto supports curation using OBO ontologies, and can be easily configured for use with any species. AVAILABILITY: Canto code and documentation are available under an Open Source license from http://curation.pombase.org/. Canto is a component of the Generic Model Organism Database (GMOD) project (http://www.gmod.org/).


Assuntos
Bases de Dados Factuais , Software , Ontologias Biológicas , Internet , Schizosaccharomyces
10.
BMC Bioinformatics ; 15: 155, 2014 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-24885854

RESUMO

BACKGROUND: The Gene Ontology project integrates data about the function of gene products across a diverse range of organisms, allowing the transfer of knowledge from model organisms to humans, and enabling computational analyses for interpretation of high-throughput experimental and clinical data. The core data structure is the annotation, an association between a gene product and a term from one of the three ontologies comprising the GO. Historically, it has not been possible to provide additional information about the context of a GO term, such as the target gene or the location of a molecular function. This has limited the specificity of knowledge that can be expressed by GO annotations. RESULTS: The GO Consortium has introduced annotation extensions that enable manually curated GO annotations to capture additional contextual details. Extensions represent effector-target relationships such as localization dependencies, substrates of protein modifiers and regulation targets of signaling pathways and transcription factors as well as spatial and temporal aspects of processes such as cell or tissue type or developmental stage. We describe the content and structure of annotation extensions, provide examples, and summarize the current usage of annotation extensions. CONCLUSIONS: The additional contextual information captured by annotation extensions improves the utility of functional annotation by representing dependencies between annotations to terms in the different ontologies of GO, external ontologies, or an organism's gene products. These enhanced annotations can also support sophisticated queries and reasoning, and will provide curated, directional links between many gene products to support pathway and network reconstruction.


Assuntos
Ontologia Genética , Anotação de Sequência Molecular , Biologia Computacional/métodos , Humanos , Proteínas/genética
11.
Bioinformatics ; 29(13): 1671-8, 2013 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-23658422

RESUMO

MOTIVATION: To provide consistent computable descriptions of phenotype data, PomBase is developing a formal ontology of phenotypes observed in fission yeast. RESULTS: The fission yeast phenotype ontology (FYPO) is a modular ontology that uses several existing ontologies from the open biological and biomedical ontologies (OBO) collection as building blocks, including the phenotypic quality ontology PATO, the Gene Ontology and Chemical Entities of Biological Interest. Modular ontology development facilitates partially automated effective organization of detailed phenotype descriptions with complex relationships to each other and to underlying biological phenomena. As a result, FYPO supports sophisticated querying, computational analysis and comparison between different experiments and even between species. AVAILABILITY: FYPO releases are available from the Subversion repository at the PomBase SourceForge project page (https://sourceforge.net/p/pombase/code/HEAD/tree/phenotype_ontology/). The current version of FYPO is also available on the OBO Foundry Web site (http://obofoundry.org/).


Assuntos
Fenótipo , Schizosaccharomyces/genética , Ontologias Biológicas , Bases de Dados Genéticas , Ontologia Genética
12.
Nat Rev Genet ; 9(7): 509-15, 2008 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-18475267

RESUMO

The Gene Ontology (GO) project is a collaboration among model organism databases to describe gene products from all organisms using a consistent and computable language. GO produces sets of explicitly defined, structured vocabularies that describe biological processes, molecular functions and cellular components of gene products in both a computer- and human-readable manner. Here we describe key aspects of GO, which, when overlooked, can cause erroneous results, and address how these pitfalls can be avoided.


Assuntos
Biologia Computacional , Bases de Dados Genéticas , Proteínas/genética , Proteínas/metabolismo , Humanos , Processamento de Linguagem Natural , Software
13.
Nature ; 453(7199): 1239-43, 2008 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-18488015

RESUMO

Recent data from several organisms indicate that the transcribed portions of genomes are larger and more complex than expected, and that many functional properties of transcripts are based not on coding sequences but on regulatory sequences in untranslated regions or non-coding RNAs. Alternative start and polyadenylation sites and regulation of intron splicing add additional dimensions to the rich transcriptional output. This transcriptional complexity has been sampled mainly using hybridization-based methods under one or few experimental conditions. Here we applied direct high-throughput sequencing of complementary DNAs (RNA-Seq), supplemented with data from high-density tiling arrays, to globally sample transcripts of the fission yeast Schizosaccharomyces pombe, independently from available gene annotations. We interrogated transcriptomes under multiple conditions, including rapid proliferation, meiotic differentiation and environmental stress, as well as in RNA processing mutants to reveal the dynamic plasticity of the transcriptional landscape as a function of environmental, developmental and genetic factors. High-throughput sequencing proved to be a powerful and quantitative method to sample transcriptomes deeply at maximal resolution. In contrast to hybridization, sequencing showed little, if any, background noise and was sensitive enough to detect widespread transcription in >90% of the genome, including traces of RNAs that were not robustly transcribed or rapidly degraded. The combined sequencing and strand-specific array data provide rich condition-specific information on novel, mostly non-coding transcripts, untranslated regions and gene structures, thus improving the existing genome annotation. Sequence reads spanning exon-exon or exon-intron junctions give unique insight into a surprising variability in splicing efficiency across introns, genes and conditions. Splicing efficiency was largely coordinated with transcript levels, and increased transcription led to increased splicing in test genes. Hundreds of introns showed such regulated splicing during cellular proliferation or differentiation.


Assuntos
Células Eucarióticas/metabolismo , Perfilação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , Schizosaccharomyces/genética , Processamento Alternativo/genética , Imunoprecipitação da Cromatina , Éxons/genética , Regulação Fúngica da Expressão Gênica , Genes Fúngicos/genética , Íntrons/genética , RNA Polimerase II/metabolismo , RNA Fúngico/análise , RNA Fúngico/genética , RNA Mensageiro/análise , RNA Mensageiro/genética , Proteínas de Schizosaccharomyces pombe/genética , Sensibilidade e Especificidade , Transcrição Gênica/genética
14.
Nucleic Acids Res ; 40(Database issue): D695-9, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22039153

RESUMO

PomBase (www.pombase.org) is a new model organism database established to provide access to comprehensive, accurate, and up-to-date molecular data and biological information for the fission yeast Schizosaccharomyces pombe to effectively support both exploratory and hypothesis-driven research. PomBase encompasses annotation of genomic sequence and features, comprehensive manual literature curation and genome-wide data sets, and supports sophisticated user-defined queries. The implementation of PomBase integrates a Chado relational database that houses manually curated data with Ensembl software that supports sequence-based annotation and web access. PomBase will provide user-friendly tools to promote curation by experts within the fission yeast community. This will make a key contribution to shaping its content and ensuring its comprehensiveness and long-term relevance.


Assuntos
Bases de Dados Genéticas , Schizosaccharomyces/genética , Genoma Fúngico , Genômica , Internet , Anotação de Sequência Molecular , Fenótipo
15.
Genetics ; 227(1)2024 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-38376816

RESUMO

PomBase (https://www.pombase.org), the model organism database (MOD) for fission yeast, was recently awarded Global Core Biodata Resource (GCBR) status by the Global Biodata Coalition (GBC; https://globalbiodata.org/) after a rigorous selection process. In this MOD review, we present PomBase's continuing growth and improvement over the last 2 years. We describe these improvements in the context of the qualitative GCBR indicators related to scientific quality, comprehensivity, accelerating science, user stories, and collaborations with other biodata resources. This review also showcases the depth of existing connections both within the biocuration ecosystem and between PomBase and its user community.


Assuntos
Schizosaccharomyces , Schizosaccharomyces/genética , Bases de Dados Genéticas , Genoma Fúngico
16.
J Biomed Semantics ; 15(1): 19, 2024 Oct 17.
Artigo em Inglês | MEDLINE | ID: mdl-39415214

RESUMO

BACKGROUND: Ontologies are fundamental components of informatics infrastructure in domains such as biomedical, environmental, and food sciences, representing consensus knowledge in an accurate and computable form. However, their construction and maintenance demand substantial resources and necessitate substantial collaboration between domain experts, curators, and ontology experts. We present Dynamic Retrieval Augmented Generation of Ontologies using AI (DRAGON-AI), an ontology generation method employing Large Language Models (LLMs) and Retrieval Augmented Generation (RAG). DRAGON-AI can generate textual and logical ontology components, drawing from existing knowledge in multiple ontologies and unstructured text sources. RESULTS: We assessed performance of DRAGON-AI on de novo term construction across ten diverse ontologies, making use of extensive manual evaluation of results. Our method has high precision for relationship generation, but has slightly lower precision than from logic-based reasoning. Our method is also able to generate definitions deemed acceptable by expert evaluators, but these scored worse than human-authored definitions. Notably, evaluators with the highest level of confidence in a domain were better able to discern flaws in AI-generated definitions. We also demonstrated the ability of DRAGON-AI to incorporate natural language instructions in the form of GitHub issues. CONCLUSIONS: These findings suggest DRAGON-AI's potential to substantially aid the manual ontology construction process. However, our results also underscore the importance of having expert curators and ontology editors drive the ontology generation process.


Assuntos
Inteligência Artificial , Ontologias Biológicas , Processamento de Linguagem Natural , Armazenamento e Recuperação da Informação/métodos
17.
bioRxiv ; 2024 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-39345458

RESUMO

Phenotypic data are critical for understanding biological mechanisms and consequences of genomic variation, and are pivotal for clinical use cases such as disease diagnostics and treatment development. For over a century, vast quantities of phenotype data have been collected in many different contexts covering a variety of organisms. The emerging field of phenomics focuses on integrating and interpreting these data to inform biological hypotheses. A major impediment in phenomics is the wide range of distinct and disconnected approaches to recording the observable characteristics of an organism. Phenotype data are collected and curated using free text, single terms or combinations of terms, using multiple vocabularies, terminologies, or ontologies. Integrating these heterogeneous and often siloed data enables the application of biological knowledge both within and across species. Existing integration efforts are typically limited to mappings between pairs of terminologies; a generic knowledge representation that captures the full range of cross-species phenomics data is much needed. We have developed the Unified Phenotype Ontology (uPheno) framework, a community effort to provide an integration layer over domain-specific phenotype ontologies, as a single, unified, logical representation. uPheno comprises (1) a system for consistent computational definition of phenotype terms using ontology design patterns, maintained as a community library; (2) a hierarchical vocabulary of species-neutral phenotype terms under which their species-specific counterparts are grouped; and (3) mapping tables between species-specific ontologies. This harmonized representation supports use cases such as cross-species integration of genotype-phenotype associations from different organisms and cross-species informed variant prioritization.

18.
PLoS Comput Biol ; 8(2): e1002386, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22359495

RESUMO

A recent paper (Nehrt et al., PLoS Comput. Biol. 7:e1002073, 2011) has proposed a metric for the "functional similarity" between two genes that uses only the Gene Ontology (GO) annotations directly derived from published experimental results. Applying this metric, the authors concluded that paralogous genes within the mouse genome or the human genome are more functionally similar on average than orthologous genes between these genomes, an unexpected result with broad implications if true. We suggest, based on both theoretical and empirical considerations, that this proposed metric should not be interpreted as a functional similarity, and therefore cannot be used to support any conclusions about the "ortholog conjecture" (or, more properly, the "ortholog functional conservation hypothesis"). First, we reexamine the case studies presented by Nehrt et al. as examples of orthologs with divergent functions, and come to a very different conclusion: they actually exemplify how GO annotations for orthologous genes provide complementary information about conserved biological functions. We then show that there is a global ascertainment bias in the experiment-based GO annotations for human and mouse genes: particular types of experiments tend to be performed in different model organisms. We conclude that the reported statistical differences in annotations between pairs of orthologous genes do not reflect differences in biological function, but rather complementarity in experimental approaches. Our results underscore two general considerations for researchers proposing novel types of analysis based on the GO: 1) that GO annotations are often incomplete, potentially in a biased manner, and subject to an "open world assumption" (absence of an annotation does not imply absence of a function), and 2) that conclusions drawn from a novel, large-scale GO analysis should whenever possible be supported by careful, in-depth examination of examples, to help ensure the conclusions have a justifiable biological basis.


Assuntos
Biologia Computacional/métodos , Genoma Humano , Algoritmos , Animais , Núcleo Celular/metabolismo , Genoma , Genômica/métodos , Humanos , Camundongos , Modelos Genéticos , Modelos Estatísticos , Biologia Molecular/métodos , Anotação de Sequência Molecular/métodos , Fosforilação , Probabilidade , Especificidade da Espécie
19.
Elife ; 122023 07 04.
Artigo em Inglês | MEDLINE | ID: mdl-37401199

RESUMO

The quantity and complexity of data being generated and published in biology has increased substantially, but few methods exist for capturing knowledge about phenotypes derived from molecular interactions between diverse groups of species, in such a way that is amenable to data-driven biology and research. To improve access to this knowledge, we have constructed a framework for the curation of the scientific literature studying interspecies interactions, using data curated for the Pathogen-Host Interactions database (PHI-base) as a case study. The framework provides a curation tool, phenotype ontology, and controlled vocabularies to curate pathogen-host interaction data, at the level of the host, pathogen, strain, gene, and genotype. The concept of a multispecies genotype, the 'metagenotype,' is introduced to facilitate capturing changes in the disease-causing abilities of pathogens, and host resistance or susceptibility, observed by gene alterations. We report on this framework and describe PHI-Canto, a community curation tool for use by publication authors.


The increasingly vast amount of data being produced in research communities can be difficult to manage, making it challenging for both humans and computers to organise and connect information from different sources. Currently, software tools that allow authors to curate peer-reviewed life science publications are designed solely for single species, or closely related species that do not interact. Although most research communities are striving to make their data FAIR (Findable, Accessible, Interoperable and Reusable), it is particularly difficult to curate detailed information based on interactions between two or more species (interspecies), such as pathogen-host interactions. As a result, there was a lack of tools to support multi-species interaction databases, leading to a reliance on labour-intensive curation methods. To address this problem, Cuzick et al. used the Pathogen-Host Interactions database (PHI-base), which curates knowledge from the text, tables and figures published in over 200 journals, as a case study. A framework was developed that could capture the many observable traits (phenotype annotations) for interactions and link them directly to the combination of genotypes involved in those interactions across multiple scales ­ ranging from microscopic to macroscopic. This demonstrated that it was possible to build a framework of software tools to enable curation of interactions between species in more detail than had been done before. Cuzick et al. developed an online tool called PHI-Canto that allows any researcher to curate published pathogen-host interactions between almost any known species. An ontology ­ a collection of concepts and their relations ­ was created to describe the outcomes of pathogen-host interactions in a standardised way. Additionally, a new concept called the 'metagenotype' was developed which represents the combination of a pathogen and a host genotype and can be easily annotated with the phenotypes arising from each interaction. The newly curated multi-species FAIR data on pathogen-host interactions will enable researchers in different disciplines to compare and contrast interactions across species and scales. Ultimately, this will assist the development of new approaches to reduce the impact of pathogens on humans, livestock, crops and ecosystems with the aim of decreasing disease while increasing food security and biodiversity. The framework is potentially adoptable by any research community investigating interactions between species and could be adapted to explore other harmful and beneficial interspecies interactions.


Assuntos
Curadoria de Dados , Bases de Dados Factuais , Genótipo , Fenótipo
20.
Genetics ; 225(3)2023 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-37758508

RESUMO

Standardized nomenclature for genes, gene products, and isoforms is crucial to prevent ambiguity and enable clear communication of scientific data, facilitating efficient biocuration and data sharing. Standardized genotype nomenclature, which describes alleles present in a specific strain that differ from those in the wild-type reference strain, is equally essential to maximize research impact and ensure that results linking genotypes to phenotypes are Findable, Accessible, Interoperable, and Reusable (FAIR). In this publication, we extend the fission yeast clade gene nomenclature guidelines to support the curation efforts at PomBase (www.pombase.org), the Schizosaccharomyces pombe Model Organism Database. This update introduces nomenclature guidelines for noncoding RNA genes, following those set forth by the Human Genome Organisation Gene Nomenclature Committee. Additionally, we provide a significant update to the allele and genotype nomenclature guidelines originally published in 1987, to standardize the diverse range of genetic modifications enabled by the fission yeast genetic toolbox. These updated guidelines reflect a community consensus between numerous fission yeast researchers. Adoption of these rules will improve consistency in gene and genotype nomenclature, and facilitate machine-readability and automated entity recognition of fission yeast genes and alleles in publications or datasets. In conclusion, our updated guidelines provide a valuable resource for the fission yeast research community, promoting consistency, clarity, and FAIRness in genetic data sharing and interpretation.


Assuntos
Schizosaccharomyces , Humanos , Schizosaccharomyces/genética , Alelos , Compreensão , Bases de Dados Genéticas , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA