Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Genome Biol ; 24(1): 115, 2023 05 12.
Artículo en Inglés | MEDLINE | ID: mdl-37173739

RESUMEN

The Vertebrate Gene Nomenclature Committee (VGNC) was established in 2016 as a sister project to the HUGO Gene Nomenclature Committee, to approve gene nomenclature in vertebrate species without an existing dedicated nomenclature committee. The VGNC aims to harmonize gene nomenclature across selected vertebrate species in line with human gene nomenclature, with orthologs assigned the same nomenclature where possible. This article presents an overview of the VGNC project and discussion of key findings resulting from this work to date. VGNC-approved nomenclature is accessible at https://vertebrate.genenames.org and is additionally displayed by the NCBI, Ensembl, and UniProt databases.


Asunto(s)
Bases de Datos Genéticas , Vertebrados , Animales , Humanos , Vertebrados/genética
2.
IUBMB Life ; 75(5): 380-389, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-35880706

RESUMEN

The HUGO Gene Nomenclature Committee (HGNC) is the sole group with the authority to approve symbols for human genes, including long non-coding RNA (lncRNA) genes. Use of approved symbols ensures that publications and biomedical databases are easily searchable and reduces the risks of confusion that can be caused by using the same symbol to refer to different genes or using many different symbols for the same gene. Here, we describe how the HGNC names lncRNA genes and review the nomenclature of the seven lncRNA genes most mentioned in the scientific literature.


Asunto(s)
ARN Largo no Codificante , Humanos , ARN Largo no Codificante/genética , Bases de Datos Genéticas
3.
Nucleic Acids Res ; 51(D1): D1003-D1009, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36243972

RESUMEN

The HUGO Gene Nomenclature Committee (HGNC) assigns unique symbols and names to human genes. The HGNC database (www.genenames.org) currently contains over 43 000 approved gene symbols, over 19 200 of which are assigned to protein-coding genes, 14 000 to pseudogenes and nearly 9000 to non-coding RNA genes. The public website, www.genenames.org, displays all approved nomenclature within Symbol Reports that contain data curated by HGNC nomenclature advisors and links to related genomic, clinical, and proteomic information. Here, we describe updates to our resource, including improvements to our search facility and new download features.


Asunto(s)
Bases de Datos Genéticas , Humanos , Genoma , Genómica , Proteómica , Seudogenes , Terminología como Asunto
4.
Hum Genomics ; 16(1): 58, 2022 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-36380364

RESUMEN

The HUGO Gene Nomenclature Committee (HGNC) has been providing standardized symbols and names for human genes since the late 1970s. As funding agencies change their priorities, finding financial support for critical biomedical resources such as the HGNC becomes ever more challenging. In this article, we outline the key roles the HGNC currently plays in aiding communication and the need for these activities to be maintained.


Asunto(s)
Bases de Datos Genéticas , Genómica , Humanos
5.
Genet Med ; 24(8): 1732-1742, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35507016

RESUMEN

PURPOSE: Several groups and resources provide information that pertains to the validity of gene-disease relationships used in genomic medicine and research; however, universal standards and terminologies to define the evidence base for the role of a gene in disease and a single harmonized resource were lacking. To tackle this issue, the Gene Curation Coalition (GenCC) was formed. METHODS: The GenCC drafted harmonized definitions for differing levels of gene-disease validity on the basis of existing resources, and performed a modified Delphi survey with 3 rounds to narrow the list of terms. The GenCC also developed a unified database to display curated gene-disease validity assertions from its members. RESULTS: On the basis of 241 survey responses from the genetics community, a consensus term set was chosen for grading gene-disease validity and database submissions. As of December 2021, the database contained 15,241 gene-disease assertions on 4569 unique genes from 12 submitters. When comparing submissions to the database from distinct sources, conflicts in assertions of gene-disease validity ranged from 5.3% to 13.4%. CONCLUSION: Terminology standardization, sharing of gene-disease validity classifications, and resolution of curation conflicts will facilitate collaborations across international curation efforts and in turn, improve consistency in genetic testing and variant interpretation.


Asunto(s)
Bases de Datos Genéticas , Genómica , Pruebas Genéticas , Variación Genética , Humanos
6.
Am J Hum Genet ; 108(10): 1813-1816, 2021 10 07.
Artículo en Inglés | MEDLINE | ID: mdl-34626580

RESUMEN

The use of approved nomenclature in publications is vital to enable effective scientific communication and is particularly crucial when discussing genes of clinical relevance. Here, we discuss several examples of cases where the failure of researchers to use a HUGO Gene Nomenclature Committee (HGNC)-approved symbol in publications has led to confusion between unrelated human genes in the literature. We also inform authors of the steps they can take to ensure that they use approved nomenclature in their manuscripts and discuss how referencing HGNC IDs can remove ambiguity when referring to genes that have previously been published with confusing alias symbols.


Asunto(s)
Bases de Datos Genéticas/normas , Genes/genética , Genoma Humano , Investigadores/normas , Terminología como Asunto , Genómica , Humanos
7.
Nucleic Acids Res ; 49(D1): D939-D946, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33152070

RESUMEN

The HUGO Gene Nomenclature Committee (HGNC) based at EMBL's European Bioinformatics Institute (EMBL-EBI) assigns unique symbols and names to human genes. There are over 42,000 approved gene symbols in our current database of which over 19 000 are for protein-coding genes. While we still update placeholder and problematic symbols, we are working towards stabilizing symbols where possible; over 2000 symbols for disease associated genes are now marked as stable in our symbol reports. All of our data is available at the HGNC website https://www.genenames.org. The Vertebrate Gene Nomenclature Committee (VGNC) was established to assign standardized nomenclature in line with human for vertebrate species lacking their own nomenclature committee. In addition to the previous VGNC core species of chimpanzee, cow, horse and dog, we now name genes in cat, macaque and pig. Gene groups have been added to VGNC and currently include two complex families: olfactory receptors (ORs) and cytochrome P450s (CYPs). In collaboration with specialists we have also named CYPs in species beyond our core set. All VGNC data is available at https://vertebrate.genenames.org/. This article provides an overview of our online data and resources, focusing on updates over the last two years.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Genes/genética , Genómica/métodos , Terminología como Asunto , Vertebrados/genética , Animales , Humanos , Internet , Proteínas/genética , Especificidad de la Especie , Interfaz Usuario-Computador , Vertebrados/clasificación
9.
Genome Res ; 29(12): 2073-2087, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31537640

RESUMEN

The most widely appreciated role of DNA is to encode protein, yet the exact portion of the human genome that is translated remains to be ascertained. We previously developed PhyloCSF, a widely used tool to identify evolutionary signatures of protein-coding regions using multispecies genome alignments. Here, we present the first whole-genome PhyloCSF prediction tracks for human, mouse, chicken, fly, worm, and mosquito. We develop a workflow that uses machine learning to predict novel conserved protein-coding regions and efficiently guide their manual curation. We analyze more than 1000 high-scoring human PhyloCSF regions and confidently add 144 conserved protein-coding genes to the GENCODE gene set, as well as additional coding regions within 236 previously annotated protein-coding genes, and 169 pseudogenes, most of them disabled after primates diverged. The majority of these represent new discoveries, including 70 previously undetected protein-coding genes. The novel coding genes are additionally supported by single-nucleotide variant evidence indicative of continued purifying selection in the human lineage, coding-exon splicing evidence from new GENCODE transcripts using next-generation transcriptomic data sets, and mass spectrometry evidence of translation for several new genes. Our discoveries required simultaneous comparative annotation of other vertebrate genomes, which we show is essential to remove spurious ORFs and to distinguish coding from pseudogene regions. Our new coding regions help elucidate disease-associated regions by revealing that 118 GWAS variants previously thought to be noncoding are in fact protein altering. Altogether, our PhyloCSF data sets and algorithms will help researchers seeking to interpret these genomes, while our new annotations present exciting loci for further experimental characterization.


Asunto(s)
Exones , Genoma Humano , Estudio de Asociación del Genoma Completo , Secuenciación de Nucleótidos de Alto Rendimiento , Sistemas de Lectura Abierta , Análisis de Secuencia de ADN , Animales , Humanos , Seudogenes
10.
J Cell Commun Signal ; 13(3): 435, 2019 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-31468292

RESUMEN

The original version of this article unfortunately contained a mistake. In the Abstract section, a production query number (Q2) was inadvertently inserted within the new official gene names of the CCN proteins.

11.
Nucleic Acids Res ; 47(D1): D786-D792, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30304474

RESUMEN

The HUGO Gene Nomenclature Committee (HGNC) based at EMBL's European Bioinformatics Institute (EMBL-EBI) assigns unique symbols and names to human genes. There are over 40 000 approved gene symbols in our current database of which over 19 000 are for protein-coding genes. The Vertebrate Gene Nomenclature Committee (VGNC) was established in 2016 to assign standardized nomenclature in line with human for vertebrate species that lack their own nomenclature committees. The VGNC initially assigned nomenclature for over 15000 protein-coding genes in chimpanzee. We have extended this process to other vertebrate species, naming over 14000 protein-coding genes in cow and dog and over 13 000 in horse to date. Our HGNC website https://www.genenames.org has undergone a major design update, simplifying the homepage to provide easy access to our search tools and making the site more mobile friendly. Our gene families pages are now known as 'gene groups' and have increased in number to over 1200, with nearly half of all named genes currently assigned to at least one gene group. This article provides an overview of our online data and resources, focusing on our work over the last two years.


Asunto(s)
Biología Computacional/normas , Bases de Datos Genéticas/normas , Genómica/normas , Terminología como Asunto , Animales , Bovinos , Perros , Caballos/genética , Humanos , Pan troglodytes/genética , Motor de Búsqueda
12.
J Cell Commun Signal ; 12(4): 625-629, 2018 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-30393824

RESUMEN

An examination of the confusion generated around the use of different acronyms for CCN proteins has been performed by the editors of the HUGO Gene Nomenclature Committee upon the request of the International CCN Society Scientific Committee. After careful consideration of the various arguments, and after polling the community of researchers who have published in the field over the past ten years, the HGNC have decided to adopt and approve the CCN nomenclature for all 6 genes. Effective October 2018, the genes referred to as CYR61, CTGF, NOV and WISP1-3 will be respectively designated by the gene symbols CCN1-6 with corresponding gene names « cellular communication Q2 network factor 1-6 ¼. We believe that this decision will be a step towards better communication between researchers working in the field, and will set the stage for fruitful collaborative projects. Accordingly, the Journal of Cell Communication and Signaling, the official journal of the International CCN Society, available both in print and online, constitutes a unique window into the CCN field. This official nomenclature will benefit the international scientific community that is supported by the established and renowned professionalism of the Springer-Nature publishing group.

13.
Circ Genom Precis Med ; 11(2): e001813, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-29440116

RESUMEN

BACKGROUND: A systems biology approach to cardiac physiology requires a comprehensive representation of how coordinated processes operate in the heart, as well as the ability to interpret relevant transcriptomic and proteomic experiments. The Gene Ontology (GO) Consortium provides structured, controlled vocabularies of biological terms that can be used to summarize and analyze functional knowledge for gene products. METHODS AND RESULTS: In this study, we created a computational resource to facilitate genetic studies of cardiac physiology by integrating literature curation with attention to an improved and expanded ontological representation of heart processes in the Gene Ontology. As a result, the Gene Ontology now contains terms that comprehensively describe the roles of proteins in cardiac muscle cell action potential, electrical coupling, and the transmission of the electrical impulse from the sinoatrial node to the ventricles. Evaluating the effectiveness of this approach to inform data analysis demonstrated that Gene Ontology annotations, analyzed within an expanded ontological context of heart processes, can help to identify candidate genes associated with arrhythmic disease risk loci. CONCLUSIONS: We determined that a combination of curation and ontology development for heart-specific genes and processes supports the identification and downstream analysis of genes responsible for the spread of the cardiac action potential through the heart. Annotating these genes and processes in a structured format facilitates data analysis and supports effective retrieval of gene-centric information about cardiac defects.


Asunto(s)
Ontología de Genes , Cardiopatías , Proteómica , Biología Computacional , Bases de Datos Genéticas , Corazón , Cardiopatías/genética , Humanos , Anotación de Secuencia Molecular , Fenotipo
14.
Nucleic Acids Res ; 45(D1): D619-D625, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27799471

RESUMEN

The HUGO Gene Nomenclature Committee (HGNC) based at the European Bioinformatics Institute (EMBL-EBI) assigns unique symbols and names to human genes. Currently the HGNC database contains almost 40 000 approved gene symbols, over 19 000 of which represent protein-coding genes. In addition to naming genomic loci we manually curate genes into family sets based on shared characteristics such as homology, function or phenotype. We have recently updated our gene family resources and introduced new improved visualizations which can be seen alongside our gene symbol reports on our primary website http://www.genenames.org In 2016 we expanded our remit and formed the Vertebrate Gene Nomenclature Committee (VGNC) which is responsible for assigning names to vertebrate species lacking a dedicated nomenclature group. Using the chimpanzee genome as a pilot project we have approved symbols and names for over 14 500 protein-coding genes in chimpanzee, and have developed a new website http://vertebrate.genenames.org to distribute these data. Here, we review our online data and resources, focusing particularly on the improvements and new developments made during the last two years.


Asunto(s)
Bases de Datos Genéticas , Genes , Genoma , Genómica/métodos , Terminología como Asunto , Vertebrados , Navegador Web , Animales , Humanos , Familia de Multigenes , Motor de Búsqueda
15.
J Biol Chem ; 291(46): 24036-24040, 2016 Nov 11.
Artículo en Inglés | MEDLINE | ID: mdl-27645994

RESUMEN

The human genome contains 25 genes coding for selenocysteine-containing proteins (selenoproteins). These proteins are involved in a variety of functions, most notably redox homeostasis. Selenoprotein enzymes with known functions are designated according to these functions: TXNRD1, TXNRD2, and TXNRD3 (thioredoxin reductases), GPX1, GPX2, GPX3, GPX4, and GPX6 (glutathione peroxidases), DIO1, DIO2, and DIO3 (iodothyronine deiodinases), MSRB1 (methionine sulfoxide reductase B1), and SEPHS2 (selenophosphate synthetase 2). Selenoproteins without known functions have traditionally been denoted by SEL or SEP symbols. However, these symbols are sometimes ambiguous and conflict with the approved nomenclature for several other genes. Therefore, there is a need to implement a rational and coherent nomenclature system for selenoprotein-encoding genes. Our solution is to use the root symbol SELENO followed by a letter. This nomenclature applies to SELENOF (selenoprotein F, the 15-kDa selenoprotein, SEP15), SELENOH (selenoprotein H, SELH, C11orf31), SELENOI (selenoprotein I, SELI, EPT1), SELENOK (selenoprotein K, SELK), SELENOM (selenoprotein M, SELM), SELENON (selenoprotein N, SEPN1, SELN), SELENOO (selenoprotein O, SELO), SELENOP (selenoprotein P, SeP, SEPP1, SELP), SELENOS (selenoprotein S, SELS, SEPS1, VIMP), SELENOT (selenoprotein T, SELT), SELENOV (selenoprotein V, SELV), and SELENOW (selenoprotein W, SELW, SEPW1). This system, approved by the HUGO Gene Nomenclature Committee, also resolves conflicting, missing, and ambiguous designations for selenoprotein genes and is applicable to selenoproteins across vertebrates.


Asunto(s)
Selenoproteínas/clasificación , Selenoproteínas/genética , Humanos , Terminología como Asunto
16.
Dis Model Mech ; 9(3): 245-52, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26935103

RESUMEN

The use of Drosophila melanogaster as a model for studying human disease is well established, reflected by the steady increase in both the number and proportion of fly papers describing human disease models in recent years. In this article, we highlight recent efforts to improve the availability and accessibility of the disease model information in FlyBase (http://flybase.org), the model organism database for Drosophila. FlyBase has recently introduced Human Disease Model Reports, each of which presents background information on a specific disease, a tabulation of related disease subtypes, and summaries of experimental data and results using fruit flies. Integrated presentations of relevant data and reagents described in other sections of FlyBase are incorporated into these reports, which are specifically designed to be accessible to non-fly researchers in order to promote collaboration across model organism communities working in translational science. Another key component of disease model information in FlyBase is that data are collected in a consistent format --- using the evolving Disease Ontology (an open-source standardized ontology for human-disease-associated biomedical data) - to allow robust and intuitive searches. To facilitate this, FlyBase has developed a dedicated tool for querying and navigating relevant data, which include mutations that model a disease and any associated interacting modifiers. In this article, we describe how data related to fly models of human disease are presented in individual Gene Reports and in the Human Disease Model Reports. Finally, we discuss search strategies and new query tools that are available to access the disease model data in FlyBase.


Asunto(s)
Investigación Biomédica , Bases de Datos Genéticas , Modelos Animales de Enfermedad , Enfermedad , Drosophila melanogaster/fisiología , Esclerosis Amiotrófica Lateral/patología , Animales , Humanos
17.
Hum Genomics ; 10: 6, 2016 Feb 03.
Artículo en Inglés | MEDLINE | ID: mdl-26842383

RESUMEN

The HUGO Gene Nomenclature Committee (HGNC) approves unique gene symbols and names for human loci. As well as naming genomic loci, we manually curate genes into family sets based on shared characteristics such as function, homology or phenotype. Each HGNC gene family has its own dedicated gene family report on our website, www.genenames.org . We have recently redesigned these reports to support the visualisation and browsing of complex relationships between families and to provide extra curated information such as family descriptions, protein domain graphics and gene family aliases. Here, we review how our gene families are curated and explain how to view, search and download the gene family data.


Asunto(s)
Bases de Datos Genéticas , Genómica , Proteínas de Neoplasias/genética , Humanos , Internet , Proteínas de Neoplasias/clasificación
18.
Artículo en Inglés | MEDLINE | ID: mdl-25157073

RESUMEN

Gene ontology (GO) annotation is a common task among model organism databases (MODs) for capturing gene function data from journal articles. It is a time-consuming and labor-intensive task, and is thus often considered as one of the bottlenecks in literature curation. There is a growing need for semiautomated or fully automated GO curation techniques that will help database curators to rapidly and accurately identify gene function information in full-length articles. Despite multiple attempts in the past, few studies have proven to be useful with regard to assisting real-world GO curation. The shortage of sentence-level training data and opportunities for interaction between text-mining developers and GO curators has limited the advances in algorithm development and corresponding use in practical circumstances. To this end, we organized a text-mining challenge task for literature-based GO annotation in BioCreative IV. More specifically, we developed two subtasks: (i) to automatically locate text passages that contain GO-relevant information (a text retrieval task) and (ii) to automatically identify relevant GO terms for the genes in a given article (a concept-recognition task). With the support from five MODs, we provided teams with >4000 unique text passages that served as the basis for each GO annotation in our task data. Such evidence text information has long been recognized as critical for text-mining algorithm development but was never made available because of the high cost of curation. In total, seven teams participated in the challenge task. From the team results, we conclude that the state of the art in automatically mining GO terms from literature has improved over the past decade while much progress is still needed for computer-assisted GO curation. Future work should focus on addressing remaining technical challenges for improved performance of automatic GO concept recognition and incorporating practical benefits of text-mining tools into real-world GO annotation. DATABASE URL: http://www.biocreative.org/tasks/biocreative-iv/track-4-GO/.


Asunto(s)
Biología Computacional/métodos , Minería de Datos , Ontología de Genes , Anotación de Secuencia Molecular/métodos , Algoritmos , Humanos , Reproducibilidad de los Resultados
19.
Artículo en Inglés | MEDLINE | ID: mdl-25070993

RESUMEN

Gene function curation via Gene Ontology (GO) annotation is a common task among Model Organism Database groups. Owing to its manual nature, this task is considered one of the bottlenecks in literature curation. There have been many previous attempts at automatic identification of GO terms and supporting information from full text. However, few systems have delivered an accuracy that is comparable with humans. One recognized challenge in developing such systems is the lack of marked sentence-level evidence text that provides the basis for making GO annotations. We aim to create a corpus that includes the GO evidence text along with the three core elements of GO annotations: (i) a gene or gene product, (ii) a GO term and (iii) a GO evidence code. To ensure our results are consistent with real-life GO data, we recruited eight professional GO curators and asked them to follow their routine GO annotation protocols. Our annotators marked up more than 5000 text passages in 200 articles for 1356 distinct GO terms. For evidence sentence selection, the inter-annotator agreement (IAA) results are 9.3% (strict) and 42.7% (relaxed) in F1-measures. For GO term selection, the IAAs are 47% (strict) and 62.9% (hierarchical). Our corpus analysis further shows that abstracts contain ∼ 10% of relevant evidence sentences and 30% distinct GO terms, while the Results/Experiment section has nearly 60% relevant sentences and >70% GO terms. Further, of those evidence sentences found in abstracts, less than one-third contain enough experimental detail to fulfill the three core criteria of a GO annotation. This result demonstrates the need of using full-text articles for text mining GO annotations. Through its use at the BioCreative IV GO (BC4GO) task, we expect our corpus to become a valuable resource for the BioNLP research community. Database URL: http://www.biocreative.org/resources/corpora/bc-iv-go-task-corpus/.


Asunto(s)
Minería de Datos/métodos , Bases de Datos Genéticas , Anotación de Secuencia Molecular , Programas Informáticos , Vocabulario Controlado , Biología Computacional/métodos , Humanos
20.
PLoS One ; 9(6): e99864, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24941002

RESUMEN

Gene Ontology (GO) provides dynamic controlled vocabularies to aid in the description of the functional biological attributes and subcellular locations of gene products from all taxonomic groups (www.geneontology.org). Here we describe collaboration between the renal biomedical research community and the GO Consortium to improve the quality and quantity of GO terms describing renal development. In the associated annotation activity, the new and revised terms were associated with gene products involved in renal development and function. This project resulted in a total of 522 GO terms being added to the ontology and the creation of approximately 9,600 kidney-related GO term associations to 940 UniProt Knowledgebase (UniProtKB) entries, covering 66 taxonomic groups. We demonstrate the impact of these improvements on the interpretation of GO term analyses performed on genes differentially expressed in kidney glomeruli affected by diabetic nephropathy. In summary, we have produced a resource that can be utilized in the interpretation of data from small- and large-scale experiments investigating molecular mechanisms of kidney function and development and thereby help towards alleviating renal disease.


Asunto(s)
Ontología de Genes , Riñón/embriología , Riñón/metabolismo , Animales , Bases de Datos Genéticas , Bases de Datos de Proteínas , Humanos , Ratones , Anotación de Secuencia Molecular , Especificidad de la Especie , Estadística como Asunto
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...