Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters

Database
Language
Publication year range
1.
Nucleic Acids Res ; 50(D1): D1216-D1220, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34718739

ABSTRACT

The European Variation Archive (EVA; https://www.ebi.ac.uk/eva/) is a resource for sharing all types of genetic variation data (SNPs, indels, and structural variants) for all species. The EVA was created in 2014 to provide FAIR access to genetic variation data and has since grown to be a primary resource for genomic variants hosting >3 billion records. The EVA and dbSNP have established a compatible global system to assign unique identifiers to all submitted genetic variants. The EVA is active within the Global Alliance of Genomics and Health (GA4GH), maintaining, contributing and implementing standards such as VCF, Refget and Variant Representation Specification (VRS). In this article, we describe the submission and permanent accessioning services along with the different ways the data can be retrieved by the scientific community.


Subject(s)
Computational Biology , Databases, Genetic , Genetic Variation/genetics , Software , Animals , Genomic Structural Variation/genetics , Genomics , Humans , INDEL Mutation/genetics , Molecular Sequence Annotation , Polymorphism, Single Nucleotide/genetics
2.
Bioinform Adv ; 4(1): vbae018, 2024.
Article in English | MEDLINE | ID: mdl-38384863

ABSTRACT

Summary: Semantic ontology mapping of clinical descriptors with disease outcome is essential. ClinVar is a key resource for human variation with known clinical significance. We present CMAT, a software toolkit and curation protocol for accurately enriching ClinVar releases with disease ontology associations and complex functional consequences. Availability and implementation: The software and ontology mappings can be obtained from: https://github.com/EBIvariation/CMAT.

3.
F1000Res ; 112022.
Article in English | MEDLINE | ID: mdl-35811804

ABSTRACT

In this opinion article, we discuss the formatting of files from (plant) genotyping studies, in particular the formatting of (meta-) data in Variant Call Format (VCF) files. The flexibility of the VCF format specification facilitates its use as a generic interchange format across domains but can lead to inconsistency between files in the presentation of metadata. To enable fully autonomous machine actionable data flow, generic elements need to be further specified. We strongly support the merits of the FAIR principles and see the need to facilitate them also through technical implementation specifications. VCF files are an established standard for the exchange and publication of genotyping data. Other data formats are also used to capture variant call data (for example, the HapMap format and the gVCF format), but none currently have the reach of VCF. In VCF, only the sites of variation are described, whereas in gVCF, all positions are listed, and confidence values are also provided. For the sake of simplicity, we will only discuss VCF and our recommendations for its use. However, the part of the VCF standard relating to metadata (as opposed to the actual variant calls) defines a syntactic format but no vocabulary, unique identifier or recommended content. In practice, often only sparse (if any) descriptive metadata is included. When descriptive metadata is provided, proprietary metadata fields are frequently added that have not been agreed upon within the community which may limit long-term and comprehensive interoperability. To address this, we propose recommendations for supplying and encoding metadata, focusing on use cases from the plant sciences. We expect there to be overlap, but also divergence, with the needs of other domains.


Subject(s)
Metadata , Software , Genotype
SELECTION OF CITATIONS
SEARCH DETAIL