RESUMEN
The European Genome-phenome Archive (EGA - https://ega-archive.org/) is a resource for long term secure archiving of all types of potentially identifiable genetic, phenotypic, and clinical data resulting from biomedical research projects. Its mission is to foster hosted data reuse, enable reproducibility, and accelerate biomedical and translational research in line with the FAIR principles. Launched in 2008, the EGA has grown quickly, currently archiving over 4,500 studies from nearly one thousand institutions. The EGA operates a distributed data access model in which requests are made to the data controller, not to the EGA, therefore, the submitter keeps control on who has access to the data and under which conditions. Given the size and value of data hosted, the EGA is constantly improving its value chain, that is, how the EGA can contribute to enhancing the value of human health data by facilitating its submission, discovery, access, and distribution, as well as leading the design and implementation of standards and methods necessary to deliver the value chain. The EGA has become a key GA4GH Driver Project, leading multiple development efforts and implementing new standards and tools, and has been appointed as an ELIXIR Core Data Resource.
Asunto(s)
Confidencialidad/legislación & jurisprudencia , Genoma Humano , Difusión de la Información/métodos , Fenómica/organización & administración , Investigación Biomédica Traslacional/métodos , Conjuntos de Datos como Asunto , Genotipo , Historia del Siglo XX , Historia del Siglo XXI , Humanos , Difusión de la Información/ética , Metadatos/ética , Metadatos/estadística & datos numéricos , Fenómica/historia , FenotipoRESUMEN
The Solve-RD project objectives include solving undiagnosed rare diseases (RD) through collaborative research on shared genome-phenome datasets. The RD-Connect Genome-Phenome Analysis Platform (GPAP), for data collation and analysis, and the European Genome-Phenome Archive (EGA), for file storage, are two key components of the Solve-RD infrastructure. Clinical researchers can identify candidate genetic variants within the RD-Connect GPAP and, thanks to the developments presented here as part of joint ELIXIR activities, are able to remotely visualize the corresponding alignments stored at the EGA. The Global Alliance for Genomics and Health (GA4GH) htsget streaming application programming interface (API) is used to retrieve alignment slices, which are rendered by an integrated genome viewer (IGV) instance embedded in the GPAP. As a result, it is no longer necessary for over 11,000 datasets to download large alignment files to visualize them locally. This work highlights the advantages, from both the user and infrastructure perspectives, of implementing interoperability standards for establishing federated genomics data networks.