Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
2.
PLoS One ; 18(5): e0285433, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37196000

RESUMO

The Global Alliance for Genomics and Health (GA4GH) is a standards-setting organization that is developing a suite of coordinated standards for genomics. The GA4GH Phenopacket Schema is a standard for sharing disease and phenotype information that characterizes an individual person or biosample. The Phenopacket Schema is flexible and can represent clinical data for any kind of human disease including rare disease, complex disease, and cancer. It also allows consortia or databases to apply additional constraints to ensure uniform data collection for specific goals. We present phenopacket-tools, an open-source Java library and command-line application for construction, conversion, and validation of phenopackets. Phenopacket-tools simplifies construction of phenopackets by providing concise builders, programmatic shortcuts, and predefined building blocks (ontology classes) for concepts such as anatomical organs, age of onset, biospecimen type, and clinical modifiers. Phenopacket-tools can be used to validate the syntax and semantics of phenopackets as well as to assess adherence to additional user-defined requirements. The documentation includes examples showing how to use the Java library and the command-line tool to create and validate phenopackets. We demonstrate how to create, convert, and validate phenopackets using the library or the command-line application. Source code, API documentation, comprehensive user guide and a tutorial can be found at https://github.com/phenopackets/phenopacket-tools. The library can be installed from the public Maven Central artifact repository and the application is available as a standalone archive. The phenopacket-tools library helps developers implement and standardize the collection and exchange of phenotypic and other clinical data for use in phenotype-driven genomic diagnostics, translational research, and precision medicine applications.


Assuntos
Neoplasias , Software , Humanos , Genômica , Bases de Dados Factuais , Biblioteca Gênica
3.
Eur Radiol Exp ; 7(1): 20, 2023 05 08.
Artigo em Inglês | MEDLINE | ID: mdl-37150779

RESUMO

Artificial intelligence (AI) is transforming the field of medical imaging and has the potential to bring medicine from the era of 'sick-care' to the era of healthcare and prevention. The development of AI requires access to large, complete, and harmonized real-world datasets, representative of the population, and disease diversity. However, to date, efforts are fragmented, based on single-institution, size-limited, and annotation-limited datasets. Available public datasets (e.g., The Cancer Imaging Archive, TCIA, USA) are limited in scope, making model generalizability really difficult. In this direction, five European Union projects are currently working on the development of big data infrastructures that will enable European, ethically and General Data Protection Regulation-compliant, quality-controlled, cancer-related, medical imaging platforms, in which both large-scale data and AI algorithms will coexist. The vision is to create sustainable AI cloud-based platforms for the development, implementation, verification, and validation of trustable, usable, and reliable AI models for addressing specific unmet needs regarding cancer care provision. In this paper, we present an overview of the development efforts highlighting challenges and approaches selected providing valuable feedback to future attempts in the area.Key points• Artificial intelligence models for health imaging require access to large amounts of harmonized imaging data and metadata.• Main infrastructures adopted either collect centrally anonymized data or enable access to pseudonymized distributed data.• Developing a common data model for storing all relevant information is a challenge.• Trust of data providers in data sharing initiatives is essential.• An online European Union meta-tool-repository is a necessity minimizing effort duplication for the various projects in the area.


Assuntos
Inteligência Artificial , Neoplasias , Humanos , Diagnóstico por Imagem , Previsões , Big Data
4.
Bioinformatics ; 38(19): 4656-4657, 2022 09 30.
Artigo em Inglês | MEDLINE | ID: mdl-35980167

RESUMO

SUMMARY: Beacon v2 is an API specification established by the Global Alliance for Genomics and Health initiative (GA4GH) that defines a standard for federated discovery of genomic and phenotypic data. Here, we present the Beacon v2 Reference Implementation (B2RI), a set of open-source software tools that allow lighting up a local Beacon instance 'out-of-the-box'. Along with the software, we have created detailed 'Read the Docs' documentation that includes information on deployment and installation. AVAILABILITY AND IMPLEMENTATION: The B2RI is released under GNU General Public License v3.0 and Apache License v2.0. Documentation and source code is available at: https://b2ri-documentation.readthedocs.io. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma , Genômica , Software , Documentação
5.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35438138

RESUMO

Since its launch in 2008, the European Genome-Phenome Archive (EGA) has been leading the archiving and distribution of human identifiable genomic data. In this regard, one of the community concerns is the potential usability of the stored data, as of now, data submitters are not mandated to perform any quality control (QC) before uploading their data and associated metadata information. Here, we present a new File QC Portal developed at EGA, along with QC reports performed and created for 1 694 442 files [Fastq, sequence alignment map (SAM)/binary alignment map (BAM)/CRAM and variant call format (VCF)] submitted at EGA. QC reports allow anonymous EGA users to view summary-level information regarding the files within a specific dataset, such as quality of reads, alignment quality, number and type of variants and other features. Researchers benefit from being able to assess the quality of data prior to the data access decision and thereby, increasing the reusability of data (https://ega-archive.org/blog/data-upcycling-powered-by-ega/).


Assuntos
Genoma , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Metadados , Controle de Qualidade , Software
6.
Hum Mutat ; 43(6): 791-799, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35297548

RESUMO

Beacon is a basic data discovery protocol issued by the Global Alliance for Genomics and Health (GA4GH). The main goal addressed by version 1 of the Beacon protocol was to test the feasibility of broadly sharing human genomic data, through providing simple "yes" or "no" responses to queries about the presence of a given variant in datasets hosted by Beacon providers. The popularity of this concept has fostered the design of a version 2, that better serves real-world requirements and addresses the needs of clinical genomics research and healthcare, as assessed by several contributing projects and organizations. Particularly, rare disease genetics and cancer research will benefit from new case level and genomic variant level requests and the enabling of richer phenotype and clinical queries as well as support for fuzzy searches. Beacon is designed as a "lingua franca" to bridge data collections hosted in software solutions with different and rich interfaces. Beacon version 2 works alongside popular standards like Phenopackets, OMOP, or FHIR, allowing implementing consortia to return matches in beacon responses and provide a handover to their preferred data exchange format. The protocol is being explored by other research domains and is being tested in several international projects.


Assuntos
Genômica , Disseminação de Informação , Humanos , Disseminação de Informação/métodos , Fenótipo , Doenças Raras , Software
7.
Hum Mutat ; 43(6): 717-733, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35178824

RESUMO

Rare disease patients are more likely to receive a rapid molecular diagnosis nowadays thanks to the wide adoption of next-generation sequencing. However, many cases remain undiagnosed even after exome or genome analysis, because the methods used missed the molecular cause in a known gene, or a novel causative gene could not be identified and/or confirmed. To address these challenges, the RD-Connect Genome-Phenome Analysis Platform (GPAP) facilitates the collation, discovery, sharing, and analysis of standardized genome-phenome data within a collaborative environment. Authorized clinicians and researchers submit pseudonymised phenotypic profiles encoded using the Human Phenotype Ontology, and raw genomic data which is processed through a standardized pipeline. After an optional embargo period, the data are shared with other platform users, with the objective that similar cases in the system and queries from peers may help diagnose the case. Additionally, the platform enables bidirectional discovery of similar cases in other databases from the Matchmaker Exchange network. To facilitate genome-phenome analysis and interpretation by clinical researchers, the RD-Connect GPAP provides a powerful user-friendly interface and leverages tens of information sources. As a result, the resource has already helped diagnose hundreds of rare disease patients and discover new disease causing genes.


Assuntos
Genômica , Doenças Raras , Exoma , Estudos de Associação Genética , Genômica/métodos , Humanos , Fenótipo , Doenças Raras/diagnóstico , Doenças Raras/genética
8.
Nucleic Acids Res ; 50(D1): D980-D987, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34791407

RESUMO

The European Genome-phenome Archive (EGA - https://ega-archive.org/) is a resource for long term secure archiving of all types of potentially identifiable genetic, phenotypic, and clinical data resulting from biomedical research projects. Its mission is to foster hosted data reuse, enable reproducibility, and accelerate biomedical and translational research in line with the FAIR principles. Launched in 2008, the EGA has grown quickly, currently archiving over 4,500 studies from nearly one thousand institutions. The EGA operates a distributed data access model in which requests are made to the data controller, not to the EGA, therefore, the submitter keeps control on who has access to the data and under which conditions. Given the size and value of data hosted, the EGA is constantly improving its value chain, that is, how the EGA can contribute to enhancing the value of human health data by facilitating its submission, discovery, access, and distribution, as well as leading the design and implementation of standards and methods necessary to deliver the value chain. The EGA has become a key GA4GH Driver Project, leading multiple development efforts and implementing new standards and tools, and has been appointed as an ELIXIR Core Data Resource.


Assuntos
Confidencialidade/legislação & jurisprudência , Genoma Humano , Disseminação de Informação/métodos , Fenômica/organização & administração , Pesquisa Translacional Biomédica/métodos , Conjuntos de Dados como Assunto , Genótipo , História do Século XX , História do Século XXI , Humanos , Disseminação de Informação/ética , Metadados/ética , Metadados/estatística & dados numéricos , Fenômica/história , Fenótipo
9.
Cell Genom ; 1(2): None, 2021 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-34820659

RESUMO

Human biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data conditions. To enable and streamline data access for these biomedical datasets, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) work stream developed and approved the Data Use Ontology (DUO) standard. DUO is a hierarchical vocabulary of human and machine-readable data use terms that consistently and unambiguously represents a dataset's allowable data uses. DUO has been implemented by major international stakeholders such as the Broad and Sanger Institutes and is currently used in annotation of over 200,000 datasets worldwide. Using DUO in data management and access facilitates researchers' discovery and access of relevant datasets. DUO annotations increase the FAIRness of datasets and support data linkages using common data use profiles when integrating the data for secondary analyses. DUO is implemented in the Web Ontology Language (OWL) and, to increase community awareness and engagement, hosted in an open, centralized GitHub repository. DUO, together with the GA4GH Passport standard, offers a new, efficient, and streamlined data authorization and access framework that has enabled increased sharing of biomedical datasets worldwide.

10.
Evolution ; 75(6): 1288-1303, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33844299

RESUMO

Due to their effects on reducing recombination, chromosomal inversions may play an important role in speciation by establishing and/or maintaining linked blocks of genes causing reproductive isolation (RI) between populations. This view fits empirical data indicating that inversions typically harbor loci involved in RI. However, previous computer simulations of infinite populations with two to four loci involved in RI implied that, even with gene flux as low as 10-8 per gamete, per generation between alternative arrangements, inversions may not have large, qualitative advantages over collinear regions in maintaining population differentiation after secondary contact. Here, we report that finite population sizes can help counteract the homogenizing consequences of gene flux, especially when several fitness-related loci reside within the inversion. In these cases, the persistence time of differentiation after secondary contact can be similar to when gene flux is absent and notably longer than the persistence time without inversions. Thus, despite gene flux, population differentiation may be maintained for up to 100,000 generations, during which time new incompatibilities and/or local adaptations might accumulate and facilitate progress toward speciation. How often these conditions are met in nature remains to be determined.


Assuntos
Inversão Cromossômica , Deriva Genética , Especiação Genética , Modelos Genéticos , Adaptação Fisiológica/genética , Simulação por Computador , Isolamento Reprodutivo
11.
Cell Genom ; 1(2)2021 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-35128509

RESUMO

We promote a shared vision and guide for how and when to federate genomic and health-related data sharing, enabling connections and insights across independent, secure databases. The GA4GH encourages a federated approach wherein data providers have the mandate and resources to share, but where data cannot move for legal or technical reasons. We recommend a federated approach to connect national genomics initiatives into a global network and precision medicine resource.

12.
Bioinformatics ; 36(3): 890-896, 2020 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-31393550

RESUMO

MOTIVATION: Association studies based on SNP arrays and Next Generation Sequencing technologies have enabled the discovery of thousands of genetic loci related to human diseases. Nevertheless, their biological interpretation is still elusive, and their medical applications limited. Recently, various tools have been developed to help bridging the gap between genomes and phenomes. To our knowledge, however none of these tools allows users to retrieve the phenotype-wide list of genetic variants that may be linked to a given disease or to visually explore the joint genetic architecture of different pathologies. RESULTS: We present the Genome-Phenome Explorer (GePhEx), a web-tool easing the visual exploration of phenotypic relationships supported by genetic evidences. GePhEx is primarily based on the thorough analysis of linkage disequilibrium between disease-associated variants and also considers relationships based on genes, pathways or drug-targets, leveraging on publicly available variant-disease associations to detect potential relationships between diseases. We demonstrate that GePhEx does retrieve well-known relationships as well as novel ones, and that, thus, it might help shedding light on the patho-physiological mechanisms underlying complex diseases. To this end, we investigate the potential relationship between schizophrenia and lung cancer, first detected using GePhEx and provide further evidence supporting a functional link between them. AVAILABILITY AND IMPLEMENTATION: GePhEx is available at: https://gephex.ega-archive.org/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Estudo de Associação Genômica Ampla , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Fenômica , Fenótipo , Software
14.
Nat Rev Genet ; 20(11): 693-701, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31455890

RESUMO

Human genomics is undergoing a step change from being a predominantly research-driven activity to one driven through health care as many countries in Europe now have nascent precision medicine programmes. To maximize the value of the genomic data generated, these data will need to be shared between institutions and across countries. In recognition of this challenge, 21 European countries recently signed a declaration to transnationally share data on at least 1 million human genomes by 2022. In this Roadmap, we identify the challenges of data sharing across borders and demonstrate that European research infrastructures are well-positioned to support the rapid implementation of widespread genomic data access.


Assuntos
Pesquisa Biomédica , Genoma Humano , Projeto Genoma Humano , Europa (Continente) , Humanos
16.
Nat Biotechnol ; 37(4): 480, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30894680

RESUMO

In the version of this article initially published, Lena Dolman's second affiliation was given as Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK. The correct second affiliation is Ontario Institute for Cancer Research, Toronto, Ontario, Canada. The error has been corrected in the HTML and PDF versions of the article.

17.
F1000Res ; 62017.
Artigo em Inglês | MEDLINE | ID: mdl-29123641

RESUMO

The availability of high-throughput molecular profiling techniques has provided more accurate and informative data for regular clinical studies. Nevertheless, complex computational workflows are required to interpret these data. Over the past years, the data volume has been growing explosively, requiring robust human data management to organise and integrate the data efficiently. For this reason, we set up an ELIXIR implementation study, together with the Translational research IT (TraIT) programme, to design a data ecosystem that is able to link raw and interpreted data. In this project, the data from the TraIT Cell Line Use Case (TraIT-CLUC) are used as a test case for this system. Within this ecosystem, we use the European Genome-phenome Archive (EGA) to store raw molecular profiling data; tranSMART to collect interpreted molecular profiling data and clinical data for corresponding samples; and Galaxy to store, run and manage the computational workflows. We can integrate these data by linking their repositories systematically. To showcase our design, we have structured the TraIT-CLUC data, which contain a variety of molecular profiling data types, for storage in both tranSMART and EGA. The metadata provided allows referencing between tranSMART and EGA, fulfilling the cycle of data submission and discovery; we have also designed a data flow from EGA to Galaxy, enabling reanalysis of the raw data in Galaxy. In this way, users can select patient cohorts in tranSMART, trace them back to the raw data and perform (re)analysis in Galaxy. Our conclusion is that the majority of metadata does not necessarily need to be stored (redundantly) in both databases, but that instead FAIR persistent identifiers should be available for well-defined data ontology levels: study, data access committee, physical sample, data sample and raw data file. This approach will pave the way for the stable linkage and reuse of data.

18.
F1000Res ; 52016.
Artigo em Inglês | MEDLINE | ID: mdl-28232859

RESUMO

High-throughput molecular profiling techniques are routinely generating vast amounts of data for translational medicine studies. Secure access controlled systems are needed to manage, store, transfer and distribute these data due to its personally identifiable nature. The European Genome-phenome Archive (EGA) was created to facilitate access and management to long-term archival of bio-molecular data. Each data provider is responsible for ensuring a Data Access Committee is in place to grant access to data stored in the EGA. Moreover, the transfer of data during upload and download is encrypted. ELIXIR, a European research infrastructure for life-science data, initiated a project (2016 Human Data Implementation Study) to understand and document the ELIXIR requirements for secure management of controlled-access data. As part of this project, a full ecosystem was designed to connect archived raw experimental molecular profiling data with interpreted data and the computational workflows, using the CTMM Translational Research IT (CTMM-TraIT) infrastructure http://www.ctmm-trait.nl as an example. Here we present the first outcomes of this project, a framework to enable the download of EGA data to a Galaxy server in a secure way. Galaxy provides an intuitive user interface for molecular biologists and bioinformaticians to run and design data analysis workflows. More specifically, we developed a tool -- ega_download_streamer - that can download data securely from EGA into a Galaxy server, which can subsequently be further processed. This tool will allow a user within the browser to run an entire analysis containing sensitive data from EGA, and to make this analysis available for other researchers in a reproducible manner, as shown with a proof of concept study.  The tool ega_download_streamer is available in the Galaxy tool shed: https://toolshed.g2.bx.psu.edu/view/yhoogstrate/ega_download_streamer.

19.
Genome Biol Evol ; 7(6): 1490-505, 2015 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-25977458

RESUMO

We set out to investigate potential differences and similarities between the selective forces acting upon the coding and noncoding regions of five different sets of genes defined according to functional and evolutionary criteria: 1) two reference gene sets presenting accelerated and slow rates of protein evolution (the Complement and Actin pathways); 2) a set of genes with evidence of accelerated evolution in at least one of their introns; and 3) two gene sets related to neurological function (Parkinson's and Alzheimer's diseases). To that effect, we combine human-chimpanzee divergence patterns with polymorphism data obtained from target resequencing 20 central chimpanzees, our closest relatives with largest long-term effective population size. By using the distribution of fitness effect-alpha extension of the McDonald-Kreitman test, we reproduce inferences of rates of evolution previously based only on divergence data on both coding and intronic sequences and also obtain inferences for other classes of genomic elements (untranslated regions, promoters, and conserved noncoding sequences). Our results suggest that 1) the distribution of fitness effect-alpha method successfully helps distinguishing different scenarios of accelerated divergence (adaptation or relaxed selective constraints) and 2) the adaptive history of coding and noncoding sequences within the gene sets analyzed is decoupled.


Assuntos
Evolução Molecular , Pan troglodytes/genética , Seleção Genética , Actinas/genética , Animais , Proteínas do Sistema Complemento/genética , Genes , Humanos , Íntrons , Mutação , Fases de Leitura Aberta , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Regiões não Traduzidas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...