RESUMEN
DECIPHER (https://decipher.sanger.ac.uk) is a web-based platform for secure deposition, analysis, and sharing of plausibly pathogenic genomic variants from well-phenotyped patients suffering from genetic disorders. DECIPHER aids clinical interpretation of these rare sequence and copy-number variants by providing tools for variant analysis and identification of other patients exhibiting similar genotype-phenotype characteristics. DECIPHER also provides mechanisms to encourage collaboration among a global community of clinical centers and researchers, as well as exchange of information between clinicians and researchers within a consortium, to accelerate discovery and diagnosis. DECIPHER has contributed to matchmaking efforts by enabling the global clinical genetics community to identify many previously undiagnosed syndromes and new disease genes, and has facilitated the publication of over 700 peer-reviewed scientific publications since 2004. At the time of writing, DECIPHER contains anonymized data from â¼250 registered centers on more than 51,500 patients (â¼18000 patients with consent for data sharing and â¼25000 anonymized records shared privately). In this paper, we describe salient features of the platform, with special emphasis on the tools and processes that aid interpretation, sharing, and effective matchmaking with other data held in the database and that make DECIPHER an invaluable clinical and research resource.
Asunto(s)
Predisposición Genética a la Enfermedad/genética , Difusión de la Información/métodos , Enfermedades Raras/genética , Bases de Datos Genéticas , Variación Genética , Humanos , Fenotipo , Programas Informáticos , Interfaz Usuario-Computador , Navegador WebRESUMEN
There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of many small siloed datasets within individual research or clinical laboratory databases and/or disease-specific organizations, hoping for serendipitous occasions when two distant investigators happen to learn they have a rare phenotype in common and can "match" these cases to build evidence for causality. However, serendipity has never proven to be a reliable or scalable approach in science. As such, the Matchmaker Exchange (MME) was launched to provide a robust and systematic approach to rare disease gene discovery through the creation of a federated network connecting databases of genotypes and rare phenotypes using a common application programming interface (API). The core building blocks of the MME have been defined and assembled. Three MME services have now been connected through the API and are available for community use. Additional databases that support internal matching are anticipated to join the MME network as it continues to grow.
Asunto(s)
Predisposición Genética a la Enfermedad/genética , Difusión de la Información/métodos , Enfermedades Raras/genética , Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Estudios de Asociación Genética , Humanos , Programas InformáticosRESUMEN
Patients with developmental disorders often harbour sub-microscopic deletions or duplications that lead to a disruption of normal gene expression or perturbation in the copy number of dosage-sensitive genes. Clinical interpretation for such patients in isolation is hindered by the rarity and novelty of such disorders. The DECIPHER project (https://decipher.sanger.ac.uk) was established in 2004 as an accessible online repository of genomic and associated phenotypic data with the primary goal of aiding the clinical interpretation of rare copy-number variants (CNVs). DECIPHER integrates information from a variety of bioinformatics resources and uses visualization tools to identify potential disease genes within a CNV. A two-tier access system permits clinicians and clinical scientists to maintain confidential linked anonymous records of phenotypes and CNVs for their patients that, with informed consent, can subsequently be shared with the wider clinical genetics and research communities. Advances in next-generation sequencing technologies are making it practical and affordable to sequence the whole exome/genome of patients who display features suggestive of a genetic disorder. This approach enables the identification of smaller intragenic mutations including single-nucleotide variants that are not accessible even with high-resolution genomic array analysis. This article briefly summarizes the current status and achievements of the DECIPHER project and looks ahead to the opportunities and challenges of jointly analysing structural and sequence variation in the human genome.
Asunto(s)
Variaciones en el Número de Copia de ADN , Bases de Datos de Ácidos Nucleicos , Discapacidades del Desarrollo/genética , Enfermedades Genéticas Congénitas/genética , Internet , Biología Computacional , Predisposición Genética a la Enfermedad , Variación Genética , Genoma Humano , Humanos , Difusión de la Información , Mutación , Fenotipo , Polimorfismo de Nucleótido SimpleRESUMEN
The Protein Data Bank (PDB) is the repository for three-dimensional structures of biological macromolecules, determined by experimental methods. The data in the archive is free and easily available via the Internet from any of the worldwide centers managing this global archive. These data are used by scientists, researchers, bioinformatics specialists, educators, students, and general audiences to understand biological phenomenon at a molecular level. Analysis of this structural data also inspires and facilitates new discoveries in science. This chapter describes the tools and methods currently used for deposition, processing, and release of data in the PDB. References to future enhancements are also included.
Asunto(s)
Bases de Datos de Proteínas , Almacenamiento y Recuperación de la Información/métodos , Proteínas/química , Biología Computacional , Documentación , Reproducibilidad de los ResultadosRESUMEN
The Protein Data Bank (PDB) is the repository for the three-dimensional structures of biological macromolecules, determined by experimental methods. The data in the archive are free and easily available via the Internet from any of the worldwide centers managing this global archive. These data are used by scientists, researchers, bioinformatics specialists, educators, students, and lay audiences to understand biological phenomena at a molecular level. Analysis of these structural data also inspires and facilitates new discoveries in science. This chapter describes the tools and methods currently used for deposition, processing, and release of data in the PDB. References to future enhancements are also included.
Asunto(s)
Bases de Datos de Proteínas , Documentación/métodos , Almacenamiento y Recuperación de la Información/métodos , Conformación Proteica , Proteínas/químicaRESUMEN
Discovery of most autosomal recessive disease-associated genes has involved analysis of large, often consanguineous multiplex families or small cohorts of unrelated individuals with a well-defined clinical condition. Discovery of new dominant causes of rare, genetically heterogeneous developmental disorders has been revolutionized by exome analysis of large cohorts of phenotypically diverse parent-offspring trios. Here we analyzed 4,125 families with diverse, rare and genetically heterogeneous developmental disorders and identified four new autosomal recessive disorders. These four disorders were identified by integrating Mendelian filtering (selecting probands with rare, biallelic and putatively damaging variants in the same gene) with statistical assessments of (i) the likelihood of sampling the observed genotypes from the general population and (ii) the phenotypic similarity of patients with recessive variants in the same candidate gene. This new paradigm promises to catalyze the discovery of novel recessive disorders, especially those with less consistent or nonspecific clinical presentations and those caused predominantly by compound heterozygous genotypes.