RESUMO
Von Hippel-Lindau (VHL) disease is a rare, autosomal dominant disorder that predisposes individuals to developing tumors in many organs. There is significant phenotypic variability and genetic variants encountered within this syndrome, posing a considerable challenge to patient care. The lack of VHL variant data sharing paired with the absence of aggregated genotype-phenotype information results in an arduous process, when characterizing genetic variants and predicting patient prognosis. To address these gaps in knowledge, the Clinical Genome Resource (ClinGen) VHL Variant Curation Expert Panel (VCEP) has been resolving a list of variants of uncertain significance within the VHL gene. Through community curation, we crowdsourced the laborious task of variant annotation by modifying the ClinGen Community Curation (C3)-developed Baseline Annotation protocol and annotating all published VHL cases with the reported genotype-phenotype information in Hypothes.is, an open-access web annotation tool. This process, incorporated into the ClinGen VCEP's workflow, will aid in their curation efforts. To facilitate the curation at all levels of genetics expertise, our team developed a 4-day biocuration training protocol and resource guide. To date, 91.3% of annotations have been completed by undergraduate and high-school students without formal academic genetics specialization. Here, we present our VHL-specific annotation protocol utilizing Hypothes.is, which offers a standardized method to present case-resolution data, and our biocuration training protocol, which can be adapted for other rare disease platforms. By facilitating training for community curation of VHL disease, we increased student engagement with clinical genetics while enhancing knowledge translation in the field of hereditary cancer. Database URL: https://hypothes.is/groups/dKymJJpZ/vhl-hypothesis-annotation.
Assuntos
Neoplasias , Humanos , Neoplasias/genética , Proteína Supressora de Tumor Von Hippel-Lindau/genéticaRESUMO
CIViC (Clinical Interpretation of Variants in Cancer; civicdb.org) is a crowd-sourced, public domain knowledgebase composed of literature-derived evidence characterizing the clinical utility of cancer variants. As clinical sequencing becomes more prevalent in cancer management, the need for cancer variant interpretation has grown beyond the capability of any single institution. CIViC contains peer-reviewed, published literature curated and expertly-moderated into structured data units (Evidence Items) that can be accessed globally and in real time, reducing barriers to clinical variant knowledge sharing. We have extended CIViC's functionality to support emergent variant interpretation guidelines, increase interoperability with other variant resources, and promote widespread dissemination of structured curated data. To support the full breadth of variant interpretation from basic to translational, including integration of somatic and germline variant knowledge and inference of drug response, we have enabled curation of three new Evidence Types (Predisposing, Oncogenic and Functional). The growing CIViC knowledgebase has over 300 contributors and distributes clinically-relevant cancer variant data currently representing >3200 variants in >470 genes from >3100 publications.
Assuntos
Variação Genética , Neoplasias , Humanos , Neoplasias/genética , Bases de Conhecimento , Sequenciamento de Nucleotídeos em Larga EscalaRESUMO
Von Hippel-Lindau (VHL) disease is a hereditary cancer syndrome where individuals are predisposed to tumor development in the brain, adrenal gland, kidney, and other organs. It is caused by pathogenic variants in the VHL tumor suppressor gene. Standardized disease information has been difficult to collect due to the rarity and diversity of VHL patients. Over 4100 unique articles published until October 2019 were screened for germline genotype-phenotype data. Patient data were translated into standardized descriptions using Human Genome Variation Society gene variant nomenclature and Human Phenotype Ontology terms and has been manually curated into an open-access knowledgebase called Clinical Interpretation of Variants in Cancer. In total, 634 unique VHL variants, 2882 patients, and 1991 families from 427 papers were captured. We identified relationship trends between phenotype and genotype data using classic statistical methods and spectral clustering unsupervised learning. Our analyses reveal earlier onset of pheochromocytoma/paraganglioma and retinal angiomas, phenotype co-occurrences and genotype-phenotype correlations including hotspots. It confirms existing VHL associations and can be used to identify new patterns and associations in VHL disease. Our database serves as an aggregate knowledge translation tool to facilitate sharing information about the pathogenicity of VHL variants.
Assuntos
Neoplasias das Glândulas Suprarrenais , Doença de von Hippel-Lindau , Neoplasias das Glândulas Suprarrenais/diagnóstico , Neoplasias das Glândulas Suprarrenais/genética , Genótipo , Humanos , Aprendizado de Máquina , Fenótipo , Proteína Supressora de Tumor Von Hippel-Lindau/genética , Doença de von Hippel-Lindau/complicações , Doença de von Hippel-Lindau/diagnóstico , Doença de von Hippel-Lindau/genéticaRESUMO
BACKGROUND: Variant interpretation is the main bottleneck in medical genomic sequencing efforts. This usually involves genome analysts manually searching through a multitude of independent databases, often with the aid of several, mostly independent, computational tools. To streamline variant interpretation, we developed the GeneTerpret platform which collates data from current interpretation tools and databases, and applies a phenotype-driven query to categorize the variants identified in the genome(s). The platform assigns quantitative validity scores to genes by query and assembly of the genotype-phenotype data, sequence homology, molecular interactions, expression data, and animal models. It also uses the American College of Medical Genetics and Genomics (ACMG) criteria to categorize variants into five tiers of pathogenicity. The final output is a prioritized list of potentially causal variants/genes. RESULTS: We tested GeneTerpret by comparing its performance to expert-curated genes (ClinGen's gene-validity database) and variant pathogenicity reports (DECIPHER database). Output from GeneTerpret was 97.2% and 83.5% concordant with the expert-curated sources, respectively. Additionally, similar concordance was observed when GeneTerpret's performance was compared with our internal expert-interpreted clinical datasets. CONCLUSIONS: GeneTerpret is a flexible platform designed to streamline the genome interpretation process, through a unique interface, with improved ease, speed and accuracy. This modular and customizable system allows the user to tailor the component-programs in the analysis process to their preference. GeneTerpret is available online at https://geneterpret.com .