RESUMO
CIViC (Clinical Interpretation of Variants in Cancer; civicdb.org) is a crowd-sourced, public domain knowledgebase composed of literature-derived evidence characterizing the clinical utility of cancer variants. As clinical sequencing becomes more prevalent in cancer management, the need for cancer variant interpretation has grown beyond the capability of any single institution. CIViC contains peer-reviewed, published literature curated and expertly-moderated into structured data units (Evidence Items) that can be accessed globally and in real time, reducing barriers to clinical variant knowledge sharing. We have extended CIViC's functionality to support emergent variant interpretation guidelines, increase interoperability with other variant resources, and promote widespread dissemination of structured curated data. To support the full breadth of variant interpretation from basic to translational, including integration of somatic and germline variant knowledge and inference of drug response, we have enabled curation of three new Evidence Types (Predisposing, Oncogenic and Functional). The growing CIViC knowledgebase has over 300 contributors and distributes clinically-relevant cancer variant data currently representing >3200 variants in >470 genes from >3100 publications.
Assuntos
Variação Genética , Neoplasias , Humanos , Neoplasias/genética , Bases de Conhecimento , Sequenciamento de Nucleotídeos em Larga EscalaRESUMO
Von Hippel-Lindau (VHL) disease is a hereditary cancer syndrome where individuals are predisposed to tumor development in the brain, adrenal gland, kidney, and other organs. It is caused by pathogenic variants in the VHL tumor suppressor gene. Standardized disease information has been difficult to collect due to the rarity and diversity of VHL patients. Over 4100 unique articles published until October 2019 were screened for germline genotype-phenotype data. Patient data were translated into standardized descriptions using Human Genome Variation Society gene variant nomenclature and Human Phenotype Ontology terms and has been manually curated into an open-access knowledgebase called Clinical Interpretation of Variants in Cancer. In total, 634 unique VHL variants, 2882 patients, and 1991 families from 427 papers were captured. We identified relationship trends between phenotype and genotype data using classic statistical methods and spectral clustering unsupervised learning. Our analyses reveal earlier onset of pheochromocytoma/paraganglioma and retinal angiomas, phenotype co-occurrences and genotype-phenotype correlations including hotspots. It confirms existing VHL associations and can be used to identify new patterns and associations in VHL disease. Our database serves as an aggregate knowledge translation tool to facilitate sharing information about the pathogenicity of VHL variants.
Assuntos
Neoplasias das Glândulas Suprarrenais , Doença de von Hippel-Lindau , Neoplasias das Glândulas Suprarrenais/diagnóstico , Neoplasias das Glândulas Suprarrenais/genética , Genótipo , Humanos , Aprendizado de Máquina , Fenótipo , Proteína Supressora de Tumor Von Hippel-Lindau/genética , Doença de von Hippel-Lindau/complicações , Doença de von Hippel-Lindau/diagnóstico , Doença de von Hippel-Lindau/genéticaRESUMO
PURPOSE: Following automated variant calling, manual review of aligned read sequences is required to identify a high-quality list of somatic variants. Despite widespread use in analyzing sequence data, methods to standardize manual review have not been described, resulting in high inter- and intralab variability. METHODS: This manual review standard operating procedure (SOP) consists of methods to annotate variants with four different calls and 19 tags. The calls indicate a reviewer's confidence in each variant and the tags indicate commonly observed sequencing patterns and artifacts that inform the manual review call. Four individuals were asked to classify variants prior to, and after, reading the SOP and accuracy was assessed by comparing reviewer calls with orthogonal validation sequencing. RESULTS: After reading the SOP, average accuracy in somatic variant identification increased by 16.7% (p value = 0.0298) and average interreviewer agreement increased by 12.7% (p value < 0.001). Manual review conducted after reading the SOP did not significantly increase reviewer time. CONCLUSION: This SOP supports and enhances manual somatic variant detection by improving reviewer accuracy while reducing the interreviewer variability for variant calling and annotation.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/normas , Mutação/genética , Neoplasias/genética , Software , Algoritmos , Humanos , Neoplasias/patologia , Polimorfismo de Nucleotídeo Único/genética , Alinhamento de SequênciaRESUMO
Widely distributed plants of western North America experience divergent selection across environmental gradients, have complex histories shaped by biogeographic barriers and distributional shifts and often illustrate continuums of reproductive isolation. Rubber rabbitbrush (Ericameria nauseosa) is a foundational shrub species that occurs across diverse environments of western North America. Its remarkable phenotypic diversity is currently ascribed to two subspecies-Ericameria nauseosa nauseosa and Ericameria nauseosa consimilis-and 22 named varieties. To understand how genetic variation is partitioned across subspecies, varieties, and environments, we used high throughput sequencing of reduced representation libraries. We found clear evidence for divergence between the two subspecies, despite largely sympatric distributions. Numerous locations exhibiting admixed ancestry were not geographically localized but were widely distributed across a mosaic hybrid zone. The occurrence of hybrid and subspecific ancestries was strongly predicted by environmental variables as well as the proximity to major ecotones between ecoregions. Although this repeatability illustrates the importance of environmental factors in shaping reproductive isolation, variability in the prevalence of hybridization also indicates these factors likely differ across ecological contexts. There was mixed evidence for the evolutionary cohesiveness of varieties, but several genetically distinct and narrow endemic varieties exhibited admixed subspecific ancestries, hinting at the possibility for transgressive hybridization to contribute to phenotypic novelty and the colonization of new environments in E. nauseosa.
Assuntos
Isolamento Reprodutivo , Borracha , Evolução Biológica , América do Norte , Hibridização GenéticaRESUMO
The spatial structure of genomic and phenotypic variation across populations reflects historical and demographic processes as well as evolution via natural selection. Characterizing such variation can provide an important perspective for understanding the evolutionary consequences of changing climate and for guiding ecological restoration. While evidence for local adaptation has been traditionally evaluated using phenotypic data, modern methods for generating and analyzing landscape genomic data can directly quantify local adaptation by associating allelic variation with environmental variation. Here, we analyze both genomic and phenotypic variation of rubber rabbitbrush (Ericameria nauseosa), a foundational shrub species of western North America. To quantify landscape genomic structure and provide perspective on patterns of local adaptation, we generated reduced representation sequencing data for 17 wild populations (222 individuals; 38,615 loci) spanning a range of environmental conditions. Population genetic analyses illustrated pronounced landscape genomic structure jointly shaped by geography and environment. Genetic-environment association (GEA) analyses using both redundancy analysis (RDA) and a machine-learning approach (Gradient Forest) indicated environmental variables (precipitation seasonality, slope, aspect, elevation, and annual precipitation) influenced spatial genomic structure and were correlated with allele frequency shifts indicative of local adaptation at a consistent set of genomic regions. We compared our GEA-based inference of local adaptation with phenotypic data collected by growing seeds from each population in a greenhouse common garden. Population differentiation in seed weight, emergence, and seedling traits was associated with environmental variables (e.g., precipitation seasonality) that were also implicated in GEA analyses, suggesting complementary conclusions about the drivers of local adaptation across different methods and data sources. Our results provide a baseline understanding of spatial genomic structure for E. nauseosa across the western Great Basin and illustrate the utility of GEA analyses for detecting the environmental causes and genetic signatures of local adaptation in a widely distributed plant species of restoration significance.
RESUMO
PURPOSE: Clinical targeted sequencing panels are important for identifying actionable variants for patients with cancer; however, existing approaches do not provide transparent and rationally designed clinical panels to accommodate the rapidly growing knowledge within oncology. MATERIALS AND METHODS: We used the Clinical Interpretations of Variants in Cancer (CIViC) database to develop an Open-Sourced CIViC Annotation Pipeline (OpenCAP). OpenCAP provides methods to identify variants within the CIViC database, build probes for variant capture, use probes on prospective samples, and link somatic variants to CIViC clinical relevance statements. OpenCAP was tested using a single-molecule molecular inversion probe (smMIP) capture design on 27 cancer samples from 5 tumor types. In total, 2,027 smMIPs were designed to target 111 eligible CIViC variants (61.5 kb of genomic space). RESULTS: When compared with orthogonal sequencing, CIViC smMIP sequencing demonstrated a 95% sensitivity for variant detection (n = 61 of 64 variants). Variant allele frequencies for variants identified on both sequencing platforms were highly concordant (Pearson's r = 0.885; n = 61 variants). Moreover, for individuals with paired tumor and normal samples (n = 12), 182 clinically relevant variants missed by orthogonal sequencing were discovered by CIViC smMIP sequencing. CONCLUSION: The OpenCAP design paradigm demonstrates the utility of an open-source and open-access database built on attendant community contributions with peer-reviewed interpretations. Use of a public repository for variant identification, probe development, and variant interpretation provides a transparent approach to build dynamic next-generation sequencing-based oncology panels.
Assuntos
Biomarcadores Tumorais/genética , Biologia Computacional/métodos , Sondas de DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala/normas , Anotação de Sequência Molecular/métodos , Neoplasias/genética , Análise Mutacional de DNA/métodos , Bases de Dados Genéticas , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Anotação de Sequência Molecular/normas , Terapia de Alvo Molecular , Neoplasias/diagnóstico , Curva ROC , Design de SoftwareRESUMO
Manually curated variant knowledgebases and their associated knowledge models are serving an increasingly important role in distributing and interpreting variants in cancer. These knowledgebases vary in their level of public accessibility, and the complexity of the models used to capture clinical knowledge. CIViC (Clinical Interpretation of Variants in Cancer - www.civicdb.org) is a fully open, free-to-use cancer variant interpretation knowledgebase that incorporates highly detailed curation of evidence obtained from peer-reviewed publications and meeting abstracts, and currently holds over 6300 Evidence Items for over 2300 variants derived from over 400 genes. CIViC has seen increased adoption by, and also undertaken collaboration with, a wide range of users and organizations involved in research. To enhance CIViC's clinical value, regular submission to the ClinVar database and pursuit of other regulatory approvals is necessary. For this reason, a formal peer reviewed curation guideline and discussion of the underlying principles of curation is needed. We present here the CIViC knowledge model, standard operating procedures (SOP) for variant curation, and detailed examples to support community-driven curation of cancer variants.