Search | Nursing VHL Search Portal

1.

Improving reporting standards for polygenic scores in risk prediction studies.

Wand, Hannah; Lambert, Samuel A; Tamburro, Cecelia; Iacocca, Michael A; O'Sullivan, Jack W; Sillari, Catherine; Kullo, Iftikhar J; Rowley, Robb; Dron, Jacqueline S; Brockman, Deanna; Venner, Eric; McCarthy, Mark I; Antoniou, Antonis C; Easton, Douglas F; Hegele, Robert A; Khera, Amit V; Chatterjee, Nilanjan; Kooperberg, Charles; Edwards, Karen; Vlessis, Katherine; Kinnear, Kim; Danesh, John N; Parkinson, Helen; Ramos, Erin M; Roberts, Megan C; Ormond, Kelly E; Khoury, Muin J; Janssens, A Cecile J W; Goddard, Katrina A B; Kraft, Peter; MacArthur, Jaqueline A L; Inouye, Michael; Wojcik, Genevieve L.

Nature ; 591(7849): 211-219, 2021 03.

Article in English | MEDLINE | ID: mdl-33692554

ABSTRACT

Polygenic risk scores (PRSs), which often aggregate results from genome-wide association studies, can bridge the gap between initial discovery efforts and clinical applications for the estimation of disease risk using genetics. However, there is notable heterogeneity in the application and reporting of these risk scores, which hinders the translation of PRSs into clinical care. Here, in a collaboration between the Clinical Genome Resource (ClinGen) Complex Disease Working Group and the Polygenic Score (PGS) Catalog, we present the Polygenic Risk Score Reporting Standards (PRS-RS), in which we update the Genetic Risk Prediction Studies (GRIPS) Statement to reflect the present state of the field. Drawing on the input of experts in epidemiology, statistics, disease-specific applications, implementation and policy, this comprehensive reporting framework defines the minimal information that is needed to interpret and evaluate PRSs, especially with respect to downstream clinical applications. Items span detailed descriptions of study populations, statistical methods for the development and validation of PRSs and considerations for the potential limitations of these scores. In addition, we emphasize the need for data availability and transparency, and we encourage researchers to deposit and share PRSs through the PGS Catalog to facilitate reproducibility and comparative benchmarking. By providing these criteria in a structured format that builds on existing standards and ontologies, the use of this framework in publishing PRSs will facilitate translation into clinical care and progress towards defining best practice.

Subject(s)

Genetic Predisposition to Disease , Genetics, Medical/standards , Multifactorial Inheritance/genetics , Humans , Reproducibility of Results , Risk Assessment/standards

2.

Exome variant discrepancies due to reference-genome differences.

Li, He; Dawood, Moez; Khayat, Michael M; Farek, Jesse R; Jhangiani, Shalini N; Khan, Ziad M; Mitani, Tadahiro; Coban-Akdemir, Zeynep; Lupski, James R; Venner, Eric; Posey, Jennifer E; Sabo, Aniko; Gibbs, Richard A.

Am J Hum Genet ; 108(7): 1239-1250, 2021 07 01.

Article in English | MEDLINE | ID: mdl-34129815

ABSTRACT

Despite release of the GRCh38 human reference genome more than seven years ago, GRCh37 remains more widely used by most research and clinical laboratories. To date, no study has quantified the impact of utilizing different reference assemblies for the identification of variants associated with rare and common diseases from large-scale exome-sequencing data. By calling variants on both the GRCh37 and GRCh38 references, we identified single-nucleotide variants (SNVs) and insertion-deletions (indels) in 1,572 exomes from participants with Mendelian diseases and their family members. We found that a total of 1.5% of SNVs and 2.0% of indels were discordant when different references were used. Notably, 76.6% of the discordant variants were clustered within discrete discordant reference patches (DISCREPs) comprising only 0.9% of loci targeted by exome sequencing. These DISCREPs were enriched for genomic elements including segmental duplications, fix patch sequences, and loci known to contain alternate haplotypes. We identified 206 genes significantly enriched for discordant variants, most of which were in DISCREPs and caused by multi-mapped reads on the reference assembly that lacked the variant call. Among these 206 genes, eight are implicated in known Mendelian diseases and 53 are associated with common phenotypes from genome-wide association studies. In addition, variant interpretations could also be influenced by the reference after lifting-over variant loci to another assembly. Overall, we identified genes and genomic loci affected by reference assembly choice, including genes associated with Mendelian disorders and complex human diseases that require careful evaluation in both research and clinical applications.

Subject(s)

Exome , Genome, Human , Polymorphism, Single Nucleotide , Cohort Studies , Genetic Diseases, Inborn/genetics , Humans , Reference Values

3.

Variant Classification Concordance using the ACMG-AMP Variant Interpretation Guidelines across Nine Genomic Implementation Research Studies.

Amendola, Laura M; Muenzen, Kathleen; Biesecker, Leslie G; Bowling, Kevin M; Cooper, Greg M; Dorschner, Michael O; Driscoll, Catherine; Foreman, Ann Katherine M; Golden-Grant, Katie; Greally, John M; Hindorff, Lucia; Kanavy, Dona; Jobanputra, Vaidehi; Johnston, Jennifer J; Kenny, Eimear E; McNulty, Shannon; Murali, Priyanka; Ou, Jeffrey; Powell, Bradford C; Rehm, Heidi L; Rolf, Bradley; Roman, Tamara S; Van Ziffle, Jessica; Guha, Saurav; Abhyankar, Avinash; Crosslin, David; Venner, Eric; Yuan, Bo; Zouk, Hana; Jarvik, Gail P.

Am J Hum Genet ; 107(5): 932-941, 2020 11 05.

Article in English | MEDLINE | ID: mdl-33108757

ABSTRACT

Harmonization of variant pathogenicity classification across laboratories is important for advancing clinical genomics. The two CLIA-accredited Electronic Medical Record and Genomics Network sequencing centers and the six CLIA-accredited laboratories and one research laboratory performing genome or exome sequencing in the Clinical Sequencing Evidence-Generating Research Consortium collaborated to explore current sources of discordance in classification. Eight laboratories each submitted 20 classified variants in the ACMG secondary finding v.2.0 genes. After removing duplicates, each of the 158 variants was annotated and independently classified by two additional laboratories using the ACMG-AMP guidelines. Overall concordance across three laboratories was assessed and discordant variants were reviewed via teleconference and email. The submitted variant set included 28 P/LP variants, 96 VUS, and 34 LB/B variants, mostly in cancer (40%) and cardiac (27%) risk genes. Eighty-six (54%) variants reached complete five-category (i.e., P, LP, VUS, LB, B) concordance, and 17 (11%) had a discordance that could affect clinical recommendations (P/LP versus VUS/LB/B). 21% and 63% of variants submitted as P and LP, respectively, were discordant with VUS. Of the 54 originally discordant variants that underwent further review, 32 reached agreement, for a post-review concordance rate of 84% (118/140 variants). This project provides an updated estimate of variant concordance, identifies considerations for LP classified variants, and highlights ongoing sources of discordance. Continued and increased sharing of variant classifications and evidence across laboratories, and the ongoing work of ClinGen to provide general as well as gene- and disease-specific guidance, will lead to continued increases in concordance.

Subject(s)

Cardiovascular Diseases/genetics , Genetic Variation , Genomics/standards , Laboratories/standards , Neoplasms/genetics , Cardiovascular Diseases/diagnosis , Computational Biology/methods , Genetic Testing , Genetics, Medical/methods , Genome, Human , High-Throughput Nucleotide Sequencing , Humans , Laboratory Proficiency Testing/statistics & numerical data , Neoplasms/diagnosis , Sequence Analysis, DNA , Software , Terminology as Topic

4.

Harmonizing variant classification for return of results in the All of Us Research Program.

Harrison, Steven M; Austin-Tse, Christina A; Kim, Serra; Lebo, Matthew; Leon, Annette; Murdock, David; Radhakrishnan, Aparna; Shirts, Brian H; Steeves, Marcie; Venner, Eric; Gibbs, Richard A; Jarvik, Gail P; Rehm, Heidi L.

Hum Mutat ; 43(8): 1114-1121, 2022 08.

Article in English | MEDLINE | ID: mdl-34923710

ABSTRACT

The All of Us Research Program (AoURP) is a historic effort to accelerate research and improve healthcare by generating and collating data from one million people in the United States. Participants will have the option to receive results from their genome analysis, including actionable findings in 59 gene-disorder pairs for which disorder-associated variants are recommended for return by the American College of Medical Genetics and Genomics. To ensure consistent reporting across the AoURP, in a prelaunch study the four participating clinical laboratories shared all variant classifications in the 59 genes of interest from their internal databases. Of the 11,813 unique variants classified by at least two of the four laboratories, classifications were concordant with regard to reportability for 99.1% (11,711), with only 0.9% (102) having reportability differences. Through variant reassessment, data sharing, and discussion of rationale, participating laboratories resolved all 102 reportable differences. These approaches will be maintained during routine AoU reporting to ensure continuous classification harmonization and consistent reporting within AoURP.

Subject(s)

Genome, Human , Population Health , Genetic Testing/methods , Genetic Variation , Genome, Human/genetics , Genomics/methods , Humans , United States

5.

Implementation of preemptive DNA sequence-based pharmacogenomics testing across a large academic medical center: The Mayo-Baylor RIGHT 10K Study.

Wang, Liewei; Scherer, Steven E; Bielinski, Suzette J; Muzny, Donna M; Jones, Leila A; Black, John Logan; Moyer, Ann M; Giri, Jyothsna; Sharp, Richard R; Matey, Eric T; Wright, Jessica A; Oyen, Lance J; Nicholson, Wayne T; Wiepert, Mathieu; Sullard, Terri; Curry, Timothy B; Rohrer Vitek, Carolyn R; McAllister, Tammy M; St Sauver, Jennifer L; Caraballo, Pedro J; Lazaridis, Konstantinos N; Venner, Eric; Qin, Xiang; Hu, Jianhong; Kovar, Christie L; Korchina, Viktoriya; Walker, Kimberly; Doddapaneni, HarshaVardhan; Wu, Tsung-Jung; Raj, Ritika; Denson, Shawn; Liu, Wen; Chandanavelli, Gauthami; Zhang, Lan; Wang, Qiaoyan; Kalra, Divya; Karow, Mary Beth; Harris, Kimberley J; Sicotte, Hugues; Peterson, Sandra E; Barthel, Amy E; Moore, Brenda E; Skierka, Jennifer M; Kluge, Michelle L; Kotzer, Katrina E; Kloke, Karen; Vander Pol, Jessica M; Marker, Heather; Sutton, Joseph A; Kekic, Adrijana.

Genet Med ; 24(5): 1062-1072, 2022 05.

Article in English | MEDLINE | ID: mdl-35331649

ABSTRACT

PURPOSE: The Mayo-Baylor RIGHT 10K Study enabled preemptive, sequence-based pharmacogenomics (PGx)-driven drug prescribing practices in routine clinical care within a large cohort. We also generated the tools and resources necessary for clinical PGx implementation and identified challenges that need to be overcome. Furthermore, we measured the frequency of both common genetic variation for which clinical guidelines already exist and rare variation that could be detected by DNA sequencing, rather than genotyping. METHODS: Targeted oligonucleotide-capture sequencing of 77 pharmacogenes was performed using DNA from 10,077 consented Mayo Clinic Biobank volunteers. The resulting predicted drug response-related phenotypes for 13 genes, including CYP2D6 and HLA, affecting 21 drug-gene pairs, were deposited preemptively in the Mayo electronic health record. RESULTS: For the 13 pharmacogenes of interest, the genomes of 79% of participants carried clinically actionable variants in 3 or more genes, and DNA sequencing identified an average of 3.3 additional conservatively predicted deleterious variants that would not have been evident using genotyping. CONCLUSION: Implementation of preemptive rather than reactive and sequence-based rather than genotype-based PGx prescribing revealed nearly universal patient applicability and required integrated institution-wide resources to fully realize individualized drug therapy and to show more efficient use of health care resources.

Subject(s)

Cytochrome P-450 CYP2D6 , Pharmacogenetics , Academic Medical Centers , Base Sequence , Cytochrome P-450 CYP2D6/genetics , Genotype , Humans , Pharmacogenetics/methods

6.

Genetic testing in ambulatory cardiology clinics reveals high rate of findings with clinical management implications.

Murdock, David R; Venner, Eric; Muzny, Donna M; Metcalf, Ginger A; Murugan, Mullai; Hadley, Trevor D; Chander, Varuna; de Vries, Paul S; Jia, Xiaoming; Hussain, Aliza; Agha, Ali M; Sabo, Aniko; Li, Shoudong; Meng, Qingchang; Hu, Jianhong; Tian, Xia; Cohen, Michelle; Yi, Victoria; Kovar, Christie L; Gingras, Marie-Claude; Korchina, Viktoriya; Howard, Chad; Riconda, Daniel L; Pereira, Stacey; Smith, Hadley S; Huda, Zohra A; Buentello, Alexandria; Marino, Patricia R; Leiber, Lee; Balasubramanyam, Ashok; Amos, Christopher I; Civitello, Andrew B; Chelu, Mihail G; Maag, Ronald; McGuire, Amy L; Boerwinkle, Eric; Wehrens, Xander H T; Ballantyne, Christie M; Gibbs, Richard A.

Genet Med ; 23(12): 2404-2414, 2021 12.

Article in English | MEDLINE | ID: mdl-34363016

ABSTRACT

PURPOSE: Cardiovascular disease (CVD) is the leading cause of death in adults in the United States, yet the benefits of genetic testing are not universally accepted. METHODS: We developed the "HeartCare" panel of genes associated with CVD, evaluating high-penetrance Mendelian conditions, coronary artery disease (CAD) polygenic risk, LPA gene polymorphisms, and specific pharmacogenetic (PGx) variants. We enrolled 709 individuals from cardiology clinics at Baylor College of Medicine, and samples were analyzed in a CAP/CLIA-certified laboratory. Results were returned to the ordering physician and uploaded to the electronic medical record. RESULTS: Notably, 32% of patients had a genetic finding with clinical management implications, even after excluding PGx results, including 9% who were molecularly diagnosed with a Mendelian condition. Among surveyed physicians, 84% reported medical management changes based on these results, including specialist referrals, cardiac tests, and medication changes. LPA polymorphisms and high polygenic risk of CAD were found in 20% and 9% of patients, respectively, leading to diet, lifestyle, and other changes. Warfarin and simvastatin pharmacogenetic variants were present in roughly half of the cohort. CONCLUSION: Our results support the use of genetic information in routine cardiovascular health management and provide a roadmap for accompanying research.

Subject(s)

Cardiology , Cardiovascular Diseases , Adult , Cardiovascular Diseases/diagnosis , Cardiovascular Diseases/genetics , Cardiovascular Diseases/therapy , Genetic Testing , Humans , Pharmacogenetics/methods , Pharmacogenomic Testing , United States

7.

Genomic considerations for FHIR®; eMERGE implementation lessons.

Murugan, Mullai; Babb, Lawrence J; Overby Taylor, Casey; Rasmussen, Luke V; Freimuth, Robert R; Venner, Eric; Yan, Fei; Yi, Victoria; Granite, Stephen J; Zouk, Hana; Aronson, Samuel J; Power, Kevin; Fedotov, Alex; Crosslin, David R; Fasel, David; Jarvik, Gail P; Hakonarson, Hakon; Bangash, Hana; Kullo, Iftikhar J; Connolly, John J; Nestor, Jordan G; Caraballo, Pedro J; Wei, WeiQi; Wiley, Ken; Rehm, Heidi L; Gibbs, Richard A.

J Biomed Inform ; 118: 103795, 2021 06.

Article in English | MEDLINE | ID: mdl-33930535

ABSTRACT

Structured representation of clinical genetic results is necessary for advancing precision medicine. The Electronic Medical Records and Genomics (eMERGE) Network's Phase III program initially used a commercially developed XML message format for standardized and structured representation of genetic results for electronic health record (EHR) integration. In a desire to move towards a standard representation, the network created a new standardized format based upon Health Level Seven Fast Healthcare Interoperability Resources (HL7® FHIR®), to represent clinical genomics results. These new standards improve the utility of HL7® FHIR® as an international healthcare interoperability standard for management of genetic data from patients. This work advances the establishment of standards that are being designed for broad adoption in the current health information technology landscape.

Subject(s)

Electronic Health Records , Medical Informatics , Genomics , Health Level Seven , Humans , Precision Medicine

8.

Atlas-CNV: a validated approach to call single-exon CNVs in the eMERGESeq gene panel.

Chiang, Theodore; Liu, Xiuping; Wu, Tsung-Jung; Hu, Jianhong; Sedlazeck, Fritz J; White, Simon; Schaid, Daniel; Andrade, Mariza de; Jarvik, Gail P; Crosslin, David; Stanaway, Ian; Carrell, David S; Connolly, John J; Hakonarson, Hakon; Groopman, Emily E; Gharavi, Ali G; Fedotov, Alexander; Bi, Weimin; Leduc, Magalie S; Murdock, David R; Jiang, Yunyun; Meng, Linyan; Eng, Christine M; Wen, Shu; Yang, Yaping; Muzny, Donna M; Boerwinkle, Eric; Salerno, William; Venner, Eric; Gibbs, Richard A.

Genet Med ; 21(9): 2135-2144, 2019 09.

Article in English | MEDLINE | ID: mdl-30890783

ABSTRACT

PURPOSE: To provide a validated method to confidently identify exon-containing copy-number variants (CNVs), with a low false discovery rate (FDR), in targeted sequencing data from a clinical laboratory with particular focus on single-exon CNVs. METHODS: DNA sequence coverage data are normalized within each sample and subsequently exonic CNVs are identified in a batch of samples, when the target log2 ratio of the sample to the batch median exceeds defined thresholds. The quality of exonic CNV calls is assessed by C-scores (Z-like scores) using thresholds derived from gold standard samples and simulation studies. We integrate an ExonQC threshold to lower FDR and compare performance with alternate software (VisCap). RESULTS: Thirteen CNVs were used as a truth set to validate Atlas-CNV and compared with VisCap. We demonstrated FDR reduction in validation, simulation, and 10,926 eMERGESeq samples without sensitivity loss. Sixty-four multiexon and 29 single-exon CNVs with high C-scores were assessed by Multiplex Ligation-dependent Probe Amplification (MLPA). CONCLUSION: Atlas-CNV is validated as a method to identify exonic CNVs in targeted sequencing data generated in the clinical laboratory. The ExonQC and C-score assignment can reduce FDR (identification of targets with high variance) and improve calling accuracy of single-exon CNVs respectively. We propose guidelines and criteria to identify high confidence single-exon CNVs.

Subject(s)

DNA Copy Number Variations/genetics , Exons/genetics , Genome, Human/genetics , Software , High-Throughput Nucleotide Sequencing , Humans , Sequence Analysis, DNA

9.

UET: a database of evolutionarily-predicted functional determinants of protein sequences that cluster as functional sites in protein structures.

Lua, Rhonald C; Wilson, Stephen J; Konecki, Daniel M; Wilkins, Angela D; Venner, Eric; Morgan, Daniel H; Lichtarge, Olivier.

Nucleic Acids Res ; 44(D1): D308-12, 2016 Jan 04.

Article in English | MEDLINE | ID: mdl-26590254

ABSTRACT

The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/.

Subject(s)

Databases, Protein , Evolution, Molecular , Protein Conformation , Sequence Analysis, Protein

10.

A large-scale evaluation of computational protein function prediction.

Radivojac, Predrag; Clark, Wyatt T; Oron, Tal Ronnen; Schnoes, Alexandra M; Wittkop, Tobias; Sokolov, Artem; Graim, Kiley; Funk, Christopher; Verspoor, Karin; Ben-Hur, Asa; Pandey, Gaurav; Yunes, Jeffrey M; Talwalkar, Ameet S; Repo, Susanna; Souza, Michael L; Piovesan, Damiano; Casadio, Rita; Wang, Zheng; Cheng, Jianlin; Fang, Hai; Gough, Julian; Koskinen, Patrik; Törönen, Petri; Nokso-Koivisto, Jussi; Holm, Liisa; Cozzetto, Domenico; Buchan, Daniel W A; Bryson, Kevin; Jones, David T; Limaye, Bhakti; Inamdar, Harshal; Datta, Avik; Manjari, Sunitha K; Joshi, Rajendra; Chitale, Meghana; Kihara, Daisuke; Lisewski, Andreas M; Erdin, Serkan; Venner, Eric; Lichtarge, Olivier; Rentzsch, Robert; Yang, Haixuan; Romero, Alfonso E; Bhat, Prajwal; Paccanaro, Alberto; Hamp, Tobias; Kaßner, Rebecca; Seemayer, Stefan; Vicedo, Esmeralda; Schaefer, Christian.

Nat Methods ; 10(3): 221-7, 2013 Mar.

Article in English | MEDLINE | ID: mdl-23353650

ABSTRACT

Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools.

Subject(s)

Computational Biology/methods , Molecular Biology/methods , Molecular Sequence Annotation , Proteins/physiology , Algorithms , Animals , Databases, Protein , Exoribonucleases/classification , Exoribonucleases/genetics , Exoribonucleases/physiology , Forecasting , Humans , Proteins/chemistry , Proteins/classification , Proteins/genetics , Species Specificity

11.

Accounting for epistatic interactions improves the functional analysis of protein structures.

Wilkins, Angela D; Venner, Eric; Marciano, David C; Erdin, Serkan; Atri, Benu; Lua, Rhonald C; Lichtarge, Olivier.

Bioinformatics ; 29(21): 2714-21, 2013 Nov 01.

Article in English | MEDLINE | ID: mdl-24021383

ABSTRACT

MOTIVATION: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. METHODS AND RESULTS: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. CONCLUSIONS: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. CONTACT: lichtarge@bcm.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Protein Conformation , Sequence Analysis, Protein/methods , Algorithms , Bacterial Proteins/chemistry , Epistasis, Genetic , Evolution, Molecular , Molecular Sequence Annotation , Mutation , Proteins/chemistry , Proteins/genetics , Proteome/chemistry , Serine Endopeptidases/chemistry

12.

Empowering personalized pharmacogenomics with generative AI solutions.

Murugan, Mullai; Yuan, Bo; Venner, Eric; Ballantyne, Christie M; Robinson, Katherine M; Coons, James C; Wang, Liwen; Empey, Philip E; Gibbs, Richard A.

J Am Med Inform Assoc ; 31(6): 1356-1366, 2024 May 20.

Article in English | MEDLINE | ID: mdl-38447590

ABSTRACT

OBJECTIVE: This study evaluates an AI assistant developed using OpenAI's GPT-4 for interpreting pharmacogenomic (PGx) testing results, aiming to improve decision-making and knowledge sharing in clinical genetics and to enhance patient care with equitable access. MATERIALS AND METHODS: The AI assistant employs retrieval-augmented generation (RAG), which combines retrieval and generative techniques, by harnessing a knowledge base (KB) that comprises data from the Clinical Pharmacogenetics Implementation Consortium (CPIC). It uses context-aware GPT-4 to generate tailored responses to user queries from this KB, further refined through prompt engineering and guardrails. RESULTS: Evaluated against a specialized PGx question catalog, the AI assistant showed high efficacy in addressing user queries. Compared with OpenAI's ChatGPT 3.5, it demonstrated better performance, especially in provider-specific queries requiring specialized data and citations. Key areas for improvement include enhancing accuracy, relevancy, and representative language in responses. DISCUSSION: The integration of context-aware GPT-4 with RAG significantly enhanced the AI assistant's utility. RAG's ability to incorporate domain-specific CPIC data, including recent literature, proved beneficial. Challenges persist, such as the need for specialized genetic/PGx models to improve accuracy and relevancy and addressing ethical, regulatory, and safety concerns. CONCLUSION: This study underscores generative AI's potential for transforming healthcare provider support and patient accessibility to complex pharmacogenomic information. While careful implementation of large language models like GPT-4 is necessary, it is clear that they can substantially improve understanding of pharmacogenomic data. With further development, these tools could augment healthcare expertise, provider productivity, and the delivery of equitable, patient-centered healthcare services.

Subject(s)

Pharmacogenetics , Precision Medicine , Humans , Artificial Intelligence , Knowledge Bases , Information Storage and Retrieval/methods , Pharmacogenomic Testing

13.

Defining and Reducing Variant Classification Disparities.

Dawood, Moez; Fayer, Shawn; Pendyala, Sriram; Post, Mason; Kalra, Divya; Patterson, Karynne; Venner, Eric; Muffley, Lara A; Fowler, Douglas M; Rubin, Alan F; Posey, Jennifer E; Plon, Sharon E; Lupski, James R; Gibbs, Richard A; Starita, Lea M; Robles-Espinoza, Carla Daniela; Coyote-Maestas, Willow; Gallego Romero, Irene.

medRxiv ; 2024 Apr 12.

Article in English | MEDLINE | ID: mdl-38645101

ABSTRACT

Background: Multiplexed Assays of Variant Effects (MAVEs) can test all possible single variants in a gene of interest. The resulting saturation-style data may help resolve variant classification disparities between populations, especially for variants of uncertain significance (VUS). Methods: We analyzed clinical significance classifications in 213,663 individuals of European-like genetic ancestry versus 206,975 individuals of non-European-like genetic ancestry from All of Us and the Genome Aggregation Database. Then, we incorporated clinically calibrated MAVE data into the Clinical Genome Resource's Variant Curation Expert Panel rules to automate VUS reclassification for BRCA1, TP53, and PTEN . Results: Using two orthogonal statistical approaches, we show a higher prevalence ( p ≤5.95e-06) of VUS in individuals of non-European-like genetic ancestry across all medical specialties assessed in all three databases. Further, in the non-European-like genetic ancestry group, higher rates of Benign or Likely Benign and variants with no clinical designation ( p ≤2.5e-05) were found across many medical specialties, whereas Pathogenic or Likely Pathogenic assignments were higher in individuals of European-like genetic ancestry ( p ≤2.5e-05). Using MAVE data, we reclassified VUS in individuals of non-European-like genetic ancestry at a significantly higher rate in comparison to reclassified VUS from European-like genetic ancestry ( p =9.1e-03) effectively compensating for the VUS disparity. Further, essential code analysis showed equitable impact of MAVE evidence codes but inequitable impact of allele frequency ( p =7.47e-06) and computational predictor ( p =6.92e-05) evidence codes for individuals of non-European-like genetic ancestry. Conclusions: Generation of saturation-style MAVE data should be a priority to reduce VUS disparities and produce equitable training data for future computational predictors.

14.

Frequency of pharmacogenomic variation and medication exposures among All of Us Participants.

Haddad, Andrew; Radhakrishnan, Aparna; McGee, Sean; Smith, Joshua D; Karnes, Jason H; Venner, Eric; Wheeler, Marsha M; Patterson, Karynne; Walker, Kimberly; Kalra, Divya; Kalla, Sara E; Wang, Qiaoyan; Gibbs, Richard A; Jarvik, Gail P; Sanchez, Janeth; Musick, Anjene; Ramirez, Andrea H; Denny, Joshua C; Empey, Philip E.

medRxiv ; 2024 Jun 13.

Article in English | MEDLINE | ID: mdl-38946996

ABSTRACT

Pharmacogenomics promises improved outcomes through individualized prescribing. However, the lack of diversity in studies impedes clinical translation and equitable application of precision medicine. We evaluated the frequencies of PGx variants, predicted phenotypes, and medication exposures using whole genome sequencing and EHR data from nearly 100k diverse All of Us Research Program participants. We report 100% of participants carried at least one pharmacogenomics variant and nearly all (99.13%) had a predicted phenotype with prescribing recommendations. Clinical impact was high with over 20% having both an actionable phenotype and a prior exposure to an impacted medication with pharmacogenomic prescribing guidance. Importantly, we also report hundreds of alleles and predicted phenotypes that deviate from known frequencies and/or were previously unreported, including within admixed American and African ancestry groups.

15.

The frequency of pathogenic variation in the All of Us cohort reveals ancestry-driven disparities.

Venner, Eric; Patterson, Karynne; Kalra, Divya; Wheeler, Marsha M; Chen, Yi-Ju; Kalla, Sara E; Yuan, Bo; Karnes, Jason H; Walker, Kimberly; Smith, Joshua D; McGee, Sean; Radhakrishnan, Aparna; Haddad, Andrew; Empey, Philip E; Wang, Qiaoyan; Lichtenstein, Lee; Toledo, Diana; Jarvik, Gail; Musick, Anjene; Gibbs, Richard A.

Commun Biol ; 7(1): 174, 2024 Feb 19.

Article in English | MEDLINE | ID: mdl-38374434

ABSTRACT

Disparities in data underlying clinical genomic interpretation is an acknowledged problem, but there is a paucity of data demonstrating it. The All of Us Research Program is collecting data including whole-genome sequences, health records, and surveys for at least a million participants with diverse ancestry and access to healthcare, representing one of the largest biomedical research repositories of its kind. Here, we examine pathogenic and likely pathogenic variants that were identified in the All of Us cohort. The European ancestry subgroup showed the highest overall rate of pathogenic variation, with 2.26% of participants having a pathogenic variant. Other ancestry groups had lower rates of pathogenic variation, including 1.62% for the African ancestry group and 1.32% in the Latino/Admixed American ancestry group. Pathogenic variants were most frequently observed in genes related to Breast/Ovarian Cancer or Hypercholesterolemia. Variant frequencies in many genes were consistent with the data from the public gnomAD database, with some notable exceptions resolved using gnomAD subsets. Differences in pathogenic variant frequency observed between ancestral groups generally indicate biases of ascertainment of knowledge about those variants, but some deviations may be indicative of differences in disease prevalence. This work will allow targeted precision medicine efforts at revealed disparities.

Subject(s)

Genetic Predisposition to Disease , Population Health , Humans , Black People , Genomics , Hispanic or Latino/genetics , United States/epidemiology , European People , African People , Black or African American

16.

Genetic sex validation for sample tracking in next-generation sequencing clinical testing.

Hu, Jianhong; Korchina, Viktoriya; Zouk, Hana; Harden, Maegan V; Murdock, David; Macbeth, Alyssa; Harrison, Steven M; Lennon, Niall; Kovar, Christie; Balasubramanian, Adithya; Zhang, Lan; Chandanavelli, Gauthami; Pasham, Divya; Rowley, Robb; Wiley, Ken; Smith, Maureen E; Gordon, Adam; Jarvik, Gail P; Sleiman, Patrick; Kelly, Melissa A; Bland, Harris T; Murugan, Mullai; Venner, Eric; Boerwinkle, Eric; Prows, Cynthia; Mahanta, Lisa; Rehm, Heidi L; Gibbs, Richard A; Muzny, Donna M.

BMC Res Notes ; 17(1): 62, 2024 Mar 03.

Article in English | MEDLINE | ID: mdl-38433186

ABSTRACT

OBJECTIVE: Data from DNA genotyping via a 96-SNP panel in a study of 25,015 clinical samples were utilized for quality control and tracking of sample identity in a clinical sequencing network. The study aimed to demonstrate the value of both the precise SNP tracking and the utility of the panel for predicting the sex-by-genotype of the participants, to identify possible sample mix-ups. RESULTS: Precise SNP tracking showed no sample swap errors within the clinical testing laboratories. In contrast, when comparing predicted sex-by-genotype to the provided sex on the test requisition, we identified 110 inconsistencies from 25,015 clinical samples (0.44%), that had occurred during sample collection or accessioning. The genetic sex predictions were confirmed using additional SNP sites in the sequencing data or high-density genotyping arrays. It was determined that discrepancies resulted from clerical errors (49.09%), samples from transgender participants (3.64%) and stem cell or bone marrow transplant patients (7.27%) along with undetermined sample mix-ups (40%) for which sample swaps occurred prior to arrival at genome centers, however the exact cause of the events at the sampling sites resulting in the mix-ups were not able to be determined.

Subject(s)

Clinical Laboratory Services , High-Throughput Nucleotide Sequencing , Humans , Bone Marrow Transplantation , Genotype , Laboratories

17.

Function prediction from networks of local evolutionary similarity in protein structure.

Erdin, Serkan; Venner, Eric; Lisewski, Andreas Martin; Lichtarge, Olivier.

BMC Bioinformatics ; 14 Suppl 3: S6, 2013.

Article in English | MEDLINE | ID: mdl-23514548

ABSTRACT

BACKGROUND: Annotating protein function with both high accuracy and sensitivity remains a major challenge in structural genomics. One proven computational strategy has been to group a few key functional amino acids into templates and search for these templates in other protein structures, so as to transfer function when a match is found. To this end, we previously developed Evolutionary Trace Annotation (ETA) and showed that diffusing known annotations over a network of template matches on a structural genomic scale improved predictions of function. In order to further increase sensitivity, we now let each protein contribute multiple templates rather than just one, and also let the template size vary. RESULTS: Retrospective benchmarks in 605 Structural Genomics enzymes showed that multiple templates increased sensitivity by up to 14% when combined with single template predictions even as they maintained the accuracy over 91%. Diffusing function globally on networks of single and multiple template matches marginally increased the area under the ROC curve over 0.97, but in a subset of proteins that could not be annotated by ETA, the network approach recovered annotations for the most confident 20-23 of 91 cases with 100% accuracy. CONCLUSIONS: We improve the accuracy and sensitivity of predictions by using multiple templates per protein structure when constructing networks of ETA matches and diffusing annotations.

Subject(s)

Protein Conformation , Proteins/physiology , Algorithms , Computational Biology , Databases, Protein , Enzymes/chemistry , Evolution, Molecular , Genomics , Molecular Sequence Annotation , Proteins/chemistry , Proteins/genetics

18.

ETAscape: analyzing protein networks to predict enzymatic function and substrates in Cytoscape.

Bachman, Benjamin J; Venner, Eric; Lua, Rhonald C; Erdin, Serkan; Lichtarge, Olivier.

Bioinformatics ; 28(16): 2186-8, 2012 Aug 15.

Article in English | MEDLINE | ID: mdl-22689386

ABSTRACT

UNLABELLED: Most proteins lack experimentally validated functions. To address this problem, we implemented the Evolutionary Trace Annotation (ETA) method in the Cytoscape network visualization environment. The result is the ETAscape plugin, which builds a structural genomics network based on local structural and evolutionary similarities among proteins and then globally diffuses known annotations across the resulting network. The plugin displays these novel functional annotations, their confidence, the molecular basis for individual matches and the set of matches that lead to a prediction. AVAILABILITY: The ETA Network Plugin is available publicly for download at http://mammoth.bcm.tmc.edu/networks/.

Subject(s)

Computational Biology/methods , Proteins/chemistry , Software , Enzymes/analysis , Enzymes/chemistry , Genomics/methods , Proteins/analysis , Substrate Specificity

19.

Familial Hypercholesterolemia in the Electronic Medical Records and Genomics Network: Prevalence, Penetrance, Cardiovascular Risk, and Outcomes After Return of Results.

Dikilitas, Ozan; Sherafati, Alborz; Saadatagah, Seyedmohammad; Satterfield, Benjamin A; Kochan, David C; Anderson, Katherine C; Chung, Wendy K; Hebbring, Scott J; Salvati, Zachary M; Sharp, Richard R; Sturm, Amy C; Gibbs, Richard A; Rowley, Robb; Venner, Eric; Linder, Jodell E; Jones, Laney K; Perez, Emma F; Peterson, Josh F; Jarvik, Gail P; Rehm, Heidi L; Zouk, Hana; Roden, Dan M; Williams, Marc S; Manolio, Teri A; Kullo, Iftikhar J.

Circ Genom Precis Med ; 16(2): e003816, 2023 04.

Article in English | MEDLINE | ID: mdl-37071725

ABSTRACT

BACKGROUND: The implications of secondary findings detected in large-scale sequencing projects remain uncertain. We assessed prevalence and penetrance of pathogenic familial hypercholesterolemia (FH) variants, their association with coronary heart disease (CHD), and 1-year outcomes following return of results in phase III of the electronic medical records and genomics network. METHODS: Adult participants (n=18 544) at 7 sites were enrolled in a prospective cohort study to assess the clinical impact of returning results from targeted sequencing of 68 actionable genes, including LDLR, APOB, and PCSK9. FH variant prevalence and penetrance (defined as low-density lipoprotein cholesterol >155 mg/dL) were estimated after excluding participants enrolled on the basis of hypercholesterolemia. Multivariable logistic regression was used to estimate the odds of CHD compared to age- and sex-matched controls without FH-associated variants. Process (eg, referral to a specialist or ordering new tests), intermediate (eg, new diagnosis of FH), and clinical (eg, treatment modification) outcomes within 1 year after return of results were ascertained by electronic health record review. RESULTS: The prevalence of FH-associated pathogenic variants was 1 in 188 (69 of 13,019 unselected participants). Penetrance was 87.5%. The presence of an FH variant was associated with CHD (odds ratio, 3.02 [2.00-4.53]) and premature CHD (odds ratio, 3.68 [2.34-5.78]). At least 1 outcome occurred in 92% of participants; 44% received a new diagnosis of FH and 26% had treatment modified following return of results. CONCLUSIONS: In a multisite cohort of electronic health record-linked biobanks, monogenic FH was prevalent, penetrant, and associated with presence of CHD. Nearly half of participants with an FH-associated variant received a new diagnosis of FH and a quarter had treatment modified after return of results. These results highlight the potential utility of sequencing electronic health record-linked biobanks to detect FH.

Subject(s)

Cardiovascular Diseases , Coronary Artery Disease , Hyperlipoproteinemia Type II , Adult , Humans , Proprotein Convertase 9/genetics , Electronic Health Records , Penetrance , Prevalence , Prospective Studies , Risk Factors , Hyperlipoproteinemia Type II/diagnosis , Hyperlipoproteinemia Type II/epidemiology , Hyperlipoproteinemia Type II/genetics , Coronary Artery Disease/genetics , Heart Disease Risk Factors , Genomics

20.

Genetic Sex Validation for Sample Tracking in Clinical Testing.

Hu, Jianhong; Korchina, Viktoriya; Zouk, Hana; Harden, Maegan V; Murdock, David; Macbeth, Alyssa; Harrison, Steven M; Lennon, Niall; Kovar, Christie; Balasubramanian, Adithya; Zhang, Lan; Chandanavelli, Gauthami; Pasham, Divya; Rowley, Robb; Wiley, Ken; Smith, Maureen E; Gordon, Adam; Jarvik, Gail P; Sleiman, Patrick; Kelly, Melissa A; Bland, Harris T; Murugan, Mullai; Venner, Eric; Boerwinkle, Eric; Prows, Cynthia; Mahanta, Lisa; Rehm, Heidi L; Gibbs, Richard A; Muzny, Donna M.

Res Sq ; 2023 Sep 11.

Article in English | MEDLINE | ID: mdl-37790445

ABSTRACT

Objective: Data from DNA genotyping via a 96-SNP panel in a study of 25,015 clinical samples were utilized for quality control and tracking of sample identity in a clinical sequencing network. The study aimed to demonstrate the value of both the precise SNP tracking and the utility of the panel for predicting the sex-by-genotype of the participants, to identify possible sample mix-ups. Results: Precise SNP tracking showed no sample swap errors within the clinical testing laboratories. In contrast, when comparing predicted sex-by-genotype to the provided sex on the test requisition, we identified 110 inconsistencies from 25,015 clinical samples (0.44%), that had occurred during sample collection or accessioning. The genetic sex predictions were confirmed using additional SNP sites in the sequencing data or high-density genotyping arrays. It was determined that discrepancies resulted from clerical errors, samples from transgender participants and stem cell or bone marrow transplant patients along with undetermined sample mix-ups.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL