Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 71
Filter
1.
Hum Genet ; 2024 Aug 07.
Article in English | MEDLINE | ID: mdl-39110250

ABSTRACT

This paper presents an evaluation of predictions submitted for the "HMBS" challenge, a component of the sixth round of the Critical Assessment of Genome Interpretation held in 2021. The challenge required participants to predict the effects of missense variants of the human HMBS gene on yeast growth. The HMBS enzyme, critical for the biosynthesis of heme in eukaryotic cells, is highly conserved among eukaryotes. Despite the application of a variety of algorithms and methods, the performance of predictors was relatively similar, with Kendall's tau correlation coefficients between predictions and experimental scores around 0.3 for a majority of submissions. Notably, the median correlation (≥ 0.34) observed among these predictors, especially the top predictions from different groups, was greater than the correlation observed between their predictions and the actual experimental results. Most predictors were moderately successful in distinguishing between deleterious and benign variants, as evidenced by an area under the receiver operating characteristic (ROC) curve (AUC) of approximately 0.7 respectively. Compared with the recent two rounds of CAGI competitions, we noticed more predictors outperformed the baseline predictor, which is solely based on the amino acid frequencies. Nevertheless, the overall accuracy of predictions is still far short of positive control, which is derived from experimental scores, indicating the necessity for considerable improvements in the field. The most inaccurately predicted variants in this round were associated with the insertion loop, which is absent in many orthologs, suggesting the predictors still heavily rely on the information from multiple sequence alignment.

2.
Cancer Discov ; 14(9): 1699-1716, 2024 Sep 04.
Article in English | MEDLINE | ID: mdl-39193992

ABSTRACT

Upregulation of MYC is a hallmark of cancer, wherein MYC drives oncogenic gene expression and elevates total RNA synthesis across cancer cell transcriptomes. Although this transcriptional anabolism fuels cancer growth and survival, the consequences and metabolic stresses induced by excess cellular RNA are poorly understood. Herein, we discover that RNA degradation and downstream ribonucleotide catabolism is a novel mechanism of MYC-induced cancer cell death. Combining genetics and metabolomics, we find that MYC increases RNA decay through the cytoplasmic exosome, resulting in the accumulation of cytotoxic RNA catabolites and reactive oxygen species. Notably, tumor-derived exosome mutations abrogate MYC-induced cell death, suggesting excess RNA decay may be toxic to human cancers. In agreement, purine salvage acts as a compensatory pathway that mitigates MYC-induced ribonucleotide catabolism, and inhibitors of purine salvage impair MYC+ tumor progression. Together, these data suggest that MYC-induced RNA decay is an oncogenic stress that can be exploited therapeutically. Significance: MYC is the most common oncogenic driver of poor-prognosis cancers but has been recalcitrant to therapeutic inhibition. We discovered a new vulnerability in MYC+ cancer where MYC induces cell death through excess RNA decay. Therapeutics that exacerbate downstream ribonucleotide catabolism provide a therapeutically tractable approach to TNBC (Triple-negative Breast Cancer) and other MYC-driven cancers.


Subject(s)
Breast Neoplasms , Proto-Oncogene Proteins c-myc , RNA Stability , Ribonucleotides , Humans , Female , Breast Neoplasms/metabolism , Breast Neoplasms/genetics , Breast Neoplasms/pathology , Proto-Oncogene Proteins c-myc/metabolism , Proto-Oncogene Proteins c-myc/genetics , Ribonucleotides/pharmacology , Cell Line, Tumor , Mice , Gene Expression Regulation, Neoplastic , Animals
3.
Res Sq ; 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-39011112

ABSTRACT

Critical evaluation of computational tools for predicting variant effects is important considering their increased use in disease diagnosis and driving molecular discoveries. In the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, a dataset of 28 STK11 rare variants (27 missense, 1 single amino acid deletion), identified in primary non-small cell lung cancer biopsies, was experimentally assayed to characterize computational methods from four participating teams and five publicly available tools. Predictors demonstrated a high level of performance on key evaluation metrics, measuring correlation with the assay outputs and separating loss-of-function (LoF) variants from wildtype-like (WT-like) variants. The best participant model, 3Cnet, performed competitively with well-known tools. Unique to this challenge was that the functional data was generated with both biological and technical replicates, thus allowing the assessors to realistically establish maximum predictive performance based on experimental variability. Three out of the five publicly available tools and 3Cnet approached the performance of the assay replicates in separating LoF variants from WT-like variants. Surprisingly, REVEL, an often-used model, achieved a comparable correlation with the real-valued assay output as that seen for the experimental replicates. Performing variant interpretation by combining the new functional evidence with computational and population data evidence led to 16 new variants receiving a clinically actionable classification of likely pathogenic (LP) or likely benign (LB). Overall, the STK11 challenge highlights the utility of variant effect predictors in biomedical sciences and provides encouraging results for driving research in the field of computational genome interpretation.

4.
Blood Cancer J ; 14(1): 99, 2024 Jun 18.
Article in English | MEDLINE | ID: mdl-38890297

ABSTRACT

Current therapies for high-grade TP53-mutated myeloid neoplasms (≥10% blasts) do not offer a meaningful survival benefit except allogeneic stem cell transplantation in the minority who achieve a complete response to first line therapy (CR1). To identify reliable pre-therapy predictors of complete response to first-line therapy (CR1) and outcomes, we assembled a cohort of 242 individuals with TP53-mutated myeloid neoplasms and ≥10% blasts with well-annotated clinical, molecular and pathology data. Key outcomes examined were CR1 & 24-month survival (OS24). In this elderly cohort (median age 68.2 years) with 74.0% receiving frontline non-intensive regimens (hypomethylating agents +/- venetoclax), the overall cohort CR1 rate was 25.6% (50/195). We additionally identified several pre-therapy factors predictive of inferior CR1 including male gender (P = 0.026), ≥2 autosomal monosomies (P < 0.001), -17/17p (P = 0.011), multi-hit TP53 allelic state (P < 0.001) and CUX1 co-alterations (P = 0.010). In univariable analysis of the entire cohort, inferior OS24 was predicated by ≥2 monosomies (P = 0.004), TP53 VAF > 25% (P = 0.002), TP53 splice junction mutations (P = 0.007) and antecedent treated myeloid neoplasm (P = 0.001). In addition, mutations/deletions in CUX1, U2AF1, EZH2, TET2, CBL, or KRAS ('EPI6' signature) predicted inferior OS24 (HR = 2.0 [1.5-2.8]; P < 0.0001). In a subgroup analysis of HMA +/-Ven treated individuals (N = 144), TP53 VAF and monosomies did not impact OS24. A risk score for HMA +/-Ven treated individuals incorporating three pre-therapy predictors including TP53 splice junction mutations, EPI6 and antecedent treated myeloid neoplasm stratified 3 prognostic distinct groups: intermediate, intermediate-poor, and poor with significantly different median (12.8, 6.0, 4.3 months) and 24-month (20.9%, 5.7%, 0.5%) survival (P < 0.0001). For the first time, in a seemingly monolithic high-risk cohort, our data identifies several baseline factors that predict response and 24-month survival.


Subject(s)
Mutation , Tumor Suppressor Protein p53 , Humans , Male , Female , Aged , Tumor Suppressor Protein p53/genetics , Middle Aged , Aged, 80 and over , Adult , Prognosis , Treatment Outcome
5.
J Biol Chem ; 300(6): 107368, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38750793

ABSTRACT

Activating signal co-integrator complex 1 (ASCC1) acts with ASCC-ALKBH3 complex in alkylation damage responses. ASCC1 uniquely combines two evolutionarily ancient domains: nucleotide-binding K-Homology (KH) (associated with regulating splicing, transcriptional, and translation) and two-histidine phosphodiesterase (PDE; associated with hydrolysis of cyclic nucleotide phosphate bonds). Germline mutations link loss of ASCC1 function to spinal muscular atrophy with congenital bone fractures 2 (SMABF2). Herein analysis of The Cancer Genome Atlas (TCGA) suggests ASCC1 RNA overexpression in certain tumors correlates with poor survival, Signatures 29 and 3 mutations, and genetic instability markers. We determined crystal structures of Alvinella pompejana (Ap) ASCC1 and Human (Hs) PDE domain revealing high-resolution details and features conserved over 500 million years of evolution. Extending our understanding of the KH domain Gly-X-X-Gly sequence motif, we define a novel structural Helix-Clasp-Helix (HCH) nucleotide binding motif and show ASCC1 sequence-specific binding to CGCG-containing RNA. The V-shaped PDE nucleotide binding channel has two His-Φ-Ser/Thr-Φ (HXT) motifs (Φ being hydrophobic) positioned to initiate cyclic phosphate bond hydrolysis. A conserved atypical active-site histidine torsion angle implies a novel PDE substrate. Flexible active site loop and arginine-rich domain linker appear regulatory. Small-angle X-ray scattering (SAXS) revealed aligned KH-PDE RNA binding sites with limited flexibility in solution. Quantitative evolutionary bioinformatic analyses of disease and cancer-associated mutations support implied functional roles for RNA binding, phosphodiesterase activity, and regulation. Collective results inform ASCC1's roles in transactivation and alkylation damage responses, its targeting by structure-based inhibitors, and how ASCC1 mutations may impact inherited disease and cancer.


Subject(s)
Phosphoric Diester Hydrolases , Humans , Computational Biology/methods , Crystallography, X-Ray , Phosphoric Diester Hydrolases/metabolism , Phosphoric Diester Hydrolases/chemistry , Phosphoric Diester Hydrolases/genetics , RNA-Binding Motifs/genetics
6.
bioRxiv ; 2024 Jun 17.
Article in English | MEDLINE | ID: mdl-38798479

ABSTRACT

Continued advances in variant effect prediction are necessary to demonstrate the ability of machine learning methods to accurately determine the clinical impact of variants of unknown significance (VUS). Towards this goal, the ARSA Critical Assessment of Genome Interpretation (CAGI) challenge was designed to characterize progress by utilizing 219 experimentally assayed missense VUS in the Arylsulfatase A (ARSA) gene to assess the performance of community-submitted predictions of variant functional effects. The challenge involved 15 teams, and evaluated additional predictions from established and recently released models. Notably, a model developed by participants of a genetics and coding bootcamp, trained with standard machine-learning tools in Python, demonstrated superior performance among submissions. Furthermore, the study observed that state-of-the-art deep learning methods provided small but statistically significant improvement in predictive performance compared to less elaborate techniques. These findings underscore the utility of variant effect prediction, and the potential for models trained with modest resources to accurately classify VUS in genetic and clinical research.

7.
Hum Genomics ; 18(1): 44, 2024 Apr 29.
Article in English | MEDLINE | ID: mdl-38685113

ABSTRACT

BACKGROUND: A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting. METHODS: We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values. RESULTS: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency. CONCLUSIONS: Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed.


Subject(s)
Rare Diseases , Humans , Rare Diseases/genetics , Rare Diseases/diagnosis , Genome, Human/genetics , Genetic Variation/genetics , Computational Biology/methods , Phenotype
8.
Am J Hum Genet ; 111(3): 487-508, 2024 03 07.
Article in English | MEDLINE | ID: mdl-38325380

ABSTRACT

Pathogenic variants in multiple genes on the X chromosome have been implicated in syndromic and non-syndromic intellectual disability disorders. ZFX on Xp22.11 encodes a transcription factor that has been linked to diverse processes including oncogenesis and development, but germline variants have not been characterized in association with disease. Here, we present clinical and molecular characterization of 18 individuals with germline ZFX variants. Exome or genome sequencing revealed 11 variants in 18 subjects (14 males and 4 females) from 16 unrelated families. Four missense variants were identified in 11 subjects, with seven truncation variants in the remaining individuals. Clinical findings included developmental delay/intellectual disability, behavioral abnormalities, hypotonia, and congenital anomalies. Overlapping and recurrent facial features were identified in all subjects, including thickening and medial broadening of eyebrows, variations in the shape of the face, external eye abnormalities, smooth and/or long philtrum, and ear abnormalities. Hyperparathyroidism was found in four families with missense variants, and enrichment of different tumor types was observed. In molecular studies, DNA-binding domain variants elicited differential expression of a small set of target genes relative to wild-type ZFX in cultured cells, suggesting a gain or loss of transcriptional activity. Additionally, a zebrafish model of ZFX loss displayed an altered behavioral phenotype, providing additional evidence for the functional significance of ZFX. Our clinical and experimental data support that variants in ZFX are associated with an X-linked intellectual disability syndrome characterized by a recurrent facial gestalt, neurocognitive and behavioral abnormalities, and an increased risk for congenital anomalies and hyperparathyroidism.


Subject(s)
Hyperparathyroidism , Intellectual Disability , Neurodevelopmental Disorders , Male , Female , Animals , Humans , Intellectual Disability/pathology , Zebrafish/genetics , Mutation, Missense/genetics , Transcription Factors/genetics , Phenotype , Neurodevelopmental Disorders/genetics
9.
Hum Genet ; 2024 Jan 03.
Article in English | MEDLINE | ID: mdl-38170232

ABSTRACT

Variants which disrupt splicing are a frequent cause of rare disease that have been under-ascertained clinically. Accurate and efficient methods to predict a variant's impact on splicing are needed to interpret the growing number of variants of unknown significance (VUS) identified by exome and genome sequencing. Here, we present the results of the CAGI6 Splicing VUS challenge, which invited predictions of the splicing impact of 56 variants ascertained clinically and functionally validated to determine splicing impact. The performance of 12 prediction methods, along with SpliceAI and CADD, was compared on the 56 functionally validated variants. The maximum accuracy achieved was 82% from two different approaches, one weighting SpliceAI scores by minor allele frequency, and one applying the recently published Splicing Prediction Pipeline (SPiP). SPiP performed optimally in terms of sensitivity, while an ensemble method combining multiple prediction tools and information from databases exceeded all others for specificity. Several challenge methods equalled or exceeded the performance of SpliceAI, with ultimate choice of prediction method likely to depend on experimental or clinical aims. One quarter of the variants were incorrectly predicted by at least 50% of the methods, highlighting the need for further improvements to splicing prediction methods for successful clinical application.

10.
Nat Metab ; 5(10): 1673-1684, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37709961

ABSTRACT

The glucagon-like peptide 1 receptor (GLP1R) is a major drug target with several agonists being prescribed in individuals with type 2 diabetes and obesity1,2. The impact of genetic variability of GLP1R on receptor function and its association with metabolic traits are unclear with conflicting reports. Here, we show an unexpected diversity of phenotypes ranging from defective cell surface expression to complete or pathway-specific gain of function (GoF) and loss of function (LoF), after performing a functional profiling of 60 GLP1R variants across four signalling pathways. The defective insulin secretion of GLP1R LoF variants is rescued by allosteric GLP1R ligands or high concentrations of exendin-4/semaglutide in INS-1 823/3 cells. Genetic association studies in 200,000 participants from the UK Biobank show that impaired GLP1R cell surface expression contributes to poor glucose control and increased adiposity with increased glycated haemoglobin A1c and body mass index. This study defines impaired GLP1R cell surface expression as a risk factor for traits associated with type 2 diabetes and obesity and provides potential treatment options for GLP1R LoF variant carriers.


Subject(s)
Blood Glucose , Diabetes Mellitus, Type 2 , Humans , Insulin/metabolism , Diabetes Mellitus, Type 2/genetics , Adiposity/genetics , Obesity/genetics
11.
Res Sq ; 2023 Aug 02.
Article in English | MEDLINE | ID: mdl-37577579

ABSTRACT

In the context of the Critical Assessment of the Genome Interpretation, 6th edition (CAGI6), the Genetics of Neurodevelopmental Disorders Lab in Padua proposed a new ID-challenge to give the opportunity of developing computational methods for predicting patient's phenotype and the causal variants. Eight research teams and 30 models had access to the phenotype details and real genetic data, based on the sequences of 74 genes (VCF format) in 415 pediatric patients affected by Neurodevelopmental Disorders (NDDs). NDDs are clinically and genetically heterogeneous conditions, with onset in infant age. In this study we evaluate the ability and accuracy of computational methods to predict comorbid phenotypes based on clinical features described in each patient and causal variants. Finally, we asked to develop a method to find new possible genetic causes for patients without a genetic diagnosis. As already done for the CAGI5, seven clinical features (ID, ASD, ataxia, epilepsy, microcephaly, macrocephaly, hypotonia), and variants (causative, putative pathogenic and contributing factors) were provided. Considering the overall clinical manifestation of our cohort, we give out the variant data and phenotypic traits of the 150 patients from CAGI5 ID-Challenge as training and validation for the prediction methods development.

12.
medRxiv ; 2023 Aug 04.
Article in English | MEDLINE | ID: mdl-37577678

ABSTRACT

Background: A major obstacle faced by rare disease families is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years, and causal variants are identified in under 50%. The Rare Genomes Project (RGP) is a direct-to-participant research study on the utility of genome sequencing (GS) for diagnosis and gene discovery. Families are consented for sharing of sequence and phenotype data with researchers, allowing development of a Critical Assessment of Genome Interpretation (CAGI) community challenge, placing variant prioritization models head-to-head in a real-life clinical diagnostic setting. Methods: Predictors were provided a dataset of phenotype terms and variant calls from GS of 175 RGP individuals (65 families), including 35 solved training set families, with causal variants specified, and 30 test set families (14 solved, 16 unsolved). The challenge tasked teams with identifying the causal variants in as many test set families as possible. Ranked variant predictions were submitted with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on rank position of true positive causal variants and maximum F-measure, based on precision and recall of causal variants across EPCR thresholds. Results: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performing teams recalled the causal variants in up to 13 of 14 solved families by prioritizing high quality variant calls that were rare, predicted deleterious, segregating correctly, and consistent with reported phenotype. In unsolved families, newly discovered diagnostic variants were returned to two families following confirmatory RNA sequencing, and two prioritized novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant, in an unsolved proband with phenotype overlap with asparagine synthetase deficiency. Conclusions: By objective assessment of variant predictions, we provide insights into current state-of-the-art algorithms and platforms for genome sequencing analysis for rare disease diagnosis and explore areas for future optimization. Identification of diagnostic variants in unsolved families promotes synergy between researchers with clinical and computational expertise as a means of advancing the field of clinical genome interpretation.

13.
Bioinformatics ; 39(8)2023 08 01.
Article in English | MEDLINE | ID: mdl-37522889

ABSTRACT

SUMMARY: In any population under selective pressure, a central challenge is to distinguish the genes that drive adaptation from others which, subject to population variation, harbor many neutral mutations de novo. We recently showed that such genes could be identified by supplementing information on mutational frequency with an evolutionary analysis of the likely functional impact of coding variants. This approach improved the discovery of driver genes in both lab-evolved and environmental Escherichia coli strains. To facilitate general adoption, we now developed ShinyBioHEAT, an R Shiny web-based application that enables identification of phenotype driving gene in two commonly used model bacteria, E.coli and Bacillus subtilis, with no specific computational skill requirements. ShinyBioHEAT not only supports transparent and interactive analysis of lab evolution data in E.coli and B.subtilis, but it also creates dynamic visualizations of mutational impact on protein structures, which add orthogonal checks on predicted drivers. AVAILABILITY AND IMPLEMENTATION: Code for ShinyBioHEAT is available at https://github.com/LichtargeLab/ShinyBioHEAT. The Shiny application is additionally hosted at http://bioheat.lichtargelab.org/.


Subject(s)
Escherichia coli , Mobile Applications , Escherichia coli/genetics , Software , Mutation , Data Interpretation, Statistical , Mutation Rate
14.
Nat Commun ; 14(1): 2765, 2023 05 13.
Article in English | MEDLINE | ID: mdl-37179358

ABSTRACT

The incidence of Alzheimer's Disease in females is almost double that of males. To search for sex-specific gene associations, we build a machine learning approach focused on functionally impactful coding variants. This method can detect differences between sequenced cases and controls in small cohorts. In the Alzheimer's Disease Sequencing Project with mixed sexes, this approach identified genes enriched for immune response pathways. After sex-separation, genes become specifically enriched for stress-response pathways in male and cell-cycle pathways in female. These genes improve disease risk prediction in silico and modulate Drosophila neurodegeneration in vivo. Thus, a general approach for machine learning on functionally impactful variants can uncover sex-specific candidates towards diagnostic biomarkers and therapeutic targets.


Subject(s)
Alzheimer Disease , Sex Factors , Female , Humans , Male , Alzheimer Disease/genetics , Alzheimer Disease/metabolism
15.
Br J Cancer ; 128(11): 2013-2024, 2023 06.
Article in English | MEDLINE | ID: mdl-37012319

ABSTRACT

BACKGROUND: Cisplatin (CDDP) is a mainstay treatment for advanced head and neck squamous cell carcinomas (HNSCC) despite a high frequency of innate and acquired resistance. We hypothesised that tumours acquire CDDP resistance through an enhanced reductive state dependent on metabolic rewiring. METHODS: To validate this model and understand how an adaptive metabolic programme might be imprinted, we performed an integrated analysis of CDDP-resistant HNSCC clones from multiple genomic backgrounds by whole-exome sequencing, RNA-seq, mass spectrometry, steady state and flux metabolomics. RESULTS: Inactivating KEAP1 mutations or reductions in KEAP1 RNA correlated with Nrf2 activation in CDDP-resistant cells, which functionally contributed to resistance. Proteomics identified elevation of downstream Nrf2 targets and the enrichment of enzymes involved in generation of biomass and reducing equivalents, metabolism of glucose, glutathione, NAD(P), and oxoacids. This was accompanied by biochemical and metabolic evidence of an enhanced reductive state dependent on coordinated glucose and glutamine catabolism, associated with reduced energy production and proliferation, despite normal mitochondrial structure and function. CONCLUSIONS: Our analysis identified coordinated metabolic changes associated with CDDP resistance that may provide new therapeutic avenues through targeting of these convergent pathways.


Subject(s)
Antineoplastic Agents , Head and Neck Neoplasms , Humans , Cisplatin/metabolism , Squamous Cell Carcinoma of Head and Neck , Kelch-Like ECH-Associated Protein 1/genetics , NF-E2-Related Factor 2/genetics , Drug Resistance, Neoplasm/genetics , Cell Line, Tumor , Glucose , Antineoplastic Agents/pharmacology
16.
Nat Commun ; 13(1): 3189, 2022 06 09.
Article in English | MEDLINE | ID: mdl-35680894

ABSTRACT

Since antibiotic development lags, we search for potential drug targets through directed evolution experiments. A challenge is that many resistance genes hide in a noisy mutational background as mutator clones emerge in the adaptive population. Here, to overcome this noise, we quantify the impact of mutations through evolutionary action (EA). After sequencing ciprofloxacin or colistin resistance strains grown under different mutational regimes, we find that an elevated sum of the evolutionary action of mutations in a gene identifies known resistance drivers. This EA integration approach also suggests new antibiotic resistance genes which are then shown to provide a fitness advantage in competition experiments. Moreover, EA integration analysis of clinical and environmental isolates of antibiotic resistant of E. coli identifies gene drivers of resistance where a standard approach fails. Together these results inform the genetic basis of de novo colistin resistance and support the robust discovery of phenotype-driving genes via the evolutionary action of genetic perturbations in fitness landscapes.


Subject(s)
Anti-Bacterial Agents , Drug Resistance, Bacterial , Escherichia coli Proteins , Escherichia coli , Anti-Bacterial Agents/pharmacology , Ciprofloxacin/pharmacology , Colistin/pharmacology , Drug Resistance, Bacterial/genetics , Escherichia coli/drug effects , Escherichia coli/genetics , Escherichia coli Proteins/genetics , Microbial Sensitivity Tests , Mutation
17.
Hum Genet ; 141(10): 1549-1577, 2022 Oct.
Article in English | MEDLINE | ID: mdl-35488922

ABSTRACT

Estimating the effects of variants found in disease driver genes opens the door to personalized therapeutic opportunities. Clinical associations and laboratory experiments can only characterize a tiny fraction of all the available variants, leaving the majority as variants of unknown significance (VUS). In silico methods bridge this gap by providing instant estimates on a large scale, most often based on the numerous genetic differences between species. Despite concerns that these methods may lack reliability in individual subjects, their numerous practical applications over cohorts suggest they are already helpful and have a role to play in genome interpretation when used at the proper scale and context. In this review, we aim to gain insights into the training and validation of these variant effect predicting methods and illustrate representative types of experimental and clinical applications. Objective performance assessments using various datasets that are not yet published indicate the strengths and limitations of each method. These show that cautious use of in silico variant impact predictors is essential for addressing genome interpretation challenges.


Subject(s)
Genetic Testing , Genome, Human , Computational Biology , Genetic Variation , Humans , Reproducibility of Results
18.
Genome Res ; 32(5): 916-929, 2022 05.
Article in English | MEDLINE | ID: mdl-35301263

ABSTRACT

Genetic variants drive the evolution of traits and diseases. We previously modeled these variants as small displacements in fitness landscapes and estimated their functional impact by differentiating the evolutionary relationship between genotype and phenotype. Conversely, here we integrate these derivatives to identify genes steering specific traits. Over cancer cohorts, integration identified 460 likely tumor-driving genes. Many have literature and experimental support but had eluded prior genomic searches for positive selection in tumors. Beyond providing cancer insights, these results introduce a general calculus of evolution to quantify the genotype-phenotype relationship and discover genes associated with complex traits and diseases.


Subject(s)
Calculi , Neoplasms , Biological Evolution , Genetic Fitness , Genotype , Humans , Models, Genetic , Neoplasms/genetics , Phenotype , Selection, Genetic
19.
Front Mol Biosci ; 8: 791792, 2021.
Article in English | MEDLINE | ID: mdl-34966786

ABSTRACT

All tumors have DNA mutations, and a predictive understanding of those mutations could inform clinical treatments. However, 40% of the mutations are variants of unknown significance (VUS), with the challenge being to objectively predict whether a VUS is pathogenic and supports the tumor or whether it is benign. To objectively decode VUS, we mapped cancer sequence data and evolutionary trace (ET) scores onto crystallography and cryo-electron microscopy structures with variant impacts quantitated by evolutionary action (EA) measures. As tumors depend on helicases and nucleases to deal with transcription/replication stress, we targeted helicase-nuclease-RPA complexes: (1) XPB-XPD (within TFIIH), XPF-ERCC1, XPG, and RPA for transcription and nucleotide excision repair pathways and (2) BLM, EXO5, and RPA plus DNA2 for stalled replication fork restart. As validation, EA scoring predicts severe effects for most disease mutations, but disease mutants with low ET scores not only are likely destabilizing but also disrupt sophisticated allosteric mechanisms. For sites of disease mutations and VUS predicted to be severe, we found strong co-localization to ordered regions. Rare discrepancies highlighted the different survival requirements between disease and tumor mutations, as well as the value of examining proteins within complexes. In a genome-wide analysis of 33 cancer types, we found correlation between the number of mutations in each tumor and which pathways or functional processes in which the mutations occur, revealing different mutagenic routes to tumorigenesis. We also found upregulation of ancient genes including BLM, which supports a non-random and concerted cancer process: reversion to a unicellular, proliferation-uncontrolled, status by breaking multicellular constraints on cell division. Together, these genes and global analyses challenge the binary "driver" and "passenger" mutation paradigm, support a gradient impact as revealed by EA scoring from moderate to severe at a single gene level, and indicate reduced regulation as well as activity. The objective quantitative assessment of VUS scoring and gene overexpression in the context of functional interactions and pathways provides insights for biology, oncology, and precision medicine.

20.
Methods Enzymol ; 661: 407-431, 2021.
Article in English | MEDLINE | ID: mdl-34776222

ABSTRACT

We present a Chemistry and Structure Screen Integrated Efficiently (CASSIE) approach (named for Greek prophet Cassandra) to design inhibitors for cancer biology and pathogenesis. CASSIE provides an effective path to target master keys to control the repair-replication interface for cancer cells and SARS CoV-2 pathogenesis as exemplified here by specific targeting of Poly(ADP-ribose) glycohydrolase (PARG) and ADP-ribose glycohydrolase ARH3 macrodomains plus SARS CoV-2 nonstructural protein 3 (Nsp3) Macrodomain 1 (Mac1) and Nsp15 nuclease. As opposed to the classical massive effort employing libraries with large numbers of compounds against single proteins, we make inhibitor design for multiple targets efficient. Our compact, chemically diverse, 5000 compound Goldilocks (GL) library has an intermediate number of compounds sized between fragments and drugs with predicted favorable ADME (absorption, distribution, metabolism, and excretion) and toxicological profiles. Amalgamating our core GL library with an approved drug (AD) library, we employ a combined GLAD library virtual screen, enabling an effective and efficient design cycle of ranked computer docking, top hit biophysical and cell validations, and defined bound structures using human proteins or their avatars. As new drug design is increasingly pathway directed as well as molecular and mechanism based, our CASSIE approach facilitates testing multiple related targets by efficiently turning a set of interacting drug discovery problems into a tractable medicinal chemistry engineering problem of optimizing affinity and ADME properties based upon early co-crystal structures. Optimization efforts are made efficient by a computationally-focused iterative chemistry and structure screen. Thus, we herein describe and apply CASSIE to define prototypic, specific inhibitors for PARG vs distinct inhibitors for the related macrodomains of ARH3 and SARS CoV-2 Nsp3 plus the SARS CoV-2 Nsp15 RNA nuclease.


Subject(s)
COVID-19 , Nucleic Acids , Severe Acute Respiratory Syndrome , DNA Repair , Humans , Molecular Docking Simulation , SARS-CoV-2
SELECTION OF CITATIONS
SEARCH DETAIL