RESUMEN
Deep mutational scanning enables high-throughput functional assessment of genetic variants. While phenotypic measurements from screening assays generally align with clinical outcomes, experimental noise may affect the accuracy of individual variant estimates. We developed the FUSE (functional substitution estimation) pipeline, which leverages measurements collectively within screening assays to improve the estimation of variant impacts. Drawing data from 115 published functional assays, FUSE assesses the mean functional effect per amino acid position and makes estimates for individual allelic variants. It enhances the correlation of variant functional effects from different assay platforms and increases the classification accuracy of missense variants in ClinVar across 29 genes (area under the receiver operating characteristic [ROC] curve [AUC] from 0.83 to 0.90). In UK Biobank patients with rare missense variants in BRCA1, LDLR, or TP53, FUSE improves the classification accuracy of associated phenotypes. FUSE can also impute variant effects for substitutions not experimentally screened. This approach improves accuracy and broadens the utility of data from functional screening.
Asunto(s)
Proteína BRCA1 , Humanos , Proteína BRCA1/genética , Proteína p53 Supresora de Tumor/genética , Receptores de LDL/genética , Mutación Missense , Fenotipo , Variación Genética/genéticaRESUMEN
Prediction of protein fitness from computational modeling is an area of active research in rational protein design. Here, we investigated whether protein fluctuations computed from molecular dynamics simulations can be used to predict the expression levels of SARS-CoV-2 receptor binding domain (RBD) mutants determined in the deep mutational scanning experiment of Starr et al. [Science (New York, N.Y.) 2022, 377, 420] Specifically, we performed more than 0.7 milliseconds of molecular dynamics (MD) simulations of 557 mutant RBDs in triplicate to achieve statistical significance under various simulation conditions. Our results show modest but significant anticorrelation in the range [-0.4, -0.3] between expression and RBD protein flexibility. A simple linear regression machine learning model achieved correlation coefficients in the range [0.7, 0.8], thus outperforming MD-based models, but required about 25 mutations at each residue position for training.
RESUMEN
RNA polymerase II (Pol II) has a highly conserved domain, the trigger loop (TL), that controls transcription fidelity and speed. We previously probed pairwise genetic interactions between residues within and surrounding the TL for the purpose of understand functional interactions between residues and to understand how individual mutants might alter TL function. We identified widespread incompatibility between TLs of different species when placed in the Saccharomyces cerevisiae Pol II context, indicating species-specific interactions between otherwise highly conserved TLs and its surroundings. These interactions represent epistasis between TL residues and the rest of Pol II. We sought to understand why certain TL sequences are incompatible with S. cerevisiae Pol II and to dissect the nature of genetic interactions within multiply substituted TLs as a window on higher order epistasis in this system. We identified both positive and negative higher-order residue interactions within example TL haplotypes. Intricate higher-order epistasis formed by TL residues was sometimes only apparent from analysis of intermediate genotypes, emphasizing complexity of epistatic interactions. Furthermore, we distinguished TL substitutions with distinct classes of epistatic patterns, suggesting specific TL residues that potentially influence TL evolution. Our examples of complex residue interactions suggest possible pathways for epistasis to facilitate Pol II evolution.
RESUMEN
The interaction of sclerostin (Scl) with the low-density lipoprotein receptor-related protein 4 (LRP4) leads to a marked reduction in bone formation by inhibiting the Wnt/ß-catenin pathway. To characterize the Scl-LRP4 binding interface, we sorted a combinatorial library of Scl variants and isolated variants with reduced affinity to LRP4. We identified Scl single-mutation variants enriched during the sorting process and verified their reduction in affinity toward LRP4-a reduction that was not a result of changes in the variants' secondary structure or stability. We found that Scl positions K75 (loop 1) and V136 (loop 3) are critical hotspots for binding to LRP4. Our findings establish the foundation for targeting these hotspots for developing novel therapeutic strategies to promote bone formation.
RESUMEN
The effect of replacing the amino acid at a given site in a protein is difficult to predict. Yet, evolutionary comparisons have revealed highly regular patterns of interchangeability between pairs of amino acids, and such patterns have proved enormously useful in a range of applications in bioinformatics, evolutionary inference, and protein design. Here we reconcile these apparently contradictory observations using fitness data from over 350,000 experimental amino acid replacements. Almost one-quarter of the 20 × 19 = 380 types of replacements have broad distributions of fitness effects (DFEs) that closely resemble the background DFE for random changes, indicating an overwhelming influence of protein context in determining mutational effects. However, we also observe that the 380 pair-specific DFEs closely follow a maximum entropy distribution, specifically a truncated exponential distribution. The shape of this distribution is determined entirely by its mean, which is equivalent to the chance that a replacement of the given type is fitter than a random replacement. In this type of distribution, modest deviations in the mean correspond to much larger changes in the probability of falling in the far right tail, so that modest differences in mean exchangeability may result in much larger differences in the chance of a highly fit mutation. Indeed, we show that under the assumption that purifying selection filters out the vast majority of mutations, the maximum entropy distributions of fitness effects inferred from deep mutational scanning experiments predict the characteristic patterns of amino acid change observed in molecular evolution. These maximum entropy distributions of mutational effects not only provide a tuneable model for molecular evolution, but also have implications for mutational effect prediction and protein engineering.
RESUMEN
Deep mutational scanning experiments aid in the surveillance and forecasting of viral evolution by providing prospective measurements of mutational effects on viral traits, but epistatic shifts in the impacts of mutations can hinder viral forecasting when measurements were made in outdated strain backgrounds. Here, we report measurements of the impact of all single amino acid mutations on ACE2-binding affinity and protein folding and expression in the SARS-CoV-2 Omicron BA.2.86 spike receptor-binding domain. As with other SARS-CoV-2 variants, we find a plastic and evolvable basis for receptor binding, with many mutations at the ACE2 interface maintaining or even improving ACE2-binding affinity. Despite its large genetic divergence, mutational effects in BA.2.86 have not diverged greatly from those measured in its Omicron BA.2 ancestor. However, we do identify strong positive epistasis among subsequent mutations that have accrued in BA.2.86 descendants. Specifically, the Q493E mutation that decreased ACE2-binding affinity in all previous SARS-CoV-2 backgrounds is reversed in sign to enhance human ACE2-binding affinity when coupled with L455S and F456L in the currently emerging KP.3 variant. Our results point to a modest degree of epistatic drift in mutational effects during recent SARS-CoV-2 evolution but highlight how these small epistatic shifts can have important consequences for the emergence of new SARS-CoV-2 variants.
RESUMEN
The Cytochrome P450s (CYPs) enzyme family metabolizes â¼80% of small molecule drugs. Variants in CYPs can substantially alter drug metabolism, leading to improper dosing and severe adverse drug reactions. Due to low sequence conservation, predicting variant effects across CYPs is challenging. Even closely related CYPs like CYP2C9 and CYP2C19, which share 92% amino acid sequence identity, display distinct phenotypic properties. Using Variant Abundance by Massively Parallel sequencing (VAMP-seq), we measured the steady-state protein abundance of 7,660 single amino acid variants in CYP2C19 expressed in cultured human cells. Our findings confirmed critical positions and structural features essential for CYP function and revealed how variants at conserved positions influence abundance. We jointly analyzed 4,670 variants whose abundance was measured in both CYP2C19 and CYP2C9, finding that the homologs have different variant abundances in substrate recognition sites within the hydrophobic core. We also measured the abundance of all single and some multiple WT amino acid exchanges between CYP2C19 and CYP2C9. While most exchanges had no effect, substitutions in substrate recognition site 4 (SRS4) reduced abundance in CYP2C19. Double and triple mutants showed distinct interactions, highlighting a region that points to differing thermodynamic properties between the two homologs. These positions are known contributors to substrate specificity, suggesting an evolutionary tradeoff between stability and enzymatic function. Finally, we analyzed 368 previously unannotated human variants, finding that 43% had decreased abundance. By comparing variant effects between these homologs, we uncovered regions underlying their functional differences, advancing our understanding of this versatile family of enzymes.
RESUMEN
Interpretation of disease-causing genetic variants remains a challenge in human genetics. Current costs and complexity of deep mutational scanning methods are obstacles for achieving genome-wide resolution of variants in disease-related genes. Our framework, saturation mutagenesis-reinforced functional assays (SMuRF), offers simple and cost-effective saturation mutagenesis paired with streamlined functional assays to enhance the interpretation of unresolved variants. Applying SMuRF to neuromuscular disease genes FKRP and LARGE1, we generated functional scores for all possible coding single-nucleotide variants, which aid in resolving clinically reported variants of uncertain significance. SMuRF also demonstrates utility in predicting disease severity, resolving critical structural regions, and providing training datasets for the development of computational predictors. Overall, our approach enables variant-to-function insights for disease genes in a cost-effective manner that can be broadly implemented by standard research laboratories.
RESUMEN
The COVID-19 pandemic has driven substantial evolution of the SARS-CoV-2 virus, yielding subvariants that exhibit enhanced infectiousness in humans. However, this adaptive advantage may not universally extend to zoonotic transmission. In this work, we hypothesize that viral adaptations favoring animal hosts do not necessarily correlate with increased human infectivity. In addition, we consider the potential for gain-of-function mutations that could facilitate the virus's rapid evolution in humans following adaptation in animal hosts. Specifically, we identify the SARS-CoV-2 receptor-binding domain (RBD) mutations that enhance human-animal cross-transmission. To this end, we construct a multitask deep learning model, MT-TopLap trained on multiple deep mutational scanning datasets, to accurately predict the binding free energy changes upon mutation for the RBD to ACE2 of various species, including humans, cats, bats, deer, and hamsters. By analyzing these changes, we identified key RBD mutations such as Q498H in SARS-CoV-2 and R493K in the BA.2 variant that are likely to increase the potential for human-animal cross-transmission.
RESUMEN
MET is a receptor tyrosine kinase (RTK) responsible for initiating signaling pathways involved in development and wound repair. MET activation relies on ligand binding to the extracellular receptor, which prompts dimerization, intracellular phosphorylation, and recruitment of associated signaling proteins. Mutations, which are predominantly observed clinically in the intracellular juxtamembrane and kinase domains, can disrupt typical MET regulatory mechanisms. Understanding how juxtamembrane variants, such as exon 14 skipping (METΔEx14), and rare kinase domain mutations can increase signaling, often leading to cancer, remains a challenge. Here, we perform a parallel deep mutational scan (DMS) of the MET intracellular kinase domain in two fusion protein backgrounds: wild-type and METΔEx14. Our comparative approach has revealed a critical hydrophobic interaction between a juxtamembrane segment and the kinase âºC-helix, pointing to potential differences in regulatory mechanisms between MET and other RTKs. Additionally, we have uncovered a ß5 motif that acts as a structural pivot for the kinase domain in MET and other TAM family of kinases. We also describe a number of previously unknown activating mutations, aiding the effort to annotate driver, passenger, and drug resistance mutations in the MET kinase domain.
Asunto(s)
Proteínas Proto-Oncogénicas c-met , Proteínas Proto-Oncogénicas c-met/genética , Proteínas Proto-Oncogénicas c-met/metabolismo , Humanos , Dominios Proteicos/genética , Mutación , Secuencias de Aminoácidos , Análisis Mutacional de ADNRESUMEN
Multi-domain enzymes can be regulated by both inter-domain interactions and structural features intrinsic to the catalytic domain. The tyrosine phosphatase SHP2 is a quintessential example of a multi-domain protein that is regulated by inter-domain interactions. This enzyme has a protein tyrosine phosphatase (PTP) domain and two phosphotyrosine-recognition domains (N-SH2 and C-SH2) that regulate phosphatase activity through autoinhibitory interactions. SHP2 is canonically activated by phosphoprotein binding to the SH2 domains, which causes large inter-domain rearrangements, but autoinhibition can also be disrupted by disease-associated mutations. Many details of the SHP2 activation mechanism are still unclear, the physiologically-relevant active conformations remain elusive, and hundreds of human variants of SHP2 have not been functionally characterized. Here, we perform deep mutational scanning on both full-length SHP2 and its isolated PTP domain to examine mutational effects on inter-domain regulation and catalytic activity. Our experiments provide a comprehensive map of SHP2 mutational sensitivity, both in the presence and absence of inter-domain regulation. Coupled with molecular dynamics simulations, our investigation reveals novel structural features that govern the stability of the autoinhibited and active states of SHP2. Our analysis also identifies key residues beyond the SHP2 active site that control PTP domain dynamics and intrinsic catalytic activity. This work expands our understanding of SHP2 regulation and provides new insights into SHP2 pathogenicity.
RESUMEN
B cells surveil the body for foreign matter using their surface-expressed B cell antigen receptor (BCR), a tetrameric complex comprising a membrane-tethered antibody (mIg) that binds antigens and a signaling dimer (CD79AB) that conveys this interaction to the B cell. Recent cryogenic electron microscopy (cryo-EM) structures of IgM and IgG isotype BCRs provide the first complete views of their architecture, revealing that the largest interaction surfaces between the mIg and CD79AB are in their transmembrane domains (TMDs). These structures support decades of biochemical work interrogating the requirements for assembly of a functional BCR and provide the basis for explaining the effects of mutations. Here we report a focused saturating mutagenesis to comprehensively characterize the nature of the interactions in the mIg TMD that are required for BCR surface expression. We examined the effects of 600 single-amino-acid changes simultaneously in a pooled competition assay and quantified their effects by next-generation sequencing. Our deep mutational scanning results reflect a feature-rich TMD sequence, with some positions completely intolerant to mutation and others requiring specific biochemical properties such as charge, polarity or hydrophobicity, emphasizing the high value of saturating mutagenesis over, for example, alanine scanning. The data agree closely with published mutagenesis and the cryo-EM structures, while also highlighting several positions and surfaces that have not previously been characterized or have effects that are difficult to rationalize purely based on structure. This unbiased and complete mutagenesis dataset serves as a reference and framework for informed hypothesis testing, design of therapeutics to regulate BCR surface expression and to annotate patient mutations.
Asunto(s)
Receptores de Antígenos de Linfocitos B , Receptores de Antígenos de Linfocitos B/genética , Receptores de Antígenos de Linfocitos B/inmunología , Receptores de Antígenos de Linfocitos B/metabolismo , Humanos , Mutación , Animales , Linfocitos B/inmunología , Linfocitos B/metabolismo , Antígenos CD79/genética , Antígenos CD79/metabolismo , Antígenos CD79/inmunología , Membrana Celular/metabolismo , RatonesRESUMEN
The large-scale experimental measures of variant functional assays submitted to MaveDB have the potential to provide key information for resolving variants of uncertain significance, but the reporting of results relative to assayed sequence hinders their downstream utility. The Atlas of Variant Effects Alliance mapped multiplexed assays of variant effect data to human reference sequences, creating a robust set of machine-readable homology mappings. This method processed approximately 2.5 million protein and genomic variants in MaveDB, successfully mapping 98.61% of examined variants and disseminating data to resources such as the UCSC Genome Browser and Ensembl Variant Effect Predictor.
RESUMEN
Baloxavir acid (BXA) is a pan-influenza antiviral that targets the cap-dependent endonuclease of the polymerase acidic (PA) protein required for viral mRNA synthesis. To gain a comprehensive understanding on the molecular changes associated with reduced susceptibility to BXA and their fitness profile, we performed a deep mutational scanning at the PA endonuclease domain of an A (H1N1)pdm09 virus. The recombinant virus libraries were serially passaged in vitro under increasing concentrations of BXA followed by next-generation sequencing to monitor PA amino acid substitutions with increased detection frequencies. Enriched PA amino acid changes were each introduced into a recombinant A (H1N1)pdm09 virus to validate their effect on BXA susceptibility and viral replication fitness in vitro. The I38 T/M substitutions known to confer reduced susceptibility to BXA were invariably detected from recombinant virus libraries within 5 serial passages. In addition, we identified a novel L106R substitution that emerged in the third passage and conferred greater than 10-fold reduced susceptibility to BXA. PA-L106 is highly conserved among seasonal influenza A and B viruses. Compared to the wild-type virus, the L106R substitution resulted in reduced polymerase activity and a minor reduction of the peak viral load, suggesting the amino acid change may result in moderate fitness loss. Our results support the use of deep mutational scanning as a practical tool to elucidate genotype-phenotype relationships, including mapping amino acid substitutions with reduced susceptibility to antivirals.
Asunto(s)
Sustitución de Aminoácidos , Antivirales , Dibenzotiepinas , Farmacorresistencia Viral , Subtipo H1N1 del Virus de la Influenza A , Morfolinas , Piridonas , Triazinas , Proteínas Virales , Replicación Viral , Dibenzotiepinas/farmacología , Farmacorresistencia Viral/genética , Antivirales/farmacología , Subtipo H1N1 del Virus de la Influenza A/efectos de los fármacos , Subtipo H1N1 del Virus de la Influenza A/genética , Triazinas/farmacología , Replicación Viral/efectos de los fármacos , Piridonas/farmacología , Humanos , Morfolinas/farmacología , Proteínas Virales/genética , Animales , Tiepinas/farmacología , ARN Polimerasa Dependiente del ARN/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Perros , Células de Riñón Canino Madin Darby , Gripe Humana/virología , Gripe Humana/tratamiento farmacológico , Oxazinas/farmacologíaRESUMEN
Human influenza virus evolves to escape neutralization by polyclonal antibodies. However, we have a limited understanding of how the antigenic effects of viral mutations vary across the human population and how this heterogeneity affects virus evolution. Here, we use deep mutational scanning to map how mutations to the hemagglutinin (HA) proteins of two H3N2 strains, A/Hong Kong/45/2019 and A/Perth/16/2009, affect neutralization by serum from individuals of a variety of ages. The effects of HA mutations on serum neutralization differ across age groups in ways that can be partially rationalized in terms of exposure histories. Mutations that were fixed in influenza variants after 2020 cause greater escape from sera from younger individuals compared with adults. Overall, these results demonstrate that influenza faces distinct antigenic selection regimes from different age groups and suggest approaches to understand how this heterogeneous selection shapes viral evolution.
Asunto(s)
Anticuerpos Antivirales , Glicoproteínas Hemaglutininas del Virus de la Influenza , Subtipo H3N2 del Virus de la Influenza A , Gripe Humana , Mutación , Humanos , Glicoproteínas Hemaglutininas del Virus de la Influenza/genética , Glicoproteínas Hemaglutininas del Virus de la Influenza/inmunología , Subtipo H3N2 del Virus de la Influenza A/genética , Subtipo H3N2 del Virus de la Influenza A/inmunología , Adulto , Anticuerpos Antivirales/inmunología , Anticuerpos Antivirales/sangre , Gripe Humana/virología , Gripe Humana/inmunología , Factores de Edad , Persona de Mediana Edad , Adulto Joven , Anticuerpos Neutralizantes/inmunología , Anticuerpos Neutralizantes/sangre , Antígenos Virales/genética , Antígenos Virales/inmunología , Adolescente , Evolución Molecular , Anciano , NiñoRESUMEN
Lassa virus is estimated to cause thousands of human deaths per year, primarily due to spillovers from its natural host, Mastomys rodents. Efforts to create vaccines and antibody therapeutics must account for the evolutionary variability of the Lassa virus's glycoprotein complex (GPC), which mediates viral entry into cells and is the target of neutralizing antibodies. To map the evolutionary space accessible to GPC, we used pseudovirus deep mutational scanning to measure how nearly all GPC amino-acid mutations affected cell entry and antibody neutralization. Our experiments defined functional constraints throughout GPC. We quantified how GPC mutations affected neutralization with a panel of monoclonal antibodies. All antibodies tested were escaped by mutations that existed among natural Lassa virus lineages. Overall, our work describes a biosafety-level-2 method to elucidate the mutational space accessible to GPC and shows how prospective characterization of antigenic variation could aid the design of therapeutics and vaccines.
Asunto(s)
Anticuerpos Monoclonales , Anticuerpos Neutralizantes , Anticuerpos Antivirales , Fiebre de Lassa , Virus Lassa , Mutación , Virus Lassa/inmunología , Virus Lassa/genética , Humanos , Anticuerpos Antivirales/inmunología , Anticuerpos Neutralizantes/inmunología , Animales , Anticuerpos Monoclonales/inmunología , Fiebre de Lassa/inmunología , Fiebre de Lassa/virología , Internalización del Virus , Proteínas del Envoltorio Viral/inmunología , Proteínas del Envoltorio Viral/genética , Glicoproteínas/inmunología , Glicoproteínas/genética , Evasión Inmune/inmunología , Evasión Inmune/genética , Células HEK293RESUMEN
Adeno-associated viruses 2 (AAV2) are minute viruses renowned for their capacity to infect human cells and akin organisms. They have recently emerged as prominent candidates in the field of gene therapy, primarily attributed to their inherent non-pathogenic nature in humans and the safety associated with their manipulation. The efficacy of AAV2 as gene therapy vectors hinges on their ability to infiltrate host cells, a phenomenon reliant on their competence to construct a capsid capable of breaching the nucleus of the target cell. To enhance their infection potential, researchers have extensively scrutinized various combinatorial libraries by introducing mutations into the capsid, aiming to boost their effectiveness. The emergence of high-throughput experimental techniques, like deep mutational scanning (DMS), has made it feasible to experimentally assess the fitness of these libraries for their intended purpose. Notably, machine learning is starting to demonstrate its potential in addressing predictions within the mutational landscape from sequence data. In this context, we introduce a biophysically-inspired model designed to predict the viability of genetic variants in DMS experiments. This model is tailored to a specific segment of the CAP region within AAV2's capsid protein. To evaluate its effectiveness, we conduct model training with diverse datasets, each tailored to explore different aspects of the mutational landscape influenced by the selection process. Our assessment of the biophysical model centers on two primary objectives: (i) providing quantitative forecasts for the log-selectivity of variants and (ii) deploying it as a binary classifier to categorize sequences into viable and non-viable classes.
Asunto(s)
Mutación , Humanos , Proteínas de la Cápside/genética , Dependovirus/genética , Parvovirinae/genéticaRESUMEN
Interpreting the wealth of rare genetic variants discovered in population-scale sequencing efforts and deciphering their associations with human health and disease present a critical challenge due to the lack of sufficient clinical case reports. One promising avenue to overcome this problem is deep mutational scanning (DMS), a method of introducing and evaluating large-scale genetic variants in model cell lines. DMS allows unbiased investigation of variants, including those that are not found in clinical reports, thus improving rare disease diagnostics. Currently, the main obstacle limiting the full potential of DMS is the availability of functional assays that are specific to disease mechanisms. Thus, we explore high-throughput functional methodologies suitable to examine broad disease mechanisms. We specifically focus on methods that do not require robotics or automation but instead use well-designed molecular tools to transform biological mechanisms into easily detectable signals, such as cell survival rate, fluorescence or drug resistance. Here, we aim to bridge the gap between disease-relevant assays and their integration into the DMS framework.
Asunto(s)
Ensayos Analíticos de Alto Rendimiento , Animales , Humanos , Enfermedad/genética , Variación Genética , Ensayos Analíticos de Alto Rendimiento/métodos , Mutación/genéticaRESUMEN
Nonsense and missense mutations in the transcription factor PAX6 cause a wide range of eye development defects, including aniridia, microphthalmia and coloboma. To understand how changes of PAX6:DNA binding cause these phenotypes, we combined saturation mutagenesis of the paired domain of PAX6 with a yeast one-hybrid (Y1H) assay in which expression of a PAX6-GAL4 fusion gene drives antibiotic resistance. We quantified binding of more than 2700 single amino-acid variants to two DNA sequence elements. Mutations in DNA-facing residues of the N-terminal subdomain and linker region were most detrimental, as were mutations to prolines and to negatively charged residues. Many variants caused sequence-specific molecular gain-of-function effects, including variants in position 71 that increased binding to the LE9 enhancer but decreased binding to a SELEX-derived binding site. In the absence of antibiotic selection, variants that retained DNA binding slowed yeast growth, likely because such variants perturbed the yeast transcriptome. Benchmarking against known patient variants and applying ACMG/AMP guidelines to variant classification, we obtained supporting-to-moderate evidence that 977 variants are likely pathogenic and 1306 are likely benign. Our analysis shows that most pathogenic mutations in the paired domain of PAX6 can be explained simply by the effects of these mutations on PAX6:DNA association, and establishes Y1H as a generalisable assay for the interpretation of variant effects in transcription factors.