RESUMO
Mutations in transporters can impact an individual's response to drugs and cause many diseases. Few variants in transporters have been evaluated for their functional impact. Here, we combine saturation mutagenesis and multi-phenotypic screening to dissect the impact of 11,213 missense single-amino-acid deletions, and synonymous variants across the 554 residues of OCT1, a key liver xenobiotic transporter. By quantifying in parallel expression and substrate uptake, we find that most variants exert their primary effect on protein abundance, a phenotype not commonly measured alongside function. Using our mutagenesis results combined with structure prediction and molecular dynamic simulations, we develop accurate structure-function models of the entire transport cycle, providing biophysical characterization of all known and possible human OCT1 polymorphisms. This work provides a complete functional map of OCT1 variants along with a framework for integrating functional genomics, biophysical modeling, and human genetics to predict variant effects on disease and drug efficacy.
Assuntos
Simulação de Dinâmica Molecular , Transportador 1 de Cátions Orgânicos , Conformação Proteica , Humanos , Transporte Biológico , Células HEK293 , Mutação , Mutação de Sentido Incorreto , Fator 1 de Transcrição de Octâmero , Transportador 1 de Cátions Orgânicos/genética , Transportador 1 de Cátions Orgânicos/metabolismo , Farmacogenética , Fenótipo , Relação Estrutura-AtividadeRESUMO
The µ-opioid receptor (µOR) represents an important target of therapeutic and abused drugs. So far, most understanding of µOR activity has focused on a subset of known signal transducers and regulatory molecules. Yet µOR signaling is coordinated by additional proteins in the interaction network of the activated receptor, which have largely remained invisible given the lack of technologies to interrogate these networks systematically. Here we describe a proteomics and computational approach to map the proximal proteome of the activated µOR and to extract subcellular location, trafficking and functional partners of G-protein-coupled receptor (GPCR) activity. We demonstrate that distinct opioid agonists exert differences in the µOR proximal proteome mediated by endocytosis and endosomal sorting. Moreover, we identify two new µOR network components, EYA4 and KCTD12, which are recruited on the basis of receptor-triggered G-protein activation and might form a previously unrecognized buffering system for G-protein activity broadly modulating cellular GPCR signaling.
Assuntos
Proteoma , Proteômica , Receptores Opioides mu , Humanos , Endocitose , Células HEK293 , Proteoma/metabolismo , Proteômica/métodos , Receptores Acoplados a Proteínas G/metabolismo , Receptores Acoplados a Proteínas G/agonistas , Receptores Opioides mu/metabolismo , Receptores Opioides mu/agonistas , Transdução de SinaisRESUMO
Domain recombination is a key principle in protein evolution and protein engineering, but inserting a donor domain into every position of a target protein is not easily experimentally accessible. Most contemporary domain insertion profiling approaches rely on DNA transposons, which are constrained by sequence bias. Here, we establish Saturated Programmable Insertion Engineering (SPINE), an unbiased, comprehensive, and targeted domain insertion library generation technique using oligo library synthesis and multi-step Golden Gate cloning. Through benchmarking to MuA transposon-mediated library generation on four ion channel genes, we demonstrate that SPINE-generated libraries are enriched for in-frame insertions, have drastically reduced sequence bias as well as near-complete and highly-redundant coverage. Unlike transposon-mediated domain insertion that was severely biased and sparse for some genes, SPINE generated high-quality libraries for all genes tested. Using the Inward Rectifier K+ channel Kir2.1, we validate the practical utility of SPINE by constructing and comparing domain insertion permissibility maps. SPINE is the first technology to enable saturated domain insertion profiling. SPINE could help explore the relationship between domain insertions and protein function, and how this relationship is shaped by evolutionary forces and can be engineered for biomedical applications.
Assuntos
Elementos de DNA Transponíveis/genética , Evolução Molecular , Mutagênese Insercional/genética , Canais de Potássio Corretores do Fluxo de Internalização/genética , Biblioteca Gênica , Humanos , Oligonucleotídeos , Domínios Proteicos/genética , Engenharia de Proteínas , Recombinação Genética/genéticaRESUMO
Deep mutational scanning (DMS) facilitates data-driven models of protein structure and function. Here, we adapted Saturated Programmable Insertion Engineering (SPINE) as a programmable DMS technique. We validate SPINE with a reference single mutant dataset in the PSD95 PDZ3 domain and then characterize most pairwise double mutants to study epistasis. We observe wide-spread proximal negative epistasis, which we attribute to mutations affecting thermodynamic stability, and strong long-range positive epistasis, which is enriched in an evolutionarily conserved and function-defining network of "sector" and clade-specifying residues. Conditional neutrality of mutations in clade-specifying residues compensates for deleterious mutations in sector positions. This suggests that epistatic interactions between these position pairs facilitated the evolutionary expansion and specialization of PDZ domains. We propose that SPINE provides easy experimental access to reveal epistasis signatures in proteins that will improve our understanding of the structural basis for protein function and adaptation.
RESUMO
Organisms within all domains of life require the cofactor cobalamin (vitamin B12), which is produced only by a subset of bacteria and archaea. On the basis of genomic analyses, cobalamin biosynthesis in marine systems has been inferred in three main groups: select heterotrophic Proteobacteria, chemoautotrophic Thaumarchaeota, and photoautotrophic Cyanobacteria. Culture work demonstrates that many Cyanobacteria do not synthesize cobalamin but rather produce pseudocobalamin, challenging the connection between the occurrence of cobalamin biosynthesis genes and production of the compound in marine ecosystems. Here we show that cobalamin and pseudocobalamin coexist in the surface ocean, have distinct microbial sources, and support different enzymatic demands. Even in the presence of cobalamin, Cyanobacteria synthesize pseudocobalamin-likely reflecting their retention of an oxygen-independent pathway to produce pseudocobalamin, which is used as a cofactor in their specialized methionine synthase (MetH). This contrasts a model diatom, Thalassiosira pseudonana, which transported pseudocobalamin into the cell but was unable to use pseudocobalamin in its homolog of MetH. Our genomic and culture analyses showed that marine Thaumarchaeota and select heterotrophic bacteria produce cobalamin. This indicates that cobalamin in the surface ocean is a result of de novo synthesis by heterotrophic bacteria or via modification of closely related compounds like cyanobacterially produced pseudocobalamin. Deeper in the water column, our study implicates Thaumarchaeota as major producers of cobalamin based on genomic potential, cobalamin cell quotas, and abundance. Together, these findings establish the distinctive roles played by abundant prokaryotes in cobalamin-based microbial interdependencies that sustain community structure and function in the ocean.
Assuntos
Vitamina B 12/metabolismo , 5-Metiltetra-Hidrofolato-Homocisteína S-Metiltransferase/metabolismo , Archaea/metabolismo , Cianobactérias/metabolismo , Diatomáceas/metabolismo , Ecossistema , Processos Heterotróficos/fisiologia , Oceanos e MaresRESUMO
Tuberculosis remains the deadliest infectious disease in the world and requires novel therapeutic targets. The ESX-3 secretion system, which is essential for iron and zinc homeostasis and thus M. tuberculosis survival, is a promising target. In this study, we perform a deep mutational scan on the ESX-3 core protein EccD3 in the model organism M. smegmatis. We systematically investigated the functional roles of 145 residues across the soluble ubiquitin-like domain, the conformationally distinct flexible linker, and selected transmembrane helices of EccD3. Our data combined with structural comparisons to ESX-5 complexes support a model where EccD3 stabilizes the complex, with the hinge motif within the linker being particularly sensitive to disruption. Our study is the first deep mutational scan in mycobacteria, which could help guide drug development toward novel treatment of tuberculosis. This study underscores the importance of context-specific mutational analyses for discovering essential protein interactions within mycobacterial systems.
RESUMO
In the canonical genetic code, many amino acids are assigned more than one codon. Work by us and others has shown that the choice of these synonymous codon is not random, and carries regulatory and functional consequences. Existing protein foundation models ignore this context-dependent role of coding sequence in shaping the protein landscape of the cell. To address this gap, we introduce cdsFM, a suite of codon-resolution large language models, including both EnCodon and DeCodon models, with up to 1B parameters. Pre-trained on 60 million protein-coding sequences from more than 5,000 species, our models effectively learn the relationship between codons and amino acids, recapitualing the overall structure of the genetic code. In addition to outperforming state-of-the-art genomic foundation models in a variety of zero-shot and few-shot learning tasks, the larger pre-trained models were superior in predicting the choice of synonymous codons. To systematically assess the impact of synonymous codon choices on protein expression and our models' ability to capture these effects, we generated a large dataset measuring overall and surface expression levels of three proteins as a function of changes in their synonymous codons. We showed that our EnCodon models could be readily fine-tuned to predict the contextual consequences of synonymous codon choices. Armed with this knowledge, we applied EnCodon to existing clinical datasets of synonymous variants, and we identified a large number of synonymous codons that are likely pathogenic, several of which we experimentally confirmed in a cell-based model. Together, our findings establish the cdsFM suite as a powerful tool for decoding the complex functional grammar underlying the choice of synonymous codons.
RESUMO
Deep mutational scanning (DMS) measures the effects of thousands of genetic variants in a protein simultaneously. The small sample size renders classical statistical methods ineffective. For example, p-values cannot be correctly calibrated when treating variants independently. We propose Rosace, a Bayesian framework for analyzing growth-based DMS data. Rosace leverages amino acid position information to increase power and control the false discovery rate by sharing information across parameters via shrinkage. We also developed Rosette for simulating the distributional properties of DMS. We show that Rosace is robust to the violation of model assumptions and is more powerful than existing tools.
Assuntos
Teorema de Bayes , Humanos , Software , Mutação , Análise Mutacional de DNA/métodosRESUMO
MET is a receptor tyrosine kinase (RTK) responsible for initiating signaling pathways involved in development and wound repair. MET activation relies on ligand binding to the extracellular receptor, which prompts dimerization, intracellular phosphorylation, and recruitment of associated signaling proteins. Mutations, which are predominantly observed clinically in the intracellular juxtamembrane and kinase domains, can disrupt typical MET regulatory mechanisms. Understanding how juxtamembrane variants, such as exon 14 skipping (METΔEx14), and rare kinase domain mutations can increase signaling, often leading to cancer, remains a challenge. Here, we perform a parallel deep mutational scan (DMS) of the MET intracellular kinase domain in two fusion protein backgrounds: wild-type and METΔEx14. Our comparative approach has revealed a critical hydrophobic interaction between a juxtamembrane segment and the kinase âºC-helix, pointing to potential differences in regulatory mechanisms between MET and other RTKs. Additionally, we have uncovered a ß5 motif that acts as a structural pivot for the kinase domain in MET and other TAM family of kinases. We also describe a number of previously unknown activating mutations, aiding the effort to annotate driver, passenger, and drug resistance mutations in the MET kinase domain.
Assuntos
Proteínas Proto-Oncogênicas c-met , Proteínas Proto-Oncogênicas c-met/genética , Proteínas Proto-Oncogênicas c-met/metabolismo , Humanos , Domínios Proteicos/genética , Mutação , Motivos de Aminoácidos , Análise Mutacional de DNARESUMO
Mutations in the kinase and juxtamembrane domains of the MET Receptor Tyrosine Kinase are responsible for oncogenesis in various cancers and can drive resistance to MET-directed treatments. Determining the most effective inhibitor for each mutational profile is a major challenge for MET-driven cancer treatment in precision medicine. Here, we used a deep mutational scan (DMS) of ~5,764 MET kinase domain variants to profile the growth of each mutation against a panel of 11 inhibitors that are reported to target the MET kinase domain. We identified common resistance sites across type I, type II, and type I ½ inhibitors, unveiled unique resistance and sensitizing mutations for each inhibitor, and validated non-cross-resistant sensitivities for type I and type II inhibitor pairs. We augment a protein language model with biophysical and chemical features to improve the predictive performance for inhibitor-treated datasets. Together, our study demonstrates a pooled experimental pipeline for identifying resistance mutations, provides a reference dictionary for mutations that are sensitized to specific therapies, and offers insights for future drug development.
RESUMO
Three proton-sensing G protein-coupled receptors (GPCRs), GPR4, GPR65, and GPR68, respond to changes in extracellular pH to regulate diverse physiology and are implicated in a wide range of diseases. A central challenge in determining how protons activate these receptors is identifying the set of residues that bind protons. Here, we determine structures of each receptor to understand the spatial arrangement of putative proton sensing residues in the active state. With a newly developed deep mutational scanning approach, we determined the functional importance of every residue in proton activation for GPR68 by generating ~9,500 mutants and measuring effects on signaling and surface expression. This unbiased screen revealed that, unlike other proton-sensitive cell surface channels and receptors, no single site is critical for proton recognition in GPR68. Instead, a network of titratable residues extend from the extracellular surface to the transmembrane region and converge on canonical class A GPCR activation motifs to activate proton-sensing GPCRs. More broadly, our approach integrating structure and unbiased functional interrogation defines a new framework for understanding the rich complexity of GPCR signaling.
RESUMO
Background: Multiplexed Assays of Variant Effects (MAVEs) can test all possible single variants in a gene of interest. The resulting saturation-style data may help resolve variant classification disparities between populations, especially for variants of uncertain significance (VUS). Methods: We analyzed clinical significance classifications in 213,663 individuals of European-like genetic ancestry versus 206,975 individuals of non-European-like genetic ancestry from All of Us and the Genome Aggregation Database. Then, we incorporated clinically calibrated MAVE data into the Clinical Genome Resource's Variant Curation Expert Panel rules to automate VUS reclassification for BRCA1, TP53, and PTEN . Results: Using two orthogonal statistical approaches, we show a higher prevalence ( p ≤5.95e-06) of VUS in individuals of non-European-like genetic ancestry across all medical specialties assessed in all three databases. Further, in the non-European-like genetic ancestry group, higher rates of Benign or Likely Benign and variants with no clinical designation ( p ≤2.5e-05) were found across many medical specialties, whereas Pathogenic or Likely Pathogenic assignments were higher in individuals of European-like genetic ancestry ( p ≤2.5e-05). Using MAVE data, we reclassified VUS in individuals of non-European-like genetic ancestry at a significantly higher rate in comparison to reclassified VUS from European-like genetic ancestry ( p =9.1e-03) effectively compensating for the VUS disparity. Further, essential code analysis showed equitable impact of MAVE evidence codes but inequitable impact of allele frequency ( p =7.47e-06) and computational predictor ( p =6.92e-05) evidence codes for individuals of non-European-like genetic ancestry. Conclusions: Generation of saturation-style MAVE data should be a priority to reduce VUS disparities and produce equitable training data for future computational predictors.
RESUMO
Insertions and deletions (indels) enable evolution and cause disease. Due to technical challenges, indels are left out of most mutational scans, limiting our understanding of them in disease, biology, and evolution. We develop a low cost and bias method, DIMPLE, for systematically generating deletions, insertions, and missense mutations in genes, which we test on a range of targets, including Kir2.1. We use DIMPLE to study how indels impact potassium channel structure, disease, and evolution. We find deletions are most disruptive overall, beta sheets are most sensitive to indels, and flexible loops are sensitive to deletions yet tolerate insertions.
Assuntos
Mutação de Sentido Incorreto , Proteínas , Mutação , Proteínas/genética , Mutação INDEL , BiologiaRESUMO
MET is a receptor tyrosine kinase (RTK) responsible for initiating signaling pathways involved in development and wound repair. MET activation relies on ligand binding to the extracellular receptor, which prompts dimerization, intracellular phosphorylation, and recruitment of associated signaling proteins. Mutations, which are predominantly observed clinically in the intracellular juxtamembrane and kinase domains, can disrupt typical MET regulatory mechanisms. Understanding how juxtamembrane variants, such as exon 14 skipping (METΔEx14), and rare kinase domain mutations can increase signaling, often leading to cancer, remains a challenge. Here, we perform a parallel deep mutational scan (DMS) of MET intracellular kinase domain in two fusion protein backgrounds: wild type and METΔEx14. Our comparative approach has revealed a critical hydrophobic interaction between a juxtamembrane segment and the kinase αC helix, pointing to differences in regulatory mechanisms between MET and other RTKs. Additionally, we have uncovered a ß5 motif that acts as a structural pivot for kinase domain activation in MET and other TAM family of kinases. We also describe a number of previously unknown activating mutations, aiding the effort to annotate driver, passenger, and drug resistance mutations in the MET kinase domain.
RESUMO
Membrane transporters play a fundamental role in the tissue distribution of endogenous compounds and xenobiotics and are major determinants of efficacy and side effects profiles. Polymorphisms within these drug transporters result in inter-individual variation in drug response, with some patients not responding to the recommended dosage of drug whereas others experience catastrophic side effects. For example, variants within the major hepatic Human organic cation transporter OCT1 (SLC22A1) can change endogenous organic cations and many prescription drug levels. To understand how variants mechanistically impact drug uptake, we systematically study how all known and possible single missense and single amino acid deletion variants impact expression and substrate uptake of OCT1. We find that human variants primarily disrupt function via folding rather than substrate uptake. Our study revealed that the major determinants of folding reside in the first 300 amino acids, including the first 6 transmembrane domains and the extracellular domain (ECD) with a stabilizing and highly conserved stabilizing helical motif making key interactions between the ECD and transmembrane domains. Using the functional data combined with computational approaches, we determine and validate a structure-function model of OCT1s conformational ensemble without experimental structures. Using this model and molecular dynamic simulations of key mutants, we determine biophysical mechanisms for how specific human variants alter transport phenotypes. We identify differences in frequencies of reduced function alleles across populations with East Asians vs European populations having the lowest and highest frequency of reduced function variants, respectively. Mining human population databases reveals that reduced function alleles of OCT1 identified in this study associate significantly with high LDL cholesterol levels. Our general approach broadly applied could transform the landscape of precision medicine by producing a mechanistic basis for understanding the effects of human mutations on disease and drug response.
RESUMO
A long-standing goal in protein science and clinical genetics is to develop quantitative models of sequence, structure, and function relationships to understand how mutations cause disease. Deep mutational scanning (DMS) is a promising strategy to map how amino acids contribute to protein structure and function and to advance clinical variant interpretation. Here, we introduce 7429 single-residue missense mutations into the inward rectifier K+ channel Kir2.1 and determine how this affects folding, assembly, and trafficking, as well as regulation by allosteric ligands and ion conduction. Our data provide high-resolution information on a cotranslationally folded biogenic unit, trafficking and quality control signals, and segregated roles of different structural elements in fold stability and function. We show that Kir2.1 surface trafficking mutants are underrepresented in variant effect databases, which has implications for clinical practice. By comparing fitness scores with expert-reviewed variant effects, we can predict the pathogenicity of 'variants of unknown significance' and disease mechanisms of known pathogenic mutations. Our study in Kir2.1 provides a blueprint for how multiparametric DMS can help us understand the mechanistic basis of genetic disorders and the structure-function relationships of proteins.
Assuntos
Canais de Potássio Corretores do Fluxo de Internalização , Mutação , Mutação de Sentido Incorreto , Canais de Potássio Corretores do Fluxo de Internalização/genética , Canais de Potássio Corretores do Fluxo de Internalização/metabolismo , Proteínas/metabolismoRESUMO
A new way to alter the genome of bacteriophages helps produce large libraries of variants, allowing these bacteria-killing viruses to be designed to target species harmful to human health.
Assuntos
Bacteriófagos , Bactérias/genética , Bacteriófagos/genética , HumanosRESUMO
Protein domains are the basic units of protein structure and function. Comparative analysis of genomes and proteomes showed that domain recombination is a main driver of multidomain protein functional diversification and some of the constraining genomic mechanisms are known. Much less is known about biophysical mechanisms that determine whether protein domains can be combined into viable protein folds. Here, we use massively parallel insertional mutagenesis to determine compatibility of over 300,000 domain recombination variants of the Inward Rectifier K+ channel Kir2.1 with channel surface expression. Our data suggest that genomic and biophysical mechanisms acted in concert to favor gain of large, structured domain at protein termini during ion channel evolution. We use machine learning to build a quantitative biophysical model of domain compatibility in Kir2.1 that allows us to derive rudimentary rules for designing domain insertion variants that fold and traffic to the cell surface. Positional Kir2.1 responses to motif insertion clusters into distinct groups that correspond to contiguous structural regions of the channel with distinct biophysical properties tuned towards providing either folding stability or gating transitions. This suggests that insertional profiling is a high-throughput method to annotate function of ion channel structural regions.
Assuntos
Biofísica , Canais de Potássio/química , Canais de Potássio/genética , Recombinação Genética , Linhagem Celular , Membrana Celular , Canais de Potássio Corretores do Fluxo de Internalização Acoplados a Proteínas G , Perfilação da Expressão Gênica , Células HEK293 , Humanos , Ativação do Canal Iônico/genética , Ativação do Canal Iônico/fisiologia , Aprendizado de Máquina , Mutagênese Insercional , Potássio/metabolismo , Canais de Potássio Corretores do Fluxo de Internalização/química , Canais de Potássio Corretores do Fluxo de Internalização/genética , Domínios Proteicos/genética , TranscriptomaRESUMO
Allostery is a fundamental principle of protein regulation that remains hard to engineer, particularly in membrane proteins such as ion channels. Here we use human Inward Rectifier K+ Channel Kir2.1 to map site-specific permissibility to the insertion of domains with different biophysical properties. We find that permissibility is best explained by dynamic protein properties, such as conformational flexibility. Several regions in Kir2.1 that are equivalent to those regulated in homologs, such as G-protein-gated inward rectifier K+ channels (GIRK), have differential permissibility; that is, for these sites permissibility depends on the structural properties of the inserted domain. Our data and the well-established link between protein dynamics and allostery led us to propose that differential permissibility is a metric of latent allosteric capacity in Kir2.1. In support of this notion, inserting light-switchable domains into sites with predicted latent allosteric capacity renders Kir2.1 activity sensitive to light.