Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 196
Filtrar
1.
bioRxiv ; 2024 Jul 24.
Artigo em Inglês | MEDLINE | ID: mdl-39091798

RESUMO

Multi-domain enzymes can be regulated by both inter-domain interactions and structural features intrinsic to the catalytic domain. The tyrosine phosphatase SHP2 is a quintessential example of a multi-domain protein that is regulated by inter-domain interactions. This enzyme has a protein tyrosine phosphatase (PTP) domain and two phosphotyrosine-recognition domains (N-SH2 and C-SH2) that regulate phosphatase activity through autoinhibitory interactions. SHP2 is canonically activated by phosphoprotein binding to the SH2 domains, which causes large inter-domain rearrangements, but autoinhibition can also be disrupted by disease-associated mutations. Many details of the SHP2 activation mechanism are still unclear, the physiologically-relevant active conformations remain elusive, and hundreds of human variants of SHP2 have not been functionally characterized. Here, we perform deep mutational scanning on both full-length SHP2 and its isolated PTP domain to examine mutational effects on inter-domain regulation and catalytic activity. Our experiments provide a comprehensive map of SHP2 mutational sensitivity, both in the presence and absence of inter-domain regulation. Coupled with molecular dynamics simulations, our investigation reveals novel structural features that govern the stability of the autoinhibited and active states of SHP2. Our analysis also identifies key residues beyond the SHP2 active site that control PTP domain dynamics and intrinsic catalytic activity. This work expands our understanding of SHP2 regulation and provides new insights into SHP2 pathogenicity.

2.
Front Immunol ; 15: 1426795, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39108267

RESUMO

B cells surveil the body for foreign matter using their surface-expressed B cell antigen receptor (BCR), a tetrameric complex comprising a membrane-tethered antibody (mIg) that binds antigens and a signaling dimer (CD79AB) that conveys this interaction to the B cell. Recent cryogenic electron microscopy (cryo-EM) structures of IgM and IgG isotype BCRs provide the first complete views of their architecture, revealing that the largest interaction surfaces between the mIg and CD79AB are in their transmembrane domains (TMDs). These structures support decades of biochemical work interrogating the requirements for assembly of a functional BCR and provide the basis for explaining the effects of mutations. Here we report a focused saturating mutagenesis to comprehensively characterize the nature of the interactions in the mIg TMD that are required for BCR surface expression. We examined the effects of 600 single-amino-acid changes simultaneously in a pooled competition assay and quantified their effects by next-generation sequencing. Our deep mutational scanning results reflect a feature-rich TMD sequence, with some positions completely intolerant to mutation and others requiring specific biochemical properties such as charge, polarity or hydrophobicity, emphasizing the high value of saturating mutagenesis over, for example, alanine scanning. The data agree closely with published mutagenesis and the cryo-EM structures, while also highlighting several positions and surfaces that have not previously been characterized or have effects that are difficult to rationalize purely based on structure. This unbiased and complete mutagenesis dataset serves as a reference and framework for informed hypothesis testing, design of therapeutics to regulate BCR surface expression and to annotate patient mutations.


Assuntos
Receptores de Antígenos de Linfócitos B , Receptores de Antígenos de Linfócitos B/genética , Receptores de Antígenos de Linfócitos B/imunologia , Receptores de Antígenos de Linfócitos B/metabolismo , Humanos , Mutação , Animais , Linfócitos B/imunologia , Linfócitos B/metabolismo , Antígenos CD79/genética , Antígenos CD79/metabolismo , Antígenos CD79/imunologia , Membrana Celular/metabolismo , Camundongos
3.
BMC Bioinformatics ; 25(1): 229, 2024 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-38956474

RESUMO

Adeno-associated viruses 2 (AAV2) are minute viruses renowned for their capacity to infect human cells and akin organisms. They have recently emerged as prominent candidates in the field of gene therapy, primarily attributed to their inherent non-pathogenic nature in humans and the safety associated with their manipulation. The efficacy of AAV2 as gene therapy vectors hinges on their ability to infiltrate host cells, a phenomenon reliant on their competence to construct a capsid capable of breaching the nucleus of the target cell. To enhance their infection potential, researchers have extensively scrutinized various combinatorial libraries by introducing mutations into the capsid, aiming to boost their effectiveness. The emergence of high-throughput experimental techniques, like deep mutational scanning (DMS), has made it feasible to experimentally assess the fitness of these libraries for their intended purpose. Notably, machine learning is starting to demonstrate its potential in addressing predictions within the mutational landscape from sequence data. In this context, we introduce a biophysically-inspired model designed to predict the viability of genetic variants in DMS experiments. This model is tailored to a specific segment of the CAP region within AAV2's capsid protein. To evaluate its effectiveness, we conduct model training with diverse datasets, each tailored to explore different aspects of the mutational landscape influenced by the selection process. Our assessment of the biophysical model centers on two primary objectives: (i) providing quantitative forecasts for the log-selectivity of variants and (ii) deploying it as a binary classifier to categorize sequences into viable and non-viable classes.


Assuntos
Mutação , Humanos , Proteínas do Capsídeo/genética , Dependovirus/genética , Parvovirinae/genética
4.
Immunity ; 2024 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-39013466

RESUMO

Lassa virus is estimated to cause thousands of human deaths per year, primarily due to spillovers from its natural host, Mastomys rodents. Efforts to create vaccines and antibody therapeutics must account for the evolutionary variability of the Lassa virus's glycoprotein complex (GPC), which mediates viral entry into cells and is the target of neutralizing antibodies. To map the evolutionary space accessible to GPC, we used pseudovirus deep mutational scanning to measure how nearly all GPC amino-acid mutations affected cell entry and antibody neutralization. Our experiments defined functional constraints throughout GPC. We quantified how GPC mutations affected neutralization with a panel of monoclonal antibodies. All antibodies tested were escaped by mutations that existed among natural Lassa virus lineages. Overall, our work describes a biosafety-level-2 method to elucidate the mutational space accessible to GPC and shows how prospective characterization of antigenic variation could aid the design of therapeutics and vaccines.

5.
bioRxiv ; 2024 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-38979347

RESUMO

The large-scale experimental measures of variant functional assays submitted to MaveDB have the potential to provide key information for resolving variants of uncertain significance, but the reporting of results relative to assayed sequence hinders their downstream utility. The Atlas of Variant Effects Alliance mapped multiplexed assays of variant effect data to human reference sequences, creating a robust set of machine-readable homology mappings. This method processed approximately 2.5 million protein and genomic variants in MaveDB, successfully mapping 98.61% of examined variants and disseminating data to resources such as the UCSC Genome Browser and Ensembl Variant Effect Predictor.

6.
Cell Host Microbe ; 32(8): 1397-1411.e11, 2024 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-39032493

RESUMO

Human influenza virus evolves to escape neutralization by polyclonal antibodies. However, we have a limited understanding of how the antigenic effects of viral mutations vary across the human population and how this heterogeneity affects virus evolution. Here, we use deep mutational scanning to map how mutations to the hemagglutinin (HA) proteins of two H3N2 strains, A/Hong Kong/45/2019 and A/Perth/16/2009, affect neutralization by serum from individuals of a variety of ages. The effects of HA mutations on serum neutralization differ across age groups in ways that can be partially rationalized in terms of exposure histories. Mutations that were fixed in influenza variants after 2020 cause greater escape from sera from younger individuals compared with adults. Overall, these results demonstrate that influenza faces distinct antigenic selection regimes from different age groups and suggest approaches to understand how this heterogeneous selection shapes viral evolution.


Assuntos
Anticorpos Antivirais , Glicoproteínas de Hemaglutininação de Vírus da Influenza , Vírus da Influenza A Subtipo H3N2 , Influenza Humana , Mutação , Humanos , Glicoproteínas de Hemaglutininação de Vírus da Influenza/genética , Glicoproteínas de Hemaglutininação de Vírus da Influenza/imunologia , Vírus da Influenza A Subtipo H3N2/genética , Vírus da Influenza A Subtipo H3N2/imunologia , Adulto , Anticorpos Antivirais/imunologia , Anticorpos Antivirais/sangue , Influenza Humana/virologia , Influenza Humana/imunologia , Fatores Etários , Pessoa de Meia-Idade , Adulto Jovem , Anticorpos Neutralizantes/imunologia , Anticorpos Neutralizantes/sangue , Antígenos Virais/genética , Antígenos Virais/imunologia , Adolescente , Evolução Molecular , Idoso , Criança
7.
Antiviral Res ; 229: 105961, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39002800

RESUMO

Baloxavir acid (BXA) is a pan-influenza antiviral that targets the cap-dependent endonuclease of the polymerase acidic (PA) protein required for viral mRNA synthesis. To gain a comprehensive understanding on the molecular changes associated with reduced susceptibility to BXA and their fitness profile, we performed a deep mutational scanning at the PA endonuclease domain of an A (H1N1)pdm09 virus. The recombinant virus libraries were serially passaged in vitro under increasing concentrations of BXA followed by next-generation sequencing to monitor PA amino acid substitutions with increased detection frequencies. Enriched PA amino acid changes were each introduced into a recombinant A (H1N1)pdm09 virus to validate their effect on BXA susceptibility and viral replication fitness in vitro. The I38 T/M substitutions known to confer reduced susceptibility to BXA were invariably detected from recombinant virus libraries within 5 serial passages. In addition, we identified a novel L106R substitution that emerged in the third passage and conferred greater than 10-fold reduced susceptibility to BXA. PA-L106 is highly conserved among seasonal influenza A and B viruses. Compared to the wild-type virus, the L106R substitution resulted in reduced polymerase activity and a minor reduction of the peak viral load, suggesting the amino acid change may result in moderate fitness loss. Our results support the use of deep mutational scanning as a practical tool to elucidate genotype-phenotype relationships, including mapping amino acid substitutions with reduced susceptibility to antivirals.


Assuntos
Substituição de Aminoácidos , Antivirais , Dibenzotiepinas , Farmacorresistência Viral , Vírus da Influenza A Subtipo H1N1 , Morfolinas , Piridonas , Triazinas , Proteínas Virais , Replicação Viral , Dibenzotiepinas/farmacologia , Farmacorresistência Viral/genética , Antivirais/farmacologia , Vírus da Influenza A Subtipo H1N1/efeitos dos fármacos , Vírus da Influenza A Subtipo H1N1/genética , Triazinas/farmacologia , Replicação Viral/efeitos dos fármacos , Piridonas/farmacologia , Humanos , Morfolinas/farmacologia , Proteínas Virais/genética , Animais , Tiepinas/farmacologia , RNA Polimerase Dependente de RNA/genética , Sequenciamento de Nucleotídeos em Larga Escala , Cães , Células Madin Darby de Rim Canino , Influenza Humana/virologia , Influenza Humana/tratamento farmacológico , Oxazinas/farmacologia
8.
BMC Genomics ; 25(1): 630, 2024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-38914936

RESUMO

Deep Mutational Scanning (DMS) assays are powerful tools to study sequence-function relationships by measuring the effects of thousands of sequence variants on protein function. During a DMS experiment, several technical artefacts might distort non-linearly the functional score obtained, potentially biasing the interpretation of the results. We therefore tested several technical parameters in the deepPCA workflow, a DMS assay for protein-protein interactions, in order to identify technical sources of non-linearities. We found that parameters common to many DMS assays such as amount of transformed DNA, timepoint of harvest and library composition can cause non-linearities in the data. Designing experiments in a way to minimize these non-linear effects will improve the quantification and interpretation of mutation effects.


Assuntos
Mutação , Fluxo de Trabalho , Proteínas/metabolismo , Proteínas/genética , Sequenciamento de Nucleotídeos em Larga Escala , Mapeamento de Interação de Proteínas/métodos , Análise Mutacional de DNA/métodos , Ligação Proteica
9.
Dis Model Mech ; 17(6)2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38940340

RESUMO

Interpreting the wealth of rare genetic variants discovered in population-scale sequencing efforts and deciphering their associations with human health and disease present a critical challenge due to the lack of sufficient clinical case reports. One promising avenue to overcome this problem is deep mutational scanning (DMS), a method of introducing and evaluating large-scale genetic variants in model cell lines. DMS allows unbiased investigation of variants, including those that are not found in clinical reports, thus improving rare disease diagnostics. Currently, the main obstacle limiting the full potential of DMS is the availability of functional assays that are specific to disease mechanisms. Thus, we explore high-throughput functional methodologies suitable to examine broad disease mechanisms. We specifically focus on methods that do not require robotics or automation but instead use well-designed molecular tools to transform biological mechanisms into easily detectable signals, such as cell survival rate, fluorescence or drug resistance. Here, we aim to bridge the gap between disease-relevant assays and their integration into the DMS framework.


Assuntos
Ensaios de Triagem em Larga Escala , Animais , Humanos , Doença/genética , Variação Genética , Ensaios de Triagem em Larga Escala/métodos , Mutação/genética
10.
Mol Syst Biol ; 20(7): 825-844, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38849565

RESUMO

Nonsense and missense mutations in the transcription factor PAX6 cause a wide range of eye development defects, including aniridia, microphthalmia and coloboma. To understand how changes of PAX6:DNA binding cause these phenotypes, we combined saturation mutagenesis of the paired domain of PAX6 with a yeast one-hybrid (Y1H) assay in which expression of a PAX6-GAL4 fusion gene drives antibiotic resistance. We quantified binding of more than 2700 single amino-acid variants to two DNA sequence elements. Mutations in DNA-facing residues of the N-terminal subdomain and linker region were most detrimental, as were mutations to prolines and to negatively charged residues. Many variants caused sequence-specific molecular gain-of-function effects, including variants in position 71 that increased binding to the LE9 enhancer but decreased binding to a SELEX-derived binding site. In the absence of antibiotic selection, variants that retained DNA binding slowed yeast growth, likely because such variants perturbed the yeast transcriptome. Benchmarking against known patient variants and applying ACMG/AMP guidelines to variant classification, we obtained supporting-to-moderate evidence that 977 variants are likely pathogenic and 1306 are likely benign. Our analysis shows that most pathogenic mutations in the paired domain of PAX6 can be explained simply by the effects of these mutations on PAX6:DNA association, and establishes Y1H as a generalisable assay for the interpretation of variant effects in transcription factors.


Assuntos
DNA , Fator de Transcrição PAX6 , Fator de Transcrição PAX6/genética , Fator de Transcrição PAX6/metabolismo , Humanos , DNA/genética , DNA/metabolismo , Sítios de Ligação , Ligação Proteica , Mutação , Técnicas do Sistema de Duplo-Híbrido , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Mutação de Sentido Incorreto , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Análise Mutacional de DNA
11.
Elife ; 122024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38767330

RESUMO

A protein's genetic architecture - the set of causal rules by which its sequence produces its functions - also determines its possible evolutionary trajectories. Prior research has proposed that the genetic architecture of proteins is very complex, with pervasive epistatic interactions that constrain evolution and make function difficult to predict from sequence. Most of this work has analyzed only the direct paths between two proteins of interest - excluding the vast majority of possible genotypes and evolutionary trajectories - and has considered only a single protein function, leaving unaddressed the genetic architecture of functional specificity and its impact on the evolution of new functions. Here, we develop a new method based on ordinal logistic regression to directly characterize the global genetic determinants of multiple protein functions from 20-state combinatorial deep mutational scanning (DMS) experiments. We use it to dissect the genetic architecture and evolution of a transcription factor's specificity for DNA, using data from a combinatorial DMS of an ancient steroid hormone receptor's capacity to activate transcription from two biologically relevant DNA elements. We show that the genetic architecture of DNA recognition consists of a dense set of main and pairwise effects that involve virtually every possible amino acid state in the protein-DNA interface, but higher-order epistasis plays only a tiny role. Pairwise interactions enlarge the set of functional sequences and are the primary determinants of specificity for different DNA elements. They also massively expand the number of opportunities for single-residue mutations to switch specificity from one DNA target to another. By bringing variants with different functions close together in sequence space, pairwise epistasis therefore facilitates rather than constrains the evolution of new functions.


Assuntos
Epistasia Genética , Evolução Molecular , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , DNA/genética , DNA/metabolismo , Mutação , Ligação Proteica
12.
Cell ; 187(11): 2735-2745.e12, 2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38723628

RESUMO

Hepatitis B virus (HBV) is a small double-stranded DNA virus that chronically infects 296 million people. Over half of its compact genome encodes proteins in two overlapping reading frames, and during evolution, multiple selective pressures can act on shared nucleotides. This study combines an RNA-based HBV cell culture system with deep mutational scanning (DMS) to uncouple cis- and trans-acting sequence requirements in the HBV genome. The results support a leaky ribosome scanning model for polymerase translation, provide a fitness map of the HBV polymerase at single-nucleotide resolution, and identify conserved prolines adjacent to the HBV polymerase termination codon that stall ribosomes. Further experiments indicated that stalled ribosomes tether the nascent polymerase to its template RNA, ensuring cis-preferential RNA packaging and reverse transcription of the HBV genome.


Assuntos
Vírus da Hepatite B , Transcrição Reversa , Humanos , Genoma Viral/genética , Vírus da Hepatite B/genética , Mutação , Ribossomos/metabolismo , RNA Viral/genética , RNA Viral/metabolismo , Linhagem Celular
13.
Mol Cell ; 84(10): 1932-1947.e10, 2024 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-38703769

RESUMO

Mutations in transporters can impact an individual's response to drugs and cause many diseases. Few variants in transporters have been evaluated for their functional impact. Here, we combine saturation mutagenesis and multi-phenotypic screening to dissect the impact of 11,213 missense single-amino-acid deletions, and synonymous variants across the 554 residues of OCT1, a key liver xenobiotic transporter. By quantifying in parallel expression and substrate uptake, we find that most variants exert their primary effect on protein abundance, a phenotype not commonly measured alongside function. Using our mutagenesis results combined with structure prediction and molecular dynamic simulations, we develop accurate structure-function models of the entire transport cycle, providing biophysical characterization of all known and possible human OCT1 polymorphisms. This work provides a complete functional map of OCT1 variants along with a framework for integrating functional genomics, biophysical modeling, and human genetics to predict variant effects on disease and drug efficacy.


Assuntos
Simulação de Dinâmica Molecular , Transportador 1 de Cátions Orgânicos , Conformação Proteica , Humanos , Transporte Biológico , Células HEK293 , Mutação , Mutação de Sentido Incorreto , Fator 1 de Transcrição de Octâmero , Transportador 1 de Cátions Orgânicos/genética , Transportador 1 de Cátions Orgânicos/metabolismo , Farmacogenética , Fenótipo , Relação Estrutura-Atividade
14.
Front Cell Infect Microbiol ; 14: 1381155, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38650737

RESUMO

Kinetoplastid pathogens including Trypanosoma brucei, T. cruzi, and Leishmania species, are early diverged, eukaryotic, unicellular parasites. Functional understanding of many proteins from these pathogens has been hampered by limited sequence homology to proteins from other model organisms. Here we describe the development of a high-throughput deep mutational scanning approach in T. brucei that facilitates rapid and unbiased assessment of the impacts of many possible amino acid substitutions within a protein on cell fitness, as measured by relative cell growth. The approach leverages several molecular technologies: cells with conditional expression of a wild-type gene of interest and constitutive expression of a library of mutant variants, degron-controlled stabilization of I-SceI meganuclease to mediate highly efficient transfection of a mutant allele library, and a high-throughput sequencing readout for cell growth upon conditional knockdown of wild-type gene expression and exclusive expression of mutant variants. Using this method, we queried the effects of amino acid substitutions in the apparently non-catalytic RNase III-like domain of KREPB4 (B4), which is an essential component of the RNA Editing Catalytic Complexes (RECCs) that carry out mitochondrial RNA editing in T. brucei. We measured the impacts of thousands of B4 variants on bloodstream form cell growth and validated the most deleterious variants containing single amino acid substitutions. Crucially, there was no correlation between phenotypes and amino acid conservation, demonstrating the greater power of this method over traditional sequence homology searching to identify functional residues. The bloodstream form cell growth phenotypes were combined with structural modeling, RECC protein proximity data, and analysis of selected substitutions in procyclic form T. brucei. These analyses revealed that the B4 RNaseIII-like domain is essential for maintenance of RECC integrity and RECC protein abundances and is also involved in changes in RECCs that occur between bloodstream and procyclic form life cycle stages.


Assuntos
Proteínas de Protozoários , Edição de RNA , Ribonuclease III , Trypanosoma brucei brucei , Substituição de Aminoácidos , Análise Mutacional de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Mutação , Domínios Proteicos/genética , Proteínas de Protozoários/genética , Proteínas de Protozoários/metabolismo , Ribonuclease III/genética , Ribonuclease III/metabolismo , Trypanosoma brucei brucei/genética , Trypanosoma brucei brucei/metabolismo , Trypanosoma brucei brucei/crescimento & desenvolvimento
15.
Genome Biol ; 25(1): 100, 2024 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-38641812

RESUMO

Multiplexed assays of variant effect (MAVEs) have emerged as a powerful approach for interrogating thousands of genetic variants in a single experiment. The flexibility and widespread adoption of these techniques across diverse disciplines have led to a heterogeneous mix of data formats and descriptions, which complicates the downstream use of the resulting datasets. To address these issues and promote reproducibility and reuse of MAVE data, we define a set of minimum information standards for MAVE data and metadata and outline a controlled vocabulary aligned with established biomedical ontologies for describing these experimental designs.


Assuntos
Metadados , Projetos de Pesquisa , Reprodutibilidade dos Testes
16.
Proc Natl Acad Sci U S A ; 121(15): e2317222121, 2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38557175

RESUMO

Antigenic drift of SARS-CoV-2 is typically defined by mutations in the N-terminal domain and receptor binding domain of spike protein. In contrast, whether antigenic drift occurs in the S2 domain remains largely elusive. Here, we perform a deep mutational scanning experiment to identify S2 mutations that affect binding of SARS-CoV-2 spike to three S2 apex public antibodies. Our results indicate that spatially diverse mutations, including D950N and Q954H, which are observed in Delta and Omicron variants, respectively, weaken the binding of spike to these antibodies. Although S2 apex antibodies are known to be nonneutralizing, we show that they confer protection in vivo through Fc-mediated effector functions. Overall, this study indicates that the S2 domain of SARS-CoV-2 spike can undergo antigenic drift, which represents a potential challenge for the development of more universal coronavirus vaccines.


Assuntos
Deriva e Deslocamento Antigênicos , COVID-19 , Humanos , SARS-CoV-2/genética , Anticorpos , Glicoproteína da Espícula de Coronavírus/genética , Anticorpos Antivirais
17.
Proteins ; 92(7): 886-902, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38501649

RESUMO

Proteins are used in various biotechnological applications, often requiring the optimization of protein properties by introducing specific amino-acid exchanges. Deep mutational scanning (DMS) is an effective high-throughput method for evaluating the effects of these exchanges on protein function. DMS data can then inform the training of a neural network to predict the impact of mutations. Most approaches use some representation of the protein sequence for training and prediction. As proteins are characterized by complex structures and intricate residue interaction networks, directly providing structural information as input reduces the need to learn these features from the data. We introduce a method for encoding protein structures as stacked 2D contact maps, which capture residue interactions, their evolutionary conservation, and mutation-induced interaction changes. Furthermore, we explored techniques to augment neural network training performance on smaller DMS datasets. To validate our approach, we trained three neural network architectures originally used for image analysis on three DMS datasets, and we compared their performances with networks trained solely on protein sequences. The results confirm the effectiveness of the protein structure encoding in machine learning efforts on DMS data. Using structural representations as direct input to the networks, along with data augmentation and pretraining, significantly reduced demands on training data size and improved prediction performance, especially on smaller datasets, while performance on large datasets was on par with state-of-the-art sequence convolutional neural networks. The methods presented here have the potential to provide the same workflow as DMS without the experimental and financial burden of testing thousands of mutants. Additionally, we present an open-source, user-friendly software tool to make these data analysis techniques accessible, particularly to biotechnology and protein engineering researchers who wish to apply them to their mutagenesis data.


Assuntos
Redes Neurais de Computação , Proteínas , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Mutação , Bases de Dados de Proteínas , Biologia Computacional/métodos , Aprendizado Profundo , Algoritmos , Conformação Proteica , Software , Aprendizado de Máquina , Humanos
18.
Methods Mol Biol ; 2774: 135-152, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38441763

RESUMO

Sequencing-based, massively parallel genetic assays have enabled simultaneous characterization of the genotype-phenotype relationships for libraries encoding thousands of unique protein variants. Since plasmid transfection and lentiviral transduction have characteristics that limit multiplexing with pooled libraries, we developed a mammalian synthetic biology platform that harnesses the Bxb1 bacteriophage DNA recombinase to insert single promoterless plasmids encoding a transgene of interest into a pre-engineered "landing pad" site within the cell genome. The transgene is expressed behind a genomically integrated promoter, ensuring only one transgene is expressed per cell, preserving a strict genotype-phenotype link. Upon selecting cells based on a desired phenotype, the transgene can be sequenced to ascribe each variant a phenotypic score. We describe how to create and utilize landing pad cells for large-scale, library-based genetic experiments. Using the provided examples, the experimental template can be adapted to explore protein variants in diverse biological problems within mammalian cells.


Assuntos
Bacteriófagos , Genômica , Animais , Sequenciamento de Nucleotídeos em Larga Escala , Biblioteca Gênica , Bioensaio , Proteínas Mutantes , Mamíferos
19.
Cell Syst ; 15(4): 374-387.e6, 2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38537640

RESUMO

How a protein's function influences the shape of its fitness landscape, smooth or rugged, is a fundamental question in evolutionary biochemistry. Smooth landscapes arise when incremental mutational steps lead to a progressive change in function, as commonly seen in enzymes and binding proteins. On the other hand, rugged landscapes are poorly understood because of the inherent unpredictability of how sequence changes affect function. Here, we experimentally characterize the entire sequence phylogeny, comprising 1,158 extant and ancestral sequences, of the DNA-binding domain (DBD) of the LacI/GalR transcriptional repressor family. Our analysis revealed an extremely rugged landscape with rapid switching of specificity, even between adjacent nodes. Further, the ruggedness arises due to the necessity of the repressor to simultaneously evolve specificity for asymmetric operators and disfavors potentially adverse regulatory crosstalk. Our study provides fundamental insight into evolutionary, molecular, and biophysical rules of genetic regulation through the lens of fitness landscapes.


Assuntos
Filogenia
20.
Circ Genom Precis Med ; 17(2): e004377, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38362799

RESUMO

BACKGROUND: Pathogenic autosomal-dominant missense variants in MYH7 (myosin heavy chain 7), which encodes the sarcomeric protein (ß-MHC [beta myosin heavy chain]) expressed in cardiac and skeletal myocytes, are a leading cause of hypertrophic cardiomyopathy and are clinically actionable. However, ≈75% of MYH7 missense variants are of unknown significance. While human-induced pluripotent stem cells (hiPSCs) can be differentiated into cardiomyocytes to enable the interrogation of MYH7 variant effect in a disease-relevant context, deep mutational scanning has not been executed using diploid hiPSC derivates due to low hiPSC gene-editing efficiency. Moreover, multiplexable phenotypes enabling deep mutational scanning of MYH7 variant hiPSC-derived cardiomyocytes are unknown. METHODS: To overcome these obstacles, we used CRISPRa On-Target Editing Retrieval enrichment to generate an hiPSC library containing 113 MYH7 codon variants suitable for deep mutational scanning. We first established that ß-MHC protein loss occurs in a hypertrophic cardiomyopathy human heart with a pathogenic MYH7 variant. We then differentiated the MYH7 missense variant hiPSC library to cardiomyocytes for multiplexed assessment of ß-MHC variant abundance by massively parallel sequencing and hiPSC-derived cardiomyocyte survival. RESULTS: Both the multiplexed assessment of ß-MHC abundance and hiPSC-derived cardiomyocyte survival accurately segregated all known pathogenic variants from synonymous variants. Functional data were generated for 4 variants of unknown significance and 58 additional MYH7 missense variants not yet detected in patients. CONCLUSIONS: This study leveraged hiPSC differentiation into disease-relevant cardiomyocytes to enable multiplexed assessments of MYH7 missense variants for the first time. Phenotyping strategies used here enable the application of deep mutational scanning to clinically actionable genes, which should reduce the burden of variants of unknown significance on patients and clinicians.


Assuntos
Cardiomiopatia Hipertrófica , Células-Tronco Pluripotentes Induzidas , Humanos , Miócitos Cardíacos/metabolismo , Cadeias Pesadas de Miosina/genética , Células-Tronco Pluripotentes Induzidas/metabolismo , Cardiomiopatia Hipertrófica/genética , Cardiomiopatia Hipertrófica/metabolismo , Diferenciação Celular/genética , Miosinas Cardíacas/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA