Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 990
Filtrar
1.
Protein Sci ; 33(10): e5164, 2024 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-39276008

RESUMEN

This review aims to provide an overview of the progress in protein-based artificial photosystem design and their potential to uncover the underlying principles governing light-harvesting in photosynthesis. While significant advances have been made in this area, a gap persists in reviewing these advances. This review provides a perspective of the field, pinpointing knowledge gaps and unresolved challenges that warrant further inquiry. In particular, it delves into the key considerations when designing photosystems based on the chromophore and protein scaffold characteristics, presents the established strategies for artificial photosystems engineering with their advantages and disadvantages, and underscores the recent breakthroughs in understanding the molecular mechanisms governing light-harvesting, charge separation, and the role of the protein motions in the chromophore's excited state relaxation. By disseminating this knowledge, this article provides a foundational resource for defining the field of bio-hybrid photosystems and aims to inspire the continued exploration of artificial photosystems using protein design.


Asunto(s)
Fotosíntesis , Ingeniería de Proteínas , Ingeniería de Proteínas/métodos , Complejos de Proteína Captadores de Luz/química , Complejos de Proteína Captadores de Luz/metabolismo , Modelos Moleculares
2.
Biotechnol Adv ; 77: 108457, 2024 Sep 27.
Artículo en Inglés | MEDLINE | ID: mdl-39343083

RESUMEN

Conditional protein-protein interactions enable dynamic regulation of cellular activity and are an attractive approach to probe native protein interactions, improve metabolic engineering of microbial factories, and develop smart therapeutics. Conditional protein-protein interactions have been engineered to respond to various chemical, light, and nucleic acid-based stimuli. These interactions have been applied to assemble protein fragments, build protein scaffolds, and spatially organize proteins in many microbial and higher-order hosts. To foster the development of novel conditional protein-protein interactions that respond to new inputs or can be utilized in alternative settings, we provide an overview of the process of designing new engineered protein interactions while showcasing many recently developed computational tools that may accelerate protein engineering in this space.

3.
Macromol Biosci ; : e2400126, 2024 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-39239781

RESUMEN

Protein assembly is an essential process in biological systems, where proteins self-assemble into complex structures with diverse functions. Inspired by the exquisite control over protein assembly in nature, scientists have been exploring ways to design and assemble protein structures with precise control over their topologies and functions. One promising approach for achieving this goal is through metal coordination, which utilizes metal-binding motifs to mediate protein-protein interactions and assemble protein complexes with controlled stoichiometry and geometry. Metal coordination provides a modular and tunable approach for protein assembly and de novo structure design, where the metal ion acts as a molecular glue that holds the protein subunits together in a specific orientation. Metal-coordinated protein assemblies have shown great potential for developing functional metalloproteinase, novel biomaterials and integrated drug delivery systems. In this review, an overview of the recent advances in protein assemblies benefited from metal coordination is provided, focusing on various protein arrangements in different dimensions including protein oligomers, protein nanocage and higher-order protein architectures. Moreover, the key metal-binding motifs and strategies used to assemble protein structures with precise control over their properties are highlighted. The potential applications of metal-mediated protein assemblies in biotechnology and biomedicine are also discussed.

4.
Front Plant Sci ; 15: 1449579, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39286837

RESUMEN

Improving crop traits requires genetic diversity, which allows breeders to select advantageous alleles of key genes. In species or loci that lack sufficient genetic diversity, synthetic directed evolution (SDE) can supplement natural variation, thus expanding the possibilities for trait engineering. In this review, we explore recent advances and applications of SDE for crop improvement, highlighting potential targets (coding sequences and cis-regulatory elements) and computational tools to enhance crop resilience and performance across diverse environments. Recent advancements in SDE approaches have streamlined the generation of variants and the selection processes; by leveraging these advanced technologies and principles, we can minimize concerns about host fitness and unintended effects, thus opening promising avenues for effectively enhancing crop traits.

5.
Talanta ; 281: 126827, 2024 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-39245003

RESUMEN

Bisphenol analogues are the typical class of endocrine disrupting chemicals (EDCs) that interfere with binding of endogenous hormones to androgen receptor (AR). With the expansion of industrial activities and the intensification of environmental pollution, an increasing array of bisphenol analogues is being released into the environment and food chain. This highlights the urgency to develop sensitive methods for the detection of bisphenol analogues. Here, we propose a biomimetic AR-based biosensor platform for detecting bisphenol analogues (BPF, TBBPA, and TBBPS) by binding with Aggregation-Induced Emission (AIE) probes. Following a comparison of the PROSS and ABACUS methods, biomimetic AR was designed using the ABACUS approach and subsequently expressed in vitro via the E. coli expression system. Through molecular docking and the observation of fluorescence changes upon binding with biomimetic AR, BS-46006 was selected as the AIE probe for the biosensor. The biomimetic AR-based biosensor showed sensitive detections of BPF, TBBPA, and TBBPS within a range of 0-50 mM. To further elucidate the multi-residue recognition mechanism, molecular orbitals, Electron Localization Function (ELF), and Localized Orbital Locator (LOL) were systematically calculated in this study. Lowest unoccupied molecular orbital and highest occupied molecular orbital indicated the energy gap of BPF, TBBPA, and TBBPS, which correspond to 0.12812, 0.19689, and 0.18711 eV, respectively. ELF and LOL offered clearer perspective through heat maps to visually represent the electron delocalization in BPF, TBBPA, and TBBPS. The matrix effect analysis suggested that the responses of bisphenol analogues in soil matrices could be effectively mitigated through sample pretreatment. The analysis of spiked soil samples showed the acceptable recoveries ranged from 91 % to 105 %. Additionally, the biomimetic AR-based AIE biosensor, which combines multi-residue detection with Tolerable Daily Intakes, shows great promise for the risk assessment of bisphenol analogues. This research may present a viable approach for the analysis of environmental pollutants.

6.
bioRxiv ; 2024 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-39229177

RESUMEN

There is strong interest in accurate methods for predicting changes in protein stability resulting from amino acid mutations to the protein sequence. Recombinant proteins must often be stabilized to be used as therapeutics or reagents, and destabilizing mutations are implicated in a variety of diseases. Due to increased data availability and improved modeling techniques, recent studies have shown advancements in predicting changes in protein stability when a single point mutation is made. Less focus has been directed toward predicting changes in protein stability when there are two or more mutations, despite the significance of mutation clusters for disease pathways and protein design studies. Here, we analyze the largest available dataset of double point mutation stability and benchmark several widely used protein stability models on this and other datasets. We identify a blind spot in how predictors are typically evaluated on multiple mutations, finding that, contrary to assumptions in the field, current stability models are unable to consistently capture epistatic interactions between double mutations. We observe one notable deviation from this trend, which is that epistasis-aware models provide marginally better predictions on stabilizing double point mutations. We develop an extension of the ThermoMPNN framework for double mutant modeling as well as a novel data augmentation scheme which mitigates some of the limitations in available datasets. Collectively, our findings indicate that current protein stability models fail to capture the nuanced epistatic interactions between concurrent mutations due to several factors, including training dataset limitations and insufficient model sensitivity.

7.
J Mol Biol ; : 168791, 2024 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-39260686

RESUMEN

The vastness of unexplored protein fold universe remains a significant question. Through systematic de novo design of proteins with novel αß-folds, we demonstrated that nature has only explored a tiny portion of the possible folds. Numerous possible protein folds are still untouched by nature. This review outlines this study and discusses the prospects for design of functional proteins with novel folds.

8.
Angew Chem Int Ed Engl ; : e202411461, 2024 Sep 19.
Artículo en Inglés | MEDLINE | ID: mdl-39295564

RESUMEN

Designing sequences for specific protein backbones is a key step in creating new functional proteins. Here, we introduce GeoSeqBuilder, a deep learning framework that integrates protein sequence generation with side chain conformation prediction to produce the complete all-atom structures for designed sequences. GeoSeqBuilder uses spatial geometric features from protein backbones and explicitly includes three-body interactions of neighboring residues. GeoSeqBuilder achieves native residue type recovery rate of 51.6%, comparable to ProteinMPNN and  other leading methods, while accurately predicting side chain conformations. We first used GeoSeqBuilder to design sequences for thioredoxin and a hallucinated three-helical bundle protein. All the 15 tested sequences expressed as soluble monomeric proteins with high thermal stability, and the 2 high-resolution crystal structures solved closely match the designed models. The generated protein sequences exhibit low similarity (minimum 23%) to the original sequences, with significantly altered hydrophobic cores. We further redesigned the hydrophobic core of glutathione peroxidase 4, and 3 of the 5 designs showed improved enzyme activity. Although further testing is needed, the high experimental success rate in our testing demonstrates that GeoSeqBuilder is a powerful tool for designing novel sequences for predefined protein structures with atomic details. GeoSeqBuilder is available at https://github.com/PKUliujl/GeoSeqBuilder.

9.
Angew Chem Int Ed Engl ; : e202410435, 2024 Sep 27.
Artículo en Inglés | MEDLINE | ID: mdl-39329252

RESUMEN

Current methods for proteomimetic engineering rely on structure-based design. Here we describe a design strategy that allows the construction of proteomimetics against challenging targets without a priori characterization of the target surface. Our approach relies on (i) a 100-membered photoreactive foldamer library, the members of which act as local surface mimetics, and (ii) the subsequent affinity maturation of the primary hits using systems chemistry. Two surface-oriented proteinogenic side chains drove the interactions between the short helical foldamer fragments and the proteins. Diazirine-based photo-crosslinking was applied to sensitively detected and localize binding even to shallow and dynamic patches on representatively difficult targets. Photo-foldamers identified functionally relevant protein interfaces, allosteric and previously unexplored targetable regions on the surface of STAT3 and an oncogenic K-Ras variant. Target-templated dynamic linking of foldamer hits resulted in two orders of magnitude affinity improvement in a single step. The dimeric K-Ras ligand mimicked protein-like catalytic functions. The photo-foldamer approach thus enables the highly efficient mapping of protein-protein interaction sites and provides a viable starting point for proteomimetic ligand development without a priori structural hypotheses.

10.
Biomolecules ; 14(9)2024 Aug 27.
Artículo en Inglés | MEDLINE | ID: mdl-39334841

RESUMEN

Therapeutic protein engineering has revolutionized medicine by enabling the development of highly specific and potent treatments for a wide range of diseases. This review examines recent advances in computational and experimental approaches for engineering improved protein therapeutics. Key areas of focus include antibody engineering, enzyme replacement therapies, and cytokine-based drugs. Computational methods like structure-based design, machine learning integration, and protein language models have dramatically enhanced our ability to predict protein properties and guide engineering efforts. Experimental techniques such as directed evolution and rational design approaches continue to evolve, with high-throughput methods accelerating the discovery process. Applications of these methods have led to breakthroughs in affinity maturation, bispecific antibodies, enzyme stability enhancement, and the development of conditionally active cytokines. Emerging approaches like intracellular protein delivery, stimulus-responsive proteins, and de novo designed therapeutic proteins offer exciting new possibilities. However, challenges remain in predicting in vivo behavior, scalable manufacturing, immunogenicity mitigation, and targeted delivery. Addressing these challenges will require continued integration of computational and experimental methods, as well as a deeper understanding of protein behavior in complex physiological environments. As the field advances, we can anticipate increasingly sophisticated and effective protein therapeutics for treating human diseases.


Asunto(s)
Productos Biológicos , Ingeniería de Proteínas , Humanos , Ingeniería de Proteínas/métodos , Productos Biológicos/química , Productos Biológicos/uso terapéutico , Animales , Diseño de Fármacos , Biología Computacional/métodos , Anticuerpos Biespecíficos/química , Anticuerpos Biespecíficos/uso terapéutico
11.
Int J Mol Sci ; 25(18)2024 Sep 23.
Artículo en Inglés | MEDLINE | ID: mdl-39337680

RESUMEN

99mTc is a well-known radionuclide that is widely used and readily available for SPECT/CT (Single-Photon Emission Computed Tomography) diagnosis. However, commercial isotope carriers are not specific enough to tumours, rapidly clear from the bloodstream, and are not safe. To overcome these limitations, we suggest immunologically compatible recombinant proteins containing a combination of metal binding sites as 99mTc chelators and several different tumour-specific ligands for early detection of tumours. E1b protein containing metal-binding centres and tumour-specific ligands targeting integrin αvß3 and nucleolin, as well as a short Cys-rich sequence, was artificially constructed. It was produced in E. coli, purified by metal-chelate chromatography, and used to obtain a complex with 99mTc. This was administered intravenously to healthy Balb/C mice at an activity dose of about 80 MBq per mouse, and the biodistribution was studied by SPECT/CT for 24 h. Free sodium 99mTc-pertechnetate at the same dose was used as a reference. The selectivity of 99mTc-E1b and the kinetics of isotope retention in tumours were then investigated in experiments in C57Bl/6 and Balb/C mice with subcutaneously transplanted lung carcinoma (LLC) or mammary adenocarcinoma (Ca755, EMT6, or 4T1). The radionuclide distribution ratio in tumour and adjacent normal tissue (T/N) steadily increased over 24 h, reaching 15.7 ± 4.2 for EMT6, 16.5 ± 3.8 for Ca755, 6.7 ± 4.2 for LLC, and 7.5 ± 3.1 for 4T1.


Asunto(s)
Ratones Endogámicos BALB C , Proteínas Recombinantes , Tecnecio , Tomografía Computarizada de Emisión de Fotón Único , Animales , Ratones , Proteínas Recombinantes/administración & dosificación , Tomografía Computarizada de Emisión de Fotón Único/métodos , Tecnecio/química , Femenino , Distribución Tisular , Radiofármacos/química , Ratones Endogámicos C57BL , Línea Celular Tumoral , Neoplasias Pulmonares/diagnóstico por imagen , Neoplasias Pulmonares/metabolismo , Tomografía Computarizada por Tomografía Computarizada de Emisión de Fotón Único/métodos , Trasplante de Neoplasias , Integrina alfaVbeta3/metabolismo
12.
FEBS Lett ; 2024 Aug 06.
Artículo en Inglés | MEDLINE | ID: mdl-39107909

RESUMEN

The dynamic evolution of SARS-CoV-2 variants necessitates ongoing advancements in therapeutic strategies. Despite the promise of monoclonal antibody (mAb) therapies like bebtelovimab, concerns persist regarding resistance mutations, particularly single-to-multipoint mutations in the receptor-binding domain (RBD). Our study addresses this by employing interface-guided computational protein design to predict potential bebtelovimab-resistance mutations. Through extensive physicochemical analysis, mutational preferences, precision-recall metrics, protein-protein docking, and energetic analyses, combined with all-atom, and coarse-grained molecular dynamics (MD) simulations, we elucidated the structural-dynamics-binding features of the bebtelovimab-RBD complexes. Identification of susceptible RBD residues under positive selection pressure, coupled with validation against bebtelovimab-escape mutations, clinically reported resistance mutations, and viral genomic sequences enhances the translational significance of our findings and contributes to a better understanding of the resistance mechanisms of SARS-CoV-2.

13.
Structure ; 2024 Aug 14.
Artículo en Inglés | MEDLINE | ID: mdl-39173620

RESUMEN

With advanced computational methods, it is now feasible to modify or design proteins for specific functions, a process with significant implications for disease treatment and other medical applications. Protein structures and functions are intrinsically linked to their backbones, making the design of these backbones a pivotal aspect of protein engineering. In this study, we focus on the task of unconditionally generating protein backbones. By means of codebook quantization and compression dictionaries, we convert protein backbone structures into a distinctive coded language and propose a GPT-based protein backbone generation model, PB-GPT. To validate the generalization performance of the model, we trained and evaluated the model on both public datasets and small protein datasets. The results demonstrate that our model has the capability to unconditionally generate elaborate, highly realistic protein backbones with structural patterns resembling those of natural proteins, thus showcasing the significant potential of large language models in protein structure design.

14.
Protein Sci ; 33(9): e5159, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39180469

RESUMEN

Beta turns, in which the protein backbone abruptly changes direction over four amino acid residues, are the most common type of protein secondary structure after alpha helices and beta sheets and play key structural and functional roles. Previous work has produced classification systems for turn geometry at multiple levels of precision, but these operate in backbone dihedral-angle (Ramachandran) space, and the absence of a local Euclidean-space coordinate system and structural alignment for turns, or of any systematic Euclidean-space characterization of turn backbone shape, presents challenges for the visualization, comparison and analysis of the wide range of turn conformations and the design of turns and the structures that incorporate them. This work derives a turn-local coordinate system that implicitly aligns turns, together with a set of geometric descriptors that characterize the bulk BB shapes of turns and describe modes of structural variation not explicitly captured by existing systems. These modes are shown to be meaningful by the demonstration of clear relationships between descriptor values and the electrostatic energy of the beta-turn H-bond, the overrepresentations of key side-chain motifs, and the structural contexts of turns. Geometric turn descriptors complement Ramachandran-space classifications, and they can be used to select turn structures for compatibility with particular side-chain interactions or contexts. Potential applications include protein design and other tasks in which an enhanced Euclidean-space characterization of turns may improve understanding or performance. The web-based tools ExploreTurns, MapTurns, and ProfileTurn, available at www.betaturn.com, incorporate turn-local coordinates and turn descriptors and demonstrate their utility.


Asunto(s)
Modelos Moleculares , Proteínas , Proteínas/química , Enlace de Hidrógeno , Bases de Datos de Proteínas , Estructura Secundaria de Proteína , Electricidad Estática , Conformación Proteica en Lámina beta
15.
Protein Sci ; 33(9): e5148, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39180484

RESUMEN

In protein design, the ultimate test of success is that the designs function as desired. Here, we discuss the utility of cell free protein synthesis (CFPS) as a rapid, convenient and versatile method to screen for activity. We champion the use of CFPS in screening potential designs. Compared to in vivo protein screening, a wider range of different activities can be evaluated using CFPS, and the scale on which it can easily be used-screening tens to hundreds of designed proteins-is ideally suited to current needs. Protein design using physics-based strategies tended to have a relatively low success rate, compared with current machine-learning based methods. Screening steps (such as yeast display) were often used to identify proteins that displayed the desired activity from many designs that were highly ranked computationally. We also describe how CFPS is well-suited to identify the reasons designs fail, which may include problems with transcription, translation, and solubility, in addition to not achieving the desired structure and function.


Asunto(s)
Sistema Libre de Células , Biosíntesis de Proteínas , Proteínas , Proteínas/química , Proteínas/metabolismo , Sistema Libre de Células/metabolismo , Ingeniería de Proteínas/métodos
16.
Angew Chem Int Ed Engl ; : e202409234, 2024 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-39168829

RESUMEN

Cells have evolved intricate mechanisms for recognizing and responding to changes in oxygen (O2) concentrations. Here, we have reprogrammed cellular hypoxia (low O2) signaling via gas tunnel engineering of prolyl hydroxylase 2 (PHD2), a non-heme iron dependent O2 sensor. Using computational modeling and protein engineering techniques, we identify a gas tunnel and critical residues therein that limit the flow of O2 to PHD2's catalytic core. We show that systematic modification of these residues can open the constriction topology of PHD2's gas tunnel. Using kinetic stopped-flow measurements with NO as a surrogate diatomic gas, we demonstrate up to 3.5-fold enhancement in its association rate to the iron center of tunnel-engineered mutants. Our most effectively designed mutant displays 9-fold enhanced catalytic efficiency (kcat/KM = 830 ± 40 M-1 s-1) in hydroxylating a peptide mimic of hypoxia inducible transcription factor HIF-1α, as compared to WT PHD2 (kcat/KM = 90 ± 9 M-1 s-1). Furthermore, transfection of plasmids that express designed PHD2 mutants in HEK-293T mammalian cells reveal significant reduction of HIF-1α and downstream hypoxia response transcripts under hypoxic conditions of 1% O2. Overall, these studies highlight activation of PHD2 as a new pathway to reprogram hypoxia responses and HIF signaling in cells.

17.
bioRxiv ; 2024 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-39091726

RESUMEN

Francis Crick's global parameterization of coiled coil geometry has been widely useful for guiding design of new protein structures and functions. However, design guided by similar global parameterization of beta barrel structures has been less successful, likely due to the deviations required from ideal beta barrel geometry to maintain extensive inter-strand hydrogen bonding without introducing considerable backbone strain. Instead, beta barrels and other protein folds have been designed guided by 2D structural blueprints; while this approach has successfully generated new fluorescent proteins, transmembrane nanopores, and other structures, it requires considerable expert knowledge and provides only indirect control over the global barrel shape. Here we show that the simplicity and control over shape and structure provided by global parametric representations can be generalized beyond coiled coils by taking advantage of the rich sequence-structure relationships implicit in RoseTTAFold based inpainting and diffusion design methods. Starting from parametrically generated idealized barrel backbones, both RFjoint inpainting and RFdiffusion readily incorporate the backbone irregularities necessary for proper folding with minimal deviation from the idealized barrel geometries. We show that for beta barrels across a broad range of global beta sheet parameterizations, these methods achieve high in silico and experimental success rates, with atomic accuracy confirmed by an X-ray crystal structure of a novel beta barrel topology, and de novo designed 12, 14, and 16 stranded transmembrane nanopores with conductances ranging from 200 to 500 pS. By combining the simplicity and control of parametric generation with the high success rates of deep learning based protein design methods, our approach makes the design of proteins where global shape confers function, such as beta barrel nanopores, more precisely specifiable and accessible.

18.
Cell Syst ; 15(8): 725-737.e7, 2024 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-39106868

RESUMEN

Evolution-based deep generative models represent an exciting direction in understanding and designing proteins. An open question is whether such models can learn specialized functional constraints that control fitness in specific biological contexts. Here, we examine the ability of generative models to produce synthetic versions of Src-homology 3 (SH3) domains that mediate signaling in the Sho1 osmotic stress response pathway of yeast. We show that a variational autoencoder (VAE) model produces artificial sequences that experimentally recapitulate the function of natural SH3 domains. More generally, the model organizes all fungal SH3 domains such that locality in the model latent space (but not simply locality in sequence space) enriches the design of synthetic orthologs and exposes non-obvious amino acid constraints distributed near and far from the SH3 ligand-binding site. The ability of generative models to design ortholog-like functions in vivo opens new avenues for engineering protein function in specific cellular contexts and environments.


Asunto(s)
Aprendizaje Profundo , Transducción de Señal , Dominios Homologos src , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo
19.
bioRxiv ; 2024 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-39131267

RESUMEN

Protein Language Models (pLMs) have revolutionized the computational modeling of protein systems, building numerical embeddings that are centered around structural features. To enhance the breadth of biochemically relevant properties available in protein embeddings, we engineered the Annotation Vocabulary, a transformer readable language of protein properties defined by structured ontologies. We trained Annotation Transformers (AT) from the ground up to recover masked protein property inputs without reference to amino acid sequences, building a new numerical feature space on protein descriptions alone. We leverage AT representations in various model architectures, for both protein representation and generation. To showcase the merit of Annotation Vocabulary integration, we performed 515 diverse downstream experiments. Using a novel loss function and only $3 in commercial compute, our premier representation model CAMP produces state-of-the-art embeddings for five out of 15 common datasets with competitive performance on the rest; highlighting the computational efficiency of latent space curation with Annotation Vocabulary. To standardize the comparison of de novo generated protein sequences, we suggest a new sequence alignment-based score that is more flexible and biologically relevant than traditional language modeling metrics. Our generative model, GSM, produces high alignment scores from annotation-only prompts with a BERT-like generation scheme. Of particular note, many GSM hallucinations return statistically significant BLAST hits, where enrichment analysis shows properties matching the annotation prompt - even when the ground truth has low sequence identity to the entire training set. Overall, the Annotation Vocabulary toolbox presents a promising pathway to replace traditional tokens with members of ontologies and knowledge graphs, enhancing transformer models in specific domains. The concise, accurate, and efficient descriptions of proteins by the Annotation Vocabulary offers a novel way to build numerical representations of proteins for protein annotation and design.

20.
Int J Mol Sci ; 25(15)2024 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-39125888

RESUMEN

Statistical analyses of homologous protein sequences can identify amino acid residue positions that co-evolve to generate family members with different properties. Based on the hypothesis that the coevolution of residue positions is necessary for maintaining protein structure, coevolutionary traits revealed by statistical models provide insight into residue-residue interactions that are important for understanding protein mechanisms at the molecular level. With the rapid expansion of genome sequencing databases that facilitate statistical analyses, this sequence-based approach has been used to study a broad range of protein families. An emerging application of this approach is to design hybrid transcriptional regulators as modular genetic sensors for novel wiring between input signals and genetic elements to control outputs. Among many allosterically regulated regulator families, the members contain structurally conserved and functionally independent protein domains, including a DNA-binding module (DBM) for interacting with a specific genetic element and a ligand-binding module (LBM) for sensing an input signal. By hybridizing a DBM and an LBM from two different family members, a hybrid regulator can be created with a new combination of signal-detection and DNA-recognition properties not present in natural systems. In this review, we present recent advances in the development of hybrid regulators and their applications in cellular engineering, especially focusing on the use of statistical analyses for characterizing DBM-LBM interactions and hybrid regulator design. Based on these studies, we then discuss the current limitations and potential directions for enhancing the impact of this sequence-based design approach.


Asunto(s)
Evolución Molecular , Modelos Estadísticos , Ingeniería de Proteínas/métodos , Humanos , Secuencia de Aminoácidos , Proteínas/genética , Proteínas/química , Proteínas/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...