Pesquisa | BVS IEC

1.

Addressing epistasis in the design of protein function.

Lipsh-Sokolik, Rosalie; Fleishman, Sarel J.

Proc Natl Acad Sci U S A ; 121(34): e2314999121, 2024 Aug 20.

Artigo em Inglês | MEDLINE | ID: mdl-39133844

RESUMO

Mutations in protein active sites can dramatically improve function. The active site, however, is densely packed and extremely sensitive to mutations. Therefore, some mutations may only be tolerated in combination with others in a phenomenon known as epistasis. Epistasis reduces the likelihood of obtaining improved functional variants and dramatically slows natural and lab evolutionary processes. Research has shed light on the molecular origins of epistasis and its role in shaping evolutionary trajectories and outcomes. In addition, sequence- and AI-based strategies that infer epistatic relationships from mutational patterns in natural or experimental evolution data have been used to design functional protein variants. In recent years, combinations of such approaches and atomistic design calculations have successfully predicted highly functional combinatorial mutations in active sites. These were used to design thousands of functional active-site variants, demonstrating that, while our understanding of epistasis remains incomplete, some of the determinants that are critical for accurate design are now sufficiently understood. We conclude that the space of active-site variants that has been explored by evolution may be expanded dramatically to enhance natural activities or discover new ones. Furthermore, design opens the way to systematically exploring sequence and structure space and mutational impacts on function, deepening our understanding and control over protein activity.

Assuntos

Epistasia Genética , Mutação , Evolução Molecular , Proteínas/genética , Proteínas/química , Proteínas/metabolismo , Domínio Catalítico , Engenharia de Proteínas/métodos

2.

A comprehensive computational benchmark for evaluating deep learning-based protein function prediction approaches.

Wang, Wenkang; Shuai, Yunyan; Yang, Qiurong; Zhang, Fuhao; Zeng, Min; Li, Min.

Brief Bioinform ; 25(2)2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38388682

RESUMO

Proteins play an important role in life activities and are the basic units for performing functions. Accurately annotating functions to proteins is crucial for understanding the intricate mechanisms of life and developing effective treatments for complex diseases. Traditional biological experiments struggle to keep pace with the growing number of known proteins. With the development of high-throughput sequencing technology, a wide variety of biological data provides the possibility to accurately predict protein functions by computational methods. Consequently, many computational methods have been proposed. Due to the diversity of application scenarios, it is necessary to conduct a comprehensive evaluation of these computational methods to determine the suitability of each algorithm for specific cases. In this study, we present a comprehensive benchmark, BeProf, to process data and evaluate representative computational methods. We first collect the latest datasets and analyze the data characteristics. Then, we investigate and summarize 17 state-of-the-art computational methods. Finally, we propose a novel comprehensive evaluation metric, design eight application scenarios and evaluate the performance of existing methods on these scenarios. Based on the evaluation, we provide practical recommendations for different scenarios, enabling users to select the most suitable method for their specific needs. All of these servers can be obtained from https://csuligroup.com/BEPROF and https://github.com/CSUBioGroup/BEPROF.

Assuntos

Aprendizado Profundo , Benchmarking , Proteínas , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala

3.

Partial order relation-based gene ontology embedding improves protein function prediction.

Li, Wenjing; Wang, Bin; Dai, Jin; Kou, Yan; Chen, Xiaojun; Pan, Yi; Hu, Shuangwei; Xu, Zhenjiang Zech.

Brief Bioinform ; 25(2)2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38446740

RESUMO

Protein annotation has long been a challenging task in computational biology. Gene Ontology (GO) has become one of the most popular frameworks to describe protein functions and their relationships. Prediction of a protein annotation with proper GO terms demands high-quality GO term representation learning, which aims to learn a low-dimensional dense vector representation with accompanying semantic meaning for each functional label, also known as embedding. However, existing GO term embedding methods, which mainly take into account ancestral co-occurrence information, have yet to capture the full topological information in the GO-directed acyclic graph (DAG). In this study, we propose a novel GO term representation learning method, PO2Vec, to utilize the partial order relationships to improve the GO term representations. Extensive evaluations show that PO2Vec achieves better outcomes than existing embedding methods in a variety of downstream biological tasks. Based on PO2Vec, we further developed a new protein function prediction method PO2GO, which demonstrates superior performance measured in multiple metrics and annotation specificity as well as few-shot prediction capability in the benchmarks. These results suggest that the high-quality representation of GO structure is critical for diverse biological tasks including computational protein annotation.

Assuntos

Benchmarking , Biologia Computacional , Ontologia Genética , Aprendizagem , Anotação de Sequência Molecular

4.

DeepSS2GO: protein function prediction from secondary structure.

Song, Fu V; Su, Jiaqi; Huang, Sixing; Zhang, Neng; Li, Kaiyue; Ni, Ming; Liao, Maofu.

Brief Bioinform ; 25(3)2024 Mar 27.

Artigo em Inglês | MEDLINE | ID: mdl-38701416

RESUMO

Predicting protein function is crucial for understanding biological life processes, preventing diseases and developing new drug targets. In recent years, methods based on sequence, structure and biological networks for protein function annotation have been extensively researched. Although obtaining a protein in three-dimensional structure through experimental or computational methods enhances the accuracy of function prediction, the sheer volume of proteins sequenced by high-throughput technologies presents a significant challenge. To address this issue, we introduce a deep neural network model DeepSS2GO (Secondary Structure to Gene Ontology). It is a predictor incorporating secondary structure features along with primary sequence and homology information. The algorithm expertly combines the speed of sequence-based information with the accuracy of structure-based features while streamlining the redundant data in primary sequences and bypassing the time-consuming challenges of tertiary structure analysis. The results show that the prediction performance surpasses state-of-the-art algorithms. It has the ability to predict key functions by effectively utilizing secondary structure information, rather than broadly predicting general Gene Ontology terms. Additionally, DeepSS2GO predicts five times faster than advanced algorithms, making it highly applicable to massive sequencing data. The source code and trained models are available at https://github.com/orca233/DeepSS2GO.

Assuntos

Algoritmos , Biologia Computacional , Redes Neurais de Computação , Estrutura Secundária de Proteína , Proteínas , Proteínas/química , Proteínas/metabolismo , Proteínas/genética , Biologia Computacional/métodos , Bases de Dados de Proteínas , Ontologia Genética , Análise de Sequência de Proteína/métodos , Software

5.

A large-scale assessment of sequence database search tools for homology-based protein function prediction.

Zhang, Chengxin; Freddolino, Lydia.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-39038936

RESUMO

Sequence database searches followed by homology-based function transfer form one of the oldest and most popular approaches for predicting protein functions, such as Gene Ontology (GO) terms. These searches are also a critical component in most state-of-the-art machine learning and deep learning-based protein function predictors. Although sequence search tools are the basis of homology-based protein function prediction, previous studies have scarcely explored how to select the optimal sequence search tools and configure their parameters to achieve the best function prediction. In this paper, we evaluate the effect of using different options from among popular search tools, as well as the impacts of search parameters, on protein function prediction. When predicting GO terms on a large benchmark dataset, we found that BLASTp and MMseqs2 consistently exceed the performance of other tools, including DIAMOND-one of the most popular tools for function prediction-under default search parameters. However, with the correct parameter settings, DIAMOND can perform comparably to BLASTp and MMseqs2 in function prediction. Additionally, we developed a new scoring function to derive GO prediction from homologous hits that consistently outperform previously proposed scoring functions. These findings enable the improvement of almost all protein function prediction algorithms with a few easily implementable changes in their sequence homolog-based component. This study emphasizes the critical role of search parameter settings in homology-based function transfer and should have an important contribution to the development of future protein function prediction algorithms.

Assuntos

Bases de Dados de Proteínas , Proteínas , Proteínas/química , Proteínas/metabolismo , Proteínas/genética , Biologia Computacional/métodos , Ontologia Genética , Algoritmos , Análise de Sequência de Proteína/métodos , Software , Aprendizado de Máquina

6.

A comprehensive review and comparison of existing computational methods for protein function prediction.

Lin, Baohui; Luo, Xiaoling; Liu, Yumeng; Jin, Xiaopeng.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-39003530

RESUMO

Protein function prediction is critical for understanding the cellular physiological and biochemical processes, and it opens up new possibilities for advancements in fields such as disease research and drug discovery. During the past decades, with the exponential growth of protein sequence data, many computational methods for predicting protein function have been proposed. Therefore, a systematic review and comparison of these methods are necessary. In this study, we divide these methods into four different categories, including sequence-based methods, 3D structure-based methods, PPI network-based methods and hybrid information-based methods. Furthermore, their advantages and disadvantages are discussed, and then their performance is comprehensively evaluated and compared. Finally, we discuss the challenges and opportunities present in this field.

Assuntos

Biologia Computacional , Proteínas , Proteínas/química , Proteínas/metabolismo , Biologia Computacional/métodos , Humanos , Análise de Sequência de Proteína/métodos , Algoritmos

7.

Rheostats, toggles, and neutrals, Oh my! A new framework for understanding how amino acid changes modulate protein function.

Swint-Kruse, Liskin; Fenton, Aron W.

J Biol Chem ; 300(3): 105736, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38336297

RESUMO

Advances in personalized medicine and protein engineering require accurately predicting outcomes of amino acid substitutions. Many algorithms correctly predict that evolutionarily-conserved positions show "toggle" substitution phenotypes, which is defined when a few substitutions at that position retain function. In contrast, predictions often fail for substitutions at the less-studied "rheostat" positions, which are defined when different amino acid substitutions at a position sample at least half of the possible functional range. This review describes efforts to understand the impact and significance of rheostat positions: (1) They have been observed in globular soluble, integral membrane, and intrinsically disordered proteins; within single proteins, their prevalence can be up to 40%. (2) Substitutions at rheostat positions can have biological consequences and â¼10% of substitutions gain function. (3) Although both rheostat and "neutral" (defined when all substitutions exhibit wild-type function) positions are nonconserved, the two classes have different evolutionary signatures. (4) Some rheostat positions have pleiotropic effects on function, simultaneously modulating multiple parameters (e.g., altering both affinity and allosteric coupling). (5) In structural studies, substitutions at rheostat positions appear to cause only local perturbations; the overall conformations appear unchanged. (6) Measured functional changes show promising correlations with predicted changes in protein dynamics; the emergent properties of predicted, dynamically coupled amino acid networks might explain some of the complex functional outcomes observed when substituting rheostat positions. Overall, rheostat positions provide unique opportunities for using single substitutions to tune protein function. Future studies of these positions will yield important insights into the protein sequence/function relationship.

Assuntos

Substituição de Aminoácidos , Aminoácidos , Proteínas , Sequência de Aminoácidos , Aminoácidos/genética , Aminoácidos/metabolismo , Sequência Conservada , Evolução Molecular , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/genética , Proteínas Intrinsicamente Desordenadas/metabolismo , Proteínas de Membrana/química , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Engenharia de Proteínas , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Relação Estrutura-Atividade , Humanos

8.

Substitution Models of Protein Evolution with Selection on Enzymatic Activity.

Ferreiro, David; Khalil, Ruqaiya; Sousa, Sergio F; Arenas, Miguel.

Mol Biol Evol ; 41(2)2024 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-38314876

RESUMO

Substitution models of evolution are necessary for diverse evolutionary analyses including phylogenetic tree and ancestral sequence reconstructions. At the protein level, empirical substitution models are traditionally used due to their simplicity, but they ignore the variability of substitution patterns among protein sites. Next, in order to improve the realism of the modeling of protein evolution, a series of structurally constrained substitution models were presented, but still they usually ignore constraints on the protein activity. Here, we present a substitution model of protein evolution with selection on both protein structure and enzymatic activity, and that can be applied to phylogenetics. In particular, the model considers the binding affinity of the enzyme-substrate complex as well as structural constraints that include the flexibility of structural flaps, hydrogen bonds, amino acids backbone radius of gyration, and solvent-accessible surface area that are quantified through molecular dynamics simulations. We applied the model to the HIV-1 protease and evaluated it by phylogenetic likelihood in comparison with the best-fitting empirical substitution model and a structurally constrained substitution model that ignores the enzymatic activity. We found that accounting for selection on the protein activity improves the fitting of the modeled functional regions with the real observations, especially in data with high molecular identity, which recommends considering constraints on the protein activity in the development of substitution models of evolution.

Assuntos

Aminoácidos , Evolução Molecular , Filogenia , Probabilidade , Modelos Genéticos , Substituição de Aminoácidos

9.

Deep learning methods for protein function prediction.

Boadu, Frimpong; Lee, Ahhyun; Cheng, Jianlin.

Proteomics ; : e2300471, 2024 Jul 12.

Artigo em Inglês | MEDLINE | ID: mdl-38996351

RESUMO

Predicting protein function from protein sequence, structure, interaction, and other relevant information is important for generating hypotheses for biological experiments and studying biological systems, and therefore has been a major challenge in protein bioinformatics. Numerous computational methods had been developed to advance protein function prediction gradually in the last two decades. Particularly, in the recent years, leveraging the revolutionary advances in artificial intelligence (AI), more and more deep learning methods have been developed to improve protein function prediction at a faster pace. Here, we provide an in-depth review of the recent developments of deep learning methods for protein function prediction. We summarize the significant advances in the field, identify several remaining major challenges to be tackled, and suggest some potential directions to explore. The data sources and evaluation metrics widely used in protein function prediction are also discussed to assist the machine learning, AI, and bioinformatics communities to develop more cutting-edge methods to advance protein function prediction.

10.

Protein function prediction through multi-view multi-label latent tensor reconstruction.

Armah-Sekum, Robert Ebo; Szedmak, Sandor; Rousu, Juho.

BMC Bioinformatics ; 25(1): 174, 2024 May 02.

Artigo em Inglês | MEDLINE | ID: mdl-38698340

RESUMO

BACKGROUND: In last two decades, the use of high-throughput sequencing technologies has accelerated the pace of discovery of proteins. However, due to the time and resource limitations of rigorous experimental functional characterization, the functions of a vast majority of them remain unknown. As a result, computational methods offering accurate, fast and large-scale assignment of functions to new and previously unannotated proteins are sought after. Leveraging the underlying associations between the multiplicity of features that describe proteins could reveal functional insights into the diverse roles of proteins and improve performance on the automatic function prediction task. RESULTS: We present GO-LTR, a multi-view multi-label prediction model that relies on a high-order tensor approximation of model weights combined with non-linear activation functions. The model is capable of learning high-order relationships between multiple input views representing the proteins and predicting high-dimensional multi-label output consisting of protein functional categories. We demonstrate the competitiveness of our method on various performance measures. Experiments show that GO-LTR learns polynomial combinations between different protein features, resulting in improved performance. Additional investigations establish GO-LTR's practical potential in assigning functions to proteins under diverse challenging scenarios: very low sequence similarity to previously observed sequences, rarely observed and highly specific terms in the gene ontology. IMPLEMENTATION: The code and data used for training GO-LTR is available at https://github.com/aalto-ics-kepaco/GO-LTR-prediction .

Assuntos

Biologia Computacional , Proteínas , Proteínas/química , Proteínas/metabolismo , Biologia Computacional/métodos , Bases de Dados de Proteínas , Algoritmos

11.

KEGG orthology prediction of bacterial proteins using natural language processing.

Chen, Jing; Wu, Haoyu; Wang, Ning.

BMC Bioinformatics ; 25(1): 146, 2024 Apr 11.

Artigo em Inglês | MEDLINE | ID: mdl-38600441

RESUMO

BACKGROUND: The advent of high-throughput technologies has led to an exponential increase in uncharacterized bacterial protein sequences, surpassing the capacity of manual curation. A large number of bacterial protein sequences remain unannotated by Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology, making it necessary to use auto annotation tools. These tools are now indispensable in the biological research landscape, bridging the gap between the vastness of unannotated sequences and meaningful biological insights. RESULTS: In this work, we propose a novel pipeline for KEGG orthology annotation of bacterial protein sequences that uses natural language processing and deep learning. To assess the effectiveness of our pipeline, we conducted evaluations using the genomes of two randomly selected species from the KEGG database. In our evaluation, we obtain competitive results on precision, recall, and F1 score, with values of 0.948, 0.947, and 0.947, respectively. CONCLUSIONS: Our experimental results suggest that our pipeline demonstrates performance comparable to traditional methods and excels in identifying distant relatives with low sequence identity. This demonstrates the potential of our pipeline to significantly improve the accuracy and comprehensiveness of KEGG orthology annotation, thereby advancing our understanding of functional relationships within biological systems.

Assuntos

Proteínas de Bactérias , Processamento de Linguagem Natural , Genoma , Anotação de Sequência Molecular , Sequência de Aminoácidos

12.

Transient excited states of the metamorphic protein Mad2 and their implications for function.

Jain, Shefali; Sekhar, Ashok.

Proteins ; 2024 Jan 14.

Artigo em Inglês | MEDLINE | ID: mdl-38221646

RESUMO

The spindle checkpoint complex is a key surveillance mechanism in cell division that prevents premature separation of sister chromatids. Mad2 is an integral component of this spindle checkpoint complex that recognizes cognate substrates such as Mad1 and Cdc20 in its closed (C-Mad2) conformation by fastening a "seatbelt" around short peptide regions that bind to the substrate recognition site. Mad2 is also a metamorphic protein that adopts not only the fold found in C-Mad2, but also a structurally distinct open conformation (O-Mad2) which is incapable of binding substrates. Here, we show using chemical exchange saturation transfer (CEST) and relaxation dispersion (CPMG) NMR experiments that Mad2 transiently populates three other higher free energy states with millisecond lifetimes, two in equilibrium with C-Mad2 (E1 and E2) and one with O-Mad2 (E3). E1 is a mimic of substrate-bound C-Mad2 in which the N-terminus of one C-Mad2 molecule inserts into the seatbelt region of a second molecule of C-Mad2, providing a potential pathway for autoinhibition of C-Mad2. E2 is the "unbuckled" conformation of C-Mad2 that facilitates the triage of molecules along competing fold-switching and substrate binding pathways. The E3 conformation that coexists with O-Mad2 shows fluctuations at a hydrophobic lock that is required for stabilizing the O-Mad2 fold and we hypothesize that E3 represents an early intermediate on-pathway towards conversion to C-Mad2. Collectively, the NMR data highlight the rugged free energy landscape of Mad2 with multiple low-lying intermediates that interlink substrate-binding and fold-switching, and also emphasize the role of molecular dynamics in its function.

13.

Role of Yme1 in mitochondrial protein homeostasis: from regulation of protein import, OXPHOS function to lipid synthesis and mitochondrial dynamics.

Kan, Kwan Ting; Wilcock, Joel; Lu, Hui.

Biochem Soc Trans ; 52(3): 1539-1548, 2024 Jun 26.

Artigo em Inglês | MEDLINE | ID: mdl-38864432

RESUMO

Mitochondria are essential organelles of eukaryotic cells and thus mitochondrial proteome is under constant quality control and remodelling. Yme1 is a multi-functional protein and subunit of the homo-hexametric complex i-AAA proteinase. Yme1 plays vital roles in the regulation of mitochondrial protein homeostasis and mitochondrial plasticity, ranging from substrate degradation to the regulation of protein functions involved in mitochondrial protein biosynthesis, energy production, mitochondrial dynamics, and lipid biosynthesis and signalling. In this mini review, we focus on discussing the current understanding of the roles of Yme1 in mitochondrial protein import via TIM22 and TIM23 pathways, oxidative phosphorylation complex function, as well as mitochondrial lipid biosynthesis and signalling, as well as a brief discussion of the role of Yme1 in modulating mitochondrial dynamics.

Assuntos

Mitocôndrias , Dinâmica Mitocondrial , Proteínas Mitocondriais , Fosforilação Oxidativa , Transporte Proteico , Proteostase , Humanos , Proteínas Mitocondriais/metabolismo , Mitocôndrias/metabolismo , Animais , ATPases Associadas a Diversas Atividades Celulares/metabolismo , Lipídeos/biossíntese , Lipídeos/química , Metabolismo dos Lipídeos , Homeostase , Transdução de Sinais , Proteases Dependentes de ATP/metabolismo

14.

Associating protein sequence positions with the modulation of quantitative phenotypes.

Hernández Berthet, Ayelén S; Aptekmann, Ariel A; Tejero, Jesús; Sánchez, Ignacio E; Noguera, Martín E; Roman, Ernesto A.

Arch Biochem Biophys ; 755: 109979, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38583654

RESUMO

Although protein sequences encode the information for folding and function, understanding their link is not an easy task. Unluckily, the prediction of how specific amino acids contribute to these features is still considerably impaired. Here, we developed a simple algorithm that finds positions in a protein sequence with potential to modulate the studied quantitative phenotypes. From a few hundred protein sequences, we perform multiple sequence alignments, obtain the per-position pairwise differences for both the sequence and the observed phenotypes, and calculate the correlation between these last two quantities. We tested our methodology with four cases: archaeal Adenylate Kinases and the organisms optimal growth temperatures, microbial rhodopsins and their maximal absorption wavelengths, mammalian myoglobins and their muscular concentration, and inhibition of HIV protease clinical isolates by two different molecules. We found from 3 to 10 positions tightly associated with those phenotypes, depending on the studied case. We showed that these correlations appear using individual positions but an improvement is achieved when the most correlated positions are jointly analyzed. Noteworthy, we performed phenotype predictions using a simple linear model that links per-position divergences and differences in the observed phenotypes. Predictions are comparable to the state-of-art methodologies which, in most of the cases, are far more complex. All of the calculations are obtained at a very low information cost since the only input needed is a multiple sequence alignment of protein sequences with their associated quantitative phenotypes. The diversity of the explored systems makes our work a valuable tool to find sequence determinants of biological activity modulation and to predict various functional features for uncharacterized members of a protein family.

15.

Reviewing the Structure-Function Paradigm in Polyglutamine Disorders: A Synergistic Perspective on Theoretical and Experimental Approaches.

Moldovean-Cioroianu, Nastasia Sanda.

Int J Mol Sci ; 25(12)2024 Jun 20.

Artigo em Inglês | MEDLINE | ID: mdl-38928495

RESUMO

Polyglutamine (polyQ) disorders are a group of neurodegenerative diseases characterized by the excessive expansion of CAG (cytosine, adenine, guanine) repeats within host proteins. The quest to unravel the complex diseases mechanism has led researchers to adopt both theoretical and experimental methods, each offering unique insights into the underlying pathogenesis. This review emphasizes the significance of combining multiple approaches in the study of polyQ disorders, focusing on the structure-function correlations and the relevance of polyQ-related protein dynamics in neurodegeneration. By integrating computational/theoretical predictions with experimental observations, one can establish robust structure-function correlations, aiding in the identification of key molecular targets for therapeutic interventions. PolyQ proteins' dynamics, influenced by their length and interactions with other molecular partners, play a pivotal role in the polyQ-related pathogenic cascade. Moreover, conformational dynamics of polyQ proteins can trigger aggregation, leading to toxic assembles that hinder proper cellular homeostasis. Understanding these intricacies offers new avenues for therapeutic strategies by fine-tuning polyQ kinetics, in order to prevent and control disease progression. Last but not least, this review highlights the importance of integrating multidisciplinary efforts to advancing research in this field, bringing us closer to the ultimate goal of finding effective treatments against polyQ disorders.

Assuntos

Doenças Neurodegenerativas , Peptídeos , Humanos , Peptídeos/química , Peptídeos/metabolismo , Doenças Neurodegenerativas/metabolismo , Doenças Neurodegenerativas/genética , Relação Estrutura-Atividade , Animais

16.

De Novo Antimicrobial Peptide Design with Feedback Generative Adversarial Networks.

Zervou, Michaela Areti; Doutsi, Effrosyni; Pantazis, Yannis; Tsakalides, Panagiotis.

Int J Mol Sci ; 25(10)2024 May 18.

Artigo em Inglês | MEDLINE | ID: mdl-38791544

RESUMO

Antimicrobial peptides (AMPs) are promising candidates for new antibiotics due to their broad-spectrum activity against pathogens and reduced susceptibility to resistance development. Deep-learning techniques, such as deep generative models, offer a promising avenue to expedite the discovery and optimization of AMPs. A remarkable example is the Feedback Generative Adversarial Network (FBGAN), a deep generative model that incorporates a classifier during its training phase. Our study aims to explore the impact of enhanced classifiers on the generative capabilities of FBGAN. To this end, we introduce two alternative classifiers for the FBGAN framework, both surpassing the accuracy of the original classifier. The first classifier utilizes the k-mers technique, while the second applies transfer learning from the large protein language model Evolutionary Scale Modeling 2 (ESM2). Integrating these classifiers into FBGAN not only yields notable performance enhancements compared to the original FBGAN but also enables the proposed generative models to achieve comparable or even superior performance to established methods such as AMPGAN and HydrAMP. This achievement underscores the effectiveness of leveraging advanced classifiers within the FBGAN framework, enhancing its computational robustness for AMP de novo design and making it comparable to existing literature.

Assuntos

Peptídeos Antimicrobianos , Peptídeos Antimicrobianos/química , Peptídeos Antimicrobianos/farmacologia , Desenho de Fármacos/métodos , Redes Neurais de Computação , Aprendizado Profundo , Algoritmos

17.

DeepPI: Alignment-Free Analysis of Flexible Length Proteins Based on Deep Learning and Image Generator.

Ji, Mingeun; Kan, Yejin; Kim, Dongyeon; Lee, Seungmin; Yi, Gangman.

Interdiscip Sci ; 2024 Apr 03.

Artigo em Inglês | MEDLINE | ID: mdl-38568406

RESUMO

With the rapid development of NGS technology, the number of protein sequences has increased exponentially. Computational methods have been introduced in protein functional studies because the analysis of large numbers of proteins through biological experiments is costly and time-consuming. In recent years, new approaches based on deep learning have been proposed to overcome the limitations of conventional methods. Although deep learning-based methods effectively utilize features of protein function, they are limited to sequences of fixed-length and consider information from adjacent amino acids. Therefore, new protein analysis tools that extract functional features from proteins of flexible length and train models are required. We introduce DeepPI, a deep learning-based tool for analyzing proteins in large-scale database. The proposed model that utilizes Global Average Pooling is applied to proteins of flexible length and leads to reduced information loss compared to existing algorithms that use fixed sizes. The image generator converts a one-dimensional sequence into a distinct two-dimensional structure, which can extract common parts of various shapes. Finally, filtering techniques automatically detect representative data from the entire database and ensure coverage of large protein databases. We demonstrate that DeepPI has been successfully applied to large databases such as the Pfam-A database. Comparative experiments on four types of image generators illustrated the impact of structure on feature extraction. The filtering performance was verified by varying the parameter values and proved to be applicable to large databases. Compared to existing methods, DeepPI outperforms in family classification accuracy for protein function inference.

18.

CRD: A de novo design algorithm for the prediction of cognate protein receptors for small molecule ligands.

Sankar, Santhosh; Vasudevan, Sneha; Chandra, Nagasuma.

Structure ; 32(3): 362-375.e4, 2024 Mar 07.

Artigo em Inglês | MEDLINE | ID: mdl-38194962

RESUMO

While predicting a ligand that binds to a protein is feasible with current methods, the opposite, i.e., the prediction of a receptor for a ligand remains challenging. We present an approach for predicting receptors of a given ligand that uses de novo design and structural bioinformatics. We have developed the algorithm CRD, comprising multiple modules combining fragment-based sub-site finding, a machine learning function to estimate the size of the site, a genetic algorithm that encodes knowledge on protein structures and a physics-based fitness scoring scheme. CRD includes a pseudo-receptor design component followed by a mapping component to identify proteins that might contain these sites. CRD recovers the sites and receptors of several natural ligands. It designs similar sites for similar ligands, yet to some extent can distinguish between closely related ligands. CRD correctly predicts receptor classes for several drugs and might become a valuable tool for drug discovery.

Assuntos

Algoritmos , Proteínas , Sítios de Ligação , Ligação Proteica , Ligantes , Proteínas/química , Desenho de Fármacos

19.

A review: Mechanisms and molecular pathways of signaling lymphocytic activation molecule family 3 (SLAMF3) in immune modulation and therapeutic prospects.

Zhou, Tong; Guan, Yanjie; Sun, Lin; Liu, Wentao.

Int Immunopharmacol ; 133: 112088, 2024 May 30.

Artigo em Inglês | MEDLINE | ID: mdl-38626547

RESUMO

The signaling lymphocytic activation molecule (SLAM) family participates in the modulation of various innate and adaptive immune responses. SLAM family (SLAMF) receptors include nine transmembrane glycoproteins, of which SLAMF3 (also known as CD229 or Ly9) has important roles in the modulation of immune responses, from the fundamental activation and suppression of immune cells to the regulation of intricate immune networks. SLAMF3 is mainly expressed in immune cells, such as T, B, and natural killer cells. It has a unique molecular structure, including four immunoglobulin-like domains in the extracellular domain and two immunoreceptor tyrosine-based signaling motifs in the intracellular structural domains. These unique structures have important implications for protein functioning. SLAMF3 is involved in pathogenesis of various disease, particularly autoimmune diseases and cancer. However, despite its potential clinical significance, a comprehensive overview of the current paradigm of SLAMF3 research is lacking. This review summarizes the structure, functional mechanisms, and therapeutic implications of SLAMF3. Our findings highlight the significance of SLAMF3 in both physiological and pathological contexts, and underline its dual role in autoimmunity and malignancies, and including disease progression and prognosis. The review also proposes that future studies on SLAMF3 should explore its context-specific inhibitory and stimulatory effects, expand on its potential in disease mapping, investigate related signaling pathways, and explore its value as a drug target. Research in these areas related to SLAMF3 can provide more precise directions for future therapeutic strategies.

Assuntos

Neoplasias , Transdução de Sinais , Família de Moléculas de Sinalização da Ativação Linfocitária , Humanos , Família de Moléculas de Sinalização da Ativação Linfocitária/metabolismo , Família de Moléculas de Sinalização da Ativação Linfocitária/genética , Família de Moléculas de Sinalização da Ativação Linfocitária/imunologia , Animais , Neoplasias/imunologia , Neoplasias/terapia , Neoplasias/metabolismo , Doenças Autoimunes/imunologia , Doenças Autoimunes/terapia

20.

Genome-wide characterization, phylogenetic and expression analysis of Galectin gene family in Golden pompano Trachinotus ovatus.

Pan, Jin-Min; Liang, Yu; Zhu, Ke-Cheng; Guo, Hua-Yang; Liu, Bao-Suo; Zhang, Nan; Xian, Lin; Zhu, Teng-Fei; Zhang, Dian-Chang.

Front Immunol ; 15: 1452609, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-39091499

RESUMO

Galectins (Gals) are a type of S-type lectin that are widespread and evolutionarily conserved among metazoans, and can act as pattern recognition receptors (PRRs) to recognize pathogen-associated molecular patterns (PAMPs). In this study, 10 Gals (ToGals) were identified in the Golden pompano (Trachinotus ovatus), and their conserved domains, motifs, and collinearity relationships were analyzed. The expression of ToGals was regulated following infection to Cryptocaryon irritans and Streptococcus agalactiae, indicating that ToGals participate in immune responses against microbial pathogens. Further analysis was conducted on one important member, Galectin-3, subcellular localization showing that ToGal-3like protein is expressed both in the nucleus and cytoplasm. Recombinant protein obtained through prokaryotic expression showed that rToGal-3like can agglutinate red blood cells of rabbit, carp and golden pompano and also agglutinate and kill Staphylococcus aureus, Bacillus subtilis, Vibrio vulnificus, S. agalactiae, Pseudomonas aeruginosa, and Aeromonas hydrophila. This study lays the foundation for further research on the immune roles of Gals in teleosts.

Assuntos

Galectinas , Filogenia , Animais , Galectinas/genética , Galectinas/imunologia , Galectinas/metabolismo , Proteínas de Peixes/genética , Proteínas de Peixes/imunologia , Proteínas de Peixes/metabolismo , Família Multigênica , Streptococcus agalactiae/imunologia , Doenças dos Peixes/imunologia , Doenças dos Peixes/microbiologia , Peixes/imunologia , Peixes/genética , Perciformes/imunologia , Perciformes/genética , Perfilação da Expressão Gênica

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA