Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Más filtros












Intervalo de año de publicación
1.
Proc Natl Acad Sci U S A ; 121(34): e2314999121, 2024 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-39133844

RESUMEN

Mutations in protein active sites can dramatically improve function. The active site, however, is densely packed and extremely sensitive to mutations. Therefore, some mutations may only be tolerated in combination with others in a phenomenon known as epistasis. Epistasis reduces the likelihood of obtaining improved functional variants and dramatically slows natural and lab evolutionary processes. Research has shed light on the molecular origins of epistasis and its role in shaping evolutionary trajectories and outcomes. In addition, sequence- and AI-based strategies that infer epistatic relationships from mutational patterns in natural or experimental evolution data have been used to design functional protein variants. In recent years, combinations of such approaches and atomistic design calculations have successfully predicted highly functional combinatorial mutations in active sites. These were used to design thousands of functional active-site variants, demonstrating that, while our understanding of epistasis remains incomplete, some of the determinants that are critical for accurate design are now sufficiently understood. We conclude that the space of active-site variants that has been explored by evolution may be expanded dramatically to enhance natural activities or discover new ones. Furthermore, design opens the way to systematically exploring sequence and structure space and mutational impacts on function, deepening our understanding and control over protein activity.


Asunto(s)
Epistasis Genética , Mutación , Evolución Molecular , Proteínas/genética , Proteínas/química , Proteínas/metabolismo , Dominio Catalítico , Ingeniería de Proteínas/métodos
2.
Front Immunol ; 15: 1452609, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39091499

RESUMEN

Galectins (Gals) are a type of S-type lectin that are widespread and evolutionarily conserved among metazoans, and can act as pattern recognition receptors (PRRs) to recognize pathogen-associated molecular patterns (PAMPs). In this study, 10 Gals (ToGals) were identified in the Golden pompano (Trachinotus ovatus), and their conserved domains, motifs, and collinearity relationships were analyzed. The expression of ToGals was regulated following infection to Cryptocaryon irritans and Streptococcus agalactiae, indicating that ToGals participate in immune responses against microbial pathogens. Further analysis was conducted on one important member, Galectin-3, subcellular localization showing that ToGal-3like protein is expressed both in the nucleus and cytoplasm. Recombinant protein obtained through prokaryotic expression showed that rToGal-3like can agglutinate red blood cells of rabbit, carp and golden pompano and also agglutinate and kill Staphylococcus aureus, Bacillus subtilis, Vibrio vulnificus, S. agalactiae, Pseudomonas aeruginosa, and Aeromonas hydrophila. This study lays the foundation for further research on the immune roles of Gals in teleosts.


Asunto(s)
Galectinas , Filogenia , Animales , Galectinas/genética , Galectinas/inmunología , Galectinas/metabolismo , Proteínas de Peces/genética , Proteínas de Peces/inmunología , Proteínas de Peces/metabolismo , Familia de Multigenes , Streptococcus agalactiae/inmunología , Enfermedades de los Peces/inmunología , Enfermedades de los Peces/microbiología , Peces/inmunología , Peces/genética , Perciformes/inmunología , Perciformes/genética , Perfilación de la Expresión Génica
3.
Proteomics ; : e2300471, 2024 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-38996351

RESUMEN

Predicting protein function from protein sequence, structure, interaction, and other relevant information is important for generating hypotheses for biological experiments and studying biological systems, and therefore has been a major challenge in protein bioinformatics. Numerous computational methods had been developed to advance protein function prediction gradually in the last two decades. Particularly, in the recent years, leveraging the revolutionary advances in artificial intelligence (AI), more and more deep learning methods have been developed to improve protein function prediction at a faster pace. Here, we provide an in-depth review of the recent developments of deep learning methods for protein function prediction. We summarize the significant advances in the field, identify several remaining major challenges to be tackled, and suggest some potential directions to explore. The data sources and evaluation metrics widely used in protein function prediction are also discussed to assist the machine learning, AI, and bioinformatics communities to develop more cutting-edge methods to advance protein function prediction.

4.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-39003530

RESUMEN

Protein function prediction is critical for understanding the cellular physiological and biochemical processes, and it opens up new possibilities for advancements in fields such as disease research and drug discovery. During the past decades, with the exponential growth of protein sequence data, many computational methods for predicting protein function have been proposed. Therefore, a systematic review and comparison of these methods are necessary. In this study, we divide these methods into four different categories, including sequence-based methods, 3D structure-based methods, PPI network-based methods and hybrid information-based methods. Furthermore, their advantages and disadvantages are discussed, and then their performance is comprehensively evaluated and compared. Finally, we discuss the challenges and opportunities present in this field.


Asunto(s)
Biología Computacional , Proteínas , Proteínas/química , Proteínas/metabolismo , Biología Computacional/métodos , Humanos , Análisis de Secuencia de Proteína/métodos , Algoritmos
5.
Heliyon ; 10(12): e32951, 2024 Jun 30.
Artículo en Inglés | MEDLINE | ID: mdl-38988537

RESUMEN

The use of anti-inflammatory peptides (AIPs) as an alternative therapeutic approach for inflammatory diseases holds great research significance. Due to the high cost and difficulty in identifying AIPs with experimental methods, the discovery and design of peptides by computational methods before the experimental stage have become promising technology. In this study, we present BertAIP, a bidirectional encoder representation from transformers (BERT)-based method for predicting AIPs directly from their amino acid sequence without using any other information. BertAIP implements a BERT model to extract features of a protein, and uses a fully connected feed-forward network for AIP classification. It was constructed and evaluated using the AIP datasets that were reconstructed from the latest Immune Epitope Database. The experimental results showed that BertAIP achieved an accuracy of 0.751 and a Matthews correlation coefficient of 0.451, which were higher than other commonly used methods. The results of the independent test suggested that BertAIP outperformed the existing AIP predictors. In addition, to enhance the interpretability of BertAIP, we explored and visualized the amino acids that the model considered important for AIP prediction. We believe that the BertAIP proposed herein will be a useful tool for large-scale screening and identifying novel AIPs for drug development and therapeutic research related to inflammatory diseases.

6.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-39038936

RESUMEN

Sequence database searches followed by homology-based function transfer form one of the oldest and most popular approaches for predicting protein functions, such as Gene Ontology (GO) terms. These searches are also a critical component in most state-of-the-art machine learning and deep learning-based protein function predictors. Although sequence search tools are the basis of homology-based protein function prediction, previous studies have scarcely explored how to select the optimal sequence search tools and configure their parameters to achieve the best function prediction. In this paper, we evaluate the effect of using different options from among popular search tools, as well as the impacts of search parameters, on protein function prediction. When predicting GO terms on a large benchmark dataset, we found that BLASTp and MMseqs2 consistently exceed the performance of other tools, including DIAMOND-one of the most popular tools for function prediction-under default search parameters. However, with the correct parameter settings, DIAMOND can perform comparably to BLASTp and MMseqs2 in function prediction. Additionally, we developed a new scoring function to derive GO prediction from homologous hits that consistently outperform previously proposed scoring functions. These findings enable the improvement of almost all protein function prediction algorithms with a few easily implementable changes in their sequence homolog-based component. This study emphasizes the critical role of search parameter settings in homology-based function transfer and should have an important contribution to the development of future protein function prediction algorithms.


Asunto(s)
Bases de Datos de Proteínas , Proteínas , Proteínas/química , Proteínas/metabolismo , Proteínas/genética , Biología Computacional/métodos , Ontología de Genes , Algoritmos , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Aprendizaje Automático
7.
Sheng Wu Gong Cheng Xue Bao ; 40(7): 2087-2099, 2024 Jul 25.
Artículo en Chino | MEDLINE | ID: mdl-39044577

RESUMEN

With the increasing of computer power and rapid expansion of biological data, the application of bioinformatics tools has become the mainstream approach to address biological problems. The accurate identification of protein function by bioinformatics tools is crucial for both biomedical research and drug discovery, making it a hot topic of research. In this paper, we categorize bioinformatics-based protein function prediction methods into three categories: protein sequence-based methods, protein structure-based methods, and protein interaction networks-based methods. We further analyze these specific algorithms, highlighting the latest research advancements and providing valuable references for the application of bioinformatics-based protein function prediction in biomedical research and drug discovery.


Asunto(s)
Algoritmos , Biología Computacional , Proteínas , Biología Computacional/métodos , Proteínas/genética , Proteínas/metabolismo , Proteínas/química , Conformación Proteica , Mapas de Interacción de Proteínas , Análisis de Secuencia de Proteína , Secuencia de Aminoácidos , Descubrimiento de Drogas
8.
Int J Mol Sci ; 25(12)2024 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-38928495

RESUMEN

Polyglutamine (polyQ) disorders are a group of neurodegenerative diseases characterized by the excessive expansion of CAG (cytosine, adenine, guanine) repeats within host proteins. The quest to unravel the complex diseases mechanism has led researchers to adopt both theoretical and experimental methods, each offering unique insights into the underlying pathogenesis. This review emphasizes the significance of combining multiple approaches in the study of polyQ disorders, focusing on the structure-function correlations and the relevance of polyQ-related protein dynamics in neurodegeneration. By integrating computational/theoretical predictions with experimental observations, one can establish robust structure-function correlations, aiding in the identification of key molecular targets for therapeutic interventions. PolyQ proteins' dynamics, influenced by their length and interactions with other molecular partners, play a pivotal role in the polyQ-related pathogenic cascade. Moreover, conformational dynamics of polyQ proteins can trigger aggregation, leading to toxic assembles that hinder proper cellular homeostasis. Understanding these intricacies offers new avenues for therapeutic strategies by fine-tuning polyQ kinetics, in order to prevent and control disease progression. Last but not least, this review highlights the importance of integrating multidisciplinary efforts to advancing research in this field, bringing us closer to the ultimate goal of finding effective treatments against polyQ disorders.


Asunto(s)
Enfermedades Neurodegenerativas , Péptidos , Humanos , Péptidos/química , Péptidos/metabolismo , Enfermedades Neurodegenerativas/metabolismo , Enfermedades Neurodegenerativas/genética , Relación Estructura-Actividad , Animales
9.
Biochem Soc Trans ; 52(3): 1539-1548, 2024 Jun 26.
Artículo en Inglés | MEDLINE | ID: mdl-38864432

RESUMEN

Mitochondria are essential organelles of eukaryotic cells and thus mitochondrial proteome is under constant quality control and remodelling. Yme1 is a multi-functional protein and subunit of the homo-hexametric complex i-AAA proteinase. Yme1 plays vital roles in the regulation of mitochondrial protein homeostasis and mitochondrial plasticity, ranging from substrate degradation to the regulation of protein functions involved in mitochondrial protein biosynthesis, energy production, mitochondrial dynamics, and lipid biosynthesis and signalling. In this mini review, we focus on discussing the current understanding of the roles of Yme1 in mitochondrial protein import via TIM22 and TIM23 pathways, oxidative phosphorylation complex function, as well as mitochondrial lipid biosynthesis and signalling, as well as a brief discussion of the role of Yme1 in modulating mitochondrial dynamics.


Asunto(s)
Mitocondrias , Dinámicas Mitocondriales , Proteínas Mitocondriales , Fosforilación Oxidativa , Transporte de Proteínas , Proteostasis , Humanos , Proteínas Mitocondriales/metabolismo , Mitocondrias/metabolismo , Animales , ATPasas Asociadas con Actividades Celulares Diversas/metabolismo , Lípidos/biosíntesis , Lípidos/química , Metabolismo de los Lípidos , Homeostasis , Transducción de Señal , Proteasas ATP-Dependientes/metabolismo
10.
Sci Rep ; 14(1): 13566, 2024 06 12.
Artículo en Inglés | MEDLINE | ID: mdl-38866950

RESUMEN

The identification of protein binding residues helps to understand their biological processes as protein function is often defined through ligand binding, such as to other proteins, small molecules, ions, or nucleotides. Methods predicting binding residues often err for intrinsically disordered proteins or regions (IDPs/IDPRs), often also referred to as molecular recognition features (MoRFs). Here, we presented a novel machine learning (ML) model trained to specifically predict binding regions in IDPRs. The proposed model, IDBindT5, leveraged embeddings from the protein language model (pLM) ProtT5 to reach a balanced accuracy of 57.2 ± 3.6% (95% confidence interval). Assessed on the same data set, this did not differ at the 95% CI from the state-of-the-art (SOTA) methods ANCHOR2 and DeepDISOBind that rely on expert-crafted features and evolutionary information from multiple sequence alignments (MSAs). Assessed on other data, methods such as SPOT-MoRF reached higher MCCs. IDBindT5's SOTA predictions are much faster than other methods, easily enabling full-proteome analyses. Our findings emphasize the potential of pLMs as a promising approach for exploring and predicting features of disordered proteins. The model and a comprehensive manual are publicly available at https://github.com/jahnl/binding_in_disorder .


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Aprendizaje Automático , Unión Proteica , Proteínas Intrínsecamente Desordenadas/química , Proteínas Intrínsecamente Desordenadas/metabolismo , Sitios de Unión , Biología Computacional/métodos , Bases de Datos de Proteínas , Humanos
11.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38701416

RESUMEN

Predicting protein function is crucial for understanding biological life processes, preventing diseases and developing new drug targets. In recent years, methods based on sequence, structure and biological networks for protein function annotation have been extensively researched. Although obtaining a protein in three-dimensional structure through experimental or computational methods enhances the accuracy of function prediction, the sheer volume of proteins sequenced by high-throughput technologies presents a significant challenge. To address this issue, we introduce a deep neural network model DeepSS2GO (Secondary Structure to Gene Ontology). It is a predictor incorporating secondary structure features along with primary sequence and homology information. The algorithm expertly combines the speed of sequence-based information with the accuracy of structure-based features while streamlining the redundant data in primary sequences and bypassing the time-consuming challenges of tertiary structure analysis. The results show that the prediction performance surpasses state-of-the-art algorithms. It has the ability to predict key functions by effectively utilizing secondary structure information, rather than broadly predicting general Gene Ontology terms. Additionally, DeepSS2GO predicts five times faster than advanced algorithms, making it highly applicable to massive sequencing data. The source code and trained models are available at https://github.com/orca233/DeepSS2GO.


Asunto(s)
Algoritmos , Biología Computacional , Redes Neurales de la Computación , Estructura Secundaria de Proteína , Proteínas , Proteínas/química , Proteínas/metabolismo , Proteínas/genética , Biología Computacional/métodos , Bases de Datos de Proteínas , Ontología de Genes , Análisis de Secuencia de Proteína/métodos , Programas Informáticos
12.
Biophys Rev ; 16(2): 189-218, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38737201

RESUMEN

The formation of a heterogeneous set of advanced glycation end products (AGEs) is the final outcome of a non-enzymatic process that occurs in vivo on long-life biomolecules. This process, known as glycation, starts with the reaction between reducing sugars, or their autoxidation products, with the amino groups of proteins, DNA, or lipids, thus gaining relevance under hyperglycemic conditions. Once AGEs are formed, they might affect the biological function of the biomacromolecule and, therefore, induce the development of pathophysiological events. In fact, the accumulation of AGEs has been pointed as a triggering factor of obesity, diabetes-related diseases, coronary artery disease, neurological disorders, or chronic renal failure, among others. Given the deleterious consequences of glycation, evolution has designed endogenous mechanisms to undo glycation or to prevent it. In addition, many exogenous molecules have also emerged as powerful glycation inhibitors. This review aims to provide an overview on what glycation is. It starts by explaining the similarities and differences between glycation and glycosylation. Then, it describes in detail the molecular mechanism underlying glycation reactions, and the bio-molecular targets with higher propensity to be glycated. Next, it discusses the precise effects of glycation on protein structure, function, and aggregation, and how computational chemistry has provided insights on these aspects. Finally, it reports the most prevalent diseases induced by glycation, and the endogenous mechanisms and the current therapeutic interventions against it.

13.
Int J Mol Sci ; 25(10)2024 May 18.
Artículo en Inglés | MEDLINE | ID: mdl-38791544

RESUMEN

Antimicrobial peptides (AMPs) are promising candidates for new antibiotics due to their broad-spectrum activity against pathogens and reduced susceptibility to resistance development. Deep-learning techniques, such as deep generative models, offer a promising avenue to expedite the discovery and optimization of AMPs. A remarkable example is the Feedback Generative Adversarial Network (FBGAN), a deep generative model that incorporates a classifier during its training phase. Our study aims to explore the impact of enhanced classifiers on the generative capabilities of FBGAN. To this end, we introduce two alternative classifiers for the FBGAN framework, both surpassing the accuracy of the original classifier. The first classifier utilizes the k-mers technique, while the second applies transfer learning from the large protein language model Evolutionary Scale Modeling 2 (ESM2). Integrating these classifiers into FBGAN not only yields notable performance enhancements compared to the original FBGAN but also enables the proposed generative models to achieve comparable or even superior performance to established methods such as AMPGAN and HydrAMP. This achievement underscores the effectiveness of leveraging advanced classifiers within the FBGAN framework, enhancing its computational robustness for AMP de novo design and making it comparable to existing literature.


Asunto(s)
Péptidos Antimicrobianos , Péptidos Antimicrobianos/química , Péptidos Antimicrobianos/farmacología , Diseño de Fármacos/métodos , Redes Neurales de la Computación , Aprendizaje Profundo , Algoritmos
14.
BMC Bioinformatics ; 25(1): 174, 2024 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-38698340

RESUMEN

BACKGROUND: In last two decades, the use of high-throughput sequencing technologies has accelerated the pace of discovery of proteins. However, due to the time and resource limitations of rigorous experimental functional characterization, the functions of a vast majority of them remain unknown. As a result, computational methods offering accurate, fast and large-scale assignment of functions to new and previously unannotated proteins are sought after. Leveraging the underlying associations between the multiplicity of features that describe proteins could reveal functional insights into the diverse roles of proteins and improve performance on the automatic function prediction task. RESULTS: We present GO-LTR, a multi-view multi-label prediction model that relies on a high-order tensor approximation of model weights combined with non-linear activation functions. The model is capable of learning high-order relationships between multiple input views representing the proteins and predicting high-dimensional multi-label output consisting of protein functional categories. We demonstrate the competitiveness of our method on various performance measures. Experiments show that GO-LTR learns polynomial combinations between different protein features, resulting in improved performance. Additional investigations establish GO-LTR's practical potential in assigning functions to proteins under diverse challenging scenarios: very low sequence similarity to previously observed sequences, rarely observed and highly specific terms in the gene ontology. IMPLEMENTATION: The code and data used for training GO-LTR is available at https://github.com/aalto-ics-kepaco/GO-LTR-prediction .


Asunto(s)
Biología Computacional , Proteínas , Proteínas/química , Proteínas/metabolismo , Biología Computacional/métodos , Bases de Datos de Proteínas , Algoritmos
15.
Interdiscip Sci ; 2024 Apr 03.
Artículo en Inglés | MEDLINE | ID: mdl-38568406

RESUMEN

With the rapid development of NGS technology, the number of protein sequences has increased exponentially. Computational methods have been introduced in protein functional studies because the analysis of large numbers of proteins through biological experiments is costly and time-consuming. In recent years, new approaches based on deep learning have been proposed to overcome the limitations of conventional methods. Although deep learning-based methods effectively utilize features of protein function, they are limited to sequences of fixed-length and consider information from adjacent amino acids. Therefore, new protein analysis tools that extract functional features from proteins of flexible length and train models are required. We introduce DeepPI, a deep learning-based tool for analyzing proteins in large-scale database. The proposed model that utilizes Global Average Pooling is applied to proteins of flexible length and leads to reduced information loss compared to existing algorithms that use fixed sizes. The image generator converts a one-dimensional sequence into a distinct two-dimensional structure, which can extract common parts of various shapes. Finally, filtering techniques automatically detect representative data from the entire database and ensure coverage of large protein databases. We demonstrate that DeepPI has been successfully applied to large databases such as the Pfam-A database. Comparative experiments on four types of image generators illustrated the impact of structure on feature extraction. The filtering performance was verified by varying the parameter values and proved to be applicable to large databases. Compared to existing methods, DeepPI outperforms in family classification accuracy for protein function inference.

16.
Curr Res Struct Biol ; 7: 100142, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38655428

RESUMEN

Binding of nucleotides and their derivatives is one of the most ancient elementary functions dating back to the Origin of Life. We review here the works considering one of the key elements in binding of (di)nucleotide-containing ligands - phosphate binding. We start from a brief discussion of major participants, conditions, and events in prebiotic evolution that resulted in the Origin of Life. Tracing back to the basic functions, including metal and phosphate binding, and, potentially, formation of primitive protein-protein interactions, we focus here on the phosphate binding. Critically assessing works on the structural, functional, and evolutionary aspects of phosphate binding, we perform a simple computational experiment reconstructing its most ancient and generic sequence prototype. The profiles of the phosphate binding signatures have been derived in form of position-specific scoring matrices (PSSMs), their peculiarities depending on the type of the ligands have been analyzed, and evolutionary connections between them have been delineated. Then, the apparent prototype that gave rise to all relevant phosphate-binding signatures had also been reconstructed. We show that two major signatures of the phosphate binding that discriminate between the binding of dinucleotide- and nucleotide-containing ligands are GxGxxG and GxxGxG, respectively. It appears that the signature archetypal for dinucleotide-containing ligands is more generic, and it can frequently bind phosphate groups in nucleotide-containing ligands as well. The reconstructed prototype's key signature GxGGxG underlies the role of glycine residues in providing flexibility and interactions necessary for binding the phosphate groups. The prototype also contains other ancient amino acids, valine, and alanine, showing versatility towards evolutionary design and functional diversification.

17.
BMC Bioinformatics ; 25(1): 146, 2024 Apr 11.
Artículo en Inglés | MEDLINE | ID: mdl-38600441

RESUMEN

BACKGROUND: The advent of high-throughput technologies has led to an exponential increase in uncharacterized bacterial protein sequences, surpassing the capacity of manual curation. A large number of bacterial protein sequences remain unannotated by Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology, making it necessary to use auto annotation tools. These tools are now indispensable in the biological research landscape, bridging the gap between the vastness of unannotated sequences and meaningful biological insights. RESULTS: In this work, we propose a novel pipeline for KEGG orthology annotation of bacterial protein sequences that uses natural language processing and deep learning. To assess the effectiveness of our pipeline, we conducted evaluations using the genomes of two randomly selected species from the KEGG database. In our evaluation, we obtain competitive results on precision, recall, and F1 score, with values of 0.948, 0.947, and 0.947, respectively. CONCLUSIONS: Our experimental results suggest that our pipeline demonstrates performance comparable to traditional methods and excels in identifying distant relatives with low sequence identity. This demonstrates the potential of our pipeline to significantly improve the accuracy and comprehensiveness of KEGG orthology annotation, thereby advancing our understanding of functional relationships within biological systems.


Asunto(s)
Proteínas Bacterianas , Procesamiento de Lenguaje Natural , Genoma , Anotación de Secuencia Molecular , Secuencia de Aminoácidos
18.
Arch Biochem Biophys ; 755: 109979, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38583654

RESUMEN

Although protein sequences encode the information for folding and function, understanding their link is not an easy task. Unluckily, the prediction of how specific amino acids contribute to these features is still considerably impaired. Here, we developed a simple algorithm that finds positions in a protein sequence with potential to modulate the studied quantitative phenotypes. From a few hundred protein sequences, we perform multiple sequence alignments, obtain the per-position pairwise differences for both the sequence and the observed phenotypes, and calculate the correlation between these last two quantities. We tested our methodology with four cases: archaeal Adenylate Kinases and the organisms optimal growth temperatures, microbial rhodopsins and their maximal absorption wavelengths, mammalian myoglobins and their muscular concentration, and inhibition of HIV protease clinical isolates by two different molecules. We found from 3 to 10 positions tightly associated with those phenotypes, depending on the studied case. We showed that these correlations appear using individual positions but an improvement is achieved when the most correlated positions are jointly analyzed. Noteworthy, we performed phenotype predictions using a simple linear model that links per-position divergences and differences in the observed phenotypes. Predictions are comparable to the state-of-art methodologies which, in most of the cases, are far more complex. All of the calculations are obtained at a very low information cost since the only input needed is a multiple sequence alignment of protein sequences with their associated quantitative phenotypes. The diversity of the explored systems makes our work a valuable tool to find sequence determinants of biological activity modulation and to predict various functional features for uncharacterized members of a protein family.

19.
Comput Biol Chem ; 110: 108064, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38677014

RESUMEN

MOTIVATION: Elucidating protein function is a central problem in biochemistry, genetics, and molecular biology. Developing computational methods for protein function prediction is critical due to the significant gap between sequence and functional data. Recent advances in protein structure prediction, which strongly correlates with function, make it feasible to use structure to predict function. However, current structure-based methods overlook the fact that individual residues may contribute differently to the protein's function and do not take into account the correlation between protein residues and their functions. The challenge of effectively utilizing the relationship between protein residues and function-level information to predict protein function remains unsolved. RESULT: We proposed a protein function prediction method based on Soft Mask Graph Networks and Residue-Label Attention (POLAT), which could combine sequence features, predicted structure features, and function-level information to get an accurate prediction. We use soft mask graph networks to adaptively extract the residues relevant to functions. A residue-label attention mechanism is adopted to obtain the protein-level encoded features of a protein, which are then concatenated with a protein-level embedding and fed into a dense classifier to determine the probabilities of each function. POLAT achieves 0.670, 0.515, 0.578 Fmax and 0.677, 0.409, 0.507 AUPR on the PDB cdhit test set for the MFO, BPO, and CCO domains, respectively, outperforming the existing structure-based SOTA method GAT-GO (Fmax 0.633, 0.492, 0.547; AUPR 0.660, 0.381, 0.479). POLAT is also competitive in extensive experiments among sequence-based and multimodal methods and achieves the SOTA performance in three out of six metrics.


Asunto(s)
Biología Computacional , Proteínas , Proteínas/química , Proteínas/metabolismo , Conformación Proteica , Algoritmos
20.
Int Immunopharmacol ; 133: 112088, 2024 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-38626547

RESUMEN

The signaling lymphocytic activation molecule (SLAM) family participates in the modulation of various innate and adaptive immune responses. SLAM family (SLAMF) receptors include nine transmembrane glycoproteins, of which SLAMF3 (also known as CD229 or Ly9) has important roles in the modulation of immune responses, from the fundamental activation and suppression of immune cells to the regulation of intricate immune networks. SLAMF3 is mainly expressed in immune cells, such as T, B, and natural killer cells. It has a unique molecular structure, including four immunoglobulin-like domains in the extracellular domain and two immunoreceptor tyrosine-based signaling motifs in the intracellular structural domains. These unique structures have important implications for protein functioning. SLAMF3 is involved in pathogenesis of various disease, particularly autoimmune diseases and cancer. However, despite its potential clinical significance, a comprehensive overview of the current paradigm of SLAMF3 research is lacking. This review summarizes the structure, functional mechanisms, and therapeutic implications of SLAMF3. Our findings highlight the significance of SLAMF3 in both physiological and pathological contexts, and underline its dual role in autoimmunity and malignancies, and including disease progression and prognosis. The review also proposes that future studies on SLAMF3 should explore its context-specific inhibitory and stimulatory effects, expand on its potential in disease mapping, investigate related signaling pathways, and explore its value as a drug target. Research in these areas related to SLAMF3 can provide more precise directions for future therapeutic strategies.


Asunto(s)
Neoplasias , Transducción de Señal , Familia de Moléculas Señalizadoras de la Activación Linfocitaria , Humanos , Familia de Moléculas Señalizadoras de la Activación Linfocitaria/metabolismo , Familia de Moléculas Señalizadoras de la Activación Linfocitaria/genética , Familia de Moléculas Señalizadoras de la Activación Linfocitaria/inmunología , Animales , Neoplasias/inmunología , Neoplasias/terapia , Neoplasias/metabolismo , Enfermedades Autoinmunes/inmunología , Enfermedades Autoinmunes/terapia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...