Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 209.608
Filtrar
1.
Protein Sci ; 33(6): e5001, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38723111

RESUMEN

De novo protein design expands the protein universe by creating new sequences to accomplish tailor-made enzymes in the future. A promising topology to implement diverse enzyme functions is the ubiquitous TIM-barrel fold. Since the initial de novo design of an idealized four-fold symmetric TIM barrel, the family of de novo TIM barrels is expanding rapidly. Despite this and in contrast to natural TIM barrels, these novel proteins lack cavities and structural elements essential for the incorporation of binding sites or enzymatic functions. In this work, we diversified a de novo TIM barrel by extending multiple ßα-loops using constrained hallucination. Experimentally tested designs were found to be soluble upon expression in Escherichia coli and well-behaved. Biochemical characterization and crystal structures revealed successful extensions with defined α-helical structures. These diversified de novo TIM barrels provide a framework to explore a broad spectrum of functions based on the potential of natural TIM barrels.


Asunto(s)
Modelos Moleculares , Escherichia coli/genética , Escherichia coli/metabolismo , Cristalografía por Rayos X , Pliegue de Proteína , Ingeniería de Proteínas/métodos , Proteínas/química , Proteínas/metabolismo
2.
BMC Genomics ; 25(1): 406, 2024 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-38724906

RESUMEN

Most proteins exert their functions by interacting with other proteins, making the identification of protein-protein interactions (PPI) crucial for understanding biological activities, pathological mechanisms, and clinical therapies. Developing effective and reliable computational methods for predicting PPI can significantly reduce the time-consuming and labor-intensive associated traditional biological experiments. However, accurately identifying the specific categories of protein-protein interactions and improving the prediction accuracy of the computational methods remain dual challenges. To tackle these challenges, we proposed a novel graph neural network method called GNNGL-PPI for multi-category prediction of PPI based on global graphs and local subgraphs. GNNGL-PPI consisted of two main components: using Graph Isomorphism Network (GIN) to extract global graph features from PPI network graph, and employing GIN As Kernel (GIN-AK) to extract local subgraph features from the subgraphs of protein vertices. Additionally, considering the imbalanced distribution of samples in each category within the benchmark datasets, we introduced an Asymmetric Loss (ASL) function to further enhance the predictive performance of the method. Through evaluations on six benchmark test sets formed by three different dataset partitioning algorithms (Random, BFS, DFS), GNNGL-PPI outperformed the state-of-the-art multi-category prediction methods of PPI, as measured by the comprehensive performance evaluation metric F1-measure. Furthermore, interpretability analysis confirmed the effectiveness of GNNGL-PPI as a reliable multi-category prediction method for predicting protein-protein interactions.


Asunto(s)
Algoritmos , Biología Computacional , Redes Neurales de la Computación , Mapeo de Interacción de Proteínas , Mapeo de Interacción de Proteínas/métodos , Biología Computacional/métodos , Mapas de Interacción de Proteínas , Humanos , Proteínas/metabolismo
3.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38725156

RESUMEN

Protein acetylation is one of the extensively studied post-translational modifications (PTMs) due to its significant roles across a myriad of biological processes. Although many computational tools for acetylation site identification have been developed, there is a lack of benchmark dataset and bespoke predictors for non-histone acetylation site prediction. To address these problems, we have contributed to both dataset creation and predictor benchmark in this study. First, we construct a non-histone acetylation site benchmark dataset, namely NHAC, which includes 11 subsets according to the sequence length ranging from 11 to 61 amino acids. There are totally 886 positive samples and 4707 negative samples for each sequence length. Secondly, we propose TransPTM, a transformer-based neural network model for non-histone acetylation site predication. During the data representation phase, per-residue contextualized embeddings are extracted using ProtT5 (an existing pre-trained protein language model). This is followed by the implementation of a graph neural network framework, which consists of three TransformerConv layers for feature extraction and a multilayer perceptron module for classification. The benchmark results reflect that TransPTM has the competitive performance for non-histone acetylation site prediction over three state-of-the-art tools. It improves our comprehension on the PTM mechanism and provides a theoretical basis for developing drug targets for diseases. Moreover, the created PTM datasets fills the gap in non-histone acetylation site datasets and is beneficial to the related communities. The related source code and data utilized by TransPTM are accessible at https://www.github.com/TransPTM/TransPTM.


Asunto(s)
Redes Neurales de la Computación , Procesamiento Proteico-Postraduccional , Acetilación , Biología Computacional/métodos , Bases de Datos de Proteínas , Programas Informáticos , Algoritmos , Humanos , Proteínas/química , Proteínas/metabolismo
4.
AAPS J ; 26(3): 60, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38730115

RESUMEN

Subcutaneous (SC) administration of therapeutic proteins is perceived to pose higher risk of immunogenicity when compared with intravenous (IV) route of administration (RoA). However, systematic evaluations of clinical data to support this claim are lacking. This meta-analysis was conducted to compare the immunogenicity of the same therapeutic protein by IV and SC RoA. Anti-drug antibody (ADA) data and controlling variables for 7 therapeutic proteins administered by both IV and SC routes across 48 treatment groups were analyzed. RoA was the primary independent variable of interest while therapeutic protein, patient population, adjusted dose, and number of ADA samples were controlling variables. Analysis of variance was used to compare the ADA incidence between IV and SC RoA, while accounting for controlling variables and potential interactions. Subsequently, 10 additional therapeutic proteins with ADA data published for both IV and SC administration were added to the above 7 therapeutic proteins and were evaluated for ADA incidence. RoA had no statistically significant effect on ADA incidence for the initial dataset of 7 therapeutic proteins (p = 0.55). The only variable with a significant effect on ADA incidence was the therapeutic protein. None of the other controlling variables, including their interactions with RoA, was significant. When all data from the 17 therapeutic proteins were pooled, there was no statistically significant effect of RoA on ADA incidence (p = 0.81). In conclusion, there is no significant difference in ADA incidence between the IV and SC RoA, based on analysis of clinical ADA data from 17 therapeutic proteins.


Asunto(s)
Administración Intravenosa , Humanos , Inyecciones Subcutáneas , Anticuerpos/administración & dosificación , Anticuerpos/inmunología , Proteínas/administración & dosificación , Proteínas/inmunología
5.
Clin Respir J ; 18(5): e13774, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38742362

RESUMEN

OBJECTIVE: This study aimed to explore the application value of human epididymis protein 4 (HE4) in diagnosing and monitoring the prognosis of lung cancer. METHODS: First, TCGA (The Cancer Genome Atlas) databases were used to analyze whey-acidic-protein 4-disulfide bond core domain 2 (WFDC2) gene expression levels in lung cancer tissues. Then, a total of 160 individuals were enrolled, categorized into three groups: the lung cancer group (n = 80), the benign lesions group (n = 40), and the healthy controls group (n = 40). Serum HE4 levels and other biomarkers were quantified using an electro-chemiluminescent immunoassay. Additionally, the expression of HE4 in tissues was analyzed through immunohistochemistry (IHC). In vitro cultures of human airway epithelial (human bronchial epithelial [HBE]) cells and various lung cancer cell lines (SPC/PC9/A594/H520) were utilized to detect HE4 levels via western blot (WB). RESULTS: Analysis of the TCGA and UALCAN (The University of Alabama at Birmingham Cancer Data Analysis Portal) databases showed that WFDC2 gene expression levels were upregulated in lung cancer tissues (p < 0.01). Compared with the control group and the benign group, HE4 was significantly higher in the serum of patients with lung cancer (p < 0.001). Receiver operating characteristic (ROC) analysis confirmed that HE4 had better diagnostic efficacy than classical markers in the differential diagnosis of lung cancer and benign lesions and had the highest diagnostic value in lung adenocarcinoma (area under the ROC curve [AUC] = 0.826). HE4 increased in early lung cancer and positively correlated with poor prognosis (p < 0.001). Moreover, the results of WB and IHC revealed that the expression of HE4 was increased in lung cancer cells (SPC/A549/H520) and lung cancer tissues but decreased in PC9 cells with a lack of exon EGFR19 (p < 0.05). CONCLUSION: Serum HE4 emerges as a promising novel biomarker for the diagnosis and prognosis assessment of lung cancer.


Asunto(s)
Biomarcadores de Tumor , Neoplasias Pulmonares , Proteína 2 de Dominio del Núcleo de Cuatro Disulfuros WAP , Humanos , Proteína 2 de Dominio del Núcleo de Cuatro Disulfuros WAP/metabolismo , Proteína 2 de Dominio del Núcleo de Cuatro Disulfuros WAP/análisis , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/metabolismo , Neoplasias Pulmonares/patología , Biomarcadores de Tumor/metabolismo , Biomarcadores de Tumor/sangre , Biomarcadores de Tumor/genética , Masculino , Pronóstico , Femenino , Persona de Mediana Edad , Proteínas/metabolismo , Proteínas/genética , Anciano , Regulación Neoplásica de la Expresión Génica , Línea Celular Tumoral , Inmunohistoquímica
6.
Sci Rep ; 14(1): 10475, 2024 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-38714683

RESUMEN

To ensure that an external force can break the interaction between a protein and a ligand, the steered molecular dynamics simulation requires a harmonic restrained potential applied to the protein backbone. A usual practice is that all or a certain number of protein's heavy atoms or Cα atoms are fixed, being restrained by a small force. This present study reveals that while fixing both either all heavy atoms and or all Cα atoms is not a good approach, while fixing a too small number of few atoms sometimes cannot prevent the protein from rotating under the influence of the bulk water layer, and the pulled molecule may smack into the wall of the active site. We found that restraining the Cα atoms under certain conditions is more relevant. Thus, we would propose an alternative solution in which only the Cα atoms of the protein at a distance larger than 1.2 nm from the ligand are restrained. A more flexible, but not too flexible, protein will be expected to lead to a more natural release of the ligand.


Asunto(s)
Simulación de Dinámica Molecular , Unión Proteica , Proteínas , Ligandos , Proteínas/química , Proteínas/metabolismo , Conformación Proteica
7.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38695119

RESUMEN

Sequence similarity is of paramount importance in biology, as similar sequences tend to have similar function and share common ancestry. Scoring matrices, such as PAM or BLOSUM, play a crucial role in all bioinformatics algorithms for identifying similarities, but have the drawback that they are fixed, independent of context. We propose a new scoring method for amino acid similarity that remedies this weakness, being contextually dependent. It relies on recent advances in deep learning architectures that employ self-supervised learning in order to leverage the power of enormous amounts of unlabelled data to generate contextual embeddings, which are vector representations for words. These ideas have been applied to protein sequences, producing embedding vectors for protein residues. We propose the E-score between two residues as the cosine similarity between their embedding vector representations. Thorough testing on a wide variety of reference multiple sequence alignments indicate that the alignments produced using the new $E$-score method, especially ProtT5-score, are significantly better than those obtained using BLOSUM matrices. The new method proposes to change the way alignments are computed, with far-reaching implications in all areas of textual data that use sequence similarity. The program to compute alignments based on various $E$-scores is available as a web server at e-score.csd.uwo.ca. The source code is freely available for download from github.com/lucian-ilie/E-score.


Asunto(s)
Algoritmos , Biología Computacional , Alineación de Secuencia , Alineación de Secuencia/métodos , Biología Computacional/métodos , Programas Informáticos , Análisis de Secuencia de Proteína/métodos , Secuencia de Aminoácidos , Proteínas/química , Proteínas/genética , Aprendizaje Profundo , Bases de Datos de Proteínas
8.
Acta Crystallogr D Struct Biol ; 80(Pt 5): 314-327, 2024 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-38700059

RESUMEN

Radiation damage remains one of the major impediments to accurate structure solution in macromolecular crystallography. The artefacts of radiation damage can manifest as structural changes that result in incorrect biological interpretations being drawn from a model, they can reduce the resolution to which data can be collected and they can even prevent structure solution entirely. In this article, we discuss how to identify and mitigate against the effects of radiation damage at each stage in the macromolecular crystal structure-solution pipeline.


Asunto(s)
Sustancias Macromoleculares , Cristalografía por Rayos X/métodos , Sustancias Macromoleculares/química , Modelos Moleculares , Proteínas/química
9.
PLoS One ; 19(5): e0299287, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38701058

RESUMEN

Matrix-assisted laser desorption/ionization time-of-flight-time-of-flight (MALDI-TOF-TOF) tandem mass spectrometry (MS/MS) is a rapid technique for identifying intact proteins from unfractionated mixtures by top-down proteomic analysis. MS/MS allows isolation of specific intact protein ions prior to fragmentation, allowing fragment ion attribution to a specific precursor ion. However, the fragmentation efficiency of mature, intact protein ions by MS/MS post-source decay (PSD) varies widely, and the biochemical and structural factors of the protein that contribute to it are poorly understood. With the advent of protein structure prediction algorithms such as Alphafold2, we have wider access to protein structures for which no crystal structure exists. In this work, we use a statistical approach to explore the properties of bacterial proteins that can affect their gas phase dissociation via PSD. We extract various protein properties from Alphafold2 predictions and analyze their effect on fragmentation efficiency. Our results show that the fragmentation efficiency from cleavage of the polypeptide backbone on the C-terminal side of glutamic acid (E) and asparagine (N) residues were nearly equal. In addition, we found that the rearrangement and cleavage on the C-terminal side of aspartic acid (D) residues that result from the aspartic acid effect (AAE) were higher than for E- and N-residues. From residue interaction network analysis, we identified several local centrality measures and discussed their implications regarding the AAE. We also confirmed the selective cleavage of the backbone at D-proline bonds in proteins and further extend it to N-proline bonds. Finally, we note an enhancement of the AAE mechanism when the residue on the C-terminal side of D-, E- and N-residues is glycine. To the best of our knowledge, this is the first report of this phenomenon. Our study demonstrates the value of using statistical analyses of protein sequences and their predicted structures to better understand the fragmentation of the intact protein ions in the gas phase.


Asunto(s)
Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción , Espectrometría de Masas en Tándem , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción/métodos , Espectrometría de Masas en Tándem/métodos , Proteínas Bacterianas/química , Proteómica/métodos , Algoritmos , Proteínas/química , Proteínas/análisis
10.
Protein Sci ; 33(6): e5021, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38747394

RESUMEN

While nickel-nitrilotriacetic acid (Ni-NTA) has greatly advanced recombinant protein purification, its limitations, including nonspecific binding and partial purification for certain proteins, highlight the necessity for additional purification such as size exclusion and ion exchange chromatography. However, specialized equipment such as FPLC is typically needed but not often available in many laboratories. Here, we show a novel method utilizing polyphosphate (polyP) for purifying proteins with histidine repeats via non-covalent interactions. Our study demonstrates that immobilized polyP efficiently binds to histidine-tagged proteins across a pH range of 5.5-7.5, maintaining binding efficacy even in the presence of reducing agent DTT and chelating agent EDTA. We carried out experiments of purifying various proteins from cell lysates and fractions post-Ni-NTA. Our results demonstrate that polyP resin is capable of further purification post-Ni-NTA without the need for specialized equipment and without compromising protein activity. This cost-effective and convenient method offers a viable approach as a complementary approach to Ni-NTA.


Asunto(s)
Histidina , Polifosfatos , Histidina/química , Polifosfatos/química , Polifosfatos/metabolismo , Ácido Nitrilotriacético/química , Proteínas Recombinantes/química , Proteínas Recombinantes/aislamiento & purificación , Proteínas Recombinantes/metabolismo , Proteínas Recombinantes/genética , Humanos , Proteínas/química , Proteínas/aislamiento & purificación
11.
Protein Sci ; 33(6): e5022, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38747440

RESUMEN

Differential scanning fluorimetry (DSF) is a method to determine the apparent melting temperature (Tma) of a purified protein. In DSF, the raw unfolding curves from which Tma is calculated vary widely in shape and complexity. However, the tools available for calculating Tma are only compatible with the simplest of DSF curves, hindering many otherwise straightforward applications of the technology. To overcome this limitation, we designed new mathematical models for Tma calculation that accommodate common forms of variation in DSF curves, including the number of transitions, the presence of high initial signal, and temperature-dependent signal decay. When tested these models against DSFbase, an open-source database of 6235 raw, real-life DSF curves, these models outperformed the existing standard approaches of sigmoid fitting and maximum of the first derivative. To make these models accessible, we created an open-source software and website, DSFworld (https://gestwickilab.shinyapps.io/dsfworld/). In addition to these improved fitting capabilities, DSFworld also includes features that overcome the practical limitations of many analysis workflows, including automatic reformatting of raw data exported from common qPCR instruments, labeling of data based on experimental variables, and flexible interactive plotting. We hope that DSFworld will enable more streamlined and accurate calculation of Tma values for DSF experiments.


Asunto(s)
Fluorometría , Programas Informáticos , Fluorometría/métodos , Temperatura de Transición , Proteínas/química
12.
Proc Natl Acad Sci U S A ; 121(21): e2400260121, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38743624

RESUMEN

We introduce ZEPPI (Z-score Evaluation of Protein-Protein Interfaces), a framework to evaluate structural models of a complex based on sequence coevolution and conservation involving residues in protein-protein interfaces. The ZEPPI score is calculated by comparing metrics for an interface to those obtained from randomly chosen residues. Since contacting residues are defined by the structural model, this obviates the need to account for indirect interactions. Further, although ZEPPI relies on species-paired multiple sequence alignments, its focus on interfacial residues allows it to leverage quite shallow alignments. ZEPPI can be implemented on a proteome-wide scale and is applied here to millions of structural models of dimeric complexes in the Escherichia coli and human interactomes found in the PrePPI database. PrePPI's scoring function is based primarily on the evaluation of protein-protein interfaces, and ZEPPI adds a new feature to this analysis through the incorporation of evolutionary information. ZEPPI performance is evaluated through applications to experimentally determined complexes and to decoys from the CASP-CAPRI experiment. As we discuss, the standard CAPRI scores used to evaluate docking models are based on model quality and not on the ability to give yes/no answers as to whether two proteins interact. ZEPPI is able to detect weak signals from PPI models that the CAPRI scores define as incorrect and, similarly, to identify potential PPIs defined as low confidence by the current PrePPI scoring function. A number of examples that illustrate how the combination of PrePPI and ZEPPI can yield functional hypotheses are provided.


Asunto(s)
Proteoma , Proteoma/metabolismo , Humanos , Mapeo de Interacción de Proteínas/métodos , Modelos Moleculares , Escherichia coli/metabolismo , Escherichia coli/genética , Bases de Datos de Proteínas , Unión Proteica , Proteínas de Escherichia coli/metabolismo , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/genética , Proteínas/química , Proteínas/metabolismo , Alineación de Secuencia
13.
PLoS One ; 19(5): e0302504, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38743747

RESUMEN

To enable personalized medicine, it is important yet highly challenging to accurately predict disease-causing mutations in target proteins at high throughput. Previous computational methods have been developed using evolutionary information in combination with various biochemical and structural features of protein residues to discriminate neutral vs. deleterious mutations. However, the power of these methods is often limited because they either assume known protein structures or treat residues independently without fully considering their interactions. To address the above limitations, we build upon recent progress in machine learning, network analysis, and protein language models, and develop a sequences-based variant site prediction workflow based on the protein residue contact networks: 1. We employ and integrate various methods of building protein residue networks using state-of-the-art coevolution analysis tools (RaptorX, DeepMetaPSICOV, and SPOT-Contact) powered by deep learning. 2. We use machine learning algorithms (Random Forest, Gradient Boosting, and Extreme Gradient Boosting) to optimally combine 20 network centrality scores to jointly predict key residues as hot spots for disease mutations. 3. Using a dataset of 107 proteins rich in disease mutations, we rigorously evaluate the network scores individually and collectively (via machine learning). This work supports a promising strategy of combining an ensemble of network scores based on different coevolution analysis methods (and optionally predictive scores from other methods) via machine learning to predict hotspot sites of disease mutations, which will inform downstream applications of disease diagnosis and targeted drug design.


Asunto(s)
Aprendizaje Automático , Polimorfismo de Nucleótido Simple , Humanos , Algoritmos , Biología Computacional/métodos , Mutación , Proteínas/genética , Proteínas/química , Evolución Molecular
14.
Sci Data ; 11(1): 495, 2024 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-38744964

RESUMEN

Single amino acid substitutions can profoundly affect protein folding, dynamics, and function. The ability to discern between benign and pathogenic substitutions is pivotal for therapeutic interventions and research directions. Given the limitations in experimental examination of these variants, AlphaMissense has emerged as a promising predictor of the pathogenicity of missense variants. Since heterogenous performance on different types of proteins can be expected, we assessed the efficacy of AlphaMissense across several protein groups (e.g. soluble, transmembrane, and mitochondrial proteins) and regions (e.g. intramembrane, membrane interacting, and high confidence AlphaFold segments) using ClinVar data for validation. Our comprehensive evaluation showed that AlphaMissense delivers outstanding performance, with MCC scores predominantly between 0.6 and 0.74. We observed low performance on disordered datasets and ClinVar data related to the CFTR ABC protein. However, a superior performance was shown when benchmarked against the high quality CFTR2 database. Our results with CFTR emphasizes AlphaMissense's potential in pinpointing functional hot spots, with its performance likely surpassing benchmarks calculated from ClinVar and ProteinGym datasets.


Asunto(s)
Regulador de Conductancia de Transmembrana de Fibrosis Quística , Regulador de Conductancia de Transmembrana de Fibrosis Quística/genética , Regulador de Conductancia de Transmembrana de Fibrosis Quística/química , Proteínas/química , Proteínas/genética , Pliegue de Proteína , Humanos , Bases de Datos de Proteínas , Sustitución de Aminoácidos , Mutación Missense
15.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38739759

RESUMEN

Proteins interact with diverse ligands to perform a large number of biological functions, such as gene expression and signal transduction. Accurate identification of these protein-ligand interactions is crucial to the understanding of molecular mechanisms and the development of new drugs. However, traditional biological experiments are time-consuming and expensive. With the development of high-throughput technologies, an increasing amount of protein data is available. In the past decades, many computational methods have been developed to predict protein-ligand interactions. Here, we review a comprehensive set of over 160 protein-ligand interaction predictors, which cover protein-protein, protein-nucleic acid, protein-peptide and protein-other ligands (nucleotide, heme, ion) interactions. We have carried out a comprehensive analysis of the above four types of predictors from several significant perspectives, including their inputs, feature profiles, models, availability, etc. The current methods primarily rely on protein sequences, especially utilizing evolutionary information. The significant improvement in predictions is attributed to deep learning methods. Additionally, sequence-based pretrained models and structure-based approaches are emerging as new trends.


Asunto(s)
Biología Computacional , Ácidos Nucleicos , Proteínas , Ácidos Nucleicos/metabolismo , Ácidos Nucleicos/química , Proteínas/química , Proteínas/metabolismo , Biología Computacional/métodos , Ligandos , Unión Proteica , Humanos
16.
Nat Commun ; 15(1): 4029, 2024 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-38740745

RESUMEN

Protein folds and the local environments they create can be compared using a variety of differently designed measures, such as the root mean squared deviation, the global distance test, the template modeling score or the local distance difference test. Although these measures have proven to be useful for a variety of tasks, each fails to fully incorporate the valuable chemical information inherent to atoms and residues, and considers these only partially and indirectly. Here, we develop the highly flexible local composition Hellinger distance (LoCoHD) metric, which is based on the chemical composition of local residue environments. Using LoCoHD, we analyze the chemical heterogeneity of amino acid environments and identify valines having the most conserved-, and arginines having the most variable chemical environments. We use LoCoHD to investigate structural ensembles, to evaluate critical assessment of structure prediction (CASP) competitors, to compare the results with the local distance difference test (lDDT) scoring system, and to evaluate a molecular dynamics simulation. We show that LoCoHD measurements provide unique information about protein structures that is distinct from, for example, those derived using the alignment-based RMSD metric, or the similarly distance matrix-based but alignment-free lDDT metric.


Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Proteínas/química , Aminoácidos/química , Conformación Proteica , Pliegue de Proteína , Algoritmos , Biología Computacional/métodos
17.
BMC Genomics ; 25(1): 466, 2024 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-38741045

RESUMEN

BACKGROUND: Protein-protein interactions (PPIs) hold significant importance in biology, with precise PPI prediction as a pivotal factor in comprehending cellular processes and facilitating drug design. However, experimental determination of PPIs is laborious, time-consuming, and often constrained by technical limitations. METHODS: We introduce a new node representation method based on initial information fusion, called FFANE, which amalgamates PPI networks and protein sequence data to enhance the precision of PPIs' prediction. A Gaussian kernel similarity matrix is initially established by leveraging protein structural resemblances. Concurrently, protein sequence similarities are gauged using the Levenshtein distance, enabling the capture of diverse protein attributes. Subsequently, to construct an initial information matrix, these two feature matrices are merged by employing weighted fusion to achieve an organic amalgamation of structural and sequence details. To gain a more profound understanding of the amalgamated features, a Stacked Autoencoder (SAE) is employed for encoding learning, thereby yielding more representative feature representations. Ultimately, classification models are trained to predict PPIs by using the well-learned fusion feature. RESULTS: When employing 5-fold cross-validation experiments on SVM, our proposed method achieved average accuracies of 94.28%, 97.69%, and 84.05% in terms of Saccharomyces cerevisiae, Homo sapiens, and Helicobacter pylori datasets, respectively. CONCLUSION: Experimental findings across various authentic datasets validate the efficacy and superiority of this fusion feature representation approach, underscoring its potential value in bioinformatics.


Asunto(s)
Biología Computacional , Mapeo de Interacción de Proteínas , Mapeo de Interacción de Proteínas/métodos , Biología Computacional/métodos , Algoritmos , Helicobacter pylori/metabolismo , Helicobacter pylori/genética , Máquina de Vectores de Soporte , Proteínas/metabolismo , Proteínas/química , Humanos , Mapas de Interacción de Proteínas , Bases de Datos de Proteínas
18.
Mikrochim Acta ; 191(6): 307, 2024 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-38713296

RESUMEN

An assay that integrates histidine-rich peptides (HisRPs) with high-affinity aptamers was developed enabling the specific and sensitive determination of the target lysozyme. The enzyme-like activity of HisRP is inhibited by its interaction with a target recognized by an aptamer. In the presence of the target, lysozyme molecules progressively assemble on the surface of HisRP in a concentration-dependent manner, resulting in the gradual suppression of enzyme-like activity. This inhibition of HisRP's enzyme-like activity can be visually observed through color changes in the reaction product or quantified using UV-visible absorption spectroscopy. Under optimal conditions, the proposed colorimetric assay for lysozyme had a detection limit as low as 1 nM and exhibited excellent selectivity against other nonspecific interferents. Furthermore, subsequent research validated the practical applicability of the developed colorimetric approach to saliva samples, indicating that the assay holds significant potential for the detection of lysozymes in samples derived from humans.


Asunto(s)
Colorimetría , Muramidasa , Saliva , Muramidasa/análisis , Muramidasa/química , Muramidasa/metabolismo , Colorimetría/métodos , Humanos , Saliva/química , Saliva/enzimología , Límite de Detección , Péptidos/química , Aptámeros de Nucleótidos/química , Proteínas/análisis , Técnicas Biosensibles/métodos , Histidina/análisis , Histidina/química
19.
Curr Protoc ; 4(5): e1047, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38720559

RESUMEN

Recent advancements in protein structure determination and especially in protein structure prediction techniques have led to the availability of vast amounts of macromolecular structures. However, the accessibility and integration of these structures into scientific workflows are hindered by the lack of standardization among publicly available data resources. To address this issue, we introduced the 3D-Beacons Network, a unified platform that aims to establish a standardized framework for accessing and displaying protein structure data. In this article, we highlight the importance of standardized approaches for accessing protein structure data and showcase the capabilities of 3D-Beacons. We describe four protocols for finding and accessing macromolecular structures from various specialist data resources via 3D-Beacons. First, we describe three scenarios for programmatically accessing and retrieving data using the 3D-Beacons API. Next, we show how to perform sequence-based searches to find structures from model providers. Then, we demonstrate how to search for structures and fetch them directly into a workflow using JalView. Finally, we outline the process of facilitating access to data from providers interested in contributing their structures to the 3D-Beacons Network. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Programmatic access to the 3D-Beacons API Basic Protocol 2: Sequence-based search using the 3D-Beacons API Basic Protocol 3: Accessing macromolecules from 3D-Beacons with JalView Basic Protocol 4: Enhancing data accessibility through 3D-Beacons.


Asunto(s)
Conformación Proteica , Proteínas , Proteínas/química , Bases de Datos de Proteínas , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...