Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 209.959
Filter
1.
Molecules ; 29(12)2024 Jun 19.
Article in English | MEDLINE | ID: mdl-38930976

ABSTRACT

Accurately predicting drug-target interactions is a critical yet challenging task in drug discovery. Traditionally, pocket detection and drug-target affinity prediction have been treated as separate aspects of drug-target interaction, with few methods combining these tasks within a unified deep learning system to accelerate drug development. In this study, we propose EMPDTA, an end-to-end framework that integrates protein pocket prediction and drug-target affinity prediction to provide a comprehensive understanding of drug-target interactions. The EMPDTA framework consists of three main modules: pocket online detection, multimodal representation learning for affinity prediction, and multi-task joint training. The performance and potential of the proposed framework have been validated across diverse benchmark datasets, achieving robust results in both tasks. Furthermore, the visualization results of the predicted pockets demonstrate accurate pocket detection, confirming the effectiveness of our framework.


Subject(s)
Drug Discovery , Drug Discovery/methods , Proteins/chemistry , Proteins/metabolism , Deep Learning , Protein Binding , Binding Sites , Humans , Algorithms
2.
Microb Biotechnol ; 17(6): e14505, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38932670

ABSTRACT

In recent years, the production of volatile fatty acids (VFA) through mixed culture fermentation (MCF) has been gaining attention. Most authors have focused on the fermentation of carbohydrates, while other possible substrates, such as proteins, have not been considered. Moreover, there is little information about how operational parameters affect the microbial communities involved in these processes, even though they are strongly related to reactor performance and VFA selectivity. Hence, this study aims to evaluate how microbial composition changes according to three different parameters (pH, type of protein and micronutrient addition) during anaerobic fermentation of protein-rich side streams. For this, two continuous stirred tank reactors (CSTR) were fed with two different proteins (casein and gelatine) and operated at different conditions: three pH values (5.0, 7.0 and 9.0) with only macronutrients supplementation and two pH values (5.0 and 7.0) with micronutrients' supplementation as well. Firmicutes, Proteobacteria and Bacteroidetes were the dominant phyla in the two reactors at all operational conditions, but their relative abundance varied with the parameters studied. At pH 7.0 and 9.0, the microbial composition was mainly affected by protein type, while at acidic conditions the driving force was the pH. The influence of micronutrients was dependent on the pH and the protein type, with a special effect on Clostridiales and Bacteroidales populations. Overall, this study shows that the acidogenic microbial community is affected by the three parameters studied and the changes in the microbial community can partially explain the macroscopic results, especially the process selectivity.


Subject(s)
Bacteria , Bioreactors , Fatty Acids, Volatile , Fermentation , Fatty Acids, Volatile/metabolism , Bioreactors/microbiology , Hydrogen-Ion Concentration , Bacteria/metabolism , Bacteria/genetics , Bacteria/classification , Anaerobiosis , Proteins/metabolism , Biota , Microbiota
3.
Bioinformatics ; 40(Supplement_1): i418-i427, 2024 Jun 28.
Article in English | MEDLINE | ID: mdl-38940145

ABSTRACT

MOTIVATION: Mutations are the crucial driving force for biological evolution as they can disrupt protein stability and protein-protein interactions which have notable impacts on protein structure, function, and expression. However, existing computational methods for protein mutation effects prediction are generally limited to single point mutations with global dependencies, and do not systematically take into account the local and global synergistic epistasis inherent in multiple point mutations. RESULTS: To this end, we propose a novel spatial and sequential message passing neural network, named DDAffinity, to predict the changes in binding affinity caused by multiple point mutations based on protein 3D structures. Specifically, instead of being on the whole protein, we perform message passing on the k-nearest neighbor residue graphs to extract pocket features of the protein 3D structures. Furthermore, to learn global topological features, a two-step additive Gaussian noising strategy during training is applied to blur out local details of protein geometry. We evaluate DDAffinity on benchmark datasets and external validation datasets. Overall, the predictive performance of DDAffinity is significantly improved compared with state-of-the-art baselines on multiple point mutations, including end-to-end and pre-training based methods. The ablation studies indicate the reasonable design of all components of DDAffinity. In addition, applications in nonredundant blind testing, predicting mutation effects of SARS-CoV-2 RBD variants, and optimizing human antibody against SARS-CoV-2 illustrate the effectiveness of DDAffinity. AVAILABILITY AND IMPLEMENTATION: DDAffinity is available at https://github.com/ak422/DDAffinity.


Subject(s)
Point Mutation , SARS-CoV-2 , SARS-CoV-2/genetics , SARS-CoV-2/metabolism , Computational Biology/methods , Protein Conformation , Humans , Neural Networks, Computer , Protein Binding , COVID-19/virology , Proteins/chemistry , Proteins/metabolism , Algorithms
4.
Bioinformatics ; 40(Supplement_1): i328-i336, 2024 Jun 28.
Article in English | MEDLINE | ID: mdl-38940160

ABSTRACT

SUMMARY: Multiple sequence alignment is an important problem in computational biology with applications that include phylogeny and the detection of remote homology between protein sequences. UPP is a popular software package that constructs accurate multiple sequence alignments for large datasets based on ensembles of hidden Markov models (HMMs). A computational bottleneck for this method is a sequence-to-HMM assignment step, which relies on the precise computation of probability scores on the HMMs. In this work, we show that we can speed up this assignment step significantly by replacing these HMM probability scores with alternative scores that can be efficiently estimated. Our proposed approach utilizes a multi-armed bandit algorithm to adaptively and efficiently compute estimates of these scores. This allows us to achieve similar alignment accuracy as UPP with a significant reduction in computation time, particularly for datasets with long sequences. AVAILABILITY AND IMPLEMENTATION: The code used to produce the results in this paper is available on GitHub at: https://github.com/ilanshom/adaptiveMSA.


Subject(s)
Algorithms , Markov Chains , Sequence Alignment , Software , Sequence Alignment/methods , Computational Biology/methods , Sequence Analysis, Protein/methods , Phylogeny , Proteins/chemistry
5.
Bioinformatics ; 40(Supplement_1): i401-i409, 2024 Jun 28.
Article in English | MEDLINE | ID: mdl-38940168

ABSTRACT

Automated protein function prediction is a crucial and widely studied problem in bioinformatics. Computationally, protein function is a multilabel classification problem where only positive samples are defined and there is a large number of unlabeled annotations. Most existing methods rely on the assumption that the unlabeled set of protein function annotations are negatives, inducing the false negative issue, where potential positive samples are trained as negatives. We introduce a novel approach named PU-GO, wherein we address function prediction as a positive-unlabeled ranking problem. We apply empirical risk minimization, i.e. we minimize the classification risk of a classifier where class priors are obtained from the Gene Ontology hierarchical structure. We show that our approach is more robust than other state-of-the-art methods on similarity-based and time-based benchmark datasets. AVAILABILITY AND IMPLEMENTATION: Data and code are available at https://github.com/bio-ontology-research-group/PU-GO.


Subject(s)
Computational Biology , Gene Ontology , Proteins , Proteins/chemistry , Proteins/metabolism , Computational Biology/methods , Databases, Protein , Algorithms
6.
Bioinformatics ; 40(Supplement_1): i428-i436, 2024 Jun 28.
Article in English | MEDLINE | ID: mdl-38940171

ABSTRACT

MOTIVATION: Cross-linking tandem mass spectrometry (XL-MS/MS) is an established analytical platform used to determine distance constraints between residues within a protein or from physically interacting proteins, thus improving our understanding of protein structure and function. To aid biological discovery with XL-MS/MS, it is essential that pairs of chemically linked peptides be accurately identified, a process that requires: (i) database search, that creates a ranked list of candidate peptide pairs for each experimental spectrum and (ii) false discovery rate (FDR) estimation, that determines the probability of a false match in a group of top-ranked peptide pairs with scores above a given threshold. Currently, the only available FDR estimation mechanism in XL-MS/MS is the target-decoy approach (TDA). However, despite its simplicity, TDA has both theoretical and practical limitations that impact the estimation accuracy and increase run time over potential decoy-free approaches (DFAs). RESULTS: We introduce a novel decoy-free framework for FDR estimation in XL-MS/MS. Our approach relies on multi-sample mixtures of skew normal distributions, where the latent components correspond to the scores of correct peptide pairs (both peptides identified correctly), partially incorrect peptide pairs (one peptide identified correctly, the other incorrectly), and incorrect peptide pairs (both peptides identified incorrectly). To learn these components, we exploit the score distributions of first- and second-ranked peptide-spectrum matches for each experimental spectrum and subsequently estimate FDR using a novel expectation-maximization algorithm with constraints. We evaluate the method on ten datasets and provide evidence that the proposed DFA is theoretically sound and a viable alternative to TDA owing to its good performance in terms of accuracy, variance of estimation, and run time. AVAILABILITY AND IMPLEMENTATION: https://github.com/shawn-peng/xlms.


Subject(s)
Algorithms , Databases, Protein , Proteomics , Tandem Mass Spectrometry , Tandem Mass Spectrometry/methods , Proteomics/methods , Peptides/chemistry , Proteins/chemistry
7.
Bioinformatics ; 40(Supplement_1): i539-i547, 2024 Jun 28.
Article in English | MEDLINE | ID: mdl-38940179

ABSTRACT

MOTIVATION: In drug discovery, it is crucial to assess the drug-target binding affinity (DTA). Although molecular docking is widely used, computational efficiency limits its application in large-scale virtual screening. Deep learning-based methods learn virtual scoring functions from labeled datasets and can quickly predict affinity. However, there are three limitations. First, existing methods only consider the atom-bond graph or one-dimensional sequence representations of compounds, ignoring the information about functional groups (pharmacophores) with specific biological activities. Second, relying on limited labeled datasets fails to learn comprehensive embedding representations of compounds and proteins, resulting in poor generalization performance in complex scenarios. Third, existing feature fusion methods cannot adequately capture contextual interaction information. RESULTS: Therefore, we propose a novel DTA prediction method named HeteroDTA. Specifically, a multi-view compound feature extraction module is constructed to model the atom-bond graph and pharmacophore graph. The residue concat graph and protein sequence are also utilized to model protein structure and function. Moreover, to enhance the generalization capability and reduce the dependence on task-specific labeled data, pre-trained models are utilized to initialize the atomic features of the compounds and the embedding representations of the protein sequence. A context-aware nonlinear feature fusion method is also proposed to learn interaction patterns between compounds and proteins. Experimental results on public benchmark datasets show that HeteroDTA significantly outperforms existing methods. In addition, HeteroDTA shows excellent generalization performance in cold-start experiments and superiority in the representation learning ability of drug-target pairs. Finally, the effectiveness of HeteroDTA is demonstrated in a real-world drug discovery study. AVAILABILITY AND IMPLEMENTATION: The source code and data are available at https://github.com/daydayupzzl/HeteroDTA.


Subject(s)
Drug Discovery , Drug Discovery/methods , Molecular Docking Simulation , Proteins/chemistry , Proteins/metabolism , Deep Learning , Pharmacophore
8.
Methods Mol Biol ; 2820: 7-20, 2024.
Article in English | MEDLINE | ID: mdl-38941010

ABSTRACT

Wastewater treatment plants (WWTPs) are the main barrier to cope with the increased pressure of municipal and industrial wastewater on natural water resources in terms of both polluting load and produced volumes. For this reason, WWTP's efficiency should be the highest; thus, their monitoring becomes critical. In conventional WWTPs, biodegradation of pollutants mainly occurs in the biological reactors, and an increasing interest in a deeper characterization of the biomasses involved in these processes (made of biofilms, granules, and suspended activated sludge) rose up in recent years. In this sense, the meta-omics approaches were recently developed to investigate the entire set of biomolecules of a given class in a microbial community with the same general objective: the identification of the biomolecules through the sequence similarity of high degree in the already available databases. Particularly, metaproteomics concerns the identification of all proteins in a microbial community in a given moment or condition. In this chapter, a protocol for the extraction and separation of proteins from activate sludge sampled at WWTPs is proposed.


Subject(s)
Sewage , Wastewater , Sewage/microbiology , Wastewater/microbiology , Wastewater/chemistry , Wastewater/analysis , Proteomics/methods , Proteins/isolation & purification , Proteins/analysis , Waste Disposal, Fluid/methods
9.
Methods Mol Biol ; 2820: 21-28, 2024.
Article in English | MEDLINE | ID: mdl-38941011

ABSTRACT

The metaproteomic approach allows a deep microbiome characterization in different complex systems. Based on metaproteome data, microbial communities' composition, succession, and functional role in different environmental conditions can be established.The main challenge in metaproteomic studies is protein extraction, and although many protocols have been developed, a few are focused on the protein extraction of fermented foods. In this chapter, a reproducible and efficient method for the extraction of proteins from a traditionally fermented starchy food is described. The method can be applied to any fermented food and aims to enrich the extraction of proteins from microorganisms for their subsequent characterization.


Subject(s)
Fermented Foods , Proteomics , Fermented Foods/microbiology , Fermented Foods/analysis , Proteomics/methods , Fermentation , Proteins/isolation & purification , Proteins/analysis , Microbiota , Food Microbiology/methods
10.
Methods Mol Biol ; 2820: 29-39, 2024.
Article in English | MEDLINE | ID: mdl-38941012

ABSTRACT

Soil metaproteomics could explore the proteins involved in life activities and their abundance in the soils to overcome the difficulty in pure cultures of soil microorganisms and the limitations of proteomics of pure cultures. However, the complexity and heterogeneity of soil composition, the low abundance of soil proteins, and the presence of massive interfering substances (including humic compounds) generally lead to an extremely low extraction efficiency of soil proteins. Therefore, the efficient extraction of soil proteins is a prerequisite and bottleneck problem in soil metaproteomics. In this chapter, a soil protein extraction method suitable for most types of soils with low cost and enabling simple operation (about 150 µg protein can be extracted from 5.0 g soil) is described. The quantity and purity of the extracted soil proteins could meet the requirements for further analysis using routine mass spectrometry-based proteomics.


Subject(s)
Proteomics , Soil , Soil/chemistry , Proteomics/methods , Proteins/isolation & purification , Proteins/analysis , Soil Microbiology , Mass Spectrometry/methods
11.
Methods Mol Biol ; 2820: 1-6, 2024.
Article in English | MEDLINE | ID: mdl-38941009

ABSTRACT

A method for the recovery of whole-cell protein extracts from biomass on membrane filters is provided here. The protein extraction method is ideal for biomass captured by filtration of large water volumes, including seawater from marine environments. The protein extraction method includes both chemical disruption and physical disruption to lyse cells and release protein for subsequent metaproteomic analysis.


Subject(s)
Filtration , Seawater , Filtration/methods , Seawater/microbiology , Microbiota , Proteomics/methods , Biomass , Bacterial Proteins/isolation & purification , Aquatic Organisms , Proteins/isolation & purification , Proteins/analysis
12.
Methods Mol Biol ; 2820: 49-56, 2024.
Article in English | MEDLINE | ID: mdl-38941014

ABSTRACT

The development of high throughput methods has enabled the study of hundreds of samples and metaproteomics is not the exception. However, the study of thousands of proteins of different organisms represents different challenges from the protein extraction to the bioinformatic analysis. Here, the sample preparation, protein extraction and protein purification for livestock microbiome research throughout metaproteomics are described. These methods are essential because the quality of the final protein pool depends on them. For that reason, the following workflow is a combination of different chemical and physical methods that intend an initial separation of the microbial organisms from the host cells and other organic materials, as well as the extraction of high concentrate pure samples.


Subject(s)
Livestock , Microbiota , Proteomics , Animals , Livestock/microbiology , Proteomics/methods , Proteins/isolation & purification , Proteins/analysis
13.
PLoS Comput Biol ; 20(6): e1012123, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38935611

ABSTRACT

AlphaFold2 is an Artificial Intelligence-based program developed to predict the 3D structure of proteins given only their amino acid sequence at atomic resolution. Due to the accuracy and efficiency at which AlphaFold2 can generate 3D structure predictions and its widespread adoption into various aspects of biochemical research, the technique of protein structure prediction should be considered for incorporation into the undergraduate biochemistry curriculum. A module for introducing AlphaFold2 into a senior-level biochemistry laboratory classroom was developed. The module's focus was to have students predict the structures of proteins from the MPOX 22 global outbreak virus isolate genome, which had no structures elucidated at that time. The goal of this study was to both determine the impact the module had on students and to develop a framework for introducing AlphaFold2 into the undergraduate curriculum so that instructors for biochemistry courses, regardless of their background in bioinformatics, could adapt the module into their classrooms.


Subject(s)
Artificial Intelligence , Biochemistry , Curriculum , Humans , Biochemistry/education , Computational Biology/education , Computational Biology/methods , Protein Conformation , Students , Software , Universities , Proteins/chemistry , Proteins/metabolism , Proteins/genetics , Amino Acid Sequence
14.
Nat Commun ; 15(1): 5459, 2024 Jun 27.
Article in English | MEDLINE | ID: mdl-38937468

ABSTRACT

Atomic-scale molecular modeling and simulation are powerful tools for computational biology. However, constructing models with large, densely packed molecules, non-water solvents, or with combinations of multiple biomembranes, polymers, and nanomaterials remains challenging and requires significant time and expertise. Furthermore, existing tools do not support such assemblies under the periodic boundary conditions (PBC) necessary for molecular simulation. Here, we describe Multicomponent Assembler in CHARMM-GUI that automates complex molecular assembly and simulation input preparation under the PBC. In this work, we demonstrate its versatility by preparing 6 challenging systems with varying density of large components: (1) solvated proteins, (2) solvated proteins with a pre-equilibrated membrane, (3) solvated proteins with a sheet-like nanomaterial, (4) solvated proteins with a sheet-like polymer, (5) a mixed membrane-nanomaterial system, and (6) a sheet-like polymer with gaseous solvent. Multicomponent Assembler is expected to be a unique cyberinfrastructure to study complex interactions between small molecules, biomacromolecules, polymers, and nanomaterials.


Subject(s)
Nanostructures , Polymers , Nanostructures/chemistry , Polymers/chemistry , Molecular Dynamics Simulation , Proteins/chemistry , Models, Molecular , Solvents/chemistry , Computational Biology/methods , Software
15.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38920341

ABSTRACT

Drug-target interactions (DTIs) are a key part of drug development process and their accurate and efficient prediction can significantly boost development efficiency and reduce development time. Recent years have witnessed the rapid advancement of deep learning, resulting in an abundance of deep learning-based models for DTI prediction. However, most of these models used a single representation of drugs and proteins, making it difficult to comprehensively represent their characteristics. Multimodal data fusion can effectively compensate for the limitations of single-modal data. However, existing multimodal models for DTI prediction do not take into account both intra- and inter-modal interactions simultaneously, resulting in limited presentation capabilities of fused features and a reduction in DTI prediction accuracy. A hierarchical multimodal self-attention-based graph neural network for DTI prediction, called HMSA-DTI, is proposed to address multimodal feature fusion. Our proposed HMSA-DTI takes drug SMILES, drug molecular graphs, protein sequences and protein 2-mer sequences as inputs, and utilizes a hierarchical multimodal self-attention mechanism to achieve deep fusion of multimodal features of drugs and proteins, enabling the capture of intra- and inter-modal interactions between drugs and proteins. It is demonstrated that our proposed HMSA-DTI has significant advantages over other baseline methods on multiple evaluation metrics across five benchmark datasets.


Subject(s)
Deep Learning , Neural Networks, Computer , Proteins/chemistry , Proteins/metabolism , Humans , Algorithms , Computational Biology/methods
16.
J Chem Theory Comput ; 20(12): 4998-5011, 2024 Jun 25.
Article in English | MEDLINE | ID: mdl-38830621

ABSTRACT

Phosphorylations are the most common and extensively studied post-translational modification (PTM) of proteins in eukaryotes. They constitute a major regulatory mechanism, modulating protein function, protein-protein interactions, as well as subcellular localization. Phosphorylation sites are preferably located in intrinsically disordered regions and have been shown to trigger structural rearrangements and order-to-disorder transitions. They can therefore have a significant effect on protein backbone dynamics or conformation, but only sparse experimental data are available. To obtain a more general description of how and when phosphorylations have a significant effect on protein behavior, molecular dynamics (MD) currently provides the only suitable framework to study these effects at a large scale in atomistic detail. This study develops a systematic MD simulation framework to explore the influence of phosphorylations on the local backbone dynamics and conformational propensities of proteins. Through a series of glycine-backbone peptides, we studied the effects of amino acid residues including the three most common phosphorylations (Ser, Thr, and Tyr), on local backbone dynamics and conformational propensities. We further extended our study to investigate the interactions of all such residues between position i to positions i + 1, i + 2, i + 3, and i + 4 in such peptides. The final data set comprises structural ensembles for 3393 sequences with more than 1 µs of sampling for each ensemble. To validate the relevance of the results, the structural and conformational properties extracted from the MD simulations are compared to NMR data from the Biological Magnetic Resonance Data Bank. The systematic nature of this study enables the projection of the gained knowledge onto any phosphorylation site in the proteome and provides a general framework for the study of further PTMs. The full data set is publicly available, as a training and reference set.


Subject(s)
Molecular Dynamics Simulation , Protein Conformation , Proteins , Phosphorylation , Proteins/chemistry , Proteins/metabolism , Peptides/chemistry , Peptides/metabolism
17.
J Chem Inf Model ; 64(12): 4651-4660, 2024 Jun 24.
Article in English | MEDLINE | ID: mdl-38847393

ABSTRACT

We present a novel and interpretable approach for assessing small-molecule binding using context explanation networks. Given the specific structure of a protein/ligand complex, our CENsible scoring function uses a deep convolutional neural network to predict the contributions of precalculated terms to the overall binding affinity. We show that CENsible can effectively distinguish active vs inactive compounds for many systems. Its primary benefit over related machine-learning scoring functions, however, is that it retains interpretability, allowing researchers to identify the contribution of each precalculated term to the final affinity prediction, with implications for subsequent lead optimization.


Subject(s)
Neural Networks, Computer , Protein Binding , Proteins , Small Molecule Libraries , Ligands , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology , Small Molecule Libraries/metabolism , Proteins/chemistry , Proteins/metabolism , Machine Learning
18.
J Control Release ; 371: 429-444, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38849096

ABSTRACT

Protein-based nanoparticles have garnered significant attention in theranostic applications due to their superior biocompatibility, exceptional biodegradability and ease of functionality. Compared to other nanocarriers, protein-based nanoparticles offer additional advantages, including biofunctionality and precise molecular recognition abilities, which make them highly effective in navigating complex biological environments. Moreover, proteins can serve as powerful tools with self-assembling structures and reagents that enhance cell penetration. And their derivation from abundant renewable sources and ability to degrade into harmless amino acids further enhance their suitability for biomedical applications. However, protein-based nanoparticles have so far not realized their full potential. In this review, we summarize recent advances in the use of protein nanoparticles in tumor diagnosis and treatment and outline typical methods for preparing protein nanoparticles. The review of protein nanoparticles may provide useful new insights into the development of biomaterial fabrication.


Subject(s)
Drug Delivery Systems , Nanoparticles , Neoplasms , Proteins , Theranostic Nanomedicine , Humans , Neoplasms/drug therapy , Theranostic Nanomedicine/methods , Nanoparticles/chemistry , Animals , Proteins/administration & dosage , Proteins/chemistry , Antineoplastic Agents/administration & dosage , Antineoplastic Agents/chemistry
19.
J Chem Theory Comput ; 20(12): 5352-5367, 2024 Jun 25.
Article in English | MEDLINE | ID: mdl-38859575

ABSTRACT

Markov state models (MSMs) have proven valuable in studying the dynamics of protein conformational changes via statistical analysis of molecular dynamics simulations. In MSMs, the complex configuration space is coarse-grained into conformational states, with dynamics modeled by a series of Markovian transitions among these states at discrete lag times. Constructing the Markovian model at a specific lag time necessitates defining states that circumvent significant internal energy barriers, enabling internal dynamics relaxation within the lag time. This process effectively coarse-grains time and space, integrating out rapid motions within metastable states. Thus, MSMs possess a multiresolution nature, where the granularity of states can be adjusted according to the time-resolution, offering flexibility in capturing system dynamics. This work introduces a continuous embedding approach for molecular conformations using the state predictive information bottleneck (SPIB), a framework that unifies dimensionality reduction and state space partitioning via a continuous, machine learned basis set. Without explicit optimization of the VAMP-based scores, SPIB demonstrates state-of-the-art performance in identifying slow dynamical processes and constructing predictive multiresolution Markovian models. Through applications to well-validated mini-proteins, SPIB showcases unique advantages compared to competing methods. It autonomously and self-consistently adjusts the number of metastable states based on a specified minimal time resolution, eliminating the need for manual tuning. While maintaining efficacy in dynamical properties, SPIB excels in accurately distinguishing metastable states and capturing numerous well-populated macrostates. This contrasts with existing VAMP-based methods, which often emphasize slow dynamics at the expense of incorporating numerous sparsely populated states. Furthermore, SPIB's ability to learn a low-dimensional continuous embedding of the underlying MSMs enhances the interpretation of dynamic pathways. With these benefits, we propose SPIB as an easy-to-implement methodology for end-to-end MSM construction.


Subject(s)
Markov Chains , Molecular Dynamics Simulation , Proteins/chemistry , Protein Conformation
20.
Bioinformatics ; 40(6)2024 Jun 03.
Article in English | MEDLINE | ID: mdl-38870521

ABSTRACT

MOTIVATION: Tools for pairwise alignments between 3D structures of proteins are of fundamental importance for structural biology and bioinformatics, enabling visual exploration of evolutionary and functional relationships. However, the absence of a user-friendly, browser-based tool for creating alignments and visualizing them at both 1D sequence and 3D structural levels makes this process unnecessarily cumbersome. RESULTS: We introduce a novel pairwise structure alignment tool (rcsb.org/alignment) that seamlessly integrates into the RCSB Protein Data Bank (RCSB PDB) research-focused RCSB.org web portal. Our tool and its underlying application programming interface (alignment.rcsb.org) empowers users to align several protein chains with a reference structure by providing access to established alignment algorithms (FATCAT, CE, TM-align, or Smith-Waterman 3D). The user-friendly interface simplifies parameter setup and input selection. Within seconds, our tool enables visualization of results in both sequence (1D) and structural (3D) perspectives through the RCSB PDB RCSB.org Sequence Annotations viewer and Mol* 3D viewer, respectively. Users can effortlessly compare structures deposited in the PDB archive alongside more than a million incorporated Computed Structure Models coming from the ModelArchive and AlphaFold DB. Moreover, this tool can be used to align custom structure data by providing a link/URL or uploading atomic coordinate files directly. Importantly, alignment results can be bookmarked and shared with collaborators. By bridging the gap between 1D sequence and 3D structures of proteins, our tool facilitates deeper understanding of complex evolutionary relationships among proteins through comprehensive sequence and structural analyses. AVAILABILITY AND IMPLEMENTATION: The alignment tool is part of the RCSB PDB research-focused RCSB.org web portal and available at rcsb.org/alignment. Programmatic access is available via alignment.rcsb.org. Frontend code has been published at github.com/rcsb/rcsb-pecos-app. Visualization is powered by the open-source Mol* viewer (github.com/molstar/molstar and github.com/molstar/rcsb-molstar) plus the Sequence Annotations in 3D Viewer (github.com/rcsb/rcsb-saguaro-3d).


Subject(s)
Algorithms , Databases, Protein , Proteins , Sequence Alignment , Software , Proteins/chemistry , Sequence Alignment/methods , Protein Conformation , User-Computer Interface , Computational Biology/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...