Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
J Cheminform ; 16(1): 52, 2024 May 12.
Artículo en Inglés | MEDLINE | ID: mdl-38735985

RESUMEN

Protein-ligand binding affinity plays a pivotal role in drug development, particularly in identifying potential ligands for target disease-related proteins. Accurate affinity predictions can significantly reduce both the time and cost involved in drug development. However, highly precise affinity prediction remains a research challenge. A key to improve affinity prediction is to capture interactions between proteins and ligands effectively. Existing deep-learning-based computational approaches use 3D grids, 4D tensors, molecular graphs, or proximity-based adjacency matrices, which are either resource-intensive or do not directly represent potential interactions. In this paper, we propose atomic-level distance features and attention mechanisms to capture better specific protein-ligand interactions based on donor-acceptor relations, hydrophobicity, and π -stacking atoms. We argue that distances encompass both short-range direct and long-range indirect interaction effects while attention mechanisms capture levels of interaction effects. On the very well-known CASF-2016 dataset, our proposed method, named Distance plus Attention for Affinity Prediction (DAAP), significantly outperforms existing methods by achieving Correlation Coefficient (R) 0.909, Root Mean Squared Error (RMSE) 0.987, Mean Absolute Error (MAE) 0.745, Standard Deviation (SD) 0.988, and Concordance Index (CI) 0.876. The proposed method also shows substantial improvement, around 2% to 37%, on five other benchmark datasets. The program and data are publicly available on the website https://gitlab.com/mahnewton/daap. Scientific Contribution StatementThis study innovatively introduces distance-based features to predict protein-ligand binding affinity, capitalizing on unique molecular interactions. Furthermore, the incorporation of protein sequence features of specific residues enhances the model's proficiency in capturing intricate binding patterns. The predictive capabilities are further strengthened through the use of a deep learning architecture with attention mechanisms, and an ensemble approach, averaging the outputs of five models, is implemented to ensure robust and reliable predictions.

2.
Comput Biol Chem ; 104: 107834, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-36863243

RESUMEN

Protein Structure Prediction (PSP) has achieved significant progress lately. Prediction of inter-residue distances by machine learning and their exploitation during the conformational search is largely among the critical factors behind the progress. Real values than bin probabilities could more naturally represent inter-residue distances, while the latter, via spline curves more naturally helps obtain differentiable objective functions than the former. Consequently, PSP methods that exploit predicted binned distances perform better than those that exploit predicted real-valued distances. To leverage the advantage of bin probabilities in getting differentiable objective functions, in this work, we propose techniques to convert real-valued distances into distance bin probabilities. Using standard benchmark proteins, we then show that our real-to-bin converted distances help PSP methods obtain three-dimensional structures with 4%-16% better root mean squared deviation (RMSD), template modeling score (TM-Score), and global distance test (GDT) values than existing similar PSP methods. Our proposed PSP method is named real to bin (R2B) inter-residue distance predictor, and its code is available from https://gitlab.com/mahnewton/r2b.


Asunto(s)
Aprendizaje Automático , Proteínas , Modelos Moleculares , Bases de Datos de Proteínas , Proteínas/química , Conformación Proteica , Biología Computacional/métodos , Algoritmos
3.
Comput Biol Chem ; 101: 107773, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36182866

RESUMEN

Protein structure prediction (PSP) is a crucial issue in Bioinformatics. PSP has its important use in many vital research areas that include drug discovery. One of the important intermediate steps in PSP is predicting a protein's beta-sheet structures. Because of non-local interactions among numerous irregular areas in beta-sheets, their highly accurate prediction is challenging. The challenge is compounded when a given protein's structure has a large number of beta-sheets. In this paper, we specifically refine the beta-sheets of a protein structure by using a local search method. Then, we use another local search method to refine the full structure. Our search methods analyse residue-residue distance-based scores and apply geometric restrictions gained from deep learning models. Moreover, our search methods recognise the regions of the current conformations prompting the nether scores and generate neighbouring conformations focusing on that identified regions and making alterations there. On a set of standard 88 proteins of various sizes between 46 and 450 residues, our method successfully outperforms state-of-the-art PSP search algorithms. The improvements are more than 12% in average root mean squared distance (RMSD), template modelling score (TM-score), and global distance test (GDT) values.


Asunto(s)
Biología Computacional , Proteínas , Conformación Proteica en Lámina beta , Proteínas/química , Biología Computacional/métodos , Algoritmos , Conformación Proteica
4.
Comput Biol Med ; 148: 105824, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-35863250

RESUMEN

Predicted inter-residue distances are a key behind recent success in high quality protein structure prediction (PSP). However, prediction of both short and long distance values together is challenging. Consequently, predicted short distances are mostly used by existing PSP methods. In this paper, we use a stacked meta-ensemble method to combine deep learning models trained for different ranges of real-valued distances. On five benchmark sets of proteins, our proposed inter-residue distance prediction method improves mean Local Distance Different Test (LDDT) scores at least by 5% over existing such methods. Moreover, using a real-valued distance based conformational search algorithm, we also show that predicted long distances help obtain significantly better protein conformations than when only predicted short distances are used. Our method is named meta-ensemble for distance prediction (MDP) and its program is available from https://gitlab.com/mahnewton/mdp.


Asunto(s)
Algoritmos , Proteínas , Conformación Proteica
5.
Comput Biol Chem ; 99: 107700, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-35665657

RESUMEN

Protein contact maps capture coevolutionary interactions between amino acid residue pairs that are spatially within certain proximity threshold. Predicted contact maps are used in many protein related problems that include drug design, protein design, protein function prediction, and protein structure prediction. Contact map prediction has achieved significant progress lately but still further challenges remain with prediction of contacts between residues that are separated in the amino acid residue sequence by large numbers of other residues. In this paper, with experimental results on 5 standard benchmark datasets that include membrane proteins, we show that contact map prediction could be significantly enhanced by using ensembles of various state-of-the-art short distance predictors and then by converting predicted distances into contact probabilities. Our program along with its data is available from https://gitlab.com/mahnewton/ecp.


Asunto(s)
Biología Computacional , Proteínas , Algoritmos , Secuencia de Aminoácidos , Aminoácidos/química , Biología Computacional/métodos , Proteínas/química
6.
BMC Bioinformatics ; 23(1): 6, 2022 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-34983370

RESUMEN

MOTIVATION: Protein backbone angle prediction has achieved significant accuracy improvement with the development of deep learning methods. Usually the same deep learning model is used in making prediction for all residues regardless of the categories of secondary structures they belong to. In this paper, we propose to train separate deep learning models for each category of secondary structures. Machine learning methods strive to achieve generality over the training examples and consequently loose accuracy. In this work, we explicitly exploit classification knowledge to restrict generalisation within the specific class of training examples. This is to compensate the loss of generalisation by exploiting specialisation knowledge in an informed way. RESULTS: The new method named SAP4SS obtains mean absolute error (MAE) values of 15.59, 18.87, 6.03, and 21.71 respectively for four types of backbone angles [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text]. Consequently, SAP4SS significantly outperforms existing state-of-the-art methods SAP, OPUS-TASS, and SPOT-1D: the differences in MAE for all four types of angles are from 1.5 to 4.1% compared to the best known results. AVAILABILITY: SAP4SS along with its data is available from https://gitlab.com/mahnewton/sap4ss .


Asunto(s)
Redes Neurales de la Computación , Proteínas , Aprendizaje Automático , Estructura Secundaria de Proteína
7.
Sci Rep ; 12(1): 787, 2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-35039537

RESUMEN

Protein structure prediction (PSP) has achieved significant progress lately via prediction of inter-residue distances using deep learning models and exploitation of the predictions during conformational search. In this context, prediction of large inter-residue distances and also prediction of distances between residues separated largely in the protein sequence remain challenging. To deal with these challenges, state-of-the-art inter-residue distance prediction algorithms have used large sets of coevolutionary and non-coevolutionary features. In this paper, we argue that the more the types of features used, the more the kinds of noises introduced and then the deep learning model has to overcome the noises to improve the accuracy of the predictions. Also, multiple features capturing similar underlying characteristics might not necessarily have significantly better cumulative effect. So we scrutinise the feature space to reduce the types of features to be used, but at the same time, we strive to improve the prediction accuracy. Consequently, for inter-residue real distance prediction, in this paper, we propose a deep learning model named scrutinised distance predictor (SDP), which uses only 2 coevolutionary and 3 non-coevolutionary features. On several sets of benchmark proteins, our proposed SDP method improves mean Local Distance Different Test (LDDT) scores at least by 10% over existing state-of-the-art methods. The SDP program along with its data is available from the website https://gitlab.com/mahnewton/sdp .


Asunto(s)
Aprendizaje Profundo , Proteínas/química , Secuencia de Aminoácidos , Conjuntos de Datos como Asunto , Modelos Moleculares , Redes Neurales de la Computación , Análisis de Secuencia de Proteína
9.
ACS Omega ; 6(18): 12306-12317, 2021 May 11.
Artículo en Inglés | MEDLINE | ID: mdl-34056383

RESUMEN

Toxicity prediction using quantitative structure-activity relationship has achieved significant progress in recent years. However, most existing machine learning methods in toxicity prediction utilize only one type of feature representation and one type of neural network, which essentially restricts their performance. Moreover, methods that use more than one type of feature representation struggle with the aggregation of information captured within the features since they use predetermined aggregation formulas. In this paper, we propose a deep learning framework for quantitative toxicity prediction using five individual base deep learning models and their own base feature representations. We then propose to adopt a meta ensemble approach using another separate deep learning model to perform aggregation of the outputs of the individual base deep learning models. We train our deep learning models in a weighted multitask fashion combining four quantitative toxicity data sets of LD50, IGC50, LC50, and LC50-DM and minimizing the root-mean-square errors. Compared to the current state-of-the-art toxicity prediction method TopTox on LD50, IGC50, and LC50-DM, that is, three out of four data sets, our method, respectively, obtains 5.46, 16.67, and 6.34% better root-mean-square errors, 6.41, 11.80, and 12.16% better mean absolute errors, and 5.21, 7.36, and 2.54% better coefficients of determination. We named our method QuantitativeTox, and our implementation is available from the GitHub repository https://github.com/Abdulk084/QuantitativeTox.

10.
Sci Rep ; 10(1): 19430, 2020 11 10.
Artículo en Inglés | MEDLINE | ID: mdl-33173130

RESUMEN

Protein structure prediction is a grand challenge. Prediction of protein structures via the representations using backbone dihedral angles has recently achieved significant progress along with the on-going surge of deep neural network (DNN) research in general. However, we observe that in the protein backbone angle prediction research, there is an overall trend to employ more and more complex neural networks and then to throw more and more features to the neural networks. While more features might add more predictive power to the neural network, we argue that redundant features could rather clutter the scenario and more complex neural networks then just could counterbalance the noise. From artificial intelligence and machine learning perspectives, problem representations and solution approaches do mutually interact and thus affect performance. We also argue that comparatively simpler predictors can more easily be reconstructed than the more complex ones. With these arguments in mind, we present a deep learning method named Simpler Angle Predictor (SAP) to train simpler DNN models that enhance protein backbone angle prediction. We then empirically show that SAP can significantly outperform existing state-of-the-art methods on well-known benchmark datasets: for some types of angles, the differences are 6-8 in terms of mean absolute error (MAE). The SAP program along with its data is available from the website https://gitlab.com/mahnewton/sap .


Asunto(s)
Hígado/efectos de los fármacos , Hígado/metabolismo , Animales , Apoptosis/efectos de los fármacos , Dieta Alta en Grasa/efectos adversos , Inhibidores de la Dipeptidil-Peptidasa IV/uso terapéutico , Células Hep G2 , Hepatocitos/efectos de los fármacos , Hepatocitos/metabolismo , Humanos , Etiquetado Corte-Fin in Situ , Masculino , Ratones , Ratones Endogámicos C57BL , Redes Neurales de la Computación , Receptores del Ligando Inductor de Apoptosis Relacionado con TNF/metabolismo
11.
Adv Bioinformatics ; 2014: 867179, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24876837

RESUMEN

Protein structure prediction (PSP) has been one of the most challenging problems in computational biology for several decades. The challenge is largely due to the complexity of the all-atomic details and the unknown nature of the energy function. Researchers have therefore used simplified energy models that consider interaction potentials only between the amino acid monomers in contact on discrete lattices. The restricted nature of the lattices and the energy models poses a twofold concern regarding the assessment of the models. Can a native or a very close structure be obtained when structures are mapped to lattices? Can the contact based energy models on discrete lattices guide the search towards the native structures? In this paper, we use the protein chain lattice fitting (PCLF) problem to address the first concern; we developed a constraint-based local search algorithm for the PCLF problem for cubic and face-centered cubic lattices and found very close lattice fits for the native structures. For the second concern, we use a number of techniques to sample the conformation space and find correlations between energy functions and root mean square deviation (RMSD) distance of the lattice-based structures with the native structures. Our analysis reveals weakness of several contact based energy models used that are popular in PSP.

12.
Adv Bioinformatics ; 2014: 985968, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24744779

RESUMEN

Protein structure prediction is computationally a very challenging problem. A large number of existing search algorithms attempt to solve the problem by exploring possible structures and finding the one with the minimum free energy. However, these algorithms perform poorly on large sized proteins due to an astronomically wide search space. In this paper, we present a multipoint spiral search framework that uses parallel processing techniques to expedite exploration by starting from different points. In our approach, a set of random initial solutions are generated and distributed to different threads. We allow each thread to run for a predefined period of time. The improved solutions are stored threadwise. When the threads finish, the solutions are merged together and the duplicates are removed. A selected distinct set of solutions are then split to different threads again. In our ab initio protein structure prediction method, we use the three-dimensional face-centred-cubic lattice for structure-backbone mapping. We use both the low resolution hydrophobic-polar energy model and the high-resolution 20 × 20 energy model for search guiding. The experimental results show that our new parallel framework significantly improves the results obtained by the state-of-the-art single-point search approaches for both energy models on three-dimensional face-centred-cubic lattice. We also experimentally show the effectiveness of mixing energy models within parallel threads.

13.
Biomed Res Int ; 2013: 924137, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24224180

RESUMEN

Protein structure prediction (PSP) is computationally a very challenging problem. The challenge largely comes from the fact that the energy function that needs to be minimised in order to obtain the native structure of a given protein is not clearly known. A high resolution 20 × 20 energy model could better capture the behaviour of the actual energy function than a low resolution energy model such as hydrophobic polar. However, the fine grained details of the high resolution interaction energy matrix are often not very informative for guiding the search. In contrast, a low resolution energy model could effectively bias the search towards certain promising directions. In this paper, we develop a genetic algorithm that mainly uses a high resolution energy model for protein structure evaluation but uses a low resolution HP energy model in focussing the search towards exploring structures that have hydrophobic cores. We experimentally show that this mixing of energy models leads to significant lower energy structures compared to the state-of-the-art results.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Conformación Proteica , Proteínas/química , Algoritmos , Secuencia de Aminoácidos , Interacciones Hidrofóbicas e Hidrofílicas , Pliegue de Proteína
14.
BMC Bioinformatics ; 14 Suppl 2: S16, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23368706

RESUMEN

BACKGROUND: Protein structure prediction is an important but unsolved problem in biological science. Predicted structures vary much with energy functions and structure-mapping spaces. In our simplified ab initio protein structure prediction methods, we use hydrophobic-polar (HP) energy model for structure evaluation, and 3-dimensional face-centred-cubic lattice for structure mapping. For HP energy model, developing a compact hydrophobic-core (H-core) is essential for the progress of the search. The H-core helps find a stable structure with the lowest possible free energy. RESULTS: In order to build H-cores, we present a new Spiral Search algorithm based on tabu-guided local search. Our algorithm uses a novel H-core directed guidance heuristic that squeezes the structure around a dynamic hydrophobic-core centre. We applied random walks to break premature H-cores and thus to avoid early convergence. We also used a novel relay-restart technique to handle stagnation. CONCLUSIONS: We have tested our algorithms on a set of benchmark protein sequences. The experimental results show that our spiral search algorithm outperforms the state-of-the-art local search algorithms for simplified protein structure prediction. We also experimentally show the effectiveness of the relay-restart.


Asunto(s)
Algoritmos , Modelos Teóricos , Conformación Proteica , Proteínas/química , Secuencia de Aminoácidos , Interacciones Hidrofóbicas e Hidrofílicas
15.
BMC Bioinformatics ; 14 Suppl 2: S19, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23368768

RESUMEN

BACKGROUND: Given a protein's amino acid sequence, the protein structure prediction problem is to find a three dimensional structure that has the native energy level. For many decades, it has been one of the most challenging problems in computational biology. A simplified version of the problem is to find an on-lattice self-avoiding walk that minimizes the interaction energy among the amino acids. Local search methods have been preferably used in solving the protein structure prediction problem for their efficiency in finding very good solutions quickly. However, they suffer mainly from two problems: re-visitation and stagnancy. RESULTS: In this paper, we present an efficient local search algorithm that deals with these two problems. During search, we select the best candidate at each iteration, but store the unexplored second best candidates in a set of elite conformations, and explore them whenever the search faces stagnation. Moreover, we propose a new non-isomorphic encoding for the protein conformations to store the conformations and to check similarity when applied with a memory based search. This new encoding helps eliminate conformations that are equivalent under rotation and translation, and thus results in better prevention of re-visitation. CONCLUSION: On standard benchmark proteins, our algorithm significantly outperforms the state-of-the art approaches for Hydrophobic-Polar energy models and Face Centered Cubic Lattice.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Conformación Proteica , Proteínas/química , Interacciones Hidrofóbicas e Hidrofílicas , Modelos Teóricos , Pliegue de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA