Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 171
Filtrar
Más filtros

Intervalo de año de publicación
1.
Mol Cell ; 82(17): 3193-3208.e8, 2022 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-35853451

RESUMEN

Aberrant phase separation of globular proteins is associated with many diseases. Here, we use a model protein system to understand how the unfolded states of globular proteins drive phase separation and the formation of unfolded protein deposits (UPODs). We find that for UPODs to form, the concentrations of unfolded molecules must be above a threshold value. Additionally, unfolded molecules must possess appropriate sequence grammars to drive phase separation. While UPODs recruit molecular chaperones, their compositional profiles are also influenced by synergistic physicochemical interactions governed by the sequence grammars of unfolded proteins and cellular proteins. Overall, the driving forces for phase separation and the compositional profiles of UPODs are governed by the sequence grammars of unfolded proteins. Our studies highlight the need for uncovering the sequence grammars of unfolded proteins that drive UPOD formation and cause gain-of-function interactions whereby proteins are aberrantly recruited into UPODs.


Asunto(s)
Chaperonas Moleculares , Pliegue de Proteína , Chaperonas Moleculares/metabolismo
2.
Proc Natl Acad Sci U S A ; 121(21): e2322923121, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38739798

RESUMEN

The ubiquitin-proteasome system is essential to all eukaryotes and has been shown to be critical to parasite survival as well, including Plasmodium falciparum, the causative agent of the deadliest form of malarial disease. Despite the central role of the ubiquitin-proteasome pathway to parasite viability across its entire life-cycle, specific inhibitors targeting the individual enzymes mediating ubiquitin attachment and removal do not currently exist. The ability to disrupt P. falciparum growth at multiple developmental stages is particularly attractive as this could potentially prevent both disease pathology, caused by asexually dividing parasites, as well as transmission which is mediated by sexually differentiated parasites. The deubiquitinating enzyme PfUCHL3 is an essential protein, transcribed across both human and mosquito developmental stages. PfUCHL3 is considered hard to drug by conventional methods given the high level of homology of its active site to human UCHL3 as well as to other UCH domain enzymes. Here, we apply the RaPID mRNA display technology and identify constrained peptides capable of binding to PfUCHL3 with nanomolar affinities. The two lead peptides were found to selectively inhibit the deubiquitinase activity of PfUCHL3 versus HsUCHL3. NMR spectroscopy revealed that the peptides do not act by binding to the active site but instead block binding of the ubiquitin substrate. We demonstrate that this approach can be used to target essential protein-protein interactions within the Plasmodium ubiquitin pathway, enabling the application of chemically constrained peptides as a novel class of antimalarial therapeutics.


Asunto(s)
Péptidos , Plasmodium falciparum , Proteínas Protozoarias , Ubiquitina Tiolesterasa , Plasmodium falciparum/enzimología , Plasmodium falciparum/metabolismo , Plasmodium falciparum/efectos de los fármacos , Ubiquitina Tiolesterasa/metabolismo , Ubiquitina Tiolesterasa/antagonistas & inhibidores , Ubiquitina Tiolesterasa/genética , Humanos , Péptidos/química , Péptidos/metabolismo , Péptidos/farmacología , Proteínas Protozoarias/metabolismo , Proteínas Protozoarias/química , Proteínas Protozoarias/genética , Proteínas Protozoarias/antagonistas & inhibidores , Antimaláricos/farmacología , Antimaláricos/química , Ubiquitina/metabolismo , Malaria Falciparum/parasitología , Malaria Falciparum/tratamiento farmacológico
3.
Hum Mol Genet ; 33(3): 224-232, 2024 Jan 20.
Artículo en Inglés | MEDLINE | ID: mdl-37883464

RESUMEN

BACKGROUND: Mutations within the Von Hippel-Lindau (VHL) tumor suppressor gene are known to cause VHL disease, which is characterized by the formation of cysts and tumors in multiple organs of the body, particularly clear cell renal cell carcinoma (ccRCC). A major challenge in clinical practice is determining tumor risk from a given mutation in the VHL gene. Previous efforts have been hindered by limited available clinical data and technological constraints. METHODS: To overcome this, we initially manually curated the largest set of clinically validated VHL mutations to date, enabling a robust assessment of existing predictive tools on an independent test set. Additionally, we comprehensively characterized the effects of mutations within VHL using in silico biophysical tools describing changes in protein stability, dynamics and affinity to binding partners to provide insights into the structure-phenotype relationship. These descriptive properties were used as molecular features for the construction of a machine learning model, designed to predict the risk of ccRCC development as a result of a VHL missense mutation. RESULTS: Analysis of our model showed an accuracy of 0.81 in the identification of ccRCC-causing missense mutations, and a Matthew's Correlation Coefficient of 0.44 on a non-redundant blind test, a significant improvement in comparison to the previous available approaches. CONCLUSION: This work highlights the power of using protein 3D structure to fully explore the range of molecular and functional consequences of genomic variants. We believe this optimized model will better enable its clinical implementation and assist guiding patient risk stratification and management.


Asunto(s)
Aprendizaje Automático , Mutación Missense , Enfermedad de von Hippel-Lindau , Humanos , Carcinoma de Células Renales/genética , Carcinoma de Células Renales/metabolismo , Neoplasias Renales/metabolismo , Mutación Missense/genética , Enfermedad de von Hippel-Lindau/genética , Enfermedad de von Hippel-Lindau/patología , Proteína Supresora de Tumores del Síndrome de Von Hippel-Lindau/genética , Proteína Supresora de Tumores del Síndrome de Von Hippel-Lindau/química , Proteína Supresora de Tumores del Síndrome de Von Hippel-Lindau/metabolismo
4.
Nucleic Acids Res ; 52(W1): W207-W214, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38783112

RESUMEN

Protein-protein interactions (PPIs) play a vital role in cellular functions and are essential for therapeutic development and understanding diseases. However, current predictive tools often struggle to balance efficiency and precision in predicting the effects of mutations on these complex interactions. To address this, we present DDMut-PPI, a deep learning model that efficiently and accurately predicts changes in PPI binding free energy upon single and multiple point mutations. Building on the robust Siamese network architecture with graph-based signatures from our prior work, DDMut, the DDMut-PPI model was enhanced with a graph convolutional network operated on the protein interaction interface. We used residue-specific embeddings from ProtT5 protein language model as node features, and a variety of molecular interactions as edge features. By integrating evolutionary context with spatial information, this framework enables DDMut-PPI to achieve a robust Pearson correlation of up to 0.75 (root mean squared error: 1.33 kcal/mol) in our evaluations, outperforming most existing methods. Importantly, the model demonstrated consistent performance across mutations that increase or decrease binding affinity. DDMut-PPI offers a significant advancement in the field and will serve as a valuable tool for researchers probing the complexities of protein interactions. DDMut-PPI is freely available as a web server and an application programming interface at https://biosig.lab.uq.edu.au/ddmut_ppi.


Asunto(s)
Aprendizaje Profundo , Mapeo de Interacción de Proteínas , Mapeo de Interacción de Proteínas/métodos , Unión Proteica , Mutación , Programas Informáticos , Mapas de Interacción de Proteínas/genética , Humanos , Proteínas/genética , Proteínas/metabolismo , Proteínas/química , Mutación Puntual
5.
Nucleic Acids Res ; 52(W1): W469-W475, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38634808

RESUMEN

Evaluating pharmacokinetic properties of small molecules is considered a key feature in most drug development and high-throughput screening processes. Generally, pharmacokinetics, which represent the fate of drugs in the human body, are described from four perspectives: absorption, distribution, metabolism and excretion-all of which are closely related to a fifth perspective, toxicity (ADMET). Since obtaining ADMET data from in vitro, in vivo or pre-clinical stages is time consuming and expensive, many efforts have been made to predict ADMET properties via computational approaches. However, the majority of available methods are limited in their ability to provide pharmacokinetics and toxicity for diverse targets, ensure good overall accuracy, and offer ease of use, interpretability and extensibility for further optimizations. Here, we introduce Deep-PK, a deep learning-based pharmacokinetic and toxicity prediction, analysis and optimization platform. We applied graph neural networks and graph-based signatures as a graph-level feature to yield the best predictive performance across 73 endpoints, including 64 ADMET and 9 general properties. With these powerful models, Deep-PK supports molecular optimization and interpretation, aiding users in optimizing and understanding pharmacokinetics and toxicity for given input molecules. The Deep-PK is freely available at https://biosig.lab.uq.edu.au/deeppk/.


Asunto(s)
Aprendizaje Profundo , Humanos , Farmacocinética , Redes Neurales de la Computación , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Bibliotecas de Moléculas Pequeñas/farmacocinética , Bibliotecas de Moléculas Pequeñas/toxicidad
6.
Am J Hum Genet ; 109(12): 2253-2269, 2022 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-36413998

RESUMEN

Heterozygous pathogenic variants in DNM1 cause developmental and epileptic encephalopathy (DEE) as a result of a dominant-negative mechanism impeding vesicular fission. Thus far, pathogenic variants in DNM1 have been studied with a canonical transcript that includes the alternatively spliced exon 10b. However, after performing RNA sequencing in 39 pediatric brain samples, we find the primary transcript expressed in the brain includes the downstream exon 10a instead. Using this information, we evaluated genotype-phenotype correlations of variants affecting exon 10a and identified a cohort of eleven previously unreported individuals. Eight individuals harbor a recurrent de novo splice site variant, c.1197-8G>A (GenBank: NM_001288739.1), which affects exon 10a and leads to DEE consistent with the classical DNM1 phenotype. We find this splice site variant leads to disease through an unexpected dominant-negative mechanism. Functional testing reveals an in-frame upstream splice acceptor causing insertion of two amino acids predicted to impair oligomerization-dependent activity. This is supported by neuropathological samples showing accumulation of enlarged synaptic vesicles adherent to the plasma membrane consistent with impaired vesicular fission. Two additional individuals with missense variants affecting exon 10a, p.Arg399Trp and p.Gly401Asp, had a similar DEE phenotype. In contrast, one individual with a missense variant affecting exon 10b, p.Pro405Leu, which is less expressed in the brain, had a correspondingly less severe presentation. Thus, we implicate variants affecting exon 10a as causing the severe DEE typically associated with DNM1-related disorders. We highlight the importance of considering relevant isoforms for disease-causing variants as well as the possibility of splice site variants acting through a dominant-negative mechanism.


Asunto(s)
Encefalopatías , Dinaminas , Síndromes Epilépticos , Humanos , Encefalopatías/genética , Causalidad , Dinaminas/genética , Exones/genética , Heterocigoto , Mutación/genética , Síndromes Epilépticos/genética
7.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-37039696

RESUMEN

The ability to identify B-cell epitopes is an essential step in vaccine design, immunodiagnostic tests and antibody production. Several computational approaches have been proposed to identify, from an antigen protein or peptide sequence, which residues are more likely to be part of an epitope, but have limited performance on relatively homogeneous data sets and lack interpretability, limiting biological insights that could otherwise be obtained. To address these limitations, we have developed epitope1D, an explainable machine learning method capable of accurately identifying linear B-cell epitopes, leveraging two new descriptors: a graph-based signature representation of protein sequences, based on our well-established Cutoff Scanning Matrix algorithm and Organism Ontology information. Our model achieved Areas Under the ROC curve of up to 0.935 on cross-validation and blind tests, demonstrating robust performance. A comprehensive comparison to alternative methods using distinct benchmark data sets was also employed, with our model outperforming state-of-the-art tools. epitope1D represents not only a significant advance in predictive performance, but also allows biologically meaningful features to be combined and used for model interpretation. epitope1D has been made available as a user-friendly web server interface and application programming interface at https://biosig.lab.uq.edu.au/epitope1d/.


Asunto(s)
Algoritmos , Epítopos de Linfocito B , Secuencia de Aminoácidos , Curva ROC
8.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38018912

RESUMEN

Dysfunctions caused by missense mutations in the tumour suppressor p53 have been extensively shown to be a leading driver of many cancers. Unfortunately, it is time-consuming and labour-intensive to experimentally elucidate the effects of all possible missense variants. Recent works presented a comprehensive dataset and machine learning model to predict the functional outcome of mutations in p53. Despite the well-established dataset and precise predictions, this tool was trained on a complicated model with limited predictions on p53 mutations. In this work, we first used computational biophysical tools to investigate the functional consequences of missense mutations in p53, informing a bias of deleterious mutations with destabilizing effects. Combining these insights with experimental assays, we present two interpretable machine learning models leveraging both experimental assays and in silico biophysical measurements to accurately predict the functional consequences on p53 and validate their robustness on clinical data. Our final model based on nine features obtained comparable predictive performance with the state-of-the-art p53 specific method and outperformed other generalized, widely used predictors. Interpreting our models revealed that information on residue p53 activity, polar atom distances and changes in p53 stability were instrumental in the decisions, consistent with a bias of the properties of deleterious mutations. Our predictions have been computed for all possible missense mutations in p53, offering clinical diagnostic utility, which is crucial for patient monitoring and the development of personalized cancer treatment.


Asunto(s)
Mutación Missense , Neoplasias , Humanos , Proteína p53 Supresora de Tumor/genética , Mutación , Neoplasias/genética , Aprendizaje Automático
9.
Nucleic Acids Res ; 51(W1): W122-W128, 2023 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-37283042

RESUMEN

Understanding the effects of mutations on protein stability is crucial for variant interpretation and prioritisation, protein engineering, and biotechnology. Despite significant efforts, community assessments of predictive tools have highlighted ongoing limitations, including computational time, low predictive power, and biased predictions towards destabilising mutations. To fill this gap, we developed DDMut, a fast and accurate siamese network to predict changes in Gibbs Free Energy upon single and multiple point mutations, leveraging both forward and hypothetical reverse mutations to account for model anti-symmetry. Deep learning models were built by integrating graph-based representations of the localised 3D environment, with convolutional layers and transformer encoders. This combination better captured the distance patterns between atoms by extracting both short-range and long-range interactions. DDMut achieved Pearson's correlations of up to 0.70 (RMSE: 1.37 kcal/mol) on single point mutations, and 0.70 (RMSE: 1.84 kcal/mol) on double/triple mutants, outperforming most available methods across non-redundant blind test sets. Importantly, DDMut was highly scalable and demonstrated anti-symmetric performance on both destabilising and stabilising mutations. We believe DDMut will be a useful platform to better understand the functional consequences of mutations, and guide rational protein engineering. DDMut is freely available as a web server and API at https://biosig.lab.uq.edu.au/ddmut.


Asunto(s)
Aprendizaje Profundo , Estabilidad Proteica , Proteínas , Programas Informáticos , Mutación , Mutación Puntual , Proteínas/química , Proteínas/genética
10.
Proc Natl Acad Sci U S A ; 119(4)2022 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-35074914

RESUMEN

Catabolism of sulfoquinovose (SQ; 6-deoxy-6-sulfoglucose), the ubiquitous sulfosugar produced by photosynthetic organisms, is an important component of the biogeochemical carbon and sulfur cycles. Here, we describe a pathway for SQ degradation that involves oxidative desulfurization to release sulfite and enable utilization of the entire carbon skeleton of the sugar to support the growth of the plant pathogen Agrobacterium tumefaciens SQ or its glycoside sulfoquinovosyl glycerol are imported into the cell by an ATP-binding cassette transporter system with an associated SQ binding protein. A sulfoquinovosidase hydrolyzes the SQ glycoside and the liberated SQ is acted on by a flavin mononucleotide-dependent sulfoquinovose monooxygenase, in concert with an NADH-dependent flavin reductase, to release sulfite and 6-oxo-glucose. An NAD(P)H-dependent oxidoreductase reduces the 6-oxo-glucose to glucose, enabling entry into primary metabolic pathways. Structural and biochemical studies provide detailed insights into the recognition of key metabolites by proteins in this pathway. Bioinformatic analyses reveal that the sulfoquinovose monooxygenase pathway is distributed across Alpha- and Betaproteobacteria and is especially prevalent within the Rhizobiales order. This strategy for SQ catabolism is distinct from previously described pathways because it enables the complete utilization of all carbons within SQ by a single organism with concomitant production of inorganic sulfite.


Asunto(s)
Bacterias/metabolismo , Fenómenos Fisiológicos Bacterianos , Redes y Vías Metabólicas , Metilglucósidos/metabolismo , Estrés Oxidativo , Transportadoras de Casetes de Unión a ATP/química , Transportadoras de Casetes de Unión a ATP/genética , Transportadoras de Casetes de Unión a ATP/metabolismo , Metabolismo de los Hidratos de Carbono , Regulación Bacteriana de la Expresión Génica , Modelos Biológicos , Modelos Moleculares , Unión Proteica , Conformación Proteica , Relación Estructura-Actividad , Azufre/metabolismo
11.
Hum Genet ; 2024 Jan 16.
Artículo en Inglés | MEDLINE | ID: mdl-38227011

RESUMEN

Missense mutations are known contributors to diverse genetic disorders, due to their subtle, single amino acid changes imparted on the resultant protein. Because of this, understanding the impact of these mutations on protein stability and function is crucial for unravelling disease mechanisms and developing targeted therapies. The Critical Assessment of Genome Interpretation (CAGI) provides a valuable platform for benchmarking state-of-the-art computational methods in predicting the impact of disease-related mutations on protein thermodynamics. Here we report the performance of our comprehensive platform of structure-based computational approaches to evaluate mutations impacting protein structure and function on 3 challenges from CAGI6: Calmodulin, MAPK1 and MAPK3. Our stability predictors have achieved correlations of up to 0.74 and AUCs of 1 when predicting changes in ΔΔG for MAPK1 and MAPK3, respectively, and AUC of up to 0.75 in the Calmodulin challenge. Overall, our study highlights the importance of structure-based approaches in understanding the effects of missense mutations on protein thermodynamics. The results obtained from the CAGI6 challenges contribute to the ongoing efforts to enhance our understanding of disease mechanisms and facilitate the development of personalised medicine approaches.

12.
Brief Bioinform ; 23(4)2022 07 18.
Artículo en Inglés | MEDLINE | ID: mdl-35656714

RESUMEN

Proteins are capable of highly specific interactions and are responsible for a wide range of functions, making them attractive in the pursuit of new therapeutic options. Previous studies focusing on overall geometry of protein-protein interfaces, however, concluded that PPI interfaces were generally flat. More recently, this idea has been challenged by their structural and thermodynamic characterisation, suggesting the existence of concave binding sites that are closer in character to traditional small-molecule binding sites, rather than exhibiting complete flatness. Here, we present a large-scale analysis of binding geometry and physicochemical properties of all protein-protein interfaces available in the Protein Data Bank. In this review, we provide a comprehensive overview of the protein-protein interface landscape, including evidence that even for overall larger, more flat interfaces that utilize discontinuous interacting regions, small and potentially druggable pockets are utilized at binding sites.


Asunto(s)
Proteínas , Sitios de Unión , Bases de Datos de Proteínas , Unión Proteica , Proteínas/química
13.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35211724

RESUMEN

Herbicides have revolutionised weed management, increased crop yields and improved profitability allowing for an increase in worldwide food security. Their widespread use, however, has also led to a rise in resistance and concerns about their environmental impact. Despite the need for potent and safe herbicidal molecules, no herbicide with a new mode of action has reached the market in 30 years. Although development of computational approaches has proven invaluable to guide rational drug discovery pipelines, leading to higher hit rates and lower attrition due to poor toxicity, little has been done in contrast for herbicide design. To fill this gap, we have developed cropCSM, a computational platform to help identify new, potent, nontoxic and environmentally safe herbicides. By using a knowledge-based approach, we identified physicochemical properties and substructures enriched in safe herbicides. By representing the small molecules as a graph, we leveraged these insights to guide the development of predictive models trained and tested on the largest collected data set of molecules with experimentally characterised herbicidal profiles to date (over 4500 compounds). In addition, we developed six new environmental and human toxicity predictors, spanning five different species to assist in molecule prioritisation. cropCSM was able to correctly identify 97% of herbicides currently available commercially, while predicting toxicity profiles with accuracies of up to 92%. We believe cropCSM will be an essential tool for the enrichment of screening libraries and to guide the development of potent and safe herbicides. We have made the method freely available through a user-friendly webserver at http://biosig.unimelb.edu.au/crop_csm.


Asunto(s)
Herbicidas , Descubrimiento de Drogas , Herbicidas/química , Herbicidas/toxicidad , Humanos
14.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35189634

RESUMEN

Changes in protein sequence can have dramatic effects on how proteins fold, their stability and dynamics. Over the last 20 years, pioneering methods have been developed to try to estimate the effects of missense mutations on protein stability, leveraging growing availability of protein 3D structures. These, however, have been developed and validated using experimentally derived structures and biophysical measurements. A large proportion of protein structures remain to be experimentally elucidated and, while many studies have based their conclusions on predictions made using homology models, there has been no systematic evaluation of the reliability of these tools in the absence of experimental structural data. We have, therefore, systematically investigated the performance and robustness of ten widely used structural methods when presented with homology models built using templates at a range of sequence identity levels (from 15% to 95%) and contrasted performance with sequence-based tools, as a baseline. We found there is indeed performance deterioration on homology models built using templates with sequence identity below 40%, where sequence-based tools might become preferable. This was most marked for mutations in solvent exposed residues and stabilizing mutations. As structure prediction tools improve, the reliability of these predictors is expected to follow, however we strongly suggest that these factors should be taken into consideration when interpreting results from structure-based predictors of mutation effects on protein stability.


Asunto(s)
Biología Computacional , Proteínas , Biología Computacional/métodos , Bases de Datos de Proteínas , Mutación , Estabilidad Proteica , Proteínas/química , Proteínas/genética , Reproducibilidad de los Resultados
15.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34676398

RESUMEN

The ability to identify antigenic determinants of pathogens, or epitopes, is fundamental to guide rational vaccine development and immunotherapies, which are particularly relevant for rapid pandemic response. A range of computational tools has been developed over the past two decades to assist in epitope prediction; however, they have presented limited performance and generalization, particularly for the identification of conformational B-cell epitopes. Here, we present epitope3D, a novel scalable machine learning method capable of accurately identifying conformational epitopes trained and evaluated on the largest curated epitope data set to date. Our method uses the concept of graph-based signatures to model epitope and non-epitope regions as graphs and extract distance patterns that are used as evidence to train and test predictive models. We show epitope3D outperforms available alternative approaches, achieving Mathew's Correlation Coefficient and F1-scores of 0.55 and 0.57 on cross-validation and 0.45 and 0.36 during independent blind tests, respectively.


Asunto(s)
Epítopos de Linfocito B , Máquina de Vectores de Soporte , Biología Computacional/métodos , Aprendizaje Automático , Conformación Molecular
16.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34882232

RESUMEN

Protein-carbohydrate interactions are crucial for many cellular processes but can be challenging to biologically characterise. To improve our understanding and ability to model these molecular interactions, we used a carefully curated set of 370 protein-carbohydrate complexes with experimental structural and biophysical data in order to train and validate a new tool, cutoff scanning matrix (CSM)-carbohydrate, using machine learning algorithms to accurately predict their binding affinity and rank docking poses as a scoring function. Information on both protein and carbohydrate complementarity, in terms of shape and chemistry, was captured using graph-based structural signatures. Across both training and independent test sets, we achieved comparable Pearson's correlations of 0.72 under cross-validation [root mean square error (RMSE) of 1.58 Kcal/mol] and 0.67 on the independent test (RMSE of 1.72 Kcal/mol), providing confidence in the generalisability and robustness of the final model. Similar performance was obtained across mono-, di- and oligosaccharides, further highlighting the applicability of this approach to the study of larger complexes. We show CSM-carbohydrate significantly outperformed previous approaches and have implemented our method and make all data freely available through both a user-friendly web interface and application programming interface, to facilitate programmatic access at http://biosig.unimelb.edu.au/csm_carbohydrate/. We believe CSM-carbohydrate will be an invaluable tool for helping assess docking poses and the effects of mutations on protein-carbohydrate affinity, unravelling important aspects that drive binding recognition.


Asunto(s)
Proteínas , Programas Informáticos , Carbohidratos , Ligandos , Aprendizaje Automático , Simulación del Acoplamiento Molecular , Unión Proteica , Proteínas/química
17.
Brief Bioinform ; 23(4)2022 07 18.
Artículo en Inglés | MEDLINE | ID: mdl-35724625

RESUMEN

The rate of biological data generation has increased dramatically in recent years, which has driven the importance of databases as a resource to guide innovation and the generation of biological insights. Given the complexity and scale of these databases, automatic data classification is often required. Biological data sets are often hierarchical in nature, with varying degrees of complexity, imposing different challenges to train, test and validate accurate and generalizable classification models. While some approaches to classify hierarchical data have been proposed, no guidelines regarding their utility, applicability and limitations have been explored or implemented. These include 'Local' approaches considering the hierarchy, building models per level or node, and 'Global' hierarchical classification, using a flat classification approach. To fill this gap, here we have systematically contrasted the performance of 'Local per Level' and 'Local per Node' approaches with a 'Global' approach applied to two different hierarchical datasets: BioLip and CATH. The results show how different components of hierarchical data sets, such as variation coefficient and prediction by depth, can guide the choice of appropriate classification schemes. Finally, we provide guidelines to support this process when embarking on a hierarchical classification task, which will help optimize computational resources and predictive performance.


Asunto(s)
Aprendizaje Profundo , Algoritmos , Bases de Datos Factuales
18.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-35998885

RESUMEN

Drug discovery is a lengthy, costly and high-risk endeavour that is further convoluted by high attrition rates in later development stages. Toxicity has been one of the main causes of failure during clinical trials, increasing drug development time and costs. To facilitate early identification and optimisation of toxicity profiles, several computational tools emerged aiming at improving success rates by timely pre-screening drug candidates. Despite these efforts, there is an increasing demand for platforms capable of assessing both environmental as well as human-based toxicity properties at large scale. Here, we present toxCSM, a comprehensive computational platform for the study and optimisation of toxicity profiles of small molecules. toxCSM leverages on the well-established concepts of graph-based signatures, molecular descriptors and similarity scores to develop 36 models for predicting a range of toxicity properties, which can assist in developing safer drugs and agrochemicals. toxCSM achieved an Area Under the Receiver Operating Characteristic (ROC) Curve (AUC) of up to 0.99 and Pearson's correlation coefficients of up to 0.94 on 10-fold cross-validation, with comparable performance on blind test sets, outperforming all alternative methods. toxCSM is freely available as a user-friendly web server and API at http://biosig.lab.uq.edu.au/toxcsm.


Asunto(s)
Agroquímicos , Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Humanos , Curva ROC
19.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-35595534

RESUMEN

Metals are present in >30% of proteins found in nature and assist them to perform important biological functions, including storage, transport, signal transduction and enzymatic activity. Traditional and experimental techniques for metal-binding site prediction are usually costly and time-consuming, making computational tools that can assist in these predictions of significant importance. Here we present Genetic Active Site Search (GASS)-Metal, a new method for protein metal-binding site prediction. The method relies on a parallel genetic algorithm to find candidate metal-binding sites that are structurally similar to curated templates from M-CSA and MetalPDB. GASS-Metal was thoroughly validated using homologous proteins and conservative mutations of residues, showing a robust performance. The ability of GASS-Metal to identify metal-binding sites was also compared with state-of-the-art methods, outperforming similar methods and achieving an MCC of up to 0.57 and detecting up to 96.1% of the sites correctly. GASS-Metal is freely available at https://gassmetal.unifei.edu.br. The GASS-Metal source code is available at https://github.com/sandroizidoro/gassmetal-local.


Asunto(s)
Proteínas , Programas Informáticos , Algoritmos , Sitios de Unión , Dominio Catalítico , Metales/química , Metales/metabolismo , Proteínas/química
20.
Bioinformatics ; 39(7)2023 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-37382557

RESUMEN

MOTIVATION: While antibodies have been ground-breaking therapeutic agents, the structural determinants for antibody binding specificity remain to be fully elucidated, which is compounded by the virtually unlimited repertoire of antigens they can recognize. Here, we have explored the structural landscapes of antibody-antigen interfaces to identify the structural determinants driving target recognition by assessing concavity and interatomic interactions. RESULTS: We found that complementarity-determining regions utilized deeper concavity with their longer H3 loops, especially H3 loops of nanobody showing the deepest use of concavity. Of all amino acid residues found in complementarity-determining regions, tryptophan used deeper concavity, especially in nanobodies, making it suitable for leveraging concave antigen surfaces. Similarly, antigens utilized arginine to bind to deeper pockets of the antibody surface. Our findings fill a gap in knowledge about the antibody specificity, binding affinity, and the nature of antibody-antigen interface features, which will lead to a better understanding of how antibodies can be more effective to target druggable sites on antigen surfaces. AVAILABILITY AND IMPLEMENTATION: The data and scripts are available at: https://github.com/YoochanMyung/scripts.


Asunto(s)
Anticuerpos , Regiones Determinantes de Complementariedad , Regiones Determinantes de Complementariedad/química , Anticuerpos/química , Antígenos , Especificidad de Anticuerpos , Sitios de Unión de Anticuerpos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA