Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 121
Filtrar
1.
Protein Sci ; 33(6): e5000, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38747401

RESUMEN

G protein-coupled receptors (GPCRs) are one of the most important families of targets for drug discovery. One of the limiting steps in the study of GPCRs has been their stability, with significant and time-consuming protein engineering often used to stabilize GPCRs for structural characterization and drug screening. Unfortunately, computational methods developed using globular soluble proteins have translated poorly to the rational engineering of GPCRs. To fill this gap, we propose GPCR-tm, a novel and personalized structurally driven web-based machine learning tool to study the impacts of mutations on GPCR stability. We show that GPCR-tm performs as well as or better than alternative methods, and that it can accurately rank the stability changes of a wide range of mutations occurring in various types of class A GPCRs. GPCR-tm achieved Pearson's correlation coefficients of 0.74 and 0.46 on 10-fold cross-validation and blind test sets, respectively. We observed that the (structural) graph-based signatures were the most important set of features for predicting destabilizing mutations, which points out that these signatures properly describe the changes in the environment where the mutations occur. More specifically, GPCR-tm was able to accurately rank mutations based on their effect on protein stability, guiding their rational stabilization. GPCR-tm is available through a user-friendly web server at https://biosig.lab.uq.edu.au/gpcr_tm/.


Asunto(s)
Ingeniería de Proteínas , Estabilidad Proteica , Receptores Acoplados a Proteínas G , Receptores Acoplados a Proteínas G/química , Receptores Acoplados a Proteínas G/genética , Receptores Acoplados a Proteínas G/metabolismo , Ingeniería de Proteínas/métodos , Humanos , Aprendizaje Automático , Mutación , Programas Informáticos , Modelos Moleculares
2.
Artículo en Inglés | MEDLINE | ID: mdl-38180643

RESUMEN

Glycoside hydrolases (GHs) are a diverse group of enzymes that catalyze the hydrolysis of glycosidic bonds. The Carbohydrate-Active enZymes (CAZy) classification organizes GHs into families based on sequence data and function, with fewer than 1% of the predicted proteins characterized biochemically. Consideration of genomic context can provide clues to infer possible enzyme activities for proteins of unknown function. We used the MultiGeneBLAST tool to discover a gene cluster in Marinovum sp., a member of the marine Roseobacter clade, that encodes homologues of enzymes belonging to the sulfoquinovose monooxygenase pathway for sulfosugar catabolism. This cluster lacks a gene encoding a classical family GH31 sulfoquinovosidase candidate, but which instead includes an uncharacterized family GH13 protein (MsGH13) that we hypothesized could be a non-classical sulfoquinovosidase. Surprisingly, recombinant MsGH13 lacks sulfoquinovosidase activity and is a broad-spectrum α-glucosidase that is active on a diverse array of α-linked disaccharides, including maltose, sucrose, nigerose, trehalose, isomaltose, and kojibiose. Using AlphaFold, a 3D model for the MsGH13 enzyme was constructed that predicted its active site shared close similarity with an α-glucosidase from Halomonas sp. H11 of the same GH13 subfamily that shows narrower substrate specificity.

3.
Curr Opin Pharmacol ; 74: 102427, 2024 02.
Artículo en Inglés | MEDLINE | ID: mdl-38219398

RESUMEN

This article investigates the role of recent advances in Artificial Intelligence (AI) to revolutionise the study of G protein-coupled receptors (GPCRs). AI has been applied to many areas of GPCR research, including the application of machine learning (ML) in GPCR classification, prediction of GPCR activation levels, modelling GPCR 3D structures and interactions, understanding G-protein selectivity, aiding elucidation of GPCRs structures, and drug design. Despite progress, challenges in predicting GPCR structures and addressing the complex nature of GPCRs remain, providing avenues for future research and development.


Asunto(s)
Inteligencia Artificial , Receptores Acoplados a Proteínas G , Humanos , Receptores Acoplados a Proteínas G/química , Aprendizaje Automático
4.
Stud Health Technol Inform ; 310: 1241-1245, 2024 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-38270013

RESUMEN

The Learning Health Systems (LHS) framework demonstrates the potential for iterative interrogation of health data in real time and implementation of insights into practice. Yet, the lack of appropriately skilled workforce results in an inability to leverage existing data to design innovative solutions. We developed a tailored professional development program to foster a skilled workforce. The short course is wholly online, for interdisciplinary professionals working in the digital health arena. To transform healthcare systems, the workforce needs an understanding of LHS principles, data driven approaches, and the need for diversly skilled learning communities that can tackle these complex problems together.


Asunto(s)
Aprendizaje del Sistema de Salud , Salud Digital , Estudios Interdisciplinarios , Aprendizaje , Recursos Humanos
5.
Hum Mol Genet ; 33(3): 224-232, 2024 Jan 20.
Artículo en Inglés | MEDLINE | ID: mdl-37883464

RESUMEN

BACKGROUND: Mutations within the Von Hippel-Lindau (VHL) tumor suppressor gene are known to cause VHL disease, which is characterized by the formation of cysts and tumors in multiple organs of the body, particularly clear cell renal cell carcinoma (ccRCC). A major challenge in clinical practice is determining tumor risk from a given mutation in the VHL gene. Previous efforts have been hindered by limited available clinical data and technological constraints. METHODS: To overcome this, we initially manually curated the largest set of clinically validated VHL mutations to date, enabling a robust assessment of existing predictive tools on an independent test set. Additionally, we comprehensively characterized the effects of mutations within VHL using in silico biophysical tools describing changes in protein stability, dynamics and affinity to binding partners to provide insights into the structure-phenotype relationship. These descriptive properties were used as molecular features for the construction of a machine learning model, designed to predict the risk of ccRCC development as a result of a VHL missense mutation. RESULTS: Analysis of our model showed an accuracy of 0.81 in the identification of ccRCC-causing missense mutations, and a Matthew's Correlation Coefficient of 0.44 on a non-redundant blind test, a significant improvement in comparison to the previous available approaches. CONCLUSION: This work highlights the power of using protein 3D structure to fully explore the range of molecular and functional consequences of genomic variants. We believe this optimized model will better enable its clinical implementation and assist guiding patient risk stratification and management.


Asunto(s)
Aprendizaje Automático , Mutación Missense , Enfermedad de von Hippel-Lindau , Humanos , Carcinoma de Células Renales/genética , Carcinoma de Células Renales/metabolismo , Neoplasias Renales/metabolismo , Mutación Missense/genética , Enfermedad de von Hippel-Lindau/genética , Enfermedad de von Hippel-Lindau/patología , Proteína Supresora de Tumores del Síndrome de Von Hippel-Lindau/genética , Proteína Supresora de Tumores del Síndrome de Von Hippel-Lindau/química , Proteína Supresora de Tumores del Síndrome de Von Hippel-Lindau/metabolismo
6.
J Biomed Inform ; 147: 104506, 2023 11.
Artículo en Inglés | MEDLINE | ID: mdl-37769829

RESUMEN

INTRODUCTION: Adequate methods to promptly translate digital health innovations for improved patient care are essential. Advances in Artificial Intelligence (AI) and Machine Learning (ML) have been sources of digital innovation and hold the promise to revolutionize the way we treat, manage and diagnose patients. Understanding the benefits but also the potential adverse effects of digital health innovations, particularly when these are made available or applied on healthier segments of the population is essential. One of such adverse effects is overdiagnosis. OBJECTIVE: to comprehensively analyze quantification strategies and data-driven definitions for overdiagnosis reported in the literature. METHODS: we conducted a scoping systematic review of manuscripts describing quantitative methods to estimate the proportion of overdiagnosed patients. RESULTS: we identified 46 studies that met our inclusion criteria. They covered a variety of clinical conditions, primarily breast and prostate cancer. Methods to quantify overdiagnosis included both prospective and retrospective methods including randomized clinical trials, and simulations. CONCLUSION: a variety of methods to quantify overdiagnosis have been published, producing widely diverging results. A standard method to quantify overdiagnosis is needed to allow its mitigation during the rapidly increasing development of new digital diagnostic tools.


Asunto(s)
Inteligencia Artificial , Neoplasias de la Próstata , Masculino , Humanos , Estudios Retrospectivos , Sobrediagnóstico , Estudios Prospectivos , Neoplasias de la Próstata/diagnóstico
7.
Bioinformatics ; 39(7)2023 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-37382557

RESUMEN

MOTIVATION: While antibodies have been ground-breaking therapeutic agents, the structural determinants for antibody binding specificity remain to be fully elucidated, which is compounded by the virtually unlimited repertoire of antigens they can recognize. Here, we have explored the structural landscapes of antibody-antigen interfaces to identify the structural determinants driving target recognition by assessing concavity and interatomic interactions. RESULTS: We found that complementarity-determining regions utilized deeper concavity with their longer H3 loops, especially H3 loops of nanobody showing the deepest use of concavity. Of all amino acid residues found in complementarity-determining regions, tryptophan used deeper concavity, especially in nanobodies, making it suitable for leveraging concave antigen surfaces. Similarly, antigens utilized arginine to bind to deeper pockets of the antibody surface. Our findings fill a gap in knowledge about the antibody specificity, binding affinity, and the nature of antibody-antigen interface features, which will lead to a better understanding of how antibodies can be more effective to target druggable sites on antigen surfaces. AVAILABILITY AND IMPLEMENTATION: The data and scripts are available at: https://github.com/YoochanMyung/scripts.


Asunto(s)
Anticuerpos , Regiones Determinantes de Complementariedad , Regiones Determinantes de Complementariedad/química , Anticuerpos/química , Antígenos , Especificidad de Anticuerpos , Sitios de Unión de Anticuerpos
8.
Bioinformatics ; 39(7)2023 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-37382560

RESUMEN

MOTIVATION: With the development of sequencing techniques, the discovery of new proteins significantly exceeds the human capacity and resources for experimentally characterizing protein functions. Localization, EC numbers, and GO terms with the structure-based Cutoff Scanning Matrix (LEGO-CSM) is a comprehensive web-based resource that fills this gap by leveraging the well-established and robust graph-based signatures to supervised learning models using both protein sequence and structure information to accurately model protein function in terms of Subcellular Localization, Enzyme Commission (EC) numbers, and Gene Ontology (GO) terms. RESULTS: We show our models perform as well as or better than alternative approaches, achieving area under the receiver operating characteristic curve of up to 0.93 for subcellular localization, up to 0.93 for EC, and up to 0.81 for GO terms on independent blind tests. AVAILABILITY AND IMPLEMENTATION: LEGO-CSM's web server is freely available at https://biosig.lab.uq.edu.au/lego_csm. In addition, all datasets used to train and test LEGO-CSM's models can be downloaded at https://biosig.lab.uq.edu.au/lego_csm/data.


Asunto(s)
Proteínas , Programas Informáticos , Humanos , Proteínas/química
9.
Nucleic Acids Res ; 51(W1): W122-W128, 2023 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-37283042

RESUMEN

Understanding the effects of mutations on protein stability is crucial for variant interpretation and prioritisation, protein engineering, and biotechnology. Despite significant efforts, community assessments of predictive tools have highlighted ongoing limitations, including computational time, low predictive power, and biased predictions towards destabilising mutations. To fill this gap, we developed DDMut, a fast and accurate siamese network to predict changes in Gibbs Free Energy upon single and multiple point mutations, leveraging both forward and hypothetical reverse mutations to account for model anti-symmetry. Deep learning models were built by integrating graph-based representations of the localised 3D environment, with convolutional layers and transformer encoders. This combination better captured the distance patterns between atoms by extracting both short-range and long-range interactions. DDMut achieved Pearson's correlations of up to 0.70 (RMSE: 1.37 kcal/mol) on single point mutations, and 0.70 (RMSE: 1.84 kcal/mol) on double/triple mutants, outperforming most available methods across non-redundant blind test sets. Importantly, DDMut was highly scalable and demonstrated anti-symmetric performance on both destabilising and stabilising mutations. We believe DDMut will be a useful platform to better understand the functional consequences of mutations, and guide rational protein engineering. DDMut is freely available as a web server and API at https://biosig.lab.uq.edu.au/ddmut.


Asunto(s)
Aprendizaje Profundo , Estabilidad Proteica , Proteínas , Programas Informáticos , Mutación , Mutación Puntual , Proteínas/química , Proteínas/genética
10.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-37039696

RESUMEN

The ability to identify B-cell epitopes is an essential step in vaccine design, immunodiagnostic tests and antibody production. Several computational approaches have been proposed to identify, from an antigen protein or peptide sequence, which residues are more likely to be part of an epitope, but have limited performance on relatively homogeneous data sets and lack interpretability, limiting biological insights that could otherwise be obtained. To address these limitations, we have developed epitope1D, an explainable machine learning method capable of accurately identifying linear B-cell epitopes, leveraging two new descriptors: a graph-based signature representation of protein sequences, based on our well-established Cutoff Scanning Matrix algorithm and Organism Ontology information. Our model achieved Areas Under the ROC curve of up to 0.935 on cross-validation and blind tests, demonstrating robust performance. A comprehensive comparison to alternative methods using distinct benchmark data sets was also employed, with our model outperforming state-of-the-art tools. epitope1D represents not only a significant advance in predictive performance, but also allows biologically meaningful features to be combined and used for model interpretation. epitope1D has been made available as a user-friendly web server interface and application programming interface at https://biosig.lab.uq.edu.au/epitope1d/.


Asunto(s)
Algoritmos , Epítopos de Linfocito B , Secuencia de Aminoácidos , Curva ROC
11.
J Chem Inf Model ; 63(2): 432-441, 2023 01 23.
Artículo en Inglés | MEDLINE | ID: mdl-36595441

RESUMEN

Teratogenic drugs can lead to extreme fetal malformation and consequently critically influence the fetus's health, yet the teratogenic risks associated with most approved drugs are unknown. Here, we propose a novel predictive tool, embryoTox, which utilizes a graph-based signature representation of the chemical structure of a small molecule to predict and classify molecules likely to be safe during pregnancy. embryoTox was trained and validated using in vitro bioactivity data of over 700 small molecules with characterized teratogenicity effects. Our final model achieved an area under the receiver operating characteristic curve (AUC) of up to 0.96 on 10-fold cross-validation and 0.82 on nonredundant blind tests, outperforming alternative approaches. We believe that our predictive tool will provide a practical resource for optimizing screening libraries to determine effective and safe molecules to use during pregnancy. To provide a simple and integrated platform to rapidly screen for potential safe molecules and their risk factors, we made embryoTox freely available online at https://biosig.lab.uq.edu.au/embryotox/.


Asunto(s)
Proyectos de Investigación , Embarazo , Femenino , Humanos , Curva ROC
13.
J Biomed Inform ; 137: 104265, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36464227

RESUMEN

The detection of adverse drug reactions (ADRs) is critical to our understanding of the safety and risk-benefit profile of medications. With an incidence that has not changed over the last 30 years, ADRs are a significant source of patient morbidity, responsible for 5%-10% of acute care hospital admissions worldwide. Spontaneous reporting of ADRs has long been the standard method of reporting, however this approach is known to have high rates of under-reporting, a problem that limits pharmacovigilance efforts. Automated ADR reporting presents an alternative pathway to increase reporting rates, although this may be limited by over-reporting of other drug-related adverse events. We developed a deep learning natural language processing algorithm to identify ADRs in discharge summaries at a single academic hospital centre. Our model was developed in two stages: first, a pre-trained model (DeBERTa) was further pre-trained on 1.1 million unlabelled clinical documents; secondly, this model was fine-tuned to detect ADR mentions in a corpus of 861 annotated discharge summaries. This model was compared to a version without the pre-training step, and a previously published RoBERTa model pretrained on MIMIC III, which has demonstrated strong performance on other pharmacovigilance tasks. To ensure that our algorithm could differentiate ADRs from other drug-related adverse events, the annotated corpus was enriched for both validated ADR reports and confounding drug-related adverse events using. The final model demonstrated good performance with a ROC-AUC of 0.955 (95% CI 0.933 - 0.978) for the task of identifying discharge summaries containing ADR mentions, significantly outperforming the two comparator models.


Asunto(s)
Aprendizaje Profundo , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Procesamiento de Lenguaje Natural , Sistemas de Registro de Reacción Adversa a Medicamentos , Algoritmos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/epidemiología , Farmacovigilancia
14.
Methods Mol Biol ; 2552: 375-397, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36346604

RESUMEN

Antibodies are essential experimental and diagnostic tools and as biotherapeutics have significantly advanced our ability to treat a range of diseases. With recent innovations in computational tools to guide protein engineering, we can now rationally design better antibodies with improved efficacy, stability, and pharmacokinetics. Here, we describe the use of the mCSM web-based in silico suite, which uses graph-based signatures to rapidly identify the structural and functional consequences of mutations, to guide rational antibody engineering to improve stability, affinity, and specificity.


Asunto(s)
Anticuerpos , Programas Informáticos , Anticuerpos/genética , Anticuerpos/química , Ingeniería de Proteínas , Mutación , Afinidad de Anticuerpos/genética
15.
J Med Genet ; 60(5): 484-490, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-36180205

RESUMEN

BACKGROUND: Amyotrophic lateral sclerosis (ALS) is a progressively fatal, neurodegenerative disease associated with both motor and non-motor symptoms, including frontotemporal dementia. Approximately 10% of cases are genetically inherited (familial ALS), while the majority are sporadic. Mutations across a wide range of genes have been associated; however, the underlying molecular effects of these mutations and their relation to phenotypes remain poorly explored. METHODS: We initially curated an extensive list (n=1343) of missense mutations identified in the clinical literature, which spanned across 111 unique genes. Of these, mutations in genes SOD1, FUS and TDP43 were analysed using in silico biophysical tools, which characterised changes in protein stability, interactions, localisation and function. The effects of pathogenic and non-pathogenic mutations within these genes were statistically compared to highlight underlying molecular drivers. RESULTS: Compared with previous ALS-dedicated databases, we have curated the most extensive missense mutation database to date and observed a twofold increase in unique implicated genes, and almost a threefold increase in the number of mutations. Our gene-specific analysis identified distinct molecular drivers across the different proteins, where SOD1 mutations primarily reduced protein stability and dimer formation, and those in FUS and TDP-43 were present within disordered regions, suggesting different mechanisms of aggregate formation. CONCLUSION: Using our three genes as case studies, we identified distinct insights which can drive further research to better understand ALS. The information curated in our database can serve as a resource for similar gene-specific analyses, further improving the current understanding of disease, crucial for the development of treatment strategies.


Asunto(s)
Esclerosis Amiotrófica Lateral , Enfermedades Neurodegenerativas , Humanos , Esclerosis Amiotrófica Lateral/genética , Esclerosis Amiotrófica Lateral/metabolismo , Esclerosis Amiotrófica Lateral/patología , Mutación Missense/genética , Superóxido Dismutasa-1/genética , Mutación
16.
Nat Struct Mol Biol ; 29(11): 1056-1067, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36344848

RESUMEN

Most proteins fold into 3D structures that determine how they function and orchestrate the biological processes of the cell. Recent developments in computational methods for protein structure predictions have reached the accuracy of experimentally determined models. Although this has been independently verified, the implementation of these methods across structural-biology applications remains to be tested. Here, we evaluate the use of AlphaFold2 (AF2) predictions in the study of characteristic structural elements; the impact of missense variants; function and ligand binding site predictions; modeling of interactions; and modeling of experimental structural data. For 11 proteomes, an average of 25% additional residues can be confidently modeled when compared with homology modeling, identifying structural features rarely seen in the Protein Data Bank. AF2-based predictions of protein disorder and complexes surpass dedicated tools, and AF2 models can be used across diverse applications equally well compared with experimentally determined structures, when the confidence metrics are critically considered. In summary, we find that these advances are likely to have a transformative impact in structural biology and broader life-science research.


Asunto(s)
Biología Computacional , Furilfuramida , Biología Computacional/métodos , Sitios de Unión , Proteínas/química , Bases de Datos de Proteínas , Conformación Proteica
17.
Am J Hum Genet ; 109(12): 2253-2269, 2022 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-36413998

RESUMEN

Heterozygous pathogenic variants in DNM1 cause developmental and epileptic encephalopathy (DEE) as a result of a dominant-negative mechanism impeding vesicular fission. Thus far, pathogenic variants in DNM1 have been studied with a canonical transcript that includes the alternatively spliced exon 10b. However, after performing RNA sequencing in 39 pediatric brain samples, we find the primary transcript expressed in the brain includes the downstream exon 10a instead. Using this information, we evaluated genotype-phenotype correlations of variants affecting exon 10a and identified a cohort of eleven previously unreported individuals. Eight individuals harbor a recurrent de novo splice site variant, c.1197-8G>A (GenBank: NM_001288739.1), which affects exon 10a and leads to DEE consistent with the classical DNM1 phenotype. We find this splice site variant leads to disease through an unexpected dominant-negative mechanism. Functional testing reveals an in-frame upstream splice acceptor causing insertion of two amino acids predicted to impair oligomerization-dependent activity. This is supported by neuropathological samples showing accumulation of enlarged synaptic vesicles adherent to the plasma membrane consistent with impaired vesicular fission. Two additional individuals with missense variants affecting exon 10a, p.Arg399Trp and p.Gly401Asp, had a similar DEE phenotype. In contrast, one individual with a missense variant affecting exon 10b, p.Pro405Leu, which is less expressed in the brain, had a correspondingly less severe presentation. Thus, we implicate variants affecting exon 10a as causing the severe DEE typically associated with DNM1-related disorders. We highlight the importance of considering relevant isoforms for disease-causing variants as well as the possibility of splice site variants acting through a dominant-negative mechanism.


Asunto(s)
Encefalopatías , Dinaminas , Síndromes Epilépticos , Humanos , Encefalopatías/genética , Causalidad , Dinaminas/genética , Exones/genética , Heterocigoto , Mutación/genética , Síndromes Epilépticos/genética
18.
J Chem Inf Model ; 62(20): 4827-4836, 2022 Oct 24.
Artículo en Inglés | MEDLINE | ID: mdl-36219164

RESUMEN

The design of novel, safe, and effective drugs to treat human diseases is a challenging venture, with toxicity being one of the main sources of attrition at later stages of development. Failure due to toxicity incurs a significant increase in costs and time to market, with multiple drugs being withdrawn from the market due to their adverse effects. Cardiotoxicity, for instance, was responsible for the failure of drugs such as fenspiride, propoxyphene, and valdecoxib. While significant effort has been dedicated to mitigate this issue by developing computational approaches that aim to identify molecules likely to be toxic, including quantitative structure-activity relationship models and machine learning methods, current approaches present limited performance and interpretability. To overcome these, we propose a new web-based computational method, cardioToxCSM, which can predict six types of cardiac toxicity outcomes, including arrhythmia, cardiac failure, heart block, hERG toxicity, hypertension, and myocardial infarction, efficiently and accurately. cardioToxCSM was developed using the concept of graph-based signatures, molecular descriptors, toxicophore matchings, and molecular fingerprints, leveraging explainable machine learning, and was validated internally via different cross validation schemes and externally via low-redundancy blind sets. The models presented robust performances with areas under ROC curves of up to 0.898 on 5-fold cross-validation, consistent with metrics on blind tests. Additionally, our models provide interpretation of the predictions by identifying whether substructures that are commonly enriched in toxic compounds were present. We believe cardioToxCSM will provide valuable insight into the potential cardiotoxicity of small molecules early on drug screening efforts. The method is made freely available as a web server at https://biosig.lab.uq.edu.au/cardiotoxcsm.


Asunto(s)
Cardiotoxicidad , Dextropropoxifeno , Humanos , Cardiotoxicidad/etiología , Relación Estructura-Actividad Cuantitativa , Aprendizaje Automático , Curva ROC , Arritmias Cardíacas
19.
Protein Sci ; 31(11): e4453, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-36305769

RESUMEN

Protein phosphorylation acts as an essential on/off switch in many cellular signaling pathways. This has led to ongoing interest in targeting kinases for therapeutic intervention. Computer-aided drug discovery has been proven a useful and cost-effective approach for facilitating prioritization and enrichment of screening libraries, but limited effort has been devoted providing insights on what makes a potent kinase inhibitor. To fill this gap, here we developed kinCSM, an integrative computational tool capable of accurately identifying potent cyclin-dependent kinase 2 (CDK2) inhibitors, quantitatively predicting CDK2 ligand-kinase inhibition constants (pKi ) and classifying different types of inhibitors based on their favorable binding modes. kinCSM predictive models were built using supervised learning and leveraged the concept of graph-based signatures to capture both physicochemical properties and geometry properties of small molecules. CDK2 inhibitors were accurately identified with Matthew's Correlation Coefficients (MCC) of up to 0.74, and inhibition constants predicted with Pearson's correlation of up to 0.76, both with consistent performances of 0.66 and 0.68 on a nonredundant blind test, respectively. kinCSM was also able to identify the potential type of inhibition for a given molecule, achieving MCC of up to 0.80 on cross-validation and 0.73 on the blind test. Analyzing the molecular composition of revealed enriched chemical fragments in CDK2 inhibitors and different types of inhibitors, which provides insights into the molecular mechanisms behind ligand-kinase interactions. kinCSM will be an invaluable tool to guide future kinase drug discovery. To aid the fast and accurate screening of CDK2 inhibitors, kinCSM is freely available at https://biosig.lab.uq.edu.au/kin_csm/.


Asunto(s)
Antineoplásicos , Inhibidores de Proteínas Quinasas , Quinasa 2 Dependiente de la Ciclina/química , Ligandos , Inhibidores de Proteínas Quinasas/farmacología , Inhibidores de Proteínas Quinasas/química , Descubrimiento de Drogas , Antineoplásicos/química
20.
Protein Sci ; 31(10): e4442, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-36173168

RESUMEN

Peptides are attractive alternatives for the development of new therapeutic strategies due to their versatility and low complexity of synthesis. Increasing interest in these molecules has led to the creation of large collections of experimentally characterized therapeutic peptides, which greatly contributes to development of data-driven computational approaches. Here we propose CSM-peptides, a novel machine learning method for rapid identification of eight different types of therapeutic peptides: anti-angiogenic, anti-bacterial, anti-cancer, anti-inflammatory, anti-viral, cell-penetrating, quorum sensing, and surface binding. Our method has shown to outperform existing approaches, achieving an AUC of up to 0.92 on independent blind tests, and consistent performance on cross-validation. We anticipate CSM-peptides to be of great value in helping screening large libraries to identify novel peptides with therapeutic potential and have made it freely available as a user-friendly web server and Application Programming Interface at https://biosig.lab.uq.edu.au/csm_peptides.


Asunto(s)
Péptidos , Programas Informáticos , Antiinflamatorios , Biología Computacional/métodos , Aprendizaje Automático , Péptidos/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...