RESUMO
Protein function prediction is critical for understanding the cellular physiological and biochemical processes, and it opens up new possibilities for advancements in fields such as disease research and drug discovery. During the past decades, with the exponential growth of protein sequence data, many computational methods for predicting protein function have been proposed. Therefore, a systematic review and comparison of these methods are necessary. In this study, we divide these methods into four different categories, including sequence-based methods, 3D structure-based methods, PPI network-based methods and hybrid information-based methods. Furthermore, their advantages and disadvantages are discussed, and then their performance is comprehensively evaluated and compared. Finally, we discuss the challenges and opportunities present in this field.
Assuntos
Biologia Computacional , Proteínas , Proteínas/química , Proteínas/metabolismo , Biologia Computacional/métodos , Humanos , Análise de Sequência de Proteína/métodos , AlgoritmosRESUMO
To attain promising pharmacotherapies, researchers have applied drug repurposing (DR) techniques to discover the candidate medicines to combat the coronavirus disease 2019 (COVID-19) outbreak. Although many DR approaches have been introduced for treating different diseases, only structure-based DR (SBDR) methods can be employed as the first therapeutic option against the COVID-19 pandemic because they rely on the rudimentary information about the diseases such as the sequence of the severe acute respiratory syndrome coronavirus 2 genome. Hence, to try out new treatments for the disease, the first attempts have been made based on the SBDR methods which seem to be among the proper choices for discovering the potential medications against the emerging and re-emerging infectious diseases. Given the importance of SBDR approaches, in the present review, well-known SBDR methods are summarized, and their merits are investigated. Then, the databases and software applications, utilized for repurposing the drugs against COVID-19, are introduced. Besides, the identified drugs are categorized based on their targets. Finally, a comparison is made between the SBDR approaches and other DR methods, and some possible future directions are proposed.
Assuntos
Antivirais/química , Tratamento Farmacológico da COVID-19 , Reposicionamento de Medicamentos , SARS-CoV-2/efeitos dos fármacos , Antivirais/uso terapêutico , COVID-19/virologia , Humanos , Pandemias , SARS-CoV-2/química , SARS-CoV-2/patogenicidadeRESUMO
Histone deacetylase (HDAC) inhibitors have gained attention over the past three decades because of their potential in the treatment of different diseases including various forms of cancers, neurodegenerative disorders, autoimmune, inflammatory diseases, and other metabolic disorders. To date, 5 HDAC inhibitor drugs are marketed for the treatment of hematological malignancies and several drug-candidate HDAC inhibitors are at different stages of clinical trials. However, due to the toxic side effects of these drugs resulting from the lack of target selectivity, active studies are ongoing to design and develop either class-selective or isoform-selective inhibitors. Computational methods have aided the discovery of HDAC inhibitors with the desired potency and/or selectivity. These methods include ligand-based approaches such as scaffold hopping, pharmacophore modeling, three-dimensional quantitative structure-activity relationships (3D-QSAR); and structure-based virtual screening (molecular docking). The current trends involve the application of the combination of these methods and incorporating molecular dynamics simulations coupled with Poisson-Boltzmann/molecular mechanics generalized Born surface area (MM-PBSA/MM-GBSA) to improve the prediction of ligand binding affinity. This review aimed at understanding the current trends in applying these multilayered strategies and their contribution to the design/identification of HDAC inhibitors.
Assuntos
Inibidores de Histona Desacetilases , Simulação de Dinâmica Molecular , Inibidores de Histona Desacetilases/farmacologia , Inibidores de Histona Desacetilases/uso terapêutico , Simulação de Acoplamento Molecular , Ligantes , Relação Quantitativa Estrutura-AtividadeRESUMO
Virtual screening (VS) is an outstanding cornerstone in the drug discovery pipeline. A variety of computational approaches, which are generally classified as ligand-based (LB) and structure-based (SB) techniques, exploit key structural and physicochemical properties of ligands and targets to enable the screening of virtual libraries in the search of active compounds. Though LB and SB methods have found widespread application in the discovery of novel drug-like candidates, their complementary natures have stimulated continued efforts toward the development of hybrid strategies that combine LB and SB techniques, integrating them in a holistic computational framework that exploits the available information of both ligand and target to enhance the success of drug discovery projects. In this review, we analyze the main strategies and concepts that have emerged in the last years for defining hybrid LB + SB computational schemes in VS studies. Particularly, attention is focused on the combination of molecular similarity and docking, illustrating them with selected applications taken from the literature.
Assuntos
Descoberta de Drogas/tendências , Avaliação Pré-Clínica de Medicamentos/tendências , Bibliotecas de Moléculas Pequenas/química , Interface Usuário-Computador , Algoritmos , Humanos , Ligantes , Simulação de Acoplamento Molecular/métodosRESUMO
BACKGROUND: Ligand-binding proteins play key roles in many biological processes. Identification of protein-ligand binding residues is important in understanding the biological functions of proteins. Existing computational methods can be roughly categorized as sequence-based or 3D-structure-based methods. All these methods are based on traditional machine learning. In a series of binding residue prediction tasks, 3D-structure-based methods are widely superior to sequence-based methods. However, due to the great number of proteins with known amino acid sequences, sequence-based methods have considerable room for improvement with the development of deep learning. Therefore, prediction of protein-ligand binding residues with deep learning requires study. RESULTS: In this study, we propose a new sequence-based approach called DeepCSeqSite for ab initio protein-ligand binding residue prediction. DeepCSeqSite includes a standard edition and an enhanced edition. The classifier of DeepCSeqSite is based on a deep convolutional neural network. Several convolutional layers are stacked on top of each other to extract hierarchical features. The size of the effective context scope is expanded as the number of convolutional layers increases. The long-distance dependencies between residues can be captured by the large effective context scope, and stacking several layers enables the maximum length of dependencies to be precisely controlled. The extracted features are ultimately combined through one-by-one convolution kernels and softmax to predict whether the residues are binding residues. The state-of-the-art ligand-binding method COACH and some of its submethods are selected as baselines. The methods are tested on a set of 151 nonredundant proteins and three extended test sets. Experiments show that the improvement of the Matthews correlation coefficient (MCC) is no less than 0.05. In addition, a training data augmentation method that slightly improves the performance is discussed in this study. CONCLUSIONS: Without using any templates that include 3D-structure data, DeepCSeqSite significantlyoutperforms existing sequence-based and 3D-structure-based methods, including COACH. Augmentation of the training sets slightly improves the performance. The model, code and datasets are available at https://github.com/yfCuiFaith/DeepCSeqSite .
Assuntos
Aprendizado Profundo , Redes Neurais de Computação , Proteínas/metabolismo , Algoritmos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Ligantes , Ligação ProteicaRESUMO
The chapter emphasizes the importance of understanding protein-protein interactions in cellular mechanisms and highlights the role of computational modeling in predicting these interactions. It discusses sequence-based approaches such as evolutionary trace (ET), correlated mutation analysis (CMA), and subtractive correlated mutation (SCM) for identifying crucial amino acid residues, considering interface conservation or evolutionary changes. The chapter also explores methods like differential ET, hidden-site class model, and spatial cluster detection (SCD) for interface specificity and spatial clustering. Furthermore, it examines approaches combining structural and sequential methodologies and evaluates modeled predictions through initiatives like critical assessment of prediction of interactions (CAPRI). Additionally, the chapter provides an overview of various software programs used for molecular docking, detailing their search, sampling, refinement and scoring stages, along with innovative techniques and tools like normal mode analysis (NMA) and adaptive Poisson-Boltzmann solver (APBS) for electrostatic calculations. These computational and experimental approaches are crucial for unraveling protein-protein interactions and aid in developing potential therapeutics for various diseases.
Assuntos
Biologia Computacional , Simulação de Acoplamento Molecular , Ligação Proteica , Proteínas , Software , Biologia Computacional/métodos , Proteínas/metabolismo , Proteínas/química , Mapeamento de Interação de Proteínas/métodos , Humanos , Mutação , Algoritmos , Conformação ProteicaRESUMO
Plant genomes contain a particularly high proportion of repeated structures of various types. This chapter proposes a guided tour of the available software that can help biologists to scan automatically for these repeats in sequence data or check hypothetical models intended to characterize their structures. Since transposable elements (TEs) are a major source of repeats in plants, many methods have been used or developed for this broad class of sequences. They are representative of the range of tools available for other classes of repeats and we have provided two sections on this topic (for the analysis of genomes or directly of sequenced reads), as well as a selection of the main existing software. It may be hard to keep up with the profusion of proposals in this dynamic field and the rest of the chapter is devoted to the foundations of an efficient search for repeats and more complex patterns. We first introduce the key concepts of the art of indexing and mapping or querying sequences. We end the chapter with the more prospective issue of building models of repeat families. We present the Machine Learning approach first, seeking to build predictors automatically for some families of ET, from a set of sequences known to belong to this family. A second approach, the linguistic (or syntactic) approach, allows biologists to describe themselves and check the validity of models of their favorite repeat family.
Assuntos
Genoma de Planta , Software , Elementos de DNA Transponíveis/genética , Plantas/genética , Estudos ProspectivosRESUMO
The integrated in silico-in vitro-in vivo approaches have fostered the development of new treatment strategies for glioblastoma patients and improved diagnosis, establishing the bridge between biochemical research and clinical practice. These approaches have provided new insights on the identification of bioactive compounds and on the complex mechanisms underlying the interactions among glioblastoma cells, and the tumor microenvironment. This review focuses on the key advances pertaining to computational modeling in glioblastoma, including predictive data on drug permeability across the blood-brain barrier, tumor growth and treatment responses. Structure- and ligand-based methods have been widely adopted, enabling the study of dynamic and evolutionary aspects of glioblastoma. Their potential applications as predictive tools and the advantages over other well-known methodologies are outlined. Challenges regarding in silico approaches for predicting tumor properties are also discussed.
Assuntos
Algoritmos , Antineoplásicos/farmacologia , Barreira Hematoencefálica/efeitos dos fármacos , Neoplasias Encefálicas/tratamento farmacológico , Glioblastoma/tratamento farmacológico , Antineoplásicos/química , Barreira Hematoencefálica/metabolismo , Neoplasias Encefálicas/diagnóstico , Neoplasias Encefálicas/metabolismo , Glioblastoma/diagnóstico , Glioblastoma/metabolismo , Humanos , Ligantes , Modelos Moleculares , Permeabilidade/efeitos dos fármacos , Relação Quantitativa Estrutura-AtividadeRESUMO
FGF23, CYP24A1 and VDR altogether play a significant role in genetic susceptibility to chronic kidney disease (CKD). Identification of possible causative mutations may serve as therapeutic targets and diagnostic markers for CKD. Thus, we adopted both sequence and sequence-structure based SNP analysis algorithm in order to overcome the limitations of both methods. We explore the functional significance towards the prediction of risky SNPs associated with CKD. We assessed the performance of four widely used pathogenicity prediction methods. We compared the performances of the programs using Mathews correlation Coefficient ranged from poor (MCC = 0.39) to reasonably good (MCC = 0.42). However, we got the best results for the combined sequence and structure based analysis method (MCC = 0.45). 4 SNPs from FGF23 gene, 8 SNPs from VDR gene and 13 SNPs from CYP24A1 gene were predicted to be the causative agents for human diseases. This study will be helpful in selecting potential SNPs for experimental study from the SNP pool and also will reduce the cost for identification of potential SNPs as a genetic marker.