Búsqueda | BVS Bolivia

1.

Deciphering the Lexicon of Protein Targets: A Review on Multifaceted Drug Discovery in the Era of Artificial Intelligence.

Nandi, Suvendu; Bhaduri, Soumyadeep; Das, Debraj; Ghosh, Priya; Mandal, Mahitosh; Mitra, Pralay.

Mol Pharm ; 21(4): 1563-1590, 2024 Apr 01.

Artículo en Inglés | MEDLINE | ID: mdl-38466810

RESUMEN

Understanding protein sequence and structure is essential for understanding protein-protein interactions (PPIs), which are essential for many biological processes and diseases. Targeting protein binding hot spots, which regulate signaling and growth, with rational drug design is promising. Rational drug design uses structural data and computational tools to study protein binding sites and protein interfaces to design inhibitors that can change these interactions, thereby potentially leading to therapeutic approaches. Artificial intelligence (AI), such as machine learning (ML) and deep learning (DL), has advanced drug discovery and design by providing computational resources and methods. Quantum chemistry is essential for drug reactivity, toxicology, drug screening, and quantitative structure-activity relationship (QSAR) properties. This review discusses the methodologies and challenges of identifying and characterizing hot spots and binding sites. It also explores the strategies and applications of artificial-intelligence-based rational drug design technologies that target proteins and protein-protein interaction (PPI) binding hot spots. It provides valuable insights for drug design with therapeutic implications. We have also demonstrated the pathological conditions of heat shock protein 27 (HSP27) and matrix metallopoproteinases (MMP2 and MMP9) and designed inhibitors of these proteins using the drug discovery paradigm in a case study on the discovery of drug molecules for cancer treatment. Additionally, the implications of benzothiazole derivatives for anticancer drug design and discovery are deliberated.

Asunto(s)

Inteligencia Artificial , Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Diseño de Fármacos , Aprendizaje Automático , Relación Estructura-Actividad Cuantitativa

2.

MaTPIP: A deep-learning architecture with eXplainable AI for sequence-driven, feature mixed protein-protein interaction prediction.

Ghosh, Shubhrangshu; Mitra, Pralay.

Comput Methods Programs Biomed ; 244: 107955, 2024 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-38064959

RESUMEN

BACKGROUND AND OBJECTIVE: Protein-protein interaction (PPI) is a vital process in all living cells, controlling essential cell functions such as cell cycle regulation, signal transduction, and metabolic processes with broad applications that include antibody therapeutics, vaccines, and drug discovery. The problem of sequence-based PPI prediction has been a long-standing issue in computational biology. METHODS: We introduce MaTPIP, a cutting-edge deep-learning framework for predicting PPI. MaTPIP stands out due to its innovative design, fusing pre-trained Protein Language Model (PLM)-based features with manually curated protein sequence attributes, emphasizing the part-whole relationship by incorporating two-dimensional granular part (amino-acid) level features and one-dimensional whole-level (protein) features. What sets MaTPIP apart is its ability to integrate these features across three different input terminals seamlessly. MatPIP also includes a distinctive configuration of Convolutional Neural Network (CNN) with Transformer components for concurrent utilization of CNN and sequential characteristics in each iteration and a one-dimensional to two-dimensional converter followed by a unified embedding. The statistical significance of this classifier is validated using McNemar's test. RESULTS: MaTPIP outperformed the existing methods on both the Human PPI benchmark and cross-species PPI testing datasets, demonstrating its immense generalization capability for PPI prediction. We used seven diverse datasets with varying PPI target class distributions. Notably, within the novel PPI scenario, the most challenging category for Human PPI Benchmark, MaTPIP improves the existing state-of-the-art score from 74.1% to 78.6% (measured in Area under ROC Curve), from 23.2% to 32.8% (in average precision) and from 4.9% to 9.5% (in precision at 3% recall) for 50%, 10% and 0.3% target class distributions, respectively. In cross-species PPI evaluation, hybrid MaTPIP establishes a new benchmark score (measured in Area Under precision-recall curve) of 81.1% from the previous 60.9% for Mouse, 80.9% from 56.2% for Fly, 78.1% from 55.9% for Worm, 59.9% from 41.7% for Yeast, and 66.2% from 58.8% for E.coli. Our eXplainable AI-based assessment reveals an average contribution of different feature families per prediction on these datasets. CONCLUSIONS: MaTPIP mixes manually curated features with the feature extracted from the pre-trained PLM to predict sequence-based protein-protein association. Furthermore, MaTPIP demonstrates strong generalization capabilities for cross-species PPI predictions.

Asunto(s)

Aprendizaje Profundo , Humanos , Animales , Ratones , Redes Neurales de la Computación , Proteínas/metabolismo , Secuencia de Aminoácidos , Curva ROC

3.

Correction to: ProFuMCell and ProModb: Web services for analyzing interactionbased functionally localized protein modules in a cell.

Das, Barnali; Mitra, Pralay.

J Mol Model ; 29(5): 148, 2023 Apr 19.

Artículo en Inglés | MEDLINE | ID: mdl-37074480

4.

A novel computational predictive biological approach distinguishes Integrin ß1 as a salient biomarker for breast cancer chemoresistance.

Das, Subhayan; Kundu, Moumita; Hassan, Atif; Parekh, Aditya; Jena, Bikash Ch; Mundre, Swati; Banerjee, Indranil; Yetirajam, Rajesh; Das, Chandan K; Pradhan, Anjan K; Das, Swadesh K; Emdad, Luni; Mitra, Pralay; Fisher, Paul B; Mandal, Mahitosh.

Biochim Biophys Acta Mol Basis Dis ; 1869(6): 166702, 2023 08.

Artículo en Inglés | MEDLINE | ID: mdl-37044238

RESUMEN

Chemoresistance is a primary cause of breast cancer treatment failure, and protein-protein interactions significantly contribute to chemoresistance during different stages of breast cancer progression. In pursuit of novel biomarkers and relevant protein-protein interactions occurring during the emergence of breast cancer chemoresistance, we used a computational predictive biological (CPB) approach. CPB identified associations of adhesion molecules with proteins connected with different breast cancer proteins associated with chemoresistance. This approach identified an association of Integrin ß1 (ITGB1) with chemoresistance and breast cancer stem cell markers. ITGB1 activated the Focal Adhesion Kinase (FAK) pathway promoting invasion, migration, and chemoresistance in breast cancer by upregulating Erk phosphorylation. FAK also activated Wnt/Sox2 signaling, which enhanced self-renewal in breast cancer. Activation of the FAK pathway by ITGB1 represents a novel mechanism linked to breast cancer chemoresistance, which may lead to novel therapies capable of blocking breast cancer progression by intervening in ITGB1-regulated signaling pathways.

Asunto(s)

Neoplasias de la Mama , Integrina beta1 , Femenino , Humanos , Biomarcadores , Neoplasias de la Mama/tratamiento farmacológico , Línea Celular Tumoral , Resistencia a Antineoplásicos , Proteína-Tirosina Quinasas de Adhesión Focal/metabolismo , Integrina beta1/metabolismo

5.

Genome surveillance of SARS-CoV-2 variants and their role in pathogenesis focusing on second wave of COVID-19 in India.

Sarkar, Poulomi; Banerjee, Sarthak; Saha, Sarbar Ali; Mitra, Pralay; Sarkar, Siddik.

Sci Rep ; 13(1): 4692, 2023 03 22.

Artículo en Inglés | MEDLINE | ID: mdl-36949118

RESUMEN

India had witnessed unprecedented surge in SARS-CoV-2 infections and its dire consequences during the second wave of COVID-19, but the detailed report of the epidemiological based spatiotemporal incidences of the disease is missing. In the manuscript, we have applied various statistical approaches (correlation, hierarchical clustering) to decipher the pattern of pathogenesis of the circulating VoCs responsible for surge in the incidences. B.1.617.1 (Kappa) was the predominant VoC during the early phase of the second wave, whereas, Delta (B.1.617.2) or Delta-like (AY.x) VoC constitutes majority ([Formula: see text]%) of the cases during the peak of the second wave. The correlation plot of Delta/Delta-like lineage demonstrates inverse correlation with other lineages including B.1.617.1, B.1.1.7, B.1, B.1.36.29 and B.1.36. The spatiotemporal analysis shows that most of the Indian states were affected during the peak of the second wave due to the Delta surge, and fall under the same cluster. The second cluster populated mostly by north-eastern states and the islands of India were minimally affected. The presence of signature mutations (T478K, D950N, E156G) along with L452K, D614G and P681R within the spike protein of Delta or Delta-like might cause elevation in the host cell attachment, increased transmission and altered antigenicity which in due course of time has replaced the other circulating variants.The timely assessment of new VoCs including Delta-like will provide a rationale for updating the diagnostic, vaccine development by medical industries and decision making by various agencies including government, educational institutions, and corporate industries.

Asunto(s)

COVID-19 , SARS-CoV-2 , Humanos , Pueblo Asiatico , COVID-19/epidemiología , COVID-19/virología , India/epidemiología , Mutación , SARS-CoV-2/genética

6.

A sequence space search engine for computational protein design to modulate molecular functionality.

Malik, Ayush; Banerjee, Anupam; Pal, Abantika; Mitra, Pralay.

J Biomol Struct Dyn ; 41(7): 2937-2946, 2023 04.

Artículo en Inglés | MEDLINE | ID: mdl-35220920

RESUMEN

De-novo protein design explores the untapped sequence space that is otherwise less discovered during the evolutionary process. This necessitates an efficient sequence space search engine for effective convergence in computational protein design. We propose a greedy simulated annealing-based Monte-Carlo parallel search algorithm for better sequence-structure compatibility probing in protein design. The guidance provided by the evolutionary profile, the greedy approach, and the cooling schedule adopted in the Monte Carlo simulation ensures sufficient exploration and exploitation of the search space leading to faster convergence. On evaluating the proposed algorithm, we find that a dataset of 76 target scaffolds report an average root-mean-square-deviation (RMSD) of 1.07 Å and an average TM-Score of 0.93 with the modeled designed protein sequences. High sequence recapitulation of 48.7% (59.4%) observed in the design sequences for all (hydrophobic) solvent-inaccessible residues again establish the goodness of the proposed algorithm. A high (93.4%) intra-group recapitulation of hydrophobic residues in the solvent-inaccessible region indicates that the proposed protein design algorithm preserves the core residues in the protein and provides alternative residue combinations in the solvent-accessible regions of the target protein. Furthermore, a COFACTOR-based protein functional analysis shows that the design sequences exhibit altered molecular functionality and introduce new molecular functions compared to the target scaffolds.Communicated by Ramaswamy H. Sarma.

Asunto(s)

Proteínas , Motor de Búsqueda , Proteínas/química , Secuencia de Aminoácidos , Simulación por Computador , Solventes

7.

Therapeutic targeting of RBPJ, an upstream regulator of ETV6 gene, abrogates ETV6-NTRK3 fusion gene transformations in glioblastoma.

Biswas, Angana; Rajesh, Yetirajam; Das, Subhayan; Banerjee, Indranil; Kapoor, Neelkamal; Mitra, Pralay; Mandal, Mahitosh.

Cancer Lett ; 544: 215811, 2022 09 28.

Artículo en Inglés | MEDLINE | ID: mdl-35787922

RESUMEN

Fusion genes are abnormal genes resulting from chromosomal translocation, insertion, deletion, inversion, etc. ETV6, a rather promiscuous partner forms fusions with several other genes, most commonly, the NTRK3 gene. This fusion leads to the formation of a constitutively activated tyrosine kinase which activates the Ras-Raf-MEK and PI3K/AKT/MAPK pathways, leading the cells through cycles of uncontrolled division and ultimately resulting in cancer. Targeted therapies against this ETV6-NTRK3 fusion protein are much needed. Therefore, to find a targeted approach, a transcription factor RBPJ regulating the ETV6 gene was established and since the ETV6-NTRK3 fusion gene is downstream of the ETV6 promoter/enhancer, this fusion protein is also regulated. The regulation of the ETV6 gene via RBPJ was validated by ChIP analysis in human glioblastoma (GBM) cell lines and patient tissue samples. This study was further followed by the identification of an inhibitor, Furamidine, against transcription factor RBPJ. It was found to be binding with the DNA binding domain of RBPJ with antitumorigenic properties and minimal organ toxicity. Hence, a new target RBPJ, regulating the production of ETV6 and ETV6-NTRK3 fusion protein was found along with a potent RBPJ inhibitor Furamidine.

Asunto(s)

Proteínas de Unión al ADN , Glioblastoma , Proteínas de Unión al ADN/genética , Glioblastoma/tratamiento farmacológico , Glioblastoma/genética , Humanos , Proteína de Unión a la Señal Recombinante J de las Inmunoglobulinas , Proteínas de Fusión Oncogénica/genética , Proteínas de Fusión Oncogénica/metabolismo , Fosfatidilinositol 3-Quinasas/metabolismo , Proteínas Proto-Oncogénicas c-ets/genética , Receptor trkC/genética , Receptor trkC/metabolismo , Proteínas Represoras/química , Proteínas Represoras/genética , Factores de Transcripción/genética

8.

ProMoCell and ProModb: Web services for analyzing interaction-based functionally localized protein modules in a cell.

Das, Barnali; Mitra, Pralay.

J Mol Model ; 28(6): 167, 2022 May 25.

Artículo en Inglés | MEDLINE | ID: mdl-35612652

RESUMEN

The modular organization of a cell which can be determined by its interaction network allows us to understand a mesh of cooperation among the functional modules. Therefore, cellular-level identification of functional modules aids in understanding the functional and structural characteristics of the biological network of a cell and also assists in determining or comprehending the evolutionary signal. We develop ProMoCell that performs real-time Web scraping for generating clusters of the cellular level functional units of an organism. ProMoCell constructs the Protein Locality Graphs and clusters the cellular level functional units of an organism by utilizing experimentally verified data from various online sources. Also, we develop ProModb, a database service that houses precomputed whole-cell protein-protein interaction network-based functional modules of an organism using ProMoCell. Our Web service is entirely synchronized with the KEGG pathway database and allows users to generate spatially localized protein modules for any organism belonging to the KEGG genome using its real-time Web scraping characteristics. Hence, the server will host as many organisms as is maintained by the KEGG database. Our Web services provide the users a comprehensive and integrated tool for an efficient browsing and extraction of the spatial locality-based protein locality graph and the functional modules constructed by gathering experimental data from several interaction databases and pathway maps. We believe that our Web services will be beneficial in pharmacological research, where a novel research domain called modular pharmacology has initiated the study on the diagnosis, prevention, and treatment of deadly diseases using functional modules.

Asunto(s)

Algoritmos , Proteínas , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas

9.

Human prion protein: exploring the thermodynamic stability and structural dynamics of its pathogenic mutants.

Halder, Puspita; Mitra, Pralay.

J Biomol Struct Dyn ; 40(21): 11274-11290, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-34338141

RESUMEN

Human familial prion diseases are known to be associated with different single-point mutants of the gene coding for prion protein with a primary focus at several locations of the globular domain. We have identified 12 different single-point pathogenic mutants of human prion protein (HuPrP) with the help of extensive perturbations/mutation technique at multiple locations of HuPrP sequence related to potentiality towards conformational disorders. Among these, some of the mutants include pathogenic variants that corroborate well with the literature reported proteins while majority include some unique single-point mutants that are either not explicitly studied early or studied for variants with different residues at the specific position. Primarily, our study sheds light on the unfolding mechanism of the above mentioned mutants in depth. Besides, we could identify some mutants under investigation that demonstrates not only unfolding of the helical structures but also extension and generation of the ß-sheet structures and or simultaneously have highly exposed hydrophobic surface which is assumed to be linked with the production of aggregate/fibril structures of the prion protein. Among the identified mutants, Q212E needs special attention due to its maximum exposure of hydrophobic core towards solvent and E200Q is found to be important due to its maximum extent of ß-content. We are also able to identify different respective structural conformations of the proteins according to their degree of structural unfolding and those conformations can be extracted and further studied in detail. Communicated by Ramaswamy H. Sarma.

Asunto(s)

Enfermedades por Prión , Priones , Humanos , Proteínas Priónicas/genética , Proteínas Priónicas/química , Priones/genética , Termodinámica

10.

Modularity-based parallel protein design algorithm with an implementation using shared memory programming.

Pal, Abantika; Mulumudy, Rohith; Mitra, Pralay.

Proteins ; 90(3): 658-669, 2022 03.

Artículo en Inglés | MEDLINE | ID: mdl-34651333

RESUMEN

Given a target protein structure, the prime objective of protein design is to find amino acid sequences that will fold/acquire to the given three-dimensional structure. The protein design problem belongs to the non-deterministic polynomial-time-hard class as sequence search space increases exponentially with protein length. To ensure better search space exploration and faster convergence, we propose a protein modularity-based parallel protein design algorithm. The modular architecture of the protein structure is exploited by considering an intermediate structural organization between secondary structure and domain defined as protein unit (PU). Here, we have incorporated a divide-and-conquer approach where a protein is split into PUs and each PU region is explored in a parallel fashion. It has been further analyzed that our shared memory implementation of modularity-based parallel sequence search leads to better search space exploration compared to the case of traditional full protein design. Sequence-based analysis on design sequences depicts an average of 39.7% sequence similarity on the benchmark data set. Structure-based comparison of the modeled structures of the design protein with the target structure exhibited an average root-mean-square deviation of 1.17 Å and an average template modeling score of 0.89. The selected modeled structures of the design protein sequences are validated using 100 ns molecular dynamics simulations where 80% of the proteins have shown better or similar stability to the respective target proteins. Our study informs that our modularity-based protein design algorithm can be extended to protein interaction design as well.

Asunto(s)

Proteínas/química , Algoritmos , Secuencia de Aminoácidos , Benchmarking , Biología Computacional , Bases de Datos de Proteínas , Simulación de Dinámica Molecular , Conformación Proteica , Relación Estructura-Actividad

11.

Protein Interaction Network-based Deep Learning Framework for Identifying Disease-Associated Human Proteins.

Das, Barnali; Mitra, Pralay.

J Mol Biol ; 433(19): 167149, 2021 09 17.

Artículo en Inglés | MEDLINE | ID: mdl-34271012

RESUMEN

Infectious diseases in humans appear to be one of the most primary public health issues. Identification of novel disease-associated proteins will furnish an efficient recognition of the novel therapeutic targets. Here, we develop a Graph Convolutional Network (GCN)-based model called PINDeL to identify the disease-associated host proteins by integrating the human Protein Locality Graph and its corresponding topological features. Because of the amalgamation of GCN with the protein interaction network, PINDeL achieves the highest accuracy of 83.45% while AUROC and AUPRC values are 0.90 and 0.88, respectively. With high accuracy, recall, F1-score, specificity, AUROC, and AUPRC, PINDeL outperforms other existing machine-learning and deep-learning techniques for disease gene/protein identification in humans. Application of PINDeL on an independent dataset of 24320 proteins, which are not used for training, validation, or testing purposes, predicts 6448 new disease-protein associations of which we verify 3196 disease-proteins through experimental evidence like disease ontology, Gene Ontology, and KEGG pathway enrichment analyses. Our investigation informs that experimentally-verified 748 proteins are indeed responsible for pathogen-host protein interactions of which 22 disease-proteins share their association with multiple diseases such as cancer, aging, chem-dependency, pharmacogenomics, normal variation, infection, and immune-related diseases. This unique Graph Convolution Network-based prediction model is of utmost use in large-scale disease-protein association prediction and hence, will provide crucial insights on disease pathogenesis and will further aid in developing novel therapeutics.

Asunto(s)

Biomarcadores/metabolismo , Enfermedades Transmisibles/metabolismo , Mapeo de Interacción de Proteínas/métodos , Aprendizaje Profundo , Estudios de Asociación Genética , Humanos , Redes Neurales de la Computación , Mapas de Interacción de Proteínas

12.

A computational framework for modeling functional protein-protein interactions.

Pal, Abantika; Pal, Debnath; Mitra, Pralay.

Proteins ; 89(10): 1353-1364, 2021 10.

Artículo en Inglés | MEDLINE | ID: mdl-34076296

RESUMEN

Protein interactions and their assemblies assist in understanding the cellular mechanisms through the knowledge of interactome. Despite recent advances, a vast number of interacting protein complexes is not annotated by three-dimensional structures. Therefore, a computational framework is a suitable alternative to fill the large gap between identified interactions and the interactions with known structures. In this work, we develop an automated computational framework for modeling functionally related protein-complex structures utilizing GO-based semantic similarity technique and co-evolutionary information of the interaction sites. The framework can consider protein sequence and structure information as input and employ both rigid-body docking and template-based modeling exploiting the existing structural templates and sequence homology information from the PDB. Our framework combines geometric as well as physicochemical features for re-ranking the docking decoys. The proposed framework has an 83% success rate when tested on a benchmark dataset while considering Top1 models for template-based modeling and Top10 models for the docking pipeline. We believe that our computational framework can be used for any pair of proteins with higher confidence to identify the functional protein-protein interactions.

Asunto(s)

Biología Computacional/métodos , Proteínas/química , Sitios de Unión , Bases de Datos de Proteínas , Unión Proteica , Mapeo de Interacción de Proteínas , Programas Informáticos , Homología Estructural de Proteína

13.

High-Performance Whole-Cell Simulation Exploiting Modular Cell Biology Principles.

Das, Barnali; Mitra, Pralay.

J Chem Inf Model ; 61(3): 1481-1492, 2021 03 22.

Artículo en Inglés | MEDLINE | ID: mdl-33683902

RESUMEN

One of the grand challenges of this century is modeling and simulating a whole cell. Extreme regulation of an extensive quantity of model and simulation data during whole-cell modeling and simulation renders it a computationally expensive research problem in systems biology. In this article, we present a high-performance whole-cell simulation exploiting modular cell biology principles. We prepare the simulation by dividing the unicellular bacterium, Escherichia coli (E. coli), into subcells utilizing the spatially localized densely connected protein clusters/modules. We set up a Brownian dynamics-based parallel whole-cell simulation framework by utilizing the Hamiltonian mechanics-based equations of motion. Though the velocity Verlet integration algorithm possesses the capability of solving the equations of motion, it lacks the ability to capture and deal with particle-collision scenarios. Hence, we propose an algorithm for detecting and resolving both elastic and inelastic collisions and subsequently modify the velocity Verlet integrator by incorporating our algorithm into it. Also, we address the boundary conditions to arrest the molecules' motion outside the subcell. For efficiency, we define one hashing-based data structure called the cellular dictionary to store all of the subcell-related information. A benchmark analysis of our CUDA C/C++ simulation code when tested on E. coli using the CPU-GPU cluster indicates that the computational time requirement decreases with the increase in the number of computing cores and becomes stable at around 128 cores. Additional testing on higher organisms such as rats and humans informs us that our proposed work can be extended to any organism and is scalable for high-end CPU-GPU clusters.

Asunto(s)

Gráficos por Computador , Escherichia coli , Algoritmos , Animales , Simulación por Computador , Proteínas , Ratas

14.

An Evolutionary Profile Guided Greedy Parallel Replica-Exchange Monte Carlo Search Algorithm for Rapid Convergence in Protein Design.

Banerjee, Anupam; Pal, Kuntal; Mitra, Pralay.

IEEE/ACM Trans Comput Biol Bioinform ; 18(2): 489-499, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-31329126

RESUMEN

Protein design, also known as the inverse protein folding problem, is the identification of a protein sequence that folds into a target protein structure. Protein design is proved as an NP-hard problem. While researchers are working on designing heuristics with an emphasis on new scoring functions, we propose a replica-exchange Monte Carlo (REMC) search algorithm that ensures faster convergence using a greedy strategy. Using biological insights, we construct an evolutionary profile to encode the amino acid variability in different positions of the target protein from its structural homologs. The evolutionary profile guides the REMC search, and the greedy approach confirms appreciable exploration and exploitation of the sequence-structure fitness surface. We allow termination of a simulation trajectory once stagnant situation is detected. A series of sequence and structure level validations establish the goodness of our design. On a benchmark dataset, our algorithm reports an average root-mean-square deviation of 1.21Å between the target and the design proteins when modeled with an existing protein folding software. Besides, our algorithm assures 6.16 times overall speedup. In Molecular Dynamics simulations, we observe that four out of selected five design proteins report better to comparable stability to the corresponding target proteins.

Asunto(s)

Algoritmos , Biología Computacional/métodos , Simulación de Dinámica Molecular , Pliegue de Proteína , Proteínas , Método de Montecarlo , Conformación Proteica , Proteínas/química , Proteínas/genética , Proteínas/metabolismo

15.

Estimating Change in Foldability Due to Multipoint Deletions in Protein Structures.

Banerjee, Anupam; Kumar, Amit; Ghosh, Kushal Kanti; Mitra, Pralay.

J Chem Inf Model ; 60(12): 6679-6690, 2020 12 28.

Artículo en Inglés | MEDLINE | ID: mdl-33225697

RESUMEN

Insertions/deletions of amino acids in the protein backbone potentially result in altered structural/functional specifications. They can either contribute positively to the evolutionary process or can result in disease conditions. Despite being the second most prevalent form of protein modification, there are no databases or computational frameworks that delineate harmful multipoint deletions (MPD) from beneficial ones. We introduce a positive unlabeled learning-based prediction framework (PROFOUND) that utilizes fold-level attributes, environment-specific properties, and deletion site-specific properties to predict the change in foldability arising from such MPDs, both in the non-loop and loop regions of protein structures. In the absence of any protein structure dataset to study MPDs, we introduce a dataset with 153 MPD instances that lead to native-like folded structures and 7650 unlabeled MPD instances whose effect on the foldability of the corresponding proteins is unknown. PROFOUND on 10-fold cross-validation on our newly introduced dataset reports a recall of 82.2% (86.6%) and a fall out rate (FR) of 14.2% (20.6%), corresponding to MPDs in the protein loop (non-loop) region. The low FR suggests that the foldability in proteins subject to MPDs is not random and necessitates unique specifications of the deleted region. In addition, we find that additional evolutionary attributes contribute to higher recall and lower FR. The first of a kind foldability prediction system owing to MPD instances and the newly introduced dataset will potentially aid in novel protein engineering endeavors.

Asunto(s)

Aminoácidos , Proteínas , Ingeniería de Proteínas , Pliegue de Proteína , Proteínas/genética

16.

Ebola Virus VP35 Protein: Modeling of the Tetrameric Structure and an Analysis of Its Interaction with Human PKR.

Banerjee, Anupam; Mitra, Pralay.

J Proteome Res ; 19(11): 4533-4542, 2020 11 06.

Artículo en Inglés | MEDLINE | ID: mdl-32871072

RESUMEN

The Viral Protein 35 (VP35), a crucial protein of the Zaire Ebolavirus (EBOV), interacts with a plethora of human proteins to cripple the human immune system. Despite its importance, the entire structure of the tetrameric assembly of EBOV VP35 and the means by which it antagonizes the autophosphorylation of the kinase domain of human protein kinase R (PKRK) is still elusive. We consult existing structural information to model a tetrameric assembly of the VP35 protein where 93% of the protein is modeled using crystal structure templates. We analyze our modeled tetrameric structure to identify interchain bonding networks and use molecular dynamics simulations and normal-mode analysis to unravel the flexibility and deformability of the different regions of the VP35 protein. We establish that the C-terminal of VP35 (VP35C) directly interacts with PKRK to prevent it from autophosphorylation. Further, we identify three plausible VP35C-PKRK complexes with better affinity than the PKRK dimer formed during autophosphorylation and use protein design to establish a new stretch in VP35C that interacts with PKRK. The proposed tetrameric assembly will aid in better understanding of the VP35 protein, and the reported VP35C-PKRK complexes along with their interacting sites will help in the shortlisting of small molecule inhibitors.

Asunto(s)

Ebolavirus , Fiebre Hemorrágica Ebola , Humanos , Proteínas de la Nucleocápside , Proteínas Virales

17.

ETV6 gene aberrations in non-haematological malignancies: A review highlighting ETV6 associated fusion genes in solid tumors.

Biswas, Angana; Rajesh, Yetirajam; Mitra, Pralay; Mandal, Mahitosh.

Biochim Biophys Acta Rev Cancer ; 1874(1): 188389, 2020 08.

Artículo en Inglés | MEDLINE | ID: mdl-32659251

RESUMEN

ETV6 (translocation-Ets-leukemia virus) gene is a transcriptional repressor mainly involved in haematopoiesis and maintenance of vascular networks and has developed to be a major oncogene with the potential ability of forming fusion partners with many other genes with carcinogenic consequences. ETV6 fusions function primarily by constitutive activation of kinase activity of the fusion partners, modifications in the normal functions of ETV6 transcription factor, loss of function of ETV6 or the partner gene and activation of a proto-oncogene near the site of translocation. The role of ETV6 fusion gene in tumorigenesis has been well-documented and more variedly found in haematological malignancies. However, the role of the ETV6 oncogene in solid tumors has also risen to prominence due to an increasing number of cases being reported with this malignancy. Since, solid tumors can be well-targeted, the diagnosis of this genre of tumors based on ETV6 malignancy is of crucial importance for treatment. This review highlights the important ETV6 associated fusions in solid tumors along with critical insights as to existing and novel means of targeting it. A consolidation of novel therapies such as immune, gene, RNAi, stem cell therapy and protein degradation hitherto unused in the case of ETV6 solid tumor malignancies may open further therapeutic avenues.

Asunto(s)

Neoplasias/genética , Proteínas de Fusión Oncogénica/genética , Proteínas Proto-Oncogénicas c-ets/genética , Proteínas Represoras/genética , Antineoplásicos/uso terapéutico , Biomarcadores de Tumor/genética , Biomarcadores de Tumor/metabolismo , Aberraciones Cromosómicas , Humanos , Terapia Molecular Dirigida , Mutación , Neoplasias/patología , Neoplasias/terapia , Proteínas de Fusión Oncogénica/metabolismo , Proto-Oncogenes Mas , Proteínas Proto-Oncogénicas c-ets/metabolismo , Proteínas Represoras/metabolismo , Proteína ETS de Variante de Translocación 6

18.

Estimating the Effect of Single-Point Mutations on Protein Thermodynamic Stability and Analyzing the Mutation Landscape of the p53 Protein.

Banerjee, Anupam; Mitra, Pralay.

J Chem Inf Model ; 60(6): 3315-3323, 2020 06 22.

Artículo en Inglés | MEDLINE | ID: mdl-32401507

RESUMEN

Nonsynonymous single-nucleotide polymorphisms often result in altered protein stability while playing crucial roles both in the evolution process and in the development of human diseases. Prediction of change in the thermodynamic stability due to such missense mutations will help in protein engineering endeavors and will contribute to a better understanding of different disease conditions. Here, we develop a machine-learning-based framework, viz., ProTSPoM, to estimate the change in protein thermodynamic stability arising out of single-point mutations (SPMs). ProTSPoM outperforms existing methods on the S2648 and S1925 databases and reports a Pearson correlation coefficient of 0.82 (0.88) and a root-mean-squared-error of 0.92 (1.06) kcal/mol between the predicted and experimental ΔΔG values on the long-established S350 (tumor suppressor p53 protein) data set. Further, we estimate the change in thermodynamic stability for all possible SPMs in the DNA binding domain of the p53 protein. We identify single-nucleotide polymorphisms in p53 which are plausibly detrimental to its structural integrity and interaction affinity with the DNA molecule. ProTSPoM with its reliable estimates and time-efficient prediction is well suited to be integrated with existing protein engineering techniques. The ProTSPoM web server is accessible at http://cosmos.iitkgp.ac.in/ProTSPoM/.

Asunto(s)

Mutación Puntual , Proteína p53 Supresora de Tumor , Humanos , Mutación , Estabilidad Proteica , Termodinámica , Proteína p53 Supresora de Tumor/genética , Proteína p53 Supresora de Tumor/metabolismo

19.

Boosting phosphorylation site prediction with sequence feature-based machine learning.

Maiti, Shyantani; Hassan, Atif; Mitra, Pralay.

Proteins ; 88(2): 284-291, 2020 02.

Artículo en Inglés | MEDLINE | ID: mdl-31412138

RESUMEN

Protein phosphorylation is one of the essential posttranslation modifications playing a vital role in the regulation of many fundamental cellular processes. We propose a LightGBM-based computational approach that uses evolutionary, geometric, sequence environment, and amino acid-specific features to decipher phosphate binding sites from a protein sequence. Our method, while compared with other existing methods on 2429 protein sequences taken from standard Phospho.ELM (P.ELM) benchmark data set featuring 11 organisms reports a higher F1 score = 0.504 (harmonic mean of the precision and recall) and ROC AUC = 0.836 (area under the curve of the receiver operating characteristics). The computation time of our proposed approach is much less than that of the recently developed deep learning-based framework. Structural analysis on selected protein sequences informs that our prediction is the superset of the phosphorylation sites, as mentioned in P.ELM data set. The foundation of our scheme is manual feature engineering and a decision tree-based classification. Hence, it is intuitive, and one can interpret the final tree as a set of rules resulting in a deeper understanding of the relationships between biophysical features and phosphorylation sites. Our innovative problem transformation method permits more control over precision and recall as is demonstrated by the fact that if we incorporate output probability of the existing deep learning framework as an additional feature, then our prediction improves (F1 score = 0.546; ROC AUC = 0.849). The implementation of our method can be accessed at http://cse.iitkgp.ac.in/~pralay/resources/PPSBoost/ and is mirrored at https://cosmos.iitkgp.ac.in/PPSBoost.

Asunto(s)

Biología Computacional/métodos , Aprendizaje Automático , Procesamiento Proteico-Postraduccional , Proteínas/química , Análisis de Secuencia de Proteína/métodos , Algoritmos , Animales , Sitios de Unión , Bases de Datos de Proteínas , Humanos , Modelos Moleculares , Fosforilación , Conformación Proteica , Proteínas/metabolismo , Reproducibilidad de los Resultados , Serina/química , Serina/metabolismo , Especificidad de la Especie , Treonina/química , Treonina/metabolismo , Tirosina/química , Tirosina/metabolismo

20.

Delineation of crosstalk between HSP27 and MMP-2/MMP-9: A synergistic therapeutic avenue for glioblastoma management.

Rajesh, Y; Banerjee, Anupam; Pal, Ipsita; Biswas, Angana; Das, Subhayan; Dey, Kaushik Kumar; Kapoor, Neelkamal; Ghosh, Ananta Kumar; Mitra, Pralay; Mandal, Mahitosh.

Biochim Biophys Acta Gen Subj ; 1863(7): 1196-1209, 2019 07.

Artículo en Inglés | MEDLINE | ID: mdl-31028823

RESUMEN

BACKGROUND: Epithelial to mesenchymal transition (EMT) and extracellular matrix (ECM) remodeling, are the two elemental processes promoting glioblastoma (GBM). In the present work we propose a mechanistic modelling of GBM and in process establish a hypothesis elucidating critical crosstalk between heat shock proteins (HSPs) and matrix metalloproteinases (MMPs) with synergistic upregulation of EMT-like process and ECM remodeling. METHODS: The interaction and the precise binding site between the HSP and MMP proteins was assayed computationally, in-vitro and in GBM clinical samples. RESULTS: A positive crosstalk of HSP27 with MMP-2 and MMP-9 was established in both GBM patient tissues and cell-lines. This association was found to be of prime significance for ECM remodeling and promotion of EMT-like characteristics. In-silico predictions revealed 3 plausible interaction sites of HSP27 interacting with MMP-2 and MMP-9. Site-directed mutagenesis followed by in-vitro immunoprecipitation assay (IP) with 3 mutated recombinant HSP27, confirmed an interface stretch containing residues 29-40 of HSP27 to be a common interaction site for both MMP-2 and MMP-9. This was further validated with in-vitro IP of truncated (sans AA 29-40) recombinant HSP27 with MMP-2 and MMP-9. CONCLUSION: The association of HSP27 with MMP-2 and MMP-9 proteins along with the identified interacting stretch has the potential to contribute towards drug development to inhibit GBM infiltration and migration. GENERAL SIGNIFICANCE: Current findings provide a novel therapeutic target for GBM opening a new horizon in the field of GBM management.

Asunto(s)

Neoplasias Encefálicas/terapia , Glioblastoma/terapia , Proteínas de Choque Térmico HSP27/metabolismo , Metaloproteinasa 2 de la Matriz/metabolismo , Metaloproteinasa 8 de la Matriz/metabolismo , Neoplasias Encefálicas/metabolismo , Neoplasias Encefálicas/patología , Línea Celular Tumoral , Progresión de la Enfermedad , Glioblastoma/metabolismo , Glioblastoma/patología , Humanos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA