Búsqueda | BVS CLAP/SMR-OPS/OMS

1.

Genome3D: integrating a collaborative data pipeline to expand the depth and breadth of consensus protein structure annotation.

Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Finn, Robert D; Gough, Julian; Jones, David; Kelley, Lawrence A; Paysan-Lafosse, Typhaine; Lam, Su Datt; Murzin, Alexey G; Pandurangan, Arun Prasad; Salazar, Gustavo A; Skwark, Marcin J; Sternberg, Michael J E; Velankar, Sameer; Orengo, Christine.

Nucleic Acids Res ; 48(D1): D314-D319, 2020 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-31733063

RESUMEN

Genome3D (https://www.genome3d.eu) is a freely available resource that provides consensus structural annotations for representative protein sequences taken from a selection of model organisms. Since the last NAR update in 2015, the method of data submission has been overhauled, with annotations now being 'pushed' to the database via an API. As a result, contributing groups are now able to manage their own structural annotations, making the resource more flexible and maintainable. The new submission protocol brings a number of additional benefits including: providing instant validation of data and avoiding the requirement to synchronise releases between resources. It also makes it possible to implement the submission of these structural annotations as an automated part of existing internal workflows. In turn, these improvements facilitate Genome3D being opened up to new prediction algorithms and groups. For the latest release of Genome3D (v2.1), the underlying dataset of sequences used as prediction targets has been updated using the latest reference proteomes available in UniProtKB. A number of new reference proteomes have also been added of particular interest to the wider scientific community: cow, pig, wheat and mycobacterium tuberculosis. These additions, along with improvements to the underlying predictions from contributing resources, has ensured that the number of annotations in Genome3D has nearly doubled since the last NAR update article. The new API has also been used to facilitate the dissemination of Genome3D data into InterPro, thereby widening the visibility of both the annotation data and annotation algorithms.

Asunto(s)

Proteínas/química , Bases de Datos de Proteínas , Proteínas/clasificación , Proteínas/genética , Interfaz Usuario-Computador

2.

Interacting networks of resistance, virulence and core machinery genes identified by genome-wide epistasis analysis.

Skwark, Marcin J; Croucher, Nicholas J; Puranen, Santeri; Chewapreecha, Claire; Pesonen, Maiju; Xu, Ying Ying; Turner, Paul; Harris, Simon R; Beres, Stephen B; Musser, James M; Parkhill, Julian; Bentley, Stephen D; Aurell, Erik; Corander, Jukka.

PLoS Genet ; 13(2): e1006508, 2017 02.

Artículo en Inglés | MEDLINE | ID: mdl-28207813

RESUMEN

Recent advances in the scale and diversity of population genomic datasets for bacteria now provide the potential for genome-wide patterns of co-evolution to be studied at the resolution of individual bases. Here we describe a new statistical method, genomeDCA, which uses recent advances in computational structural biology to identify the polymorphic loci under the strongest co-evolutionary pressures. We apply genomeDCA to two large population data sets representing the major human pathogens Streptococcus pneumoniae (pneumococcus) and Streptococcus pyogenes (group A Streptococcus). For pneumococcus we identified 5,199 putative epistatic interactions between 1,936 sites. Over three-quarters of the links were between sites within the pbp2x, pbp1a and pbp2b genes, the sequences of which are critical in determining non-susceptibility to beta-lactam antibiotics. A network-based analysis found these genes were also coupled to that encoding dihydrofolate reductase, changes to which underlie trimethoprim resistance. Distinct from these antibiotic resistance genes, a large network component of 384 protein coding sequences encompassed many genes critical in basic cellular functions, while another distinct component included genes associated with virulence. The group A Streptococcus (GAS) data set population represents a clonal population with relatively little genetic variation and a high level of linkage disequilibrium across the genome. Despite this, we were able to pinpoint two RNA pseudouridine synthases, which were each strongly linked to a separate set of loci across the chromosome, representing biologically plausible targets of co-selection. The population genomic analysis method applied here identifies statistically significantly co-evolving locus pairs, potentially arising from fitness selection interdependence reflecting underlying protein-protein interactions, or genes whose product activities contribute to the same phenotype. This discovery approach greatly enhances the future potential of epistasis analysis for systems biology, and can complement genome-wide association studies as a means of formulating hypotheses for targeted experimental work.

Asunto(s)

Epistasis Genética , Selección Genética/genética , Streptococcus pneumoniae/genética , Streptococcus pyogenes/genética , Resistencia betalactámica/genética , Aminoaciltransferasas/genética , Antibacterianos/uso terapéutico , Proteínas Bacterianas/genética , Redes Reguladoras de Genes/genética , Genética de Población , Genoma Bacteriano/genética , Genómica , Genotipo , Humanos , Pruebas de Sensibilidad Microbiana , Proteínas de Unión a las Penicilinas/química , Proteínas de Unión a las Penicilinas/genética , Peptidil Transferasas/genética , Streptococcus pneumoniae/efectos de los fármacos , Streptococcus pneumoniae/patogenicidad , Streptococcus pyogenes/efectos de los fármacos , Streptococcus pyogenes/patogenicidad , beta-Lactamas/metabolismo

3.

Predicting accurate contacts in thousands of Pfam domain families using PconsC3.

Michel, Mirco; Skwark, Marcin J; Menéndez Hurtado, David; Ekeberg, Magnus; Elofsson, Arne.

Bioinformatics ; 33(18): 2859-2866, 2017 Sep 15.

Artículo en Inglés | MEDLINE | ID: mdl-28535189

RESUMEN

MOTIVATION: A few years ago it was shown that by using a maximum entropy approach to describe couplings between columns in a multiple sequence alignment it is possible to significantly increase the accuracy of residue contact predictions. For very large protein families with more than 1000 effective sequences the accuracy is sufficient to produce accurate models of proteins as well as complexes. Today, for about half of all Pfam domain families no structure is known, but unfortunately most of these families have at most a few hundred members, i.e. are too small for such contact prediction methods. RESULTS: To extend accurate contact predictions to the thousands of smaller protein families we present PconsC3, a fast and improved method for protein contact predictions that can be used for families with even 100 effective sequence members. PconsC3 outperforms direct coupling analysis (DCA) methods significantly independent on family size, secondary structure content, contact range, or the number of selected contacts. AVAILABILITY AND IMPLEMENTATION: PconsC3 is available as a web server and downloadable version at http://c3.pcons.net . The downloadable version is free for all to use and licensed under the GNU General Public License, version 2. At this site contact predictions for most Pfam families are also available. We do estimate that more than 4000 contact maps for Pfam families of unknown structure have more than 50% of the top-ranked contacts predicted correctly. CONTACT: arne@bioinfo.se. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Biología Computacional/métodos , Estructura Secundaria de Proteína , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Programas Informáticos

4.

PconsFold: improved contact predictions improve protein models.

Michel, Mirco; Hayat, Sikander; Skwark, Marcin J; Sander, Chris; Marks, Debora S; Elofsson, Arne.

Bioinformatics ; 30(17): i482-8, 2014 Sep 01.

Artículo en Inglés | MEDLINE | ID: mdl-25161237

RESUMEN

MOTIVATION: Recently it has been shown that the quality of protein contact prediction from evolutionary information can be improved significantly if direct and indirect information is separated. Given sufficiently large protein families, the contact predictions contain sufficient information to predict the structure of many protein families. However, since the first studies contact prediction methods have improved. Here, we ask how much the final models are improved if improved contact predictions are used. RESULTS: In a small benchmark of 15 proteins, we show that the TM-scores of top-ranked models are improved by on average 33% using PconsFold compared with the original version of EVfold. In a larger benchmark, we find that the quality is improved with 15-30% when using PconsC in comparison with earlier contact prediction methods. Further, using Rosetta instead of CNS does not significantly improve global model accuracy, but the chemistry of models generated with Rosetta is improved. AVAILABILITY: PconsFold is a fully automated pipeline for ab initio protein structure prediction based on evolutionary information. PconsFold is based on PconsC contact prediction and uses the Rosetta folding protocol. Due to its modularity, the contact prediction tool can be easily exchanged. The source code of PconsFold is available on GitHub at https://www.github.com/ElofssonLab/pcons-fold under the MIT license. PconsC is available from http://c.pcons.net/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Modelos Moleculares , Conformación Proteica , Pliegue de Proteína , Programas Informáticos , Algoritmos , Aminoácidos/química , Proteínas/química , Análisis de Secuencia de Proteína

5.

Improving contact prediction along three dimensions.

Feinauer, Christoph; Skwark, Marcin J; Pagnani, Andrea; Aurell, Erik.

PLoS Comput Biol ; 10(10): e1003847, 2014 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-25299132

RESUMEN

Correlation patterns in multiple sequence alignments of homologous proteins can be exploited to infer information on the three-dimensional structure of their members. The typical pipeline to address this task, which we in this paper refer to as the three dimensions of contact prediction, is to (i) filter and align the raw sequence data representing the evolutionarily related proteins; (ii) choose a predictive model to describe a sequence alignment; (iii) infer the model parameters and interpret them in terms of structural properties, such as an accurate contact map. We show here that all three dimensions are important for overall prediction success. In particular, we show that it is possible to improve significantly along the second dimension by going beyond the pair-wise Potts models from statistical physics, which have hitherto been the focus of the field. These (simple) extensions are motivated by multiple sequence alignments often containing long stretches of gaps which, as a data feature, would be rather untypical for independent samples drawn from a Potts model. Using a large test set of proteins we show that the combined improvements along the three dimensions are as large as any reported to date.

Asunto(s)

Biología Computacional/métodos , Proteínas/química , Análisis de Secuencia de Proteína/métodos , Modelos Estadísticos , Alineación de Secuencia

6.

Improved contact predictions using the recognition of protein like contact patterns.

Skwark, Marcin J; Raimondi, Daniele; Michel, Mirco; Elofsson, Arne.

PLoS Comput Biol ; 10(11): e1003889, 2014 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-25375897

RESUMEN

Given sufficient large protein families, and using a global statistical inference approach, it is possible to obtain sufficient accuracy in protein residue contact predictions to predict the structure of many proteins. However, these approaches do not consider the fact that the contacts in a protein are neither randomly, nor independently distributed, but actually follow precise rules governed by the structure of the protein and thus are interdependent. Here, we present PconsC2, a novel method that uses a deep learning approach to identify protein-like contact patterns to improve contact predictions. A substantial enhancement can be seen for all contacts independently on the number of aligned sequences, residue separation or secondary structure type, but is largest for ß-sheet containing proteins. In addition to being superior to earlier methods based on statistical inferences, in comparison to state of the art methods using machine learning, PconsC2 is superior for families with more than 100 effective sequence homologs. The improved contact prediction enables improved structure prediction.

Asunto(s)

Biología Computacional/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Proteínas/química , Análisis de Secuencia de Proteína/métodos , Inteligencia Artificial , Bases de Datos de Proteínas , Conformación Proteica , Estructura Secundaria de Proteína

7.

PconsD: ultra rapid, accurate model quality assessment for protein structure prediction.

Skwark, Marcin J; Elofsson, Arne.

Bioinformatics ; 29(14): 1817-8, 2013 Jul 15.

Artículo en Inglés | MEDLINE | ID: mdl-23677942

RESUMEN

SUMMARY: Clustering methods are often needed for accurately assessing the quality of modeled protein structures. Recent blind evaluation of quality assessment methods in CASP10 showed that there is little difference between many different methods as far as ranking models and selecting best model are concerned. When comparing many models, the computational cost of the model comparison can become significant. Here, we present PconsD, a fast, stream-computing method for distance-driven model quality assessment that runs on consumer hardware. PconsD is at least one order of magnitude faster than other methods of comparable accuracy. AVAILABILITY: The source code for PconsD is freely available at http://d.pcons.net/. Supplementary benchmarking data are also available there. CONTACT: arne@bioinfo.se SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Modelos Moleculares , Conformación Proteica , Programas Informáticos , Análisis por Conglomerados , Biología Computacional/métodos , Proteínas/química

8.

PconsC: combination of direct information methods and alignments improves contact prediction.

Skwark, Marcin J; Abdel-Rehim, Abbi; Elofsson, Arne.

Bioinformatics ; 29(14): 1815-6, 2013 Jul 15.

Artículo en Inglés | MEDLINE | ID: mdl-23658418

RESUMEN

SUMMARY: Recently, several new contact prediction methods have been published. They use (i) large sets of multiple aligned sequences and (ii) assume that correlations between columns in these alignments can be the results of indirect interaction. These methods are clearly superior to earlier methods when it comes to predicting contacts in proteins. Here, we demonstrate that combining predictions from two prediction methods, PSICOV and plmDCA, and two alignment methods, HHblits and jackhmmer at four different e-value cut-offs, provides a relative improvement of 20% in comparison with the best single method, exceeding 70% correct predictions for one contact prediction per residue. AVAILABILITY: The source code for PconsC along with supplementary data is freely available at http://c.pcons.net/ CONTACT: arne@bioinfo.se SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Conformación Proteica , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína , Algoritmos , Proteínas/química

9.

The influence of manual semen collection in male trained dogs (Canis familiaris), in the presence or absence of a female in estrus, on the concentrations of cortisol, oxytocin, prolactin and testosterone.

Woszczylo, Martyna; Szumny, Antoni; Knap, Piotr; Jezierski, Tadeusz; Nizanski, Wojciech; Kokocinska, Agata; Skwark, Marcin J; Dzieciol, Michal.

PLoS One ; 18(2): e0278524, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-36730259

RESUMEN

Sex pheromones are chemical substances secreted into the environment that affect the physiology and behavior of recipients. Females use these compounds during oestrus to attract males, which leads to attempts of mating. This study evaluates the influence of manual semen collection in male dogs, in the presence or absence of a female in estrus, on the blood concentrations of cortisol (CRT), oxytocin (OXT), prolactin (PRL) and testosterone (T), as hormones involved both in the physiology of reproduction and stress. Ten male dogs were used in Experiment 1 to measure the serum and plasma concentrations of the aforementioned hormones in the absence of semen collection. Subsequently in the same animals, the concentrations of these hormones were evaluated before and after semen collection in the presence (Exp. 2) or in absence of a female in estrus (Exp. 3). No significant changes in hormone concentration caused by the semen collection were found, either with, or without the presence of female in estrus. Obtained results suggest that the procedure of manual semen collection in dogs, probably due to its passive character, does not stimulate endocrine glands to secrete hormones, and the process of ejaculation is probably controlled by neural pathway. The lack of effect of semiochemical stimulation to the CRT, PRL, OXT and T level, could be caused by a short contact with female during semen collection. Further studies on involvement of the hormones during the process of natural mating, especially preceded by long courtships, similar to that observed under natural conditions, should shed a light on the physiology of mating and the connection between the endocrine system and semiochemical stimulation in dogs.

Asunto(s)

Hidrocortisona , Prolactina , Perros , Femenino , Masculino , Animales , Oxitocina , Testosterona , Semen/fisiología , Estro/fisiología , Feromonas

10.

Urinary Proteins of Female Domestic Dog (Canis familiaris) during Ovarian Cycle.

Woszczylo, Martyna; Pasikowski, Pawel; Devaraj, Sankarganesh; Kokocinska, Agata; Szumny, Antoni; Skwark, Marcin J; Nizanski, Wojciech; Dzieciol, Michal.

Vet Sci ; 10(4)2023 Apr 14.

Artículo en Inglés | MEDLINE | ID: mdl-37104448

RESUMEN

The presence and identity of non-volatile chemical signals remain elusive in canines. In this study, we aim to evaluate the urinary proteins of female domestic dogs in the estrus and anestrus phases to evidence the presence of non-volatile chemical signals and to elucidate their identities. We collected urine samples from eight female dogs in the estrus and anestrus phases. A total of 240 proteins were identified in the urine samples using liquid chromatography-mass spectrometry (LC-MS analysis). The comparison of the proteins revealed a significant difference between the estrus and anestrus urine. We identified proteins belonging to the lipocalin family of canines (beta-lactoglobulin-1 and beta-lactoglobulin-2, P33685 and P33686, respectively), one of whose function was the transport of pheromones and which was present only in the estrus urine samples. Moreover, proteins such as Clusterin (CLU), Liver-expressed antimicrobial peptide 2 (LEAP2), and Proenkephalin (PENK) were more abundant in the estrus urine when compared to the anestrus urine. LEAP2 was recently described as a ghrelin receptor antagonist and implicated in regulating food intake and body weight in humans and mice. Proenkephalin, a polypeptide hormone cleaved into opioid peptides, was also recognized as a candidate to determine kidney function. As of yet, none of these have played a role in chemical communication. Clusterin, an extracellular chaperone protecting from protein aggregation implicated in stress-induced cell apoptosis, is a plausible candidate in chemical communication, which is a claim that needs to be ascertained further. Data are available via ProteomeXchange with the identifier PXD040418.

11.

Early computational detection of potential high-risk SARS-CoV-2 variants.

Beguir, Karim; Skwark, Marcin J; Fu, Yunguan; Pierrot, Thomas; Carranza, Nicolas Lopez; Laterre, Alexandre; Kadri, Ibtissem; Korched, Abir; Lowegard, Anna U; Lui, Bonny Gaby; Sänger, Bianca; Liu, Yunpeng; Poran, Asaf; Muik, Alexander; Sahin, Ugur.

Comput Biol Med ; 155: 106618, 2023 03.

Artículo en Inglés | MEDLINE | ID: mdl-36774893

RESUMEN

The ongoing COVID-19 pandemic is leading to the discovery of hundreds of novel SARS-CoV-2 variants daily. While most variants do not impact the course of the pandemic, some variants pose an increased risk when the acquired mutations allow better evasion of antibody neutralisation or increased transmissibility. Early detection of such high-risk variants (HRVs) is paramount for the proper management of the pandemic. However, experimental assays to determine immune evasion and transmissibility characteristics of new variants are resource-intensive and time-consuming, potentially leading to delays in appropriate responses by decision makers. Presented herein is a novel in silico approach combining spike (S) protein structure modelling and large protein transformer language models on S protein sequences to accurately rank SARS-CoV-2 variants for immune escape and fitness potential. Both metrics were experimentally validated using in vitro pseudovirus-based neutralisation test and binding assays and were subsequently combined to explore the changing landscape of the pandemic and to create an automated Early Warning System (EWS) capable of evaluating new variants in minutes and risk-monitoring variant lineages in near real-time. The system accurately pinpoints the putatively dangerous variants by selecting on average less than 0.3% of the novel variants each week. The EWS flagged all 16 variants designated by the World Health Organization (WHO) as variants of interest (VOIs) if applicable or variants of concern (VOCs) otherwise with an average lead time of more than one and a half months ahead of their designation as such.

Asunto(s)

COVID-19 , SARS-CoV-2 , Humanos , Pandemias , Benchmarking , Mutación

12.

Improved predictions by Pcons.net using multiple templates.

Larsson, Per; Skwark, Marcin J; Wallner, Björn; Elofsson, Arne.

Bioinformatics ; 27(3): 426-7, 2011 Feb 01.

Artículo en Inglés | MEDLINE | ID: mdl-21149277

RESUMEN

UNLABELLED: Multiple templates can often be used to build more accurate homology models than models built from a single template. Here we introduce PconsM, an automated protocol that uses multiple templates to build protein models. PconsM has been among the top-performing methods in the recent CASP experiments and consistently perform better than the single template models used in Pcons.net. In particular for the easier targets with many alternative templates with a high degree of sequence identity, quality is readily improved with a few percentages over the highest ranked model built on a single template. PconsM is available as an additional pipeline within the Pcons.net protein structure prediction server. AVAILABILITY AND IMPLEMENTATION: PconsM is freely available from http://pcons.net/.

Asunto(s)

Biología Computacional/métodos , Modelos Moleculares , Proteínas/química , Programas Informáticos

13.

The structural basis of Cdc7-Dbf4 kinase dependent targeting and phosphorylation of the MCM2-7 double hexamer.

Saleh, Almutasem; Noguchi, Yasunori; Aramayo, Ricardo; Ivanova, Marina E; Stevens, Kathryn M; Montoya, Alex; Sunidhi, S; Carranza, Nicolas Lopez; Skwark, Marcin J; Speck, Christian.

Nat Commun ; 13(1): 2915, 2022 05 25.

Artículo en Inglés | MEDLINE | ID: mdl-35614055

RESUMEN

The controlled assembly of replication forks is critical for genome stability. The Dbf4-dependent Cdc7 kinase (DDK) initiates replisome assembly by phosphorylating the MCM2-7 replicative helicase at the N-terminal tails of Mcm2, Mcm4 and Mcm6. At present, it remains poorly understood how DDK docks onto the helicase and how the kinase targets distal Mcm subunits for phosphorylation. Using cryo-electron microscopy and biochemical analysis we discovered that an interaction between the HBRCT domain of Dbf4 with Mcm2 serves as an anchoring point, which supports binding of DDK across the MCM2-7 double-hexamer interface and phosphorylation of Mcm4 on the opposite hexamer. Moreover, a rotation of DDK along its anchoring point allows phosphorylation of Mcm2 and Mcm6. In summary, our work provides fundamental insights into DDK structure, control and selective activation of the MCM2-7 helicase during DNA replication. Importantly, these insights can be exploited for development of novel DDK inhibitors.

Asunto(s)

Proteínas de Ciclo Celular , Proteínas de Mantenimiento de Minicromosoma , Proteínas Serina-Treonina Quinasas , Proteínas de Saccharomyces cerevisiae , Proteínas de Ciclo Celular/metabolismo , Microscopía por Crioelectrón , Replicación del ADN , Proteínas de Mantenimiento de Minicromosoma/metabolismo , Fosforilación , Proteínas Serina-Treonina Quinasas/metabolismo , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo

14.

The Collagen Receptor Discoidin Domain Receptor 1b Enhances Integrin ß1-Mediated Cell Migration by Interacting With Talin and Promoting Rac1 Activation.

Borza, Corina M; Bolas, Gema; Zhang, Xiuqi; Browning Monroe, Mary Beth; Zhang, Ming-Zhi; Meiler, Jens; Skwark, Marcin J; Harris, Raymond C; Lapierre, Lynne A; Goldenring, James R; Hook, Magnus; Rivera, Jose; Brown, Kyle L; Leitinger, Birgit; Tyska, Matthew J; Moser, Markus; Böttcher, Ralph T; Zent, Roy; Pozzi, Ambra.

Front Cell Dev Biol ; 10: 836797, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-35309920

RESUMEN

Integrins and discoidin domain receptors (DDRs) 1 and 2 promote cell adhesion and migration on both fibrillar and non fibrillar collagens. Collagen I contains DDR and integrin selective binding motifs; however, the relative contribution of these two receptors in regulating cell migration is unclear. DDR1 has five isoforms (DDR1a-e), with most cells expressing the DDR1a and DDR1b isoforms. We show that human embryonic kidney 293 cells expressing DDR1b migrate more than DDR1a expressing cells on DDR selective substrata as well as on collagen I in vitro. In addition, DDR1b expressing cells show increased lung colonization after tail vein injection in nude mice. DDR1a and DDR1b differ from each other by an extra 37 amino acids in the DDR1b cytoplasmic domain. Interestingly, these 37 amino acids contain an NPxY motif which is a central control module within the cytoplasmic domain of ß integrins and acts by binding scaffold proteins, including talin. Using purified recombinant DDR1 cytoplasmic tail proteins, we show that DDR1b directly binds talin with higher affinity than DDR1a. In cells, DDR1b, but not DDR1a, colocalizes with talin and integrin ß1 to focal adhesions and enhances integrin ß1-mediated cell migration. Moreover, we show that DDR1b promotes cell migration by enhancing Rac1 activation. Mechanistically DDR1b interacts with the GTPase-activating protein (GAP) Breakpoint cluster region protein (BCR) thus reducing its GAP activity and enhancing Rac activation. Our study identifies DDR1b as a major driver of cell migration and talin and BCR as key players in the interplay between integrins and DDR1b in regulating cell migration.

15.

HARP: a database of structural impacts of systematic missense mutations in drug targets of Mycobacterium leprae.

Vedithi, Sundeep Chaitanya; Malhotra, Sony; Skwark, Marcin J; Munir, Asma; Acebrón-García-De-Eulate, Marta; Waman, Vaishali P; Alsulami, Ali; Ascher, David B; Blundell, Tom L.

Comput Struct Biotechnol J ; 18: 3692-3704, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-33304465

RESUMEN

Computational Saturation Mutagenesis is an in-silico approach that employs systematic mutagenesis of each amino acid residue in the protein to all other amino acid types, and predicts changes in thermodynamic stability and affinity to the other subunits/protein counterparts, ligands and nucleic acid molecules. The data thus generated are useful in understanding the functional consequences of mutations in antimicrobial resistance phenotypes. In this study, we applied computational saturation mutagenesis to three important drug-targets in Mycobacterium leprae (M. leprae) for the drugs dapsone, rifampin and ofloxacin namely Dihydropteroate Synthase (DHPS), RNA Polymerase (RNAP) and DNA Gyrase (GYR), respectively. M. leprae causes leprosy and is an obligate intracellular bacillus with limited protein structural information associating mutations with phenotypic resistance outcomes in leprosy. Experimentally solved structures of DHPS, RNAP and GYR of M. leprae are not available in the Protein Data Bank, therefore, we modelled the structures of these proteins using template-based comparative modelling and introduced systematic mutations in each model generating 80,902 mutations and mutant structures for all the three proteins. Impacts of mutations on stability and protein-subunit, protein-ligand and protein-nucleic acid affinities were computed using various in-house developed and other published protein stability and affinity prediction software. A consensus impact was estimated for each mutation using qualitative scoring metrics for physicochemical properties and by a categorical grouping of stability and affinity predictions. We developed a web database named HARP (a database of Hansen's Disease Antimicrobial Resistance Profiles), which is accessible at the URL - https://harp-leprosy.org and provides the details to each of these predictions.

16.

Computational saturation mutagenesis to predict structural consequences of systematic mutations in the beta subunit of RNA polymerase in Mycobacterium leprae.

Vedithi, Sundeep Chaitanya; Rodrigues, Carlos H M; Portelli, Stephanie; Skwark, Marcin J; Das, Madhusmita; Ascher, David B; Blundell, Tom L; Malhotra, Sony.

Comput Struct Biotechnol J ; 18: 271-286, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-32042379

RESUMEN

Rifampin resistance in leprosy may remain undetected due to the lack of rapid and effective diagnostic tools. A quick and reliable method is essential to determine the impacts of emerging detrimental mutations in the drug targets. The functional consequences of missense mutations in the ß-subunit of RNA polymerase (RNAP) in Mycobacterium leprae (M. leprae) contribute to phenotypic resistance to rifampin in leprosy. Here, we report in-silico saturation mutagenesis of all residues in the ß-subunit of RNAP to all other 19 amino acid types (generating 21,394 mutations for 1126 residues) and predict their impacts on overall thermodynamic stability, on interactions at subunit interfaces, and on ß-subunit-RNA and rifampin affinities (only for the rifampin binding site) using state-of-the-art structure, sequence and normal mode analysis-based methods. Mutations in the conserved residues that line the active-site cleft show largely destabilizing effects, resulting in increased relative solvent accessibility and a concomitant decrease in residue-depth (the extent to which a residue is buried in the protein structure space) of the mutant residues. The mutations at residue positions S437, G459, H451, P489, K884 and H1035 are identified as extremely detrimental as they induce highly destabilizing effects on the overall protein stability, and nucleic acid and rifampin affinities. Destabilizing effects were predicted for all the clinically/experimentally identified rifampin-resistant mutations in M. leprae indicating that this model can be used as a surveillance tool to monitor emerging detrimental mutations that destabilise RNAP-rifampin interactions and confer rifampin resistance in leprosy. AUTHOR SUMMARY: The emergence of primary and secondary drug resistance to rifampin in leprosy is a growing concern and poses a threat to the leprosy control and elimination measures globally. In the absence of an effective in-vitro system to detect and monitor phenotypic resistance to rifampin in leprosy, diagnosis mainly relies on the presence of mutations in drug resistance determining regions of the rpoB gene that encodes the ß-subunit of RNAP in M. leprae. Few labs in the world perform mouse food pad propagation of M. leprae in the presence of drugs (rifampin) to determine growth patterns and confirm resistance, however the duration of these methods lasts from 8 to 12 months making them impractical for diagnosis. Understanding molecular mechanisms of drug resistance is vital to associating mutations to clinically detected drug resistance in leprosy. Here we propose an in-silico saturation mutagenesis approach to comprehensively elucidate the structural implications of any mutations that exist or that can arise in the ß-subunit of RNAP in M. leprae. Most of the predicted mutations may not occur in M. leprae due to fitness costs but the information thus generated by this approach help decipher the impacts of mutations across the structure and conversely enable identification of stable regions in the protein that are least impacted by mutations (mutation coolspots) which can be a potential choice for small molecule binding and structure guided drug discovery.

17.

Assessment of global and local model quality in CASP8 using Pcons and ProQ.

Larsson, Per; Skwark, Marcin J; Wallner, Björn; Elofsson, Arne.

Proteins ; 77 Suppl 9: 167-72, 2009.

Artículo en Inglés | MEDLINE | ID: mdl-19544566

RESUMEN

Model Quality Assessment Programs (MQAPs) are programs developed to rank protein models. These methods can be trained to predict the overall global quality of a model or what local regions in a model that are likely to be incorrect. In CASP8, we participated with two predictors that predict both global and local quality using either consensus information, Pcons, or purely structural information, ProQ. Consistently with results in previous CASPs, the best performance in CASP8 was obtained using the Pcons method. Furthermore, the results show that the modification introduced into Pcons for CASP8 improved the predictions against GDT_TS and now a correlation coefficient above 0.9 is achieved, whereas the correlation for ProQ is about 0.7. The correlation is better for the easier than for the harder targets, but it is not below 0.5 for a single target and below 0.7 only for three targets. The correlation coefficient for the best local quality MQAP is 0.68 showing that there is still clear room for improvement within this area. We also detect that Pcons still is not always able to identify the best model. However, we show that using a linear combination of Pcons and ProQ it is possible to select models that are better than the models from the best single server. In particular, the average quality over the hard targets increases by about 6% compared with using Pcons alone.

Asunto(s)

Biología Computacional/métodos , Modelos Moleculares , Proteínas/química , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Conformación Proteica

18.

Mabellini: a genome-wide database for understanding the structural proteome and evaluating prospective antimicrobial targets of the emerging pathogen Mycobacterium abscessus.

Skwark, Marcin J; Torres, Pedro H M; Copoiu, Liviu; Bannerman, Bridget; Floto, R Andres; Blundell, Tom L.

Database (Oxford) ; 2019(1)2019 01 01.

Artículo en Inglés | MEDLINE | ID: mdl-31681953

RESUMEN

Mycobacterium abscessus, a rapid growing, multidrug resistant, nontuberculous mycobacteria, can cause a wide range of opportunistic infections, particularly in immunocompromised individuals. M. abscessus has emerged as a growing threat to patients with cystic fibrosis, where it causes accelerated inflammatory lung damage, is difficult and sometimes impossible to treat and can prevent safe transplantation. There is therefore an urgent unmet need to develop new therapeutic strategies. The elucidation of the M. abscessus genome in 2009 opened a wide range of research possibilities in the field of drug discovery that can be more effectively exploited upon the characterization of the structural proteome. Where there are no experimental structures, we have used the available amino acid sequences to create 3D models of the majority of the remaining proteins that constitute the M. abscessus proteome (3394 proteins and over 13 000 models) using a range of up-to-date computational tools, many developed by our own group. The models are freely available for download in an on-line database, together with quality data and functional annotation. Furthermore, we have developed an intuitive and user-friendly web interface (http://www.mabellinidb.science) that enables easy browsing, querying and retrieval of the proteins of interest. We believe that this resource will be of use in evaluating the prospective targets for design of antimicrobial agents and will serve as a cornerstone to support the development of new molecules to treat M. abscessus infections.

Asunto(s)

Proteínas Bacterianas , Bases de Datos Genéticas , Genoma Bacteriano , Modelos Moleculares , Mycobacterium abscessus , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Estudio de Asociación del Genoma Completo , Infecciones por Mycobacterium no Tuberculosas/genética , Infecciones por Mycobacterium no Tuberculosas/metabolismo , Mycobacterium abscessus/química , Mycobacterium abscessus/genética , Mycobacterium abscessus/metabolismo

19.

Mycobacterial genomics and structural bioinformatics: opportunities and challenges in drug discovery.

Waman, Vaishali P; Vedithi, Sundeep Chaitanya; Thomas, Sherine E; Bannerman, Bridget P; Munir, Asma; Skwark, Marcin J; Malhotra, Sony; Blundell, Tom L.

Emerg Microbes Infect ; 8(1): 109-118, 2019.

Artículo en Inglés | MEDLINE | ID: mdl-30866765

RESUMEN

Of the more than 190 distinct species of Mycobacterium genus, many are economically and clinically important pathogens of humans or animals. Among those mycobacteria that infect humans, three species namely Mycobacterium tuberculosis (causative agent of tuberculosis), Mycobacterium leprae (causative agent of leprosy) and Mycobacterium abscessus (causative agent of chronic pulmonary infections) pose concern to global public health. Although antibiotics have been successfully developed to combat each of these, the emergence of drug-resistant strains is an increasing challenge for treatment and drug discovery. Here we describe the impact of the rapid expansion of genome sequencing and genome/pathway annotations that have greatly improved the progress of structure-guided drug discovery. We focus on the applications of comparative genomics, metabolomics, evolutionary bioinformatics and structural proteomics to identify potential drug targets. The opportunities and challenges for the design of drugs for M. tuberculosis, M. leprae and M. abscessus to combat resistance are discussed.

Asunto(s)

Proteínas Bacterianas/química , Biología Computacional/métodos , Mycobacterium/genética , Análisis de Secuencia de ADN/métodos , Animales , Proteínas Bacterianas/metabolismo , Descubrimiento de Drogas , Farmacorresistencia Bacteriana , Genoma Bacteriano , Humanos , Anotación de Secuencia Molecular , Mycobacterium/metabolismo , Mycobacterium abscessus/genética , Mycobacterium abscessus/metabolismo , Mycobacterium leprae/genética , Mycobacterium leprae/metabolismo , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , Conformación Proteica , Proteómica

20.

Membrane protein contact and structure prediction using co-evolution in conjunction with machine learning.

Teixeira, Pedro L; Mendenhall, Jeff L; Heinze, Sten; Weiner, Brian; Skwark, Marcin J; Meiler, Jens.

PLoS One ; 12(5): e0177866, 2017.

Artículo en Inglés | MEDLINE | ID: mdl-28542325

RESUMEN

De novo membrane protein structure prediction is limited to small proteins due to the conformational search space quickly expanding with length. Long-range contacts (24+ amino acid separation)-residue positions distant in sequence, but in close proximity in the structure, are arguably the most effective way to restrict this conformational space. Inverse methods for co-evolutionary analysis predict a global set of position-pair couplings that best explain the observed amino acid co-occurrences, thus distinguishing between evolutionarily explained co-variances and these arising from spurious transitive effects. Here, we show that applying machine learning approaches and custom descriptors improves evolutionary contact prediction accuracy, resulting in improvement of average precision by 6 percentage points for the top 1L non-local contacts. Further, we demonstrate that predicted contacts improve protein folding with BCL::Fold. The mean RMSD100 metric for the top 10 models folded was reduced by an average of 2 Å for a benchmark of 25 membrane proteins.

Asunto(s)

Aprendizaje Automático , Proteínas de la Membrana/metabolismo , Modelos Moleculares , Pliegue de Proteína , Estructura Secundaria de Proteína/fisiología , Algoritmos , Secuencia de Aminoácidos , Humanos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA