RESUMO
G-protein-coupled receptors (GPCRs) transduce physiological and sensory stimuli into appropriate cellular responses and mediate the actions of one-third of drugs. GPCR structural studies have revealed the general bases of receptor activation, signaling, drug action and allosteric modulation, but so far cover only 13% of nonolfactory receptors. We broadly surveyed the receptor modifications/engineering and methods used to produce all available GPCR crystal and cryo-electron microscopy (cryo-EM) structures, and present an interactive resource integrated in GPCRdb ( http://www.gpcrdb.org ) to assist users in designing constructs and browsing appropriate experimental conditions for structure studies.
Assuntos
Biologia Computacional/métodos , Internet , Receptores Acoplados a Proteínas G/genética , Sítio Alostérico , Animais , Bovinos , Microscopia Crioeletrônica , Cristalografia por Raios X , Bases de Dados de Proteínas , Desenho de Fármacos , Glicosilação , Células HEK293 , Humanos , Mutação , Fosforilação , Domínios Proteicos , Engenharia de Proteínas , Rodopsina/química , Transdução de Sinais , SoftwareRESUMO
Light-sensitive G protein-coupled receptors (GPCRs)-rhodopsins-absorb photons to isomerize their covalently bound retinal, triggering conformational changes that result in downstream signaling cascades. Monostable rhodopsins release retinal upon isomerization as opposed to the retinal in bistable rhodopsins that "reisomerize" upon absorption of a second photon. Understanding the mechanistic differences between these light-sensitive GPCRs has been hindered by the scarcity of recombinant models of the latter. Here, we reveal the high-resolution crystal structure of a recombinant bistable rhodopsin, jumping spider rhodopsin-1, bound to the inverse agonist 9-cis retinal. We observe a water-mediated network around the ligand hinting toward the basis of their bistable nature. In contrast to bovine rhodopsin (monostable), the transmembrane bundle of jumping spider rhodopsin-1 as well that of the bistable squid rhodopsin adopts a more "activation-ready" conformation often observed in other nonphotosensitive class A GPCRs. These similarities suggest the role of jumping spider rhodopsin-1 as a potential model system in the study of the structure-function relationship of both photosensitive and nonphotosensitive class A GPCRs.
Assuntos
Proteínas de Artrópodes/ultraestrutura , Rodopsina/ultraestrutura , Transdução de Sinais/efeitos da radiação , Aranhas , Animais , Proteínas de Artrópodes/isolamento & purificação , Proteínas de Artrópodes/metabolismo , Cristalografia por Raios X , Células HEK293 , Humanos , Ligantes , Luz , Simulação de Dinâmica Molecular , Isoformas de Proteínas/isolamento & purificação , Isoformas de Proteínas/metabolismo , Isoformas de Proteínas/ultraestrutura , Proteínas Recombinantes/isolamento & purificação , Proteínas Recombinantes/metabolismo , Proteínas Recombinantes/ultraestrutura , Rodopsina/isolamento & purificação , Rodopsina/metabolismo , Estereoisomerismo , Relação Estrutura-AtividadeRESUMO
Box jellyfish and vertebrates are separated by >500 million years of evolution yet have structurally analogous lens eyes that employ rhodopsin photopigments for vision. All opsins possess a negatively charged residue-the counterion-to maintain visible-light sensitivity and facilitate photoisomerization of their retinaldehyde chromophore. In vertebrate rhodopsins, the molecular evolution of the counterion position-from a highly conserved distal location in the second extracellular loop (E181) to a proximal location in the third transmembrane helix (E113)-is established as a key driver of higher fidelity photoreception. Here, we use computational biology and heterologous action spectroscopy to determine whether the appearance of the advanced visual apparatus in box jellyfish was also accompanied by changes in the opsin tertiary structure. We found that the counterion in an opsin from the lens eye of the box jellyfish Carybdea rastonii (JellyOp) has also moved to a unique proximal location within the transmembrane bundle-E94 in TM2. Furthermore, we reveal that this Schiff base/counterion system includes an additional positive charge-R186-that has coevolved with E94 to functionally separate E94 and E181 in the chromophore-binding pocket of JellyOp. By engineering this pocket-neutralizing R186 and E94, or swapping E94 with the vertebrate counterion E113-we can recreate versions of the invertebrate and vertebrate counterion systems, respectively, supporting a relatively similar overall architecture in this region of animal opsins. In summary, our data establish the third only counterion site in animal opsins and reveal convergent evolution of tertiary structure in opsins from distantly related species with advanced visual systems.
Assuntos
Cubomedusas/genética , Evolução Molecular , Rodopsina , Visão Ocular/genética , Animais , Células HEK293 , Humanos , Simulação de Dinâmica Molecular , Filogenia , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Rodopsina/química , Rodopsina/genética , Rodopsina/metabolismoRESUMO
Moringa oleifera is a plant well-known for its nutrition value, drought resistance and medicinal properties. cDNA libraries from five different tissues (leaf, root, stem, seed and flower) of M.â¯oleifera cultivar Bhagya were generated and sequenced. We developed a bioinformatics pipeline to assemble transcriptome, along with the previously published M.â¯oleifera genome, to predict 17,148 gene models. Few candidate genes related to biosynthesis of secondary metabolites, vitamins and ion transporters were identified. Expressions were further confirmed by real-time quantitative PCR experiments for few promising leads. Quantitative estimation of metabolites, as well as elemental analysis, was also carried out to support our observations. Enzymes in the biosynthesis of vitamins and metabolites like quercetin and kaempferol are highly expressed in leaves, flowers and seeds. The expression of iron transporters and calcium storage proteins were observed in root and leaves. In general, leaves retain the highest amount of small molecules of interest.
Assuntos
Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas/fisiologia , Moringa oleifera , Metabolismo Secundário/fisiologia , Transcriptoma/fisiologia , Biblioteca Gênica , Moringa oleifera/genética , Moringa oleifera/metabolismoRESUMO
MOTIVATION: In the post-genomic era, automatic annotation of protein sequences using computational homology-based methods is highly desirable. However, often protein sequences diverge to an extent where detection of homology and automatic annotation transfer is not straightforward. Sophisticated approaches to detect such distant relationships are needed. We propose a new approach to identify deep evolutionary relationships of proteins to overcome shortcomings of the available methods. RESULTS: We have developed a method to identify remote homologues more effectively from any protein sequence database by using several cascading events with Hidden Markov Models (C-HMM). We have implemented clustering of hits and profile generation of hit clusters to effectively reduce the computational timings of the cascaded sequence searches. Our C-HMM approach could cover 94, 83 and 40% coverage at family, superfamily and fold levels, respectively, when applied on diverse protein folds. We have compared C-HMM with various remote homology detection methods and discuss the trade-offs between coverage and false positives. AVAILABILITY AND IMPLEMENTATION: A standalone package implemented in Java along with a detailed documentation can be downloaded from https://github.com/RSLabNCBS/C-HMM SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. CONTACT: mini@ncbs.res.in.
Assuntos
Análise de Sequência de Proteína/métodos , Homologia de Sequência de Aminoácidos , Algoritmos , Análise por Conglomerados , Bases de Dados de Proteínas , Cadeias de Markov , Proteínas/química , Proteínas/classificação , Proteínas/genética , Alinhamento de SequênciaRESUMO
Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.
Assuntos
Bases de Dados de Proteínas , Mutação INDEL , Estrutura Terciária de Proteína , Variação Genética , Internet , Modelos Moleculares , Estrutura Terciária de Proteína/genética , Proteínas/classificação , Alinhamento de Sequência , Análise de Sequência de ProteínaRESUMO
BACKGROUND: Krishna Tulsi, a member of Lamiaceae family, is a herb well known for its spiritual, religious and medicinal importance in India. The common name of this plant is 'Tulsi' (or 'Tulasi' or 'Thulasi') and is considered sacred by Hindus. We present the draft genome of Ocimum tenuiflurum L (subtype Krishna Tulsi) in this report. The paired-end and mate-pair sequence libraries were generated for the whole genome sequenced with the Illumina Hiseq 1000, resulting in an assembled genome of 374 Mb, with a genome coverage of 61 % (612 Mb estimated genome size). We have also studied transcriptomes (RNA-Seq) of two subtypes of O. tenuiflorum, Krishna and Rama Tulsi and report the relative expression of genes in both the varieties. RESULTS: The pathways leading to the production of medicinally-important specialized metabolites have been studied in detail, in relation to similar pathways in Arabidopsis thaliana and other plants. Expression levels of anthocyanin biosynthesis-related genes in leaf samples of Krishna Tulsi were observed to be relatively high, explaining the purple colouration of Krishna Tulsi leaves. The expression of six important genes identified from genome data were validated by performing q-RT-PCR in different tissues of five different species, which shows the high extent of urosolic acid-producing genes in young leaves of the Rama subtype. In addition, the presence of eugenol and ursolic acid, implied as potential drugs in the cure of many diseases including cancer was confirmed using mass spectrometry. CONCLUSIONS: The availability of the whole genome of O.tenuiflorum and our sequence analysis suggests that small amino acid changes at the functional sites of genes involved in metabolite synthesis pathways confer special medicinal properties to this herb.
Assuntos
Regulação da Expressão Gênica de Plantas , Genoma de Planta , Ocimum/genética , Índia , Ocimum/metabolismo , Folhas de Planta/metabolismo , Plantas Medicinais/genética , Plantas Medicinais/metabolismoRESUMO
Vibrio parahaemolyticus is one of the leading causative agents of foodborne diseases in humans. In this study, the proteome profiles of the pandemic strain V. parahaemolyticus SC192 belonging to the O3:K6 serovar during the planktonic and biofilm stages were analyzed by two-dimensional liquid chromatography coupled to tandem mass spectrometry. This non-gel-based multidimensional protein identification technology approach identified 45.5% of the proteome in the reference genome V. parahaemolyticus RIMD 2210633. This is the largest proteome coverage obtained so far in V. parahaemolyticus and provides evidence for expression of 27% of the hypothetical proteins. Comparison of the planktonic and biofilm proteomes based on their cluster of orthologous groups, gene ontologies and KEGG pathways provides basic information on biofilm specific functions and pathways. To the authors' knowledge, this is the first study to generate a global proteome profile of the pandemic strain of V. parahaemolyticus and the method reported here could be used to rapidly obtain a snapshot of the proteome of any microorganism at a given condition.
Assuntos
Proteínas de Bactérias/metabolismo , Biofilmes , Pandemias , Plâncton/microbiologia , Proteoma/genética , Vibrioses/epidemiologia , Vibrio parahaemolyticus/genética , Proteínas de Bactérias/genética , Biologia Computacional , Eletroforese em Gel Bidimensional , Ontologia Genética , Espectrometria de Massas , Microscopia Confocal , Plâncton/genética , Proteômica/métodosRESUMO
The developmentally active and cell-stress responsive hsrω locus in Drosophila melanogaster carries two exons, one omega intron, one short translatable open reading frame (ORFω), long stretch of unique tandem repeats and an overlapping mir-4951 near its 30' end. It produces multiple long noncoding RNAs (lncRNAs) using two transcription start and four termination sites. Earlier cytogenetic studies revealed functional conservation of hsrω in several Drosophila species. However, sequence analysis in three species showed poor conservation for ORFω, tandem repeat and other regions while the 16 nt at 50 and 60 nt at 30 splice junctions of the omega intron, respectively, were found to be ultra-conserved. The present bioinformatic study using the splice-junction landmarks in D. melanogaster hsrω identified orthologues in publicly available 34 Drosophila species genomes. Each orthologue carries a short ORFω, ultra-conserved splice junctions of omega intron, repeat region, conserved 30'end located at mir-4951, and syntenic neighbours. Multiple copies of conserved nonamer motifs are seen in the tandem repeat region, despite a high variability in the repeat sequences. Intriguingly, only the omega intron sequences in different species show evolutionary relationships matching the general phylogenetic history in the genus. Search in other known insect genomes did not reveal sequence homology although a locus with similar functional properties is suggested in Chironomus and Ceratitis genera. Amidst the high sequence divergence, the conserved organization of exons, ORFω and omega intron in this gene's proximal part and tandem repeats in distal part across the Drosophila genus is remarkable and possibly reflects functional importance of higher order structure of hsrω lncRNAs and the small omega peptide.
Assuntos
Evolução Biológica , Simulação por Computador , Drosophila melanogaster/genética , Íntrons , RNA Longo não Codificante/genética , Sequências Repetitivas de Ácido Nucleico , Estresse Fisiológico , Sequência de Aminoácidos , Animais , Sequência de Bases , Drosophila melanogaster/classificação , Drosophila melanogaster/crescimento & desenvolvimento , Filogenia , Homologia de Sequência , Especificidade da EspécieRESUMO
This protocol describes a stepwise process to identify proteins of interest from a query proteome derived from NGS data. We implemented this protocol on Moringa oleifera transcriptome to identify proteins involved in secondary metabolite and vitamin biosynthesis and ion transport. This knowledge-driven protocol identifies proteins using an integrated approach involving sensitive sequence search and evolutionary relationships. We make use of functionally important residues (FIR) specific for the query protein family identified through its homologous sequences and literature. We screen protein hits based on the clustering with true homologues through phylogenetic tree reconstruction complemented with the FIR mapping. The protocol was validated for the protein hits through qRT-PCR and transcriptome quantification. Our protocol demonstrated a higher specificity as compared to other methods, particularly in distinguishing cross-family hits. This protocol was effective in transcriptome data analysis of M. oleifera as described in Pasha et al.â¢Knowledge-driven protocol to identify secondary metabolite synthesizing protein in a highly specific manner.â¢Use of functionally important residues for screening of true hits.â¢Beneficial for metabolite pathway reconstruction in any (species, metagenomics) NGS data.
RESUMO
In this paper, we present the data acquired during transcriptome analysis of the plant Moringa oleifera [1] from five different tissues (root, stem, leaf, flower and seed) by RNA sequencing. A total of 271 million reads were assembled with an N50 of 2094â¯bp. The combined transcriptome was assessed for transcript abundance across five tissues. The protein coding genes identified from the transcripts were annotated and used for orthology analysis. Further, enzymes involved in the biosynthesis of select medicinally important secondary metabolites, vitamins and ion transporters were identified and their expression levels across tissues were examined. The data generated by RNA sequencing has been deposited to NCBI public repository under the accession number PRJNA394193 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA394193).
RESUMO
Animals sense light using photosensitive proteins-rhodopsins-containing a chromophore-retinal-that intrinsically absorbs in the ultraviolet. Visible light-sensitivity depends primarily on protonation of the retinylidene Schiff base (SB), which requires a negatively-charged amino acid residue-counterion-for stabilization. Little is known about how the most common counterion among varied rhodopsins, Glu181, functions. Here, we demonstrate that in a spider visual rhodopsin, orthologue of mammal melanopsins relevant to circadian rhythms, the Glu181 counterion functions likely by forming a hydrogen-bonding network, where Ser186 is a key mediator of the Glu181-SB interaction. We also suggest that upon light activation, the Glu181-SB interaction rearranges while Ser186 changes its contribution. This is in contrast to how the counterion of vertebrate visual rhodopsins, Glu113, functions, which forms a salt bridge with the SB. Our results shed light on the molecular mechanisms of visible light-sensitivity relevant to invertebrate vision and vertebrate non-visual photoreception.
Assuntos
Proteínas de Artrópodes/química , Proteínas de Artrópodes/efeitos da radiação , Rodopsina/química , Rodopsina/efeitos da radiação , Substituição de Aminoácidos , Animais , Proteínas de Artrópodes/genética , Ligação de Hidrogênio , Luz , Modelos Moleculares , Mutagênese Sítio-Dirigida , Processos Fotoquímicos , Estabilidade Proteica , Rodopsina/genética , Bases de Schiff/química , Bases de Schiff/efeitos da radiação , Aranhas/química , Aranhas/genéticaRESUMO
Arrestin-1 desensitizes the activated and phosphorylated photoreceptor rhodopsin by forming transient rhodopsin-arrestin-1 complexes that eventually decay to opsin, retinal and arrestin-1. Via a multi-dimensional screening setup, we identified and combined arrestin-1 mutants that form lasting complexes with light-activated and phosphorylated rhodopsin in harsh conditions, such as high ionic salt concentration. Two quadruple mutants, D303A + T304A + E341A + F375A and R171A + T304A + E341A + F375A share similar heterologous expression and thermo-stability levels with wild type (WT) arrestin-1, but are able to stabilize complexes with rhodopsin with more than seven times higher half-maximal inhibitory concentration (IC50) values for NaCl compared to the WT arrestin-1 protein. These quadruple mutants are also characterized by higher binding affinities to phosphorylated rhodopsin, light-activated rhodopsin and phosphorylated opsin, as compared with WT arrestin-1. Furthermore, the assessed arrestin-1 mutants are still specifically associating with phosphorylated or light-activated receptor states only, while binding to the inactive ground state of the receptor is not significantly altered. Additionally, we propose a novel functionality for R171 in stabilizing the inactive arrestin-1 conformation as well as the rhodopsin-arrestin-1 complex. The achieved stabilization of the active rhodopsin-arrestin-1 complex might be of great interest for future structure determination, antibody development studies as well as drug-screening efforts targeting G protein-coupled receptors (GPCRs).
Assuntos
Arrestinas/metabolismo , Complexos Multiproteicos/metabolismo , Opsinas/metabolismo , Engenharia de Proteínas/métodos , Rodopsina/metabolismo , Animais , Arrestinas/química , Arrestinas/genética , Bovinos , Células HEK293 , Humanos , Modelos Moleculares , Complexos Multiproteicos/química , Complexos Multiproteicos/genética , Mutação , Opsinas/química , Fosforilação , Ligação Proteica , Conformação Proteica , Estabilidade Proteica , Rodopsina/químicaRESUMO
Cellular functions of arrestins are determined in part by the pattern of phosphorylation on the G protein-coupled receptors (GPCRs) to which arrestins bind. Despite high-resolution structural data of arrestins bound to phosphorylated receptor C-termini, the functional role of each phosphorylation site remains obscure. Here, we employ a library of synthetic phosphopeptide analogues of the GPCR rhodopsin C-terminus and determine the ability of these peptides to bind and activate arrestins using a variety of biochemical and biophysical methods. We further characterize how these peptides modulate the conformation of arrestin-1 by nuclear magnetic resonance (NMR). Our results indicate different functional classes of phosphorylation sites: 'key sites' required for arrestin binding and activation, an 'inhibitory site' that abrogates arrestin binding, and 'modulator sites' that influence the global conformation of arrestin. These functional motifs allow a better understanding of how different GPCR phosphorylation patterns might control how arrestin functions in the cell.
Assuntos
Arrestina/metabolismo , Fosforilação/fisiologia , Rodopsina/metabolismo , beta-Arrestina 1/metabolismo , beta-Arrestina 2/metabolismo , Motivos de Aminoácidos/fisiologia , Animais , Arrestina/química , Arrestina/genética , Arrestina/isolamento & purificação , Bioensaio , Bovinos , Membrana Celular/metabolismo , Mutação , Ressonância Magnética Nuclear Biomolecular , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/isolamento & purificação , Proteínas Recombinantes/metabolismo , Rodopsina/química , Segmento Externo da Célula Bastonete/metabolismo , beta-Arrestina 1/química , beta-Arrestina 1/isolamento & purificação , beta-Arrestina 2/química , beta-Arrestina 2/isolamento & purificaçãoRESUMO
Insertions/deletions are common evolutionary tools employed to alter the structural and functional repertoire of protein domains. An insert situated proximal to the active site or ligand binding site frequently impacts protein function; however, the effect of distal indels on protein activity and/or stability are often not studied. In this paper, we have investigated a distal insert, which influences the function and stability of a unique DNA polymerase, called terminal deoxynucleotidyl transferase (TdT). TdT (EC:2.7.7.31) is a monomeric 58 kDa protein belonging to family X of eukaryotic DNA polymerases and known for its role in V(D)J recombination as well as in non-homologous end-joining (NHEJ) pathways. Two murine isoforms of TdT, with a length difference of twenty residues and having different biochemical properties, have been studied. All-atom molecular dynamics simulations at different temperatures and interaction network analyses were performed on the short and long-length isoforms. We observed conformational changes in the regions distal to the insert position (thumb subdomain) in the longer isoform, which indirectly affects the activity and stability of the enzyme through a mediating loop (Loop1). A structural rationale could be provided to explain the reduced polymerization rate as well as increased thermosensitivity of the longer isoform caused by peripherally located length variations within a DNA polymerase. These observations increase our understanding of the roles of length variants in introducing functional diversity in protein families in general.
Assuntos
DNA Nucleotidilexotransferase/química , DNA/química , Sequência de Aminoácidos , Animais , Sítios de Ligação , DNA/metabolismo , DNA Nucleotidilexotransferase/classificação , DNA Nucleotidilexotransferase/metabolismo , Isoenzimas/química , Isoenzimas/classificação , Isoenzimas/metabolismo , Camundongos , Mutagênese Insercional , Filogenia , Ligação Proteica , Conformação Proteica em alfa-Hélice , Dobramento de Proteína , Domínios e Motivos de Interação entre Proteínas , Alinhamento de Sequência , Relação Estrutura-Atividade , Especificidade por Substrato , TermodinâmicaRESUMO
Prolyl oligopeptidases (POPs) are serine proteases found in prokaryotes and eukaryotes which hydrolyze the peptide bond containing proline. The current study focuses on the analysis of POP sequences, their distribution and domain architecture in Shewanella woodyi, a Gram-negative, luminous bacterium which causes celiac sprue and similar infections in marine organisms. The POP undergoes huge interdomain movement, which allows possible route for the entry of any substrate. Hence, it offers an opportunity to understand the mechanism of substrate gating by studying the domain architecture and possibility to identify a probable drug target. In the present study, the POP sequence was retrieved from GenBank database and the best homologous templates were identified by PSI-BLAST search. The three-dimensional structures of the closed and open forms of POP from S. woodyi, which are not available in native form, were generated by homology modeling. The ideal lead molecules were screened by computer-aided virtual screening, and the binding potential of the best leads toward the target was studied by molecular docking. The domain architecture of the POP revealed that it has a propeller domain consists of [Formula: see text]-sheets, surrounded by [Formula: see text]-helices and [Formula: see text] hydrolase domain with catalytic triad containing Ser-564, Asp-646 and His-681. The hypothetical models of open and closed POP showed backbone RMSD value of 0.56 and 0.65 Å, respectively. Ramachandran plot of the open and closed POP conformations accounts for 99.4 and 98.7 % residues in the favoured region, respectively. Our study revealed that propeller domain comes as an insert between N-terminal and C-terminal [Formula: see text] hydrolase domain. Molecular docking, drug likeness properties and ADME prediction suggested that KUC-103481N and Pramiracetum can be used as probable lead molecules toward the POP from S. woodyi.
Assuntos
Biologia Computacional/métodos , Serina Endopeptidases/química , Serina Endopeptidases/metabolismo , Shewanella/enzimologia , Simulação de Acoplamento Molecular , Prolil Oligopeptidases , Serina Proteases/química , Serina Proteases/metabolismoRESUMO
Prolyl oligopeptidases (POP) are serine proteases found in prokaryotes and eukaryotes which hydrolyze the peptide bond containing proline. The current study focuses on the analysis of POP sequences, their distribution and domain architecture in Shewanella woodyi, a Gram negative, luminous bacterium which causes celiac sprue and similar infections in marine organisms. The POP undergoes huge inter-domain movement, which allows possible route for the entry of any substrate. Hence, it offers an opportunity to understand the mechanism of substrate gating by studying the domain architecture and possibility to identify a probable drug target. In the present study, the POP sequence was retrieved from GenBank data base and the best homologous templates were identified by PSI-BLAST search. The three dimensional structures of the closed and open forms of POP from Shewanella woodyi, which are not available in native form, was generated by homology modeling. The ideal lead molecules were screened by computer aided virtual screening and the binding potential of the best leads towards the target was studied by molecular docking. The domain architecture of the POP revealed that, it has a propeller domain consist of ß-sheets, surrounded by α-helices and α/ß hydrolase domain with catalytic triad containing Ser-564, Asp-646 and His-681. The hypothetical models of open and closed POP showed backbone RMSD value of 0.56 Å and 0.65 Å respectively. Ramachandran plot of the open and closed POP conformations accounts for 99.4% and 98.7% residues in the favoured region respectively. Our study revealed that, propeller domain comes as an insert between N-terminal and C-terminal α/ß hydrolase domain. Molecular docking, drug likeliness properties and ADME prediction suggested that KUC-103481N and Pramiracetum can be used as probable lead molecules towards the POP from Shewanella woodyi.
RESUMO
BACKGROUND: Influx of newly determined crystal structures into primary structural databases is increasing at a rapid pace. This leads to updation of primary and their dependent secondary databases which makes large scale analysis of structures even more challenging. Hence, it becomes essential to compare and appreciate replacement of data and inclusion of new data that is critical between two updates. PASS2 is a database that retains structure-based sequence alignments of protein domain superfamilies and relies on SCOP database for its hierarchy and definition of superfamily members. Since, accurate alignments of distantly related proteins are useful evolutionary models for depicting variations within protein superfamilies, this study aims to trace the changes in data in between PASS2 updates. RESULTS: In this study, differences in superfamily compositions, family constituents and length variations between different versions of PASS2 have been tracked. Studying length variations in protein domains, which have been introduced by indels (insertions/deletions), are important because theses indels act as evolutionary signatures in introducing variations in substrate specificity, domain interactions and sometimes even regulating protein stability. With this objective of classifying the nature and source of variations in the superfamilies during transitions (between the different versions of PASS2), increasing length-rigidity of the superfamilies in the recent version is observed. In order to study such length-variant superfamilies in detail, an improved classification approach is also presented, which divides the superfamilies into distinct groups based on their extent of length variation. CONCLUSIONS: An objective study in terms of transition between the database updates, detailed investigation of the new/old members and examination of their structural alignments is non-trivial and will help researchers in designing experiments on specific superfamilies, in various modelling studies, in linking representative superfamily members to rapidly expanding sequence space and in evaluating the effects of length variations of new members in drug target proteins. The improved objective classification scheme developed here would be useful in future for automatic analysis of length variation in cases of updates of databases or even within different secondary databases.
RESUMO
BACKGROUND: Development of sensitive sequence search procedures for the detection of distant relationships between proteins at superfamily/fold level is still a big challenge. The intermediate sequence search approach is the most frequently employed manner of identifying remote homologues effectively. In this study, examination of serine proteases of prolyl oligopeptidase, rhomboid and subtilisin protein families were carried out using plant serine proteases as queries from two genomes including A. thaliana and O. sativa and 13 other families of unrelated folds to identify the distant homologues which could not be obtained using PSI-BLAST. METHODOLOGY/PRINCIPAL FINDINGS: We have proposed to start with multiple queries of classical serine protease members to identify remote homologues in families, using a rigorous approach like Cascade PSI-BLAST. We found that classical sequence based approaches, like PSI-BLAST, showed very low sequence coverage in identifying plant serine proteases. The algorithm was applied on enriched sequence database of homologous domains and we obtained overall average coverage of 88% at family, 77% at superfamily or fold level along with specificity of ~100% and Mathew's correlation coefficient of 0.91. Similar approach was also implemented on 13 other protein families representing every structural class in SCOP database. Further investigation with statistical tests, like jackknifing, helped us to better understand the influence of neighbouring protein families. CONCLUSIONS/SIGNIFICANCE: Our study suggests that employment of multiple queries of a family for the Cascade PSI-BLAST searches is useful for predicting distant relationships effectively even at superfamily level. We have proposed a generalized strategy to cover all the distant members of a particular family using multiple query sequences. Our findings reveal that prior selection of sequences as query and the presence of neighbouring families can be important for covering the search space effectively in minimal computational time. This study also provides an understanding of the 'bridging' role of related families.