Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
Nucleic Acids Res ; 50(3): 1687-1700, 2022 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-35018473

RESUMO

Toxin-antitoxin (TA) systems are proposed to play crucial roles in bacterial growth under stress conditions such as phage infection. The type III TA systems consist of a protein toxin whose activity is inhibited by a noncoding RNA antitoxin. The toxin is an endoribonuclease, while the antitoxin consists of multiple repeats of RNA. The toxin assembles with the individual antitoxin repeats into a cyclic complex in which the antitoxin forms a pseudoknot structure. While structure and functions of some type III TA systems are characterized, the complex assembly process is not well understood. Using bioinformatics analysis, we have identified type III TA systems belonging to the ToxIN family across different Escherichia coli strains and found them to be clustered into at least five distinct clusters. Furthermore, we report a 2.097 Å resolution crystal structure of the first E. coli ToxIN complex that revealed the overall assembly of the protein-RNA complex. Isothermal titration calorimetry experiments showed that toxin forms a high-affinity complex with antitoxin RNA resulting from two independent (5' and 3' sides of RNA) RNA binding sites on the protein. These results further our understanding of the assembly of type III TA complexes in bacteria.


Assuntos
Antitoxinas , Toxinas Bacterianas , Escherichia coli/química , Sistemas Toxina-Antitoxina , Antitoxinas/química , Proteínas de Bactérias/metabolismo , Toxinas Bacterianas/genética , Toxinas Bacterianas/metabolismo , Escherichia coli/metabolismo , RNA/metabolismo
2.
Proteins ; 2023 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-37902388

RESUMO

Proteins such as enzymes perform their function by predominant non-covalent bond interactions between transiently interacting units. There is an impact on the overall structural topology of the protein, albeit transient nature of such interactions, that enable proteins to deactivate or activate. This aspect of the alteration of the structural topology is studied by employing protein structural networks, which are node-edge representative models of protein structure, reported as a robust tool for capturing interactions between residues. Several methods have been optimized to collect meaningful, functionally relevant information by studying alteration of structural networks. In this article, different methods of comparing protein structural networks are employed, along with spectral decomposition of graphs to study the subtle impact of protein-protein interactions. A detailed analysis of the structural network of interacting partners is performed across a dataset of around 900 pairs of bound complexes and corresponding unbound protein structures. The variation in network parameters at, around, and far away from the interface are analyzed. Finally, we present interesting case studies, where an allosteric mechanism of structural impact is understood from communication-path detection methods. The results of this analysis are beneficial in understanding protein stability, for future engineering, and docking studies.

3.
J Biol Chem ; 294(23): 9048-9063, 2019 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-31018964

RESUMO

Mycobacterium tuberculosis possesses an unusually large representation of type II toxin-antitoxin (TA) systems, whose functions and targets are mostly unknown. To better understand the basis of their unique expansion and to probe putative functional similarities among these systems, here we computationally and experimentally investigated their sequence relationships. Bioinformatic and phylogenetic investigations revealed that 51 sequences of the VapBC toxin family group into paralogous sub-clusters. On the basis of conserved sequence fingerprints within paralogues, we predicted functional residues and residues at the putative TA interface that are useful to evaluate TA interactions. Substitution of these likely functional residues abolished the toxin's growth-inhibitory activity. Furthermore, conducting similarity searches in 101 mycobacterial and ∼4500 other prokaryotic genomes, we assessed the relative conservation of the M. tuberculosis TA systems and found that most TA orthologues are well-conserved among the members of the M. tuberculosis complex, which cause tuberculosis in animal hosts. We found that soil-inhabiting, free-living Actinobacteria also harbor as many as 12 TA pairs. Finally, we identified five novel putative TA modules in M. tuberculosis. For one of them, we demonstrate that overexpression of the putative toxin, Rv2514c, induces bacteriostasis and that co-expression of the cognate antitoxin Rv2515c restores bacterial growth. Taken together, our findings reveal that toxin sequences are more closely related than antitoxin sequences in M. tuberculosis Furthermore, the identification of additional TA systems reported here expands the known repertoire of TA systems in M. tuberculosis.


Assuntos
Antitoxinas/metabolismo , Toxinas Bacterianas/metabolismo , Biologia Computacional/métodos , Mycobacterium tuberculosis/metabolismo , Sistemas Toxina-Antitoxina/genética , Sequência de Aminoácidos , Antitoxinas/genética , Toxinas Bacterianas/química , Toxinas Bacterianas/genética , Mutagênese Sítio-Dirigida , Mycobacterium tuberculosis/genética , Filogenia , Células Procarióticas/metabolismo , Alinhamento de Sequência
4.
Proteins ; 88(12): 1688-1700, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-32725917

RESUMO

High divergence in protein sequences makes the detection of distant protein relationships through homology-based approaches challenging. Grouping protein sequences into families, through similarities in either sequence or 3-D structure, facilitates in the improved recognition of protein relationships. In addition, strategically designed protein-like sequences have been shown to bridge distant structural domain families by serving as artificial linkers. In this study, we have augmented a search database of known protein domain families with such designed sequences, with the intention of providing functional clues to domain families of unknown structure. When assessed using representative query sequences from each family, we obtain a success rate of 94% in protein domain families of known structure. Further, we demonstrate that the augmented search space enabled fold recognition for 582 families with no structural information available a priori. Additionally, we were able to provide reliable functional relationships for 610 orphan families. We discuss the application of our method in predicting functional roles through select examples for DUF4922, DUF5131, and DUF5085. Our approach also detects new associations between families that were previously not known to be related, as demonstrated through new sub-groups of the RNA polymerase domain among three distinct RNA viruses. Taken together, designed sequences-augmented search databases direct the detection of meaningful relationships between distant protein families. In turn, they enable fold recognition and offer reliable pointers to potential functional sites that may be probed further through direct mutagenesis studies.


Assuntos
Bases de Dados de Proteínas , Hidrolases/metabolismo , Família Multigênica , Nucleotidiltransferases/metabolismo , Sequência de Aminoácidos , Humanos , Hidrolases/química , Nucleotidiltransferases/química , Conformação Proteica , Homologia de Sequência
5.
Nucleic Acids Res ; 43(Database issue): D300-5, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25262355

RESUMO

NrichD (http://proline.biochem.iisc.ernet.in/NRICHD/) is a database of computationally designed protein-like sequences, augmented into natural sequence databases that can perform hops in protein sequence space to assist in the detection of remote relationships. Establishing protein relationships in the absence of structural evidence or natural 'intermediately related sequences' is a challenging task. Recently, we have demonstrated that the computational design of artificial intermediary sequences/linkers is an effective approach to fill naturally occurring voids in protein sequence space. Through a large-scale assessment we have demonstrated that such sequences can be plugged into commonly employed search databases to improve the performance of routinely used sequence search methods in detecting remote relationships. Since it is anticipated that such data sets will be employed to establish protein relationships, two databases that have already captured these relationships at the structural and functional domain level, namely, the SCOP database and the Pfam database, have been 'enriched' with these artificial intermediary sequences. NrichD database currently contains 3,611,010 artificial sequences that have been generated between 27,882 pairs of families from 374 SCOP folds. The data sets are freely available for download. Additional features include the design of artificial sequences between any two protein families of interest to the user.


Assuntos
Bases de Dados de Proteínas , Homologia de Sequência de Aminoácidos , Biologia Computacional , Internet , Anotação de Sequência Molecular , Estrutura Terciária de Proteína , Análise de Sequência de Proteína
6.
BMC Bioinformatics ; 16: 119, 2015 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-25888118

RESUMO

BACKGROUND: Understanding channel structures that lead to active sites or traverse the molecule is important in the study of molecular functions such as ion, ligand, and small molecule transport. Efficient methods for extracting, storing, and analyzing protein channels are required to support such studies. Further, there is a need for an integrated framework that supports computation of the channels, interactive exploration of their structure, and detailed visual analysis of their properties. RESULTS: We describe a method for molecular channel extraction based on the alpha complex representation. The method computes geometrically feasible channels, stores both the volume occupied by the channel and its centerline in a unified representation, and reports significant channels. The representation also supports efficient computation of channel profiles that help understand channel properties. We describe methods for effective visualization of the channels and their profiles. These methods and the visual analysis framework are implemented in a software tool, CHEXVIS. We apply the method on a number of known channel containing proteins to extract pore features. Results from these experiments on several proteins show that CHEXVIS performance is comparable to, and in some cases, better than existing channel extraction techniques. Using several case studies, we demonstrate how CHEXVIS can be used to study channels, extract their properties and gain insights into molecular function. CONCLUSION: CHEXVIS supports the visual exploration of multiple channels together with their geometric and physico-chemical properties thereby enabling the understanding of the basic biology of transport through protein channels. The CHEXVIS web-server is freely available at http://vgl.serc.iisc.ernet.in/chexvis/ . The web-server is supported on all modern browsers with latest Java plug-in.


Assuntos
Biologia Computacional/métodos , Gráficos por Computador , Canais Iônicos/química , Canais Iônicos/isolamento & purificação , Software , Humanos , Imageamento Tridimensional/métodos , Canais Iônicos/metabolismo , Proteínas de Membrana/química , Proteínas de Membrana/metabolismo , Modelos Moleculares , Conformação Proteica , Eletricidade Estática , Especificidade por Substrato , Interface Usuário-Computador
7.
J Proteome Res ; 13(12): 5603-17, 2014 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-25252820

RESUMO

Histones regulate a variety of chromatin templated events by their post-translational modifications (PTMs). Although there are extensive reports on the PTMs of canonical histones, the information on the histone variants remains very scanty. Here, we report the identification of different PTMs, such as acetylation, methylation, and phosphorylation of a major mammalian histone variant TH2B. Our mass spectrometric analysis has led to the identification of both conserved and unique modifications across tetraploid spermatocytes and haploid spermatids. We have also computationally derived the 3-dimensional model of a TH2B containing nucleosome in order to study the spatial orientation of the PTMs identified and their effect on nucleosome stability and DNA binding potential. From our nucleosome model, it is evident that substitution of specific amino acid residues in TH2B results in both differential histone-DNA and histone-histone contacts. Furthermore, we have also observed that acetylation on the N-terminal tail of TH2B weakens the interactions with the DNA. These results provide direct evidence that, similar to somatic H2B, the testis specific histone TH2B also undergoes multiple PTMs, suggesting the possibility of chromatin regulation by such covalent modifications in mammalian male germ cells.


Assuntos
Histonas/metabolismo , Nucleossomos/metabolismo , Processamento de Proteína Pós-Traducional , Espermátides/metabolismo , Sequência de Aminoácidos , Animais , Histonas/química , Histonas/ultraestrutura , Masculino , Modelos Moleculares , Dados de Sequência Molecular , Nucleossomos/ultraestrutura , Mapeamento de Peptídeos , Ratos Wistar , Espermatócitos/metabolismo , Tetraploidia
8.
Methods Mol Biol ; 2449: 149-167, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35507261

RESUMO

Sequence-based approaches are fundamental to guide experimental investigations in obtaining structural and/or functional insights into uncharacterized protein families. Powerful profile-based sequence search methods rely on a sequence space continuum to identify non-trivial relationships through homology detection. The computational design of protein-like sequences that serve as "artificial linkers" is useful in identifying relationships between distant members of a structural fold. Such sequences act as intermediates and guide homology searches between distantly related proteins. Here, we describe an approach that represents natural intermediate sequences and designed protein-like sequences as HMM (Hidden Markov Models) profiles, to improve the sensitivity of existing search methods. Searches made within the "Profile database" were shown to recognize the parent structural fold for 90% of the search queries at query coverage better than 60%. For 1040 protein families with no available structure, fold associations were made through searches in the database of natural and designed sequence profiles. Most of the associations were made with the Alpha-alpha superhelix, Transmembrane beta-barrels, TIM barrel, and Immunoglobulin-like beta-sandwich folds. For 11 domain families of unknown functions, we provide confident fold associations using the profiles of designed sequences and a consensus from other fold recognition methods. For two DUFs (Domain families of Unknown Functions), we performed detailed functional annotation through comparisons with characterized templates of families of known function.


Assuntos
Biologia Computacional , Proteínas , Sequência de Aminoácidos , Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas/química , Proteínas/genética
9.
Front Mol Biosci ; 9: 954926, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36275618

RESUMO

RNA is the key player in many cellular processes such as signal transduction, replication, transport, cell division, transcription, and translation. These diverse functions are accomplished through interactions of RNA with proteins. However, protein-RNA interactions are still poorly derstood in contrast to protein-protein and protein-DNA interactions. This knowledge gap can be attributed to the limited availability of protein-RNA structures along with the experimental difficulties in studying these complexes. Recent progress in computational resources has expanded the number of tools available for studying protein-RNA interactions at various molecular levels. These include tools for predicting interacting residues from primary sequences, modelling of protein-RNA complexes, predicting hotspots in these complexes and insights into derstanding in the dynamics of their interactions. Each of these tools has its strengths and limitations, which makes it significant to select an optimal approach for the question of interest. Here we present a mini review of computational tools to study different aspects of protein-RNA interactions, with focus on overall application, development of the field and the future perspectives.

10.
Sci Rep ; 11(1): 1011, 2021 01 13.
Artigo em Inglês | MEDLINE | ID: mdl-33441654

RESUMO

Afrotheria is a clade of African-origin species with striking dissimilarities in appearance and habitat. In this study, we compared whole proteome sequences of six Afrotherian species to obtain a broad viewpoint of their underlying molecular make-up, to recognize potentially unique proteomic signatures. We find that 62% of the proteomes studied here, predominantly involved in metabolism, are orthologous, while the number of homologous proteins between individual species is as high as 99.5%. Further, we find that among Afrotheria, L. africana has several orphan proteins with 112 proteins showing < 30% sequence identity with their homologues. Rigorous sequence searches and complementary approaches were employed to annotate 156 uncharacterized protein sequences and 28 species-specific proteins. For 122 proteins we predicted potential functional roles, 43 of which we associated with protein- and nucleic-acid binding roles. Further, we analysed domain content and variations in their combinations within Afrotheria and identified 141 unique functional domain architectures, highlighting proteins with potential for specialized functions. Finally, we discuss the potential relevance of highly represented protein families such as MAGE-B2, olfactory receptor and ribosomal proteins in L. africana and E. edwardii, respectively. Taken together, our study reports the first comparative study of the Afrotherian proteomes and highlights salient molecular features.


Assuntos
Eutérios/classificação , Eutérios/genética , Animais , Sequência Conservada , Bases de Dados de Proteínas , Elefantes/classificação , Elefantes/genética , Elefantes/metabolismo , Eutérios/metabolismo , Evolução Molecular , Ouriços/classificação , Ouriços/genética , Ouriços/metabolismo , Anotação de Sequência Molecular , Toupeiras/classificação , Toupeiras/genética , Toupeiras/metabolismo , Filogenia , Domínios Proteicos , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Proteoma/genética , Proteômica , Musaranhos/classificação , Musaranhos/genética , Musaranhos/metabolismo , Especificidade da Espécie , Trichechus manatus/classificação , Trichechus manatus/genética , Trichechus manatus/metabolismo
11.
Front Genet ; 12: 747344, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-35082828

RESUMO

Multi-protein assemblies are complex molecular systems that perform highly sophisticated biochemical functions in an orchestrated manner. They are subject to changes that are governed by the evolution of individual components. We performed a comparative analysis of the ancient and functionally conserved spliceosomal SF3b complex, to recognize molecular signatures that contribute to sequence divergence and functional specializations. For this, we recognized homologous sequences of individual SF3b proteins distributed across 10 supergroups of eukaryotes and identified all seven protein components of the complex in 578 eukaryotic species. Using sequence and structural analysis, we establish that proteins occurring on the surface of the SF3b complex harbor more sequence variation than the proteins that lie in the core. Further, we show through protein interface conservation patterns that the extent of conservation varies considerably between interacting partners. When we analyze phylogenetic distributions of individual components of the complex, we find that protein partners that are known to form independent subcomplexes are observed to share similar profiles, reaffirming the link between differential conservation of interface regions and their inter-dependence. When we extend our analysis to individual protein components of the complex, we find taxa-specific variability in molecular signatures of the proteins. These trends are discussed in the context of proline-rich motifs of SF3b4, functional and drug binding sites of SF3b1. Further, we report key protein-protein interactions between SF3b1 and SF3b6 whose presence is observed to be lineage-specific across eukaryotes. Together, our studies show the association of protein location within the complex and subcomplex formation patterns with the sequence conservation of SF3b proteins. In addition, our study underscores evolutionarily flexible elements that appear to confer adaptive features in individual components of the multi-protein SF3b complexes and may contribute to its functional adaptability.

12.
Curr Res Struct Biol ; 3: 133-145, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-35028595

RESUMO

The evolution of homologous and functionally equivalent multiprotein assemblies is intriguing considering sequence divergence of constituent proteins. Here, we studied the implications of protein sequence divergence on the structure, dynamics and function of homologous yeast and human SF3b spliceosomal subcomplexes. Human and yeast SF3b comprise of 7 and 6 proteins respectively, with all yeast proteins homologous to their human counterparts at moderate sequence identity. SF3b6, an additional component in the human SF3b, interacts with the N-terminal extension of SF3b1 while the yeast homologue Hsh155 lacks the equivalent region. Through detailed homology studies, we show that SF3b6 is absent not only in yeast but in multiple lineages of eukaryotes implying that it is critical in specific organisms. We probed for the potential role of SF3b6 in the spliceosome assembled form through structural and flexibility analyses. By analysing normal modes derived from anisotropic network models of SF3b1, we demonstrate that when SF3b1 is bound to SF3b6, similarities in the magnitude of residue motions (0.86) and inter-residue correlated motions (0.94) with Hsh155 are significantly higher than when SF3b1 is considered in isolation (0.21 and 0.89 respectively). We observed that SF3b6 promotes functionally relevant 'open-to-close' transition in SF3b1 by enhancing concerted residue motions. Such motions are found to occur in the Hsh155 without SF3b6. The presence of SF3b6 influences motions of 16 residues that interact with U2 snRNA/branchpoint duplex and supports the participation of its interface residues in long-range communication in the SF3b1. These results advocate that SF3b6 potentially acts as an allosteric regulator of SF3b1 for BPS selection and might play a role in alternative splicing. Furthermore, we observe variability in the relative orientation of SF3b4 and in the local structure of three ß-propeller domains of SF3b3 with reference to their yeast counterparts. Such differences influence the inter-protein interactions of SF3b between these two organisms. Together, our findings highlight features of SF3b evolution and suggests that the human SF3b may have evolved sophisticated mechanisms to fine tune its molecular function.

13.
Front Mol Biosci ; 8: 654164, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34409066

RESUMO

Ribosomes play a critical role in maintaining cellular proteostasis. The binding of messenger RNA (mRNA) to the ribosome regulates kinetics of protein synthesis. To generate an understanding of the structural, mechanistic, and dynamical features of mRNA recognition in the ribosome, we have analysed mRNA-protein interactions through a structural comparison of the ribosomal complex in the presence and absence of mRNA. To do so, we compared the 3-Dimensional (3D) structures of components of the two assembly structures and analysed their structural differences because of mRNA binding, using elastic network models and structural network-based analysis. We observe that the head region of 30S ribosomal subunit undergoes structural displacement and subunit rearrangement to accommodate incoming mRNA. We find that these changes are observed in proteins that lie far from the mRNA-protein interface, implying allostery. Further, through perturbation response scanning, we show that the proteins S13, S19, and S20 act as universal sensors that are sensitive to changes in the inter protein network, upon binding of 30S complex with mRNA and other initiation factors. Our study highlights the significance of mRNA binding in the ribosome complex and identifies putative allosteric sites corresponding to alterations in structure and/or dynamics, in regions away from mRNA binding sites in the complex. Overall, our work provides fresh insights into mRNA association with the ribosome, highlighting changes in the interactions and dynamics of the ribosome assembly because of the binding.

14.
Toxins (Basel) ; 12(8)2020 07 29.
Artigo em Inglês | MEDLINE | ID: mdl-32751054

RESUMO

Mycobacterium tuberculosis genome encodes over 80 toxin-antitoxin (TA) systems. While each toxin interacts with its cognate antitoxin, the abundance of TA systems presents an opportunity for potential non-cognate interactions. TA systems mediate manifold interactions to manage pathogenicity and stress response network of the cell and non-cognate interactions may play vital roles as well. To address if non-cognate and heterologous interactions are feasible and to understand the structural basis of their interactions, we have performed comprehensive computational analyses on the available 3D structures and generated structural models of paralogous M. tuberculosis VapBC and MazEF TA systems. For a majority of the TA systems, we show that non-cognate toxin-antitoxin interactions are structurally incompatible except for complexes like VapBC15 and VapBC11, which show similar interfaces and potential for cross-reactivity. For TA systems which have been experimentally shown earlier to disfavor non-cognate interactions, we demonstrate that they are structurally and stereo-chemically incompatible. For selected TA systems, our detailed structural analysis identifies specificity conferring residues. Thus, our work improves the current understanding of TA interfaces and generates a hypothesis based on congenial binding site, geometric complementarity, and chemical nature of interfaces. Overall, our work offers a structure-based explanation for non-cognate toxin-antitoxin interactions in M. tuberculosis.


Assuntos
Proteínas de Bactérias , Toxinas Bacterianas , Mycobacterium tuberculosis , Sistemas Toxina-Antitoxina , Sequência de Aminoácidos , Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Toxinas Bacterianas/química , Toxinas Bacterianas/genética , Modelos Moleculares , Mutação Puntual , Alinhamento de Sequência , Sistemas Toxina-Antitoxina/genética
15.
Sci Rep ; 9(1): 1163, 2019 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-30718534

RESUMO

Toxin-antitoxin (TA) systems are ubiquitously existing addiction modules with essential roles in bacterial persistence and virulence. The genome of Mycobacterium tuberculosis encodes approximately 79 TA systems. Through computational and experimental investigations, we report for the first time that Rv0366c-Rv0367c is a non-canonical PezAT-like toxin-antitoxin system in M. tuberculosis. Homology searches with known PezT homologues revealed that residues implicated in nucleotide, antitoxin-binding and catalysis are conserved in Rv0366c. Unlike canonical PezA antitoxins, the N-terminal of Rv0367c is predicted to adopt the ribbon-helix-helix (RHH) motif for deoxyribonucleic acid (DNA) recognition. Further, the modelled complex predicts that the interactions between PezT and PezA involve conserved residues. We performed a large-scale search in sequences encoded in 101 mycobacterial and 4500 prokaryotic genomes and show that such an atypical PezAT organization is conserved in 20 other mycobacterial organisms and in families of class Actinobacteria. We also demonstrate that overexpression of Rv0366c induces bacteriostasis and this growth defect could be restored upon co-expression of cognate antitoxin, Rv0367c. Further, we also observed that inducible expression of Rv0366c in Mycobacterium smegmatis results in decreased cell-length and enhanced tolerance against a front-line tuberculosis (TB) drug, ethambutol. Taken together, we have identified and functionally characterized a novel non-canonical TA system from M. tuberculosis.


Assuntos
Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , Sistemas Toxina-Antitoxina/genética , Biologia Computacional , Sequência Conservada , Conformação Proteica , Mapas de Interação de Proteínas
16.
BMC Struct Biol ; 8: 28, 2008 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-18513436

RESUMO

BACKGROUND: Distantly related proteins adopt and retain similar structural scaffolds despite length variations that could be as much as two-fold in some protein superfamilies. In this paper, we describe an analysis of indel regions that accommodate length variations amongst related proteins. We have developed an algorithm CUSP, to examine multi-membered PASS2 superfamily alignments to identify indel regions in an automated manner. Further, we have used the method to characterize the length, structural type and biochemical features of indels in related protein domains. RESULTS: CUSP, examines protein domain structural alignments to distinguish regions of conserved structure common to related proteins from structurally unconserved regions that vary in length and type of structure. On a non-redundant dataset of 353 domain superfamily alignments from PASS2, we find that 'length- deviant' protein superfamilies show > 30% length variation from their average domain length. 60% of additional lengths that occur in indels are short-length structures (< 5 residues) while 6% of indels are > 15 residues in length. Structural types in indels also show class-specific trends. CONCLUSION: The extent of length variation varies across different superfamilies and indels show class-specific trends for preferred lengths and structural types. Such indels of different lengths even within a single protein domain superfamily could have structural and functional consequences that drive their selection, underlying their importance in similarity detection and computational modelling. The availability of systematic algorithms, like CUSP, should enable decision making in a domain superfamily-specific manner.


Assuntos
Algoritmos , Sequência Conservada/genética , Modelos Moleculares , Estrutura Terciária de Proteína , Proteínas/genética , Alinhamento de Sequência/métodos , Conformação Proteica
17.
PLoS One ; 13(10): e0205267, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30307988

RESUMO

BAF250a and BAF250b are subunits of the SWI/SNF chromatin-remodeling complex that recruit the complex to chromatin allowing transcriptional activation of several genes. Despite being the central subunits of the SWI/SNF complex, the structural and functional annotation of BAF250a/b remains poorly understood. BAF250a (nearly 2200 residues protein) harbors an N-terminal DNA binding ARID (~110 residues) and a C-terminal folded region (~250 residues) of unknown structure and function, recently annotated as BAF250_C. Using hydrophobic core analysis, fold prediction and comparative modeling, here we have defined a domain boundary and associate a ß-catenin like ARM-repeat fold to the C-terminus of BAF250a that encompass BAF250_C. The N-terminal DNA-binding ARID is found in diverse domain combinations in proteins imparting unique functions. We used a comparative sequence analysis based approach to study the ARIDs from diverse domain contexts and identified conserved residue positions that are important to preserve its core structure. Supporting this, mutation of one such conserved residue valine, at position 1067, to glycine, resulted in destabilization, loss of structural integrity and DNA binding affinity of ARID. Additionally, we identified a set of conserved and surface-exposed residues unique to the ARID when it co-occurs with the ARM repeat containing BAF250_C in BAF250a. Several of these residues are found mutated in somatic cancers. We predict that these residues in BAF250a may play important roles in mediating protein-DNA and protein-protein interactions in the BAF complex.


Assuntos
Montagem e Desmontagem da Cromatina/genética , Complexos Multienzimáticos/genética , Proteínas Nucleares/genética , Domínios Proteicos/genética , Fatores de Transcrição/genética , Diferenciação Celular , Biologia Computacional , Proteínas de Ligação a DNA , Conjuntos de Dados como Assunto , Glicina/genética , Simulação de Dinâmica Molecular , Complexos Multienzimáticos/química , Mutagênese Sítio-Dirigida , Mutação , Ressonância Magnética Nuclear Biomolecular , Proteínas Nucleares/química , Proteínas Nucleares/isolamento & purificação , Ligação Proteica/genética , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/isolamento & purificação , Alinhamento de Sequência , Fatores de Transcrição/química , Fatores de Transcrição/isolamento & purificação , Valina/genética
18.
Biol Direct ; 13(1): 8, 2018 05 09.
Artigo em Inglês | MEDLINE | ID: mdl-29776380

RESUMO

BACKGROUND: Knowledge of the protein structure is a pre-requisite for improved understanding of molecular function. The gap in the sequence-structure space has increased in the post-genomic era. Grouping related protein sequences into families can aid in narrowing the gap. In the Pfam database, structure description is provided for part or full-length proteins of 7726 families. For the remaining 52% of the families, information on 3-D structure is not yet available. We use the computationally designed sequences that are intermediately related to two protein domain families, which are already known to share the same fold. These strategically designed sequences enable detection of distant relationships and here, we have employed them for the purpose of structure recognition of protein families of yet unknown structure. RESULTS: We first measured the success rate of our approach using a dataset of protein families of known fold and achieved a success rate of 88%. Next, for 1392 families of yet unknown structure, we made structural assignments for part/full length of the proteins. Fold association for 423 domains of unknown function (DUFs) are provided as a step towards functional annotation. CONCLUSION: The results indicate that knowledge-based filling of gaps in protein sequence space is a lucrative approach for structure recognition. Such sequences assist in traversal through protein sequence space and effectively function as 'linkers', where natural linkers between distant proteins are unavailable. REVIEWERS: This article was reviewed by Oliviero Carugo, Christine Orengo and Srikrishna Subramanian.


Assuntos
Proteínas/química , Sequência de Aminoácidos , Biologia Computacional , Bases de Dados de Proteínas , Dobramento de Proteína , Estrutura Terciária de Proteína , Proteínas/genética , Análise de Sequência de Proteína
19.
Curr Opin Struct Biol ; 37: 71-80, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26773478

RESUMO

Design of proteins has far-reaching potentials in diverse areas that span repurposing of the protein scaffold for reactions and substrates that they were not naturally meant for, to catching a glimpse of the ephemeral proteins that nature might have sampled during evolution. These non-natural proteins, either in synthesized or virtual form have opened the scope for the design of entities that not only rival their natural counterparts but also offer a chance to visualize the protein space continuum that might help to relate proteins and understand their associations. Here, we review the recent advances in protein engineering and design, in multiple areas, with a view to drawing attention to their future potential.


Assuntos
Proteínas/química , Sequência de Aminoácidos , Nanotecnologia , Dobramento de Proteína
20.
Biol Direct ; 10: 38, 2015 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-26228684

RESUMO

BACKGROUND: In the post-genomic era where sequences are being determined at a rapid rate, we are highly reliant on computational methods for their tentative biochemical characterization. The Pfam database currently contains 3,786 families corresponding to "Domains of Unknown Function" (DUF) or "Uncharacterized Protein Family" (UPF), of which 3,087 families have no reported three-dimensional structure, constituting almost one-fourth of the known protein families in search for both structure and function. RESULTS: We applied a 'computational structural genomics' approach using five state-of-the-art remote similarity detection methods to detect the relationship between uncharacterized DUFs and domain families of known structures. The association with a structural domain family could serve as a start point in elucidating the function of a DUF. Amongst these five methods, searches in SCOP-NrichD database have been applied for the first time. Predictions were classified into high, medium and low- confidence based on the consensus of results from various approaches and also annotated with enzyme and Gene ontology terms. 614 uncharacterized DUFs could be associated with a known structural domain, of which high confidence predictions, involving at least four methods, were made for 54 families. These structure-function relationships for the 614 DUF families can be accessed on-line at http://proline.biochem.iisc.ernet.in/RHD_DUFS/ . For potential enzymes in this set, we assessed their compatibility with the associated fold and performed detailed structural and functional annotation by examining alignments and extent of conservation of functional residues. Detailed discussion is provided for interesting assignments for DUF3050, DUF1636, DUF1572, DUF2092 and DUF659. CONCLUSIONS: This study provides insights into the structure and potential function for nearly 20 % of the DUFs. Use of different computational approaches enables us to reliably recognize distant relationships, especially when they converge to a common assignment because the methods are often complementary. We observe that while pointers to the structural domain can offer the right clues to the function of a protein, recognition of its precise functional role is still 'non-trivial' with many DUF domains conserving only some of the critical residues. It is not clear whether these are functional vestiges or instances involving alternate substrates and interacting partners.


Assuntos
Bases de Dados de Proteínas , Evolução Molecular , Estrutura Terciária de Proteína , Homologia de Sequência de Aminoácidos , Biologia Computacional , Genômica , Anotação de Sequência Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA