Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 52
Filtrar
Mais filtros

Bases de dados
Tipo de documento
Intervalo de ano de publicação
1.
Methods Mol Biol ; 2447: 83-93, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35583774

RESUMO

The physiological relevance of site-specific precursor processing for the biogenesis of peptide hormones and growth factors can be demonstrated in genetic complementation experiments, in which a gain of function is observed for the cleavable wild-type precursor, but not for a non-cleavable precursor mutant. Similarly, cleavable and non-cleavable synthetic peptides can be used in bioassays to test whether processing is required for bioactivity. In genetic complementation experiments, site-directed mutagenesis has to be used to mask a processing site against proteolysis. Peptide-based bioassays have the distinctive advantage that peptides can be protected against proteolytic cleavage by backbone modifications, i.e., without changing the amino acid sequence. Peptide backbone modifications have been employed to increase the metabolic stability of peptide drugs, and in basic research, to investigate whether processing at a certain site is required for precursor maturation and formation of the bioactive peptide. For this approach, it is important to show that modification of the peptide backbone has the desired effect and does indeed protect the respective peptide bond against proteolysis. This can be accomplished with the MALDI-TOF mass spectrometry-based assay we describe here.


Assuntos
Hormônios Peptídicos , Processamento de Proteína Pós-Traducional , Sequência de Aminoácidos , Hormônios Peptídicos/metabolismo , Sinais Direcionadores de Proteínas , Proteólise
2.
Parasit Vectors ; 12(1): 508, 2019 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-31666116

RESUMO

BACKGROUND: New candidate protective antigens for tick vaccine development may be identified by selecting and testing antigen candidates that play key biological functions. After blood-feeding, tick midgut overexpresses proteins that play essential functions in tick survival and disease transmission. Herein, Ornithodoros erraticus midgut transcriptomic and proteomic data were examined in order to select functionally significant antigens upregulated after feeding to be tested as vaccine candidate antigens. METHODS: Transcripts annotated as chitinases, tetraspanins, ribosomal protein P0 and secreted proteins/peptides were mined from the recently published O. erraticus midgut transcriptome and filtered in a second selection step using criteria based on upregulation after feeding, predicted antigenicity and expression in the midgut proteome. Five theoretical candidate antigens were selected, obtained as recombinant proteins and used to immunise rabbits: one chitinase (CHI), two tetraspanins (TSPs), the ribosomal protein P0 (RPP0) and one secreted protein PK-4 (PK4). RESULTS: Rabbit vaccination with individual recombinant candidates induced strong humoral responses that mainly reduced nymph moulting and female reproduction, providing 30.2% (CHI), 56% (TSPs), 57.5% (RPP0) and 57.8% (PK4) protection to O. erraticus infestations and 19.6% (CHI), 11.1% (TSPs), 0% (RPP0) and 8.1% (PK4) cross-protection to infestations by the African tick Ornithodoros moubata. The joint vaccine efficacy of the candidates was assessed in a second vaccine trial reaching 66.3% protection to O. erraticus and 25.6% cross-protection to O. moubata. CONCLUSIONS: These results (i) indicate that argasid chitinases and RPP0 are promising protective antigens, as has already been demonstrated for ixodid chitinases and RPP0, and could be included in vaccines targeting multiple tick species; (ii) reveal novel protective antigens tetraspanins and secreted protein PK-4, never tested before as protective antigens in ticks; and (iii) demonstrate that multi-antigenic vaccines increased vaccine efficacy compared with individual antigens. Lastly, our data emphasize the value of the tick midgut as a source of protective candidate antigens in argasids for tick control.


Assuntos
Proteínas de Artrópodes/imunologia , Ornithodoros/química , Vacinas/imunologia , Sequência de Aminoácidos , Animais , Antígenos/imunologia , Quitinases/química , Epitopos/química , Feminino , Glicosídeo Hidrolases/química , Ornithodoros/classificação , Ornithodoros/imunologia , Filogenia , Sinais Direcionadores de Proteínas , Coelhos , Proteínas Recombinantes/imunologia , Proteínas Ribossômicas/imunologia , Alinhamento de Sequência , Tetraspaninas/química , Tetraspaninas/imunologia , Tetraspaninas/isolamento & purificação
3.
Int J Mol Sci ; 19(12)2018 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-30469512

RESUMO

Signal peptides are N-terminal presequences responsible for targeting proteins to the endomembrane system, and subsequent subcellular or extracellular compartments, and consequently condition their proper function. The significance of signal peptides stimulates development of new computational methods for their detection. These methods employ learning systems trained on datasets comprising signal peptides from different types of proteins and taxonomic groups. As a result, the accuracy of predictions are high in the case of signal peptides that are well-represented in databases, but might be low in other, atypical cases. Such atypical signal peptides are present in proteins found in apicomplexan parasites, causative agents of malaria and toxoplasmosis. Apicomplexan proteins have a unique amino acid composition due to their AT-biased genomes. Therefore, we designed a new, more flexible and universal probabilistic model for recognition of atypical eukaryotic signal peptides. Our approach called signalHsmm includes knowledge about the structure of signal peptides and physicochemical properties of amino acids. It is able to recognize signal peptides from the malaria parasites and related species more accurately than popular programs. Moreover, it is still universal enough to provide prediction of other signal peptides on par with the best preforming predictors.


Assuntos
Plasmodium/química , Sinais Direcionadores de Proteínas , Proteínas de Protozoários/química , Análise de Sequência de Proteína/métodos , Aminoácidos/química , Cadeias de Markov , Análise de Sequência de Proteína/normas
4.
J Bioinform Comput Biol ; 16(5): 1850019, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30353782

RESUMO

Hidden Markov Models (HMMs) are probabilistic models widely used in computational molecular biology. However, the Markovian assumption regarding transition probabilities which dictates that the observed symbol depends only on the current state may not be sufficient for some biological problems. In order to overcome the limitations of the first order HMM, a number of extensions have been proposed in the literature to incorporate past information in HMMs conditioning either on the hidden states, or on the observations, or both. Here, we implement a simple extension of the standard HMM in which the current observed symbol (amino acid residue) depends both on the current state and on a series of observed previous symbols. The major advantage of the method is the simplicity in the implementation, which is achieved by properly transforming the observation sequence, using an extended alphabet. Thus, it can utilize all the available algorithms for the training and decoding of HMMs. We investigated the use of several encoding schemes and performed tests in a number of important biological problems previously studied by our team (prediction of transmembrane proteins and prediction of signal peptides). The evaluation shows that, when enough data are available, the performance increased by 1.8%-8.2% and the existing prediction methods may improve using this approach. The methods, for which the improvement was significant (PRED-TMBB2, PRED-TAT and HMM-TM), are available as web-servers freely accessible to academic users at www.compgen.org/tools/ .


Assuntos
Biologia Computacional/métodos , Cadeias de Markov , Algoritmos , Proteínas de Membrana/química , Proteínas de Membrana/metabolismo , Modelos Moleculares , Modelos Estatísticos , Sinais Direcionadores de Proteínas
5.
BMC Bioinformatics ; 17(1): 378, 2016 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-27634135

RESUMO

BACKGROUND: "Tail-anchored (TA) proteins" is a collective term for transmembrane proteins with a C-terminal transmembrane domain (TMD) and without an N-terminal signal sequence. TA proteins account for approximately 3-5 % of all transmembrane proteins that mediate membrane fusion, regulation of apoptosis, and vesicular transport. The combined use of TMD and signal sequence prediction tools is typically required to predict TA proteins. RESULTS: Here we developed a prediction system named TAPPM that predicted TA proteins solely from target amino acid sequences according to the knowledge of the sequence features of TMDs and the peripheral regions of TA proteins. Manually curated TA proteins were collected from published literature. We constructed hidden markov models of TA proteins as well as three different types of transmembrane proteins with similar structures and compared their likelihoods as TA proteins. CONCLUSIONS: Using the HMM models, we achieved high prediction accuracy; area under the receiver operator curve values reaching 0.963. A command line tool written in Python is available at https://github.com/davecao/tappm_cli .


Assuntos
Proteínas de Membrana/química , Análise de Sequência de Proteína/métodos , Humanos , Cadeias de Markov , Domínios Proteicos , Sinais Direcionadores de Proteínas
6.
Biol Direct ; 10: 31, 2015 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-26018427

RESUMO

BACKGROUND: Transmembrane proteins have important roles in cells, as they are involved in energy production, signal transduction, cell-cell interaction, cell-cell communication and more. In human cells, they are frequently targets for pharmaceuticals; therefore, knowledge about their properties and structure is crucial. Topology of transmembrane proteins provide a low resolution structural information, which can be a starting point for either laboratory experiments or modelling their 3D structures. RESULTS: Here, we present a database of the human α-helical transmembrane proteome, including the predicted and/or experimentally established topology of each transmembrane protein, together with the reliability of the prediction. In order to distinguish transmembrane proteins in the proteome as well as for topology prediction, we used a newly developed consensus method (CCTOP) that incorporates recent state of the art methods, with tested accuracies on a novel human benchmark protein set. CCTOP utilizes all available structure and topology data as well as bioinformatical evidences for topology prediction in a probabilistic framework provided by the hidden Markov model. This method shows the highest accuracy (98.5 % for discrinimating between transmembrane and non-transmembrane proteins and 84 % for per protein topology prediction) among the dozen tested topology prediction methods. Analysis of the human proteome with the CCTOP indicates that it contains 4998 (26 %) transmembrane proteins. Besides predicting topology, reliability of the predictions is estimated as well, and it is demonstrated that the per protein prediction accuracies of more than 60 % of the predictions are over 98 % on the benchmark sets and most probably on the predicted human transmembrane proteome too. CONCLUSIONS: Here, we present the most accurate prediction of the human transmembrane proteome together with the experimental topology data. These data, as well as various statistics about the human transmembrane proteins and their topologies can be downloaded from and can be visualized at the website of the human transmembrane proteome ( http://htp.enzim.hu ).


Assuntos
Proteínas de Membrana/química , Proteoma , Algoritmos , Comunicação Celular , Biologia Computacional , Bases de Dados de Proteínas , Humanos , Cadeias de Markov , Probabilidade , Conformação Proteica , Sinais Direcionadores de Proteínas , Transdução de Sinais
7.
J Am Soc Mass Spectrom ; 25(5): 722-8, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24526466

RESUMO

Coarse-grained simulations with charge hopping were performed for a positively charged tetrameric transthyretin (TTR) protein complex with a total charge of +20. Charges were allowed to move among basic amino acid sites as well as N-termini. Charge distributions and radii of gyration were calculated for complexes simulated at two temperatures, 300 and 600 K, under different scenarios. One scenario treated the complex in its normal state allowing charge to move to any basic site. Another scenario blocked protonation of all the N-termini except one. A final scenario used the complex in its normal state but added a basic-site containing tether (charge tag) near the N-terminus of one chain. The differences in monomer unfolding and charging were monitored in all three scenarios and compared. The simulation results show the importance of the N-terminus in leading the unfolding of the monomer units; a process that follows a zipper-like mechanism. Overall, experimentally modifying the complex by adding a tether or blocking the protonation of N-termini may give the potential for controlling the unraveling and subsequent dissociation of protein complexes.


Assuntos
Modelos Moleculares , Pré-Albumina/química , Substituição de Aminoácidos , Temperatura Alta , Humanos , Proteínas Imobilizadas/química , Proteínas Imobilizadas/genética , Cinética , Simulação de Dinâmica Molecular , Método de Monte Carlo , Proteínas Mutantes/química , Pré-Albumina/genética , Sinais Direcionadores de Proteínas , Estrutura Quaternária de Proteína , Desdobramento de Proteína , Propriedades de Superfície , Volatilização
8.
J Proteome Res ; 12(10): 4449-61, 2013 Oct 04.
Artigo em Inglês | MEDLINE | ID: mdl-24007199

RESUMO

The secretion of certain proteins in Porphyromonas gingivalis is dependent on a C-terminal domain (CTD). After secretion, the CTD is cleaved prior to extensive modification of the mature protein, probably with lipopolysaccharide, therefore enabling attachment to the cell surface. In this study, bioinformatic analyses of the CTD demonstrated the presence of three conserved sequence motifs. These motifs were used to construct Hidden Markov Models (HMMs) that predicted 663 CTD-containing proteins in 21 fully sequenced species of the Bacteroidetes phylum, while no CTD-containing proteins were predicted in species outside this phylum. Further HMM searching of Cytophaga hutchinsonii led to a total of 171 predicted CTD proteins in that organism alone. Proteomic analyses of membrane fractions and culture fluid derived from P. gingivalis and four other species containing predicted CTDs (Parabacteroides distasonis, Prevotella intermedia, Tannerella forsythia, and C. hutchinsonii) demonstrated that membrane localization, extensive post-translational modification, and CTD-cleavage were conserved features of the secretion system. The CTD cleavage site of 10 different proteins from 3 different species was determined and found to be similar to the cleavage site previously determined in P. gingivalis, suggesting that homologues of the C-terminal signal peptidase (PG0026) are responsible for the cleavage in these species.


Assuntos
Proteínas de Bactérias/metabolismo , Proteínas de Membrana/metabolismo , Porphyromonas gingivalis/metabolismo , Prevotella intermedia/metabolismo , Processamento de Proteína Pós-Traducional , Sequência de Aminoácidos , Proteínas de Bactérias/química , Sistemas de Secreção Bacterianos , Bacteroidetes/metabolismo , Cadeias de Markov , Proteínas de Membrana/química , Dados de Sequência Molecular , Filogenia , Sinais Direcionadores de Proteínas , Homologia de Sequência de Aminoácidos
9.
PLoS One ; 8(6): e65012, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23762278

RESUMO

The problem of reconstruction of ancestral states given a phylogeny and data from extant species arises in a wide range of biological studies. The continuous-time Markov model for the discrete states evolution is generally used for the reconstruction of ancestral states. We modify this model to account for a case when the states of the extant species are uncertain. This situation appears, for example, if the states for extant species are predicted by some program and thus are known only with some level of reliability; it is common for bioinformatics field. The main idea is formulation of the problem as a hidden Markov model on a tree (tree HMM, tHMM), where the basic continuous-time Markov model is expanded with the introduction of emission probabilities of observed data (e.g. prediction scores) for each underlying discrete state. Our tHMM decoding algorithm allows us to predict states at the ancestral nodes as well as to refine states at the leaves on the basis of quantitative comparative genomics. The test on the simulated data shows that the tHMM approach applied to the continuous variable reflecting the probabilities of the states (i.e. prediction score) appears to be more accurate then the reconstruction from the discrete states assignment defined by the best score threshold. We provide examples of applying our model to the evolutionary analysis of N-terminal signal peptides and transcription factor binding sites in bacteria. The program is freely available at http://bioinf.fbb.msu.ru/~nadya/tHMM and via web-service at http://bioinf.fbb.msu.ru/treehmmweb.


Assuntos
Evolução Biológica , Genômica , Cadeias de Markov , Modelos Genéticos , Algoritmos , Asparaginase/metabolismo , Teorema de Bayes , Sítios de Ligação , Simulação por Computador , Filogenia , Sinais Direcionadores de Proteínas/genética , Fatores de Transcrição/metabolismo
10.
Microbiology (Reading) ; 159(Pt 7): 1267-1275, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23704786

RESUMO

The facultatively anaerobic, thermophilic bacterium Geobacillus thermoglucosidasius is being developed as an industrial micro-organism for cellulosic bioethanol production. Process improvement would be gained by enhanced secretion of glycosyl hydrolases. Here we report the construction of a modular system for combining promoters, signal peptide encoding regions and glycosyl hydrolase genes to facilitate selection of the optimal combination in G. thermoglucosidasius. Initially, a minimal three-part E. coli-Geobacillus sp. shuttle vector pUCG3.8 was constructed using Gibson isothermal DNA assembly. The three PCR amplicons contained the pMB1 E. coli origin of replication and multiple cloning site (MCS) of pUC18, the Geobacillus sp. origin of replication pBST1 and the thermostable kanamycin nucleotidyltransferase gene (knt), respectively. G. thermoglucosidasius could be transformed with pUCG3.8 at an increased efficiency [2.8×10(5) c.f.u. (µg DNA)(-1)] compared to a previously reported shuttle vector, pUCG18. A modular cassette for the inducible expression and secretion of proteins in G. thermoglucosidasius, designed to allow the simple interchange of parts, was demonstrated using the endoglucanase Cel5A from Thermotoga maritima as a secretion target. Expression of cel5A was placed under the control of a cellobiose-inducible promoter (Pßglu) together with a signal peptide encoding sequence from a G. thermoglucosidasius C56-YS93 endo-ß-1,4-xylanase. The interchange of parts was demonstrated by exchanging the cel5A gene with the 3' region of a gene with homology to celA from Caldicellulosiruptor saccharolyticus and substituting Pßglu for the synthetic, constitutive promoter PUp2n38, which increased Cel5A activity five-fold. Cel5A and CelA activities were detected in culture supernatants indicating successful expression and secretion. N-terminal protein sequencing of Cel5A carrying a C-terminal FLAG epitope confirmed processing of the signal peptide sequence.


Assuntos
Celulase/metabolismo , Vetores Genéticos , Geobacillus/enzimologia , Biologia Sintética/métodos , Thermotoga maritima/enzimologia , Thermotoga maritima/genética , Celobiose/metabolismo , Celulase/genética , Clonagem Molecular , Endo-1,4-beta-Xilanases/genética , Endo-1,4-beta-Xilanases/metabolismo , Geobacillus/classificação , Geobacillus/genética , Regiões Promotoras Genéticas/genética , Sinais Direcionadores de Proteínas
11.
Clin Chem ; 58(4): 757-67, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22291141

RESUMO

BACKGROUND: New biomarkers are needed to assist clinical decision making in cardiovascular disease. We have recently shown that signal peptides may represent a novel biomarker target in cardiovascular diseases. METHODS: We developed a novel immunoassay for the signal peptide of preproANP (ANPsp) and used it to document cardiac tissue levels of ANPsp in explant human hearts (n = 9), circulating venous concentrations of ANPsp in healthy volunteers (n = 65), temporal ANPsp concentrations in patients with ST-elevation myocardial infarction (STEMI) <4 h after chest pain onset (n = 23), and regional plasma ANPsp concentrations in patients undergoing clinically indicated catheterization (n = 10). We analyzed the structure and sequence of circulating ANPsp by tandem mass spectrometry (MS/MS). RESULTS: ANPsp levels in human heart tissue were 50-1000 times lower than those of ANP/NT-proANP. ANPsp was detectable in control human plasma at concentrations comparable with ANP itself (approximately 20 ng/L). In STEMI patients, plasma concentrations of ANPsp rose to peak values at 5 h after symptom onset, significantly earlier than myoglobin, creatine kinase-MB, and troponin (P < 0.001). There were significant arteriovenous increases in ANPsp concentrations (P < 0.05) across the heart and kidney; arterial and coronary sinus concentrations of ANPsp both negatively correlated with systolic and mean arterial blood pressures (both P < 0.01). MS/MS verified circulating ANPsp to be preproANP(16-25) and preproANP(18-25). CONCLUSIONS: ANPsp is a novel circulating natriuretic peptide with potential to act as a cardiovascular biomarker. The rapid increase of plasma ANPsp in STEMI and its significant relationship with blood pressure encourage further study of its potential clinical utility.


Assuntos
Fator Natriurético Atrial/sangue , Sinais Direcionadores de Proteínas , Fator Natriurético Atrial/química , Biomarcadores/sangue , Cromatografia em Gel , Cromatografia Líquida de Alta Pressão , Humanos , Imunoensaio , Infarto do Miocárdio/sangue , Infarto do Miocárdio/diagnóstico , Miocárdio/metabolismo , Espectrometria de Massas em Tandem
12.
Biochim Biophys Acta ; 1824(3): 488-92, 2012 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-22244925

RESUMO

Conopeptides are small toxins produced by predatory marine snails of the genus Conus. They are studied with increasing intensity due to their potential in neurosciences and pharmacology. The number of existing conopeptides is estimated to be 1 million, but only about 1000 have been described to date. Thanks to new high-throughput sequencing technologies the number of known conopeptides is likely to increase exponentially in the near future. There is therefore a need for a fast and accurate computational method for identification and classification of the novel conopeptides in large data sets. 62 profile Hidden Markov Models (pHMMs) were built for prediction and classification of all described conopeptide superfamilies and families, based on the different parts of the corresponding protein sequences. These models showed very high specificity in detection of new peptides. 56 out of 62 models do not give a single false positive in a test with the entire UniProtKB/Swiss-Prot protein sequence database. Our study demonstrates the usefulness of mature peptide models for automatic classification with accuracy of 96% for the mature peptide models and 100% for the pro- and signal peptide models. Our conopeptide profile HMMs can be used for finding and annotation of new conopeptides from large datasets generated by transcriptome or genome sequencing. To our knowledge this is the first time this kind of computational method has been applied to predict all known conopeptide superfamilies and some conopeptide families.


Assuntos
Conotoxinas/classificação , Caramujo Conus/química , Neurotoxinas/classificação , Precursores de Proteínas/classificação , Transcriptoma , Sequência de Aminoácidos , Animais , Conotoxinas/química , Conotoxinas/isolamento & purificação , Caramujo Conus/genética , Bases de Dados de Proteínas , Cadeias de Markov , Dados de Sequência Molecular , Neurotoxinas/química , Neurotoxinas/isolamento & purificação , Filogenia , Precursores de Proteínas/química , Precursores de Proteínas/isolamento & purificação , Sinais Direcionadores de Proteínas/fisiologia , Análise de Sequência de Proteína , Terminologia como Assunto
13.
J Comput Biol ; 18(11): 1709-22, 2011 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-21999284

RESUMO

Proper subcellular localization is critical for proteins to perform their roles in cellular functions. Proteins are transported by different cellular sorting pathways, some of which take a protein through several intermediate locations until reaching its final destination. The pathway a protein is transported through is determined by carrier proteins that bind to specific sequence motifs. In this article, we present a new method that integrates protein interaction and sequence motif data to model how proteins are sorted through these sorting pathways. We use a hidden Markov model (HMM) to represent protein sorting pathways. The model is able to determine intermediate sorting states and to assign carrier proteins and motifs to the sorting pathways. In simulation studies, we show that the method can accurately recover an underlying sorting model. Using data for yeast, we show that our model leads to accurate prediction of subcellular localization. We also show that the pathways learned by our model recover many known sorting pathways and correctly assign proteins to the path they utilize. The learned model identified new pathways and their putative carriers and motifs and these may represent novel protein sorting mechanisms. Supplementary results and software implementation are available from http://murphylab.web.cmu.edu/software/2010_RECOMB_pathways/.


Assuntos
Sinais Direcionadores de Proteínas , Transporte Proteico , Software , Algoritmos , Motivos de Aminoácidos , Inteligência Artificial , Simulação por Computador , Proteínas Fúngicas/química , Cadeias de Markov , Modelos Biológicos , Domínios e Motivos de Interação entre Proteínas
14.
Proc Natl Acad Sci U S A ; 108(9): 3596-601, 2011 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-21317362

RESUMO

Nascent membrane proteins typically insert in a sequential fashion into the membrane via a protein-conducting channel, the Sec translocon. How this process occurs is still unclear, although a thermodynamic partitioning between the channel and the membrane environment has been proposed. Experiment- and simulation-based scales for the insertion free energy of various amino acids are, however, at variance, the former appearing to lie in a narrower range than the latter. Membrane insertion of arginine, for instance, requires 14-17 kcal/mol according to molecular dynamics simulations, but only 2-3 kcal/mol according to experiment. We suggest that this disagreement is resolved by assuming a two-stage insertion process wherein the first step, the insertion into the translocon, is energized by protein synthesis and, therefore, has an effectively zero free-energy cost; the second step, the insertion into the membrane, invokes the translocon as an intermediary between the fully hydrated and the fully inserted locations. Using free-energy perturbation calculations, the effective transfer free energies from the translocon to the membrane have been determined for both arginine and leucine amino acids carried by a background polyleucine helix. Indeed, the insertion penalty for arginine as well as the insertion gain for leucine from the translocon to the membrane is found to be significantly reduced compared to direct insertion from water, resulting in the same compression as observed in the experiment-based scale.


Assuntos
Proteínas de Membrana/metabolismo , Sinais Direcionadores de Proteínas , Arginina/metabolismo , Simulação por Computador , Interações Hidrofóbicas e Hidrofílicas , Bicamadas Lipídicas/metabolismo , Modelos Moleculares , Peptídeos/química , Estrutura Secundária de Proteína , Termodinâmica , Água/química
15.
Artigo em Inglês | MEDLINE | ID: mdl-21233524

RESUMO

Many methods have been described to predict the subcellular location of proteins from sequence information. However, most of these methods either rely on global sequence properties or use a set of known protein targeting motifs to predict protein localization. Here, we develop and test a novel method that identifies potential targeting motifs using a discriminative approach based on hidden Markov models (discriminative HMMs). These models search for motifs that are present in a compartment but absent in other, nearby, compartments by utilizing an hierarchical structure that mimics the protein sorting mechanism. We show that both discriminative motif finding and the hierarchical structure improve localization prediction on a benchmark data set of yeast proteins. The motifs identified can be mapped to known targeting motifs and they are more conserved than the average protein sequence. Using our motif-based predictions, we can identify potential annotation errors in public databases for the location of some of the proteins. A software implementation and the data set described in this paper are available from http://murphylab.web.cmu.edu/software/2009_TCBB_motif/.


Assuntos
Sinais Direcionadores de Proteínas , Proteínas/análise , Análise de Sequência de Proteína/métodos , Algoritmos , Motivos de Aminoácidos , Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas Fúngicas/análise , Proteínas Fúngicas/química , Cadeias de Markov , Proteínas/química , Alinhamento de Sequência/métodos
16.
J Immunol Methods ; 364(1-2): 77-82, 2011 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-21093446

RESUMO

Chemokines, a class of small secreted proteins, direct immune cells to their target sites and play an important role in chronic inflammations and allergies. To study their interactions with their cellular receptors or potential inhibitors large quantities of chemokines are required. Here we present a fast and efficient strategy to purify the human chemokine interleukin-8 (IL-8, CXCL8). The chemokine is expressed with a pelB-leader peptide that is cleaved off its N-terminus by an endogenous bacterial peptidase. This yields wild-type 72aa IL-8 with a serine at its N-terminus. IL-8 is recovered in the soluble fraction after lysis while pelB-IL8 fusion protein remains in the pellet. Interleukin-8 is purified via cation exchange chromatography and heparin affinity chromatography using a single inexpensive buffer system. No dialysis or membrane filtration steps are required and the final protein fractions may be used without any desalting steps. The use of 0.5% Triton X-114 in the lysis buffer leads to low endotoxin levels in the resulting protein. The protein can be eluted from the gel filtration column with a variety of buffers and is ready to be used in binding assays and activity assays.


Assuntos
Interleucina-8/metabolismo , Proteínas Recombinantes de Fusão/metabolismo , Soluções Tampão , Cromatografia de Afinidade , Análise Custo-Benefício , Endotoxinas/metabolismo , Humanos , Interleucina-8/genética , Interleucina-8/isolamento & purificação , Octoxinol , Polietilenoglicóis/química , Engenharia de Proteínas/métodos , Sinais Direcionadores de Proteínas/genética , Proteínas Recombinantes de Fusão/genética , Proteínas Recombinantes de Fusão/isolamento & purificação
17.
Curr Protein Pept Sci ; 11(7): 550-61, 2010 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-20887261

RESUMO

Transmembrane protein topology prediction methods play important roles in structural biology, because the structure determination of these types of proteins is extremely difficult by the common biophysical, biochemical and molecular biological methods. The need for accurate prediction methods is high, as the number of known membrane protein structures fall far behind the estimated number of these proteins in various genomes. The accuracy of these prediction methods appears to be higher than most prediction methods applied on globular proteins, however it decreases slightly with the increasing number of structures. Unfortunately, most prediction algorithms use common machine learning techniques, and they do not reveal why topologies are predicted with such a high success rate and which biophysical or biochemical properties are important to achieve this level of accuracy. Incorporating topology data determined so far into the prediction methods as constraints helps us to reach even higher prediction accuracy, therefore collection of such topology data is also an important issue.


Assuntos
Proteínas de Membrana/química , Simulação por Computador , Bases de Dados de Proteínas , Cadeias de Markov , Proteínas de Membrana/classificação , Modelos Moleculares , Redes Neurais de Computação , Sinais Direcionadores de Proteínas , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína
18.
Bioinformatics ; 26(22): 2811-7, 2010 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-20847219

RESUMO

MOTIVATION: Computational prediction of signal peptides is of great importance in computational biology. In addition to the general secretory pathway (Sec), Bacteria, Archaea and chloroplasts possess another major pathway that utilizes the Twin-Arginine translocase (Tat), which recognizes longer and less hydrophobic signal peptides carrying a distinctive pattern of two consecutive Arginines (RR) in the n-region. A major functional differentiation between the Sec and Tat export pathways lies in the fact that the former translocates secreted proteins unfolded through a protein-conducting channel, whereas the latter translocates completely folded proteins using an unknown mechanism. The purpose of this work is to develop a novel method for predicting and discriminating Sec from Tat signal peptides at better accuracy. RESULTS: We report the development of a novel method, PRED-TAT, which is capable of discriminating Sec from Tat signal peptides and predicting their cleavage sites. The method is based on Hidden Markov Models and possesses a modular architecture suitable for both Sec and Tat signal peptides. On an independent test set of experimentally verified Tat signal peptides, PRED-TAT clearly outperforms the previously proposed methods TatP and TATFIND, whereas, when evaluated as a Sec signal peptide predictor compares favorably to top-scoring predictors such as SignalP and Phobius. The method is freely available for academic users at http://www.compgen.org/tools/PRED-TAT/.


Assuntos
Biologia Computacional/métodos , Cadeias de Markov , Sinais Direcionadores de Proteínas , Bases de Dados de Proteínas , Proteínas de Membrana Transportadoras/química , Dobramento de Proteína , Via Secretória
19.
PLoS Comput Biol ; 6(7): e1000867, 2010 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-20686689

RESUMO

Large-scale genome sequencing gained general importance for life science because functional annotation of otherwise experimentally uncharacterized sequences is made possible by the theory of biomolecular sequence homology. Historically, the paradigm of similarity of protein sequences implying common structure, function and ancestry was generalized based on studies of globular domains. Having the same fold imposes strict conditions over the packing in the hydrophobic core requiring similarity of hydrophobic patterns. The implications of sequence similarity among non-globular protein segments have not been studied to the same extent; nevertheless, homology considerations are silently extended for them. This appears especially detrimental in the case of transmembrane helices (TMs) and signal peptides (SPs) where sequence similarity is necessarily a consequence of physical requirements rather than common ancestry. Thus, matching of SPs/TMs creates the illusion of matching hydrophobic cores. Therefore, inclusion of SPs/TMs into domain models can give rise to wrong annotations. More than 1001 domains among the 10,340 models of Pfam release 23 and 18 domains of SMART version 6 (out of 809) contain SP/TM regions. As expected, fragment-mode HMM searches generate promiscuous hits limited to solely the SP/TM part among clearly unrelated proteins. More worryingly, we show explicit examples that the scores of clearly false-positive hits, even in global-mode searches, can be elevated into the significance range just by matching the hydrophobic runs. In the PIR iProClass database v3.74 using conservative criteria, we find that at least between 2.1% and 13.6% of its annotated Pfam hits appear unjustified for a set of validated domain models. Thus, false-positive domain hits enforced by SP/TM regions can lead to dramatic annotation errors where the hit has nothing in common with the problematic domain model except the SP/TM region itself. We suggest a workflow of flagging problematic hits arising from SP/TM-containing models for critical reconsideration by annotation users.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Sinais Direcionadores de Proteínas , Proteínas/química , Homologia de Sequência de Aminoácidos , Animais , Humanos , Cadeias de Markov , Proteínas de Membrana/química , Proteínas de Membrana/classificação , Reconhecimento Automatizado de Padrão , Dobramento de Proteína , Estrutura Terciária de Proteína , Proteínas/classificação , Reprodutibilidade dos Testes
20.
Comput Biol Med ; 40(7): 621-8, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20488436

RESUMO

BACKGROUND: Hidden Markov models (HMMs) have been extensively used in computational molecular biology, for modelling protein and nucleic acid sequences. The design of the model architecture and the algorithms for parameter estimation and decoding are extremely important for improve the performance of HMM. In topology prediction of transmembrane beta-barrels proteins (TMBs), the Baum-Welch algorithm is widely adapted for HMM training but usually leads to a sub-optimal model in practice. In addition, all the existing HMM-based predictors are only designed to model the transmembrane segment without a submodel to model the signal peptide (SP) for full-length sequences. It is not convenient for users to investigate the structures of full-length TMB sequences. RESULTS: We present here, an HMM that combine a transmembrane barrel submodel and an SP submodel for both topology and SP predictions. A new genetic algorithm (GA) is presented here to training the model, at the same time the Posterior-Viterbi algorithm is adopted for decoding. A dataset including 33 TMBs that is the most so far in literature are collected for model training and testing. Results of self-consistency and jackknife tests shows the GA has better global performance than the Baum-Welch algorithm. Results of jackknife tests show that this method performs better than all well known existing methods for topology predictions. Furthermore, it provides a function to predict SP in full-length TMBs sequences with fairish accuracy. CONCLUSION: We show that our combined HMM-based method is a better choice for TMB topology prediction, which implements topology predictions with higher accuracy and additional SP predictions for full-length TMB sequences.


Assuntos
Proteínas de Bactérias/química , Biologia Computacional/métodos , Cadeias de Markov , Proteínas de Membrana/química , Modelos Genéticos , Sinais Direcionadores de Proteínas , Algoritmos , Bases de Dados de Proteínas , Modelos Moleculares , Modelos Estatísticos , Estrutura Secundária de Proteína , Análise de Sequência de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA