Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
IEEE/ACM Trans Comput Biol Bioinform ; 17(6): 1918-1931, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-30998480

RESUMO

As the first step of machine-learning based protein structure and function prediction, the amino acid encoding play a fundamental role in the final success of those methods. Different from the protein sequence encoding, the amino acid encoding can be used in both residue-level and sequence-level prediction of protein properties by combining them with different algorithms. However, it has not attracted enough attention in the past decades, and there are no comprehensive reviews and assessments about encoding methods so far. In this article, we make a systematic classification and propose a comprehensive review and assessment for various amino acid encoding methods. Those methods are grouped into five categories according to their information sources and information extraction methodologies, including binary encoding, physicochemical properties encoding, evolution-based encoding, structure-based encoding, and machine-learning encoding. Then, 16 representative methods from five categories are selected and compared on protein secondary structure prediction and protein fold recognition tasks by using large-scale benchmark datasets. The results show that the evolution-based position-dependent encoding method PSSM achieved the best performance, and the structure-based and machine-learning encoding methods also show some potential for further application, the neural network based distributed representation of amino acids in particular may bring new light to this area. We hope that the review and assessment are useful for future studies in amino acid encoding.


Assuntos
Sequência de Aminoácidos/genética , Aminoácidos/química , Biologia Computacional/métodos , Proteínas , Análise de Sequência de Proteína/métodos , Algoritmos , Dobramento de Proteína , Estrutura Secundária de Proteína/genética , Proteínas/química , Proteínas/genética , Proteínas/fisiologia
2.
J Mol Biol ; 410(2): 357-67, 2011 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-21616081

RESUMO

Small soluble oligomers, and dimers in particular, of the amyloid ß-peptide (Aß) are believed to play an important pathological role in Alzheimer's disease. Here, we investigate the spontaneous dimerization of Aß42, with 42 residues, by implicit solvent all-atom Monte Carlo simulations, for the wild-type peptide and the mutants F20E, E22G and E22G/I31E. The observed dimers of these variants share many overall conformational characteristics but differ in several aspects at a detailed level. In all four cases, the most common type of secondary structure is intramolecular antiparallel ß-sheets. Parallel, in-register ß-sheet structure, as in models for Aß fibrils, is rare. The primary force driving the formation of dimers is hydrophobic attraction. The conformational differences that we do see involve turns centered in the 20-30 region. The probability of finding turns centered in the 25-30 region, where there is a loop in Aß fibrils, is found to increase upon dimerization and to correlate with experimentally measured rates of fibril formation for the different Aß42 variants. Our findings hint at reorganization of this part of the molecule as a potentially critical step in Aß aggregation.


Assuntos
Peptídeos beta-Amiloides/biossíntese , Peptídeos beta-Amiloides/química , Variação Genética , Método de Monte Carlo , Fragmentos de Peptídeos/biossíntese , Fragmentos de Peptídeos/química , Multimerização Proteica , Substituição de Aminoácidos/genética , Peptídeos beta-Amiloides/genética , Simulação por Computador , Humanos , Simulação de Dinâmica Molecular , Mutação , Fragmentos de Peptídeos/genética , Conformação Proteica , Multimerização Proteica/genética , Estrutura Secundária de Proteína/genética
3.
Genome Biol ; 11(9): R98, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20920270

RESUMO

BACKGROUND: The nature of the protein molecular clock, the protein-specific rate of amino acid substitutions, is among the central questions of molecular evolution. Protein expression level is the dominant determinant of the clock rate in a number of organisms. It has been suggested that highly expressed proteins evolve slowly in all species mainly to maintain robustness to translation errors that generate toxic misfolded proteins. Here we investigate this hypothesis experimentally by comparing the growth rate of Escherichia coli expressing wild type and misfolding-prone variants of the LacZ protein. RESULTS: We show that the cost of toxic protein misfolding is small compared to other costs associated with protein synthesis. Complementary computational analyses demonstrate that there is also a relatively weaker, but statistically significant, selection for increasing solubility and polarity in highly expressed E. coli proteins. CONCLUSIONS: Although we cannot rule out the possibility that selection against misfolding toxicity significantly affects the protein clock in species other than E. coli, our results suggest that it is unlikely to be the dominant and universal factor determining the clock rate in all organisms. We find that in this bacterium other costs associated with protein synthesis are likely to play an important role. Interestingly, our experiments also suggest significant costs associated with volume effects, such as jamming of the cellular environment with unnecessary proteins.


Assuntos
Substituição de Aminoácidos , Proteínas de Escherichia coli/genética , Escherichia coli , Evolução Molecular , Dobramento de Proteína , Estrutura Secundária de Proteína/genética , beta-Galactosidase/genética , Western Blotting , Eletroforese em Gel de Poliacrilamida , Escherichia coli/genética , Escherichia coli/crescimento & desenvolvimento , Escherichia coli/metabolismo , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/metabolismo , Genômica , Óperon Lac , Mutação , Biossíntese de Proteínas , Estabilidade Proteica , Solubilidade , Relação Estrutura-Atividade , Fatores de Tempo , beta-Galactosidase/química , beta-Galactosidase/metabolismo
4.
Biochim Biophys Acta ; 1798(2): 167-76, 2010 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19615331

RESUMO

The first proton transfer of bacteriorhodopsin (bR) occurs from the protonated Schiff base to the anionic Asp 85 at the central part of the protein in the L to M states. Low-frequency dynamics accompanied by this process can be revealed by suppressed or recovered intensities (SRI) analysis of site-directed (13)C solid-state NMR spectra of 2D crystalline preparations. First of all, we examined a relationship of fluctuation frequencies available from [1-(13)C]Val- and [3-(13)C]Ala-labeled preparations, by taking the effective correlation time of internal methyl rotations into account. We analyzed the SRI data of [1-(13)C]Val-labeled wild-type bR and D85N mutants, as a function of temperature and pH, respectively, based on so-far assigned peaks including newly assigned or revised ones. Global conformational change of the protein backbone, caused by neutralization of the anionic D85 by D85N, can be visualized by characteristic displacement of peaks due to the conformation-dependent (13)C chemical shifts. Concomitant dynamics changes if any, with fluctuation frequencies in the order of 10(4) Hz, were evaluated by the decreased peak intensities in the B-C and D-E loops of D85N mutant. The resulting fluctuation frequencies, owing to subsequent, accelerated dynamics changes in the M-like state by deprotonation of the Schiff base at alkaline pH, were successfully evaluated based on the SRI plots as a function of pH, which were varied depending upon the extent of interference of induced fluctuation frequency with frequency of magic angle spinning or escape from such interference. Distinguishing fluctuation frequencies between the higher and lower than 10(4) Hz is now possible, instead of a simple description of the data around 10(4) Hz available from one-point data analysis previously reported.


Assuntos
Substituição de Aminoácidos , Bacteriorodopsinas/química , Halobacterium salinarum/química , Mutação de Sentido Incorreto , Ressonância Magnética Nuclear Biomolecular/métodos , Bacteriorodopsinas/genética , Isótopos de Carbono/química , Halobacterium salinarum/genética , Concentração de Íons de Hidrogênio , Mutagênese Sítio-Dirigida , Estrutura Secundária de Proteína/genética
5.
BMC Vet Res ; 2: 35, 2006 Dec 04.
Artigo em Inglês | MEDLINE | ID: mdl-17144917

RESUMO

BACKGROUND: Foot-and-Mouth Disease (FMD) causes significant economic losses in Turkish livestock. We have analysed the genetic diversity of the 1D sequences, encoding the hypervariable surface protein VP1, of Turkish isolates of serotype A and O collected from 1998 to 2004 in order to obtain epidemiological and immunological information. RESULTS: The 1D coding region of 33 serotype O and 20 serotype A isolates, obtained from outbreaks of FMD between 1998 and 2004, was sequenced. For serotype A, we confirmed the occurrence of the two subtypes IRN99 and IRN96. These subtypes are most divergent within the region encoding the immuno-dominant GH-loop. Also a close relationship to Foot-and-Mouth Disease virus (FMDV) serotype A isolates obtained from outbreaks in Iraq and Iran were detected and a clustering of isolates collected during the same period of time were found. The analysis of the deduced amino-acid sequences of these subtypes revealed evidence of positive selection in one site and one deletion, both within the GH-loop region. By inferring the ancestral history of the positively selected codon, two potential precursors were found. Furthermore, the structural alignment of IRN99 and IRN96 revealed differences between the tertiary structures of these subtypes. The similarity plot of the serotype O isolates suggested a more homogeneous group than the serotype A isolates. However, phylogenetic analysis revealed two major groups, each further divided in subgroups, of which some only consisted of Turkish isolates. Positively selected sites and structural differences of the Turkish isolates analysed, were not found. CONCLUSION: The sequence and structural analysis of the IRN99 strains is indicative of positive selection suggesting an immunological advantage compared to IRN96. However, results of antigenic comparison reported elsewhere do not substantiate such a conclusion. There is evidence that IRN99 was introduced to Turkey, in all probability from Iran. Since, a member of the IRN96 lineage was included as a component of the FMDV vaccine produced since 2000, the outbreaks caused by IRN96 strains in 2004 could be due to incomplete vaccine coverage. The Turkish type O strains, all with a VP1 structure similar to the O1/Manisa/69 vaccine, appear in several sublineages. Whether these sublineages reflect multiple samplings from a limited number of outbreaks, or if they reflect cross-boundary introductions is not clear.


Assuntos
Proteínas do Capsídeo/genética , Doenças dos Bovinos/epidemiologia , Vírus da Febre Aftosa/genética , Febre Aftosa/epidemiologia , Sequência de Aminoácidos , Animais , Teorema de Bayes , Bovinos , Doenças dos Bovinos/economia , Doenças dos Bovinos/virologia , Análise por Conglomerados , Febre Aftosa/economia , Febre Aftosa/virologia , Vírus da Febre Aftosa/classificação , Interações Hidrofóbicas e Hidrofílicas , Epidemiologia Molecular/métodos , Dados de Sequência Molecular , Filogenia , Estrutura Secundária de Proteína/genética , RNA Viral/química , Seleção Genética , Alinhamento de Sequência , Homologia de Sequência do Ácido Nucleico , Sorotipagem , Turquia/epidemiologia
6.
Proteins ; 44(2): 97-109, 2001 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-11391772

RESUMO

To facilitate investigation of the molecular and biochemical functions of the adenovirus E4 Orf6 protein, we sought to derive three-dimensional structural information using computational methods, particularly threading and comparative protein modeling. The amino acid sequence of the protein was used for secondary structure and hidden Markov model (HMM) analyses, and for fold recognition by the ProCeryon program. Six alternative models were generated from the top-scoring folds identified by threading. These models were examined by 3D-1D analysis and evaluated in the light of available experimental evidence. The final model of the E4 protein derived from these and additional threading calculations was a chimera, with the tertiary structure of its C-terminal 226 residues derived from a TIM barrel template and a mainly alpha-nonbundle topology for its poorly conserved N-terminal 68 residues. To assess the accuracy of this model, additional threading calculations were performed with E4 Orf6 sequences altered as in previous experimental studies. The proposed structural model is consistent with the reported secondary structure of a functionally important C-terminal sequence and can account for the properties of proteins carrying alterations in functionally important sequences or of those that disrupt an unusual zinc-coordination motif.


Assuntos
Proteínas E4 de Adenovirus/química , Modelos Moleculares , Fases de Leitura Aberta , Proteínas E4 de Adenovirus/genética , Substituição de Aminoácidos/genética , Simulação por Computador , Humanos , Cadeias de Markov , Fases de Leitura Aberta/genética , Dobramento de Proteína , Estrutura Secundária de Proteína/genética , Proteínas Recombinantes de Fusão/química , Alinhamento de Sequência/métodos
7.
Cancer Res ; 61(10): 4092-7, 2001 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-11358831

RESUMO

Several groups have studied the molecular pathology of inherited breast cancer. By combining several such studies, we show in this study that somatic TP53 abnormalities are more common in breast cancer associated with BRCA1 or BRCA2 germ-line mutations than in sporadic breast cancers (odds ratio, 2.8; P = 0.0003). Then, we compared the spectrum of TP53 mutations for breast cancers in the IARC TP53 mutation database with the 82 mutations reported in BRCA1/2-associated breast cancers. The spectrum differed significantly both in distribution (P < 1 x 10(-6)) and in base changes (P = 0.025). Mutations at A:T bp were more common in BRCA1/2-associated tumors and strand bias suggesting DNA repair abnormalities was found. Changes were common at TP53 codons that are not mutation hotspots. Structural modeling showed that most of these p53 non-hotspot amino acids characterized in breast tumors isolated from patients with deficient BRCA1/2 function are distributed in a region of the protein on the opposite side of the p53 DNA-binding surface. Our results suggest that BRCA1/2 mutations influence the type and distribution of TP53 mutations seen in breast cancer.


Assuntos
Neoplasias da Mama/genética , Genes BRCA1/genética , Genes p53/genética , Mutação em Linhagem Germinativa , Mutação de Sentido Incorreto , Proteínas de Neoplasias/genética , Fatores de Transcrição/genética , Proteína BRCA2 , Sítios de Ligação , DNA/metabolismo , Análise Mutacional de DNA/métodos , Feminino , Humanos , Método de Monte Carlo , Neoplasias Ovarianas/genética , Estrutura Secundária de Proteína/genética , Proteína Supressora de Tumor p53/química , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor p53/metabolismo
8.
Bioinformatics ; 15(12): 1039-46, 1999 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-10745994

RESUMO

MOTIVATION: Prediction of protein secondary structure provides information that is useful for other prediction methods like fold recognition and ab initio 3D prediction. A consensus prediction constructed from the output of several methods should yield more reliable results than each of the individual methods. METHOD: We present an approach that reveals subtle but systematic differences in the output of different secondary structure prediction methods allowing the derivation of coherent consensus predictions. The method uses a machine learning technique that builds decision trees from existing data. RESULTS: The first results of our analysis show that consensus prediction of protein secondary structure may be improved both quantitatively and qualitatively.


Assuntos
Árvores de Decisões , Estrutura Secundária de Proteína/genética , Algoritmos , Inteligência Artificial , Reprodutibilidade dos Testes , Integração de Sistemas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA