Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 39(39 Suppl 1): i357-i367, 2023 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-37387189

RESUMO

The tendency of an amino acid to adopt certain configurations in folded proteins is treated here as a statistical estimation problem. We model the joint distribution of the observed mainchain and sidechain dihedral angles (〈ϕ,ψ,χ1,χ2,…〉) of any amino acid by a mixture of a product of von Mises probability distributions. This mixture model maps any vector of dihedral angles to a point on a multi-dimensional torus. The continuous space it uses to specify the dihedral angles provides an alternative to the commonly used rotamer libraries. These rotamer libraries discretize the space of dihedral angles into coarse angular bins, and cluster combinations of sidechain dihedral angles (〈χ1,χ2,…〉) as a function of backbone 〈ϕ,ψ〉 conformations. A 'good' model is one that is both concise and explains (compresses) observed data. Competing models can be compared directly and in particular our model is shown to outperform the Dunbrack rotamer library in terms of model complexity (by three orders of magnitude) and its fidelity (on average 20% more compression) when losslessly explaining the observed dihedral angle data across experimental resolutions of structures. Our method is unsupervised (with parameters estimated automatically) and uses information theory to determine the optimal complexity of the statistical model, thus avoiding under/over-fitting, a common pitfall in model selection problems. Our models are computationally inexpensive to sample from and are geared to support a number of downstream studies, ranging from experimental structure refinement, de novo protein design, and protein structure prediction. We call our collection of mixture models as PhiSiCal (ϕψχal). AVAILABILITY AND IMPLEMENTATION: PhiSiCal mixture models and programs to sample from them are available for download at http://lcb.infotech.monash.edu.au/phisical.


Assuntos
Compressão de Dados , Bibliotecas , Aminoácidos , Biblioteca Gênica , Teoria da Informação
2.
Bioinformatics ; 38(Suppl 1): i255-i263, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35758808

RESUMO

MOTIVATION: Alignments are correspondences between sequences. How reliable are alignments of amino acid sequences of proteins, and what inferences about protein relationships can be drawn? Using techniques not previously applied to these questions, by weighting every possible sequence alignment by its posterior probability we derive a formal mathematical expectation, and develop an efficient algorithm for computation of the distance between alternative alignments allowing quantitative comparisons of sequence-based alignments with corresponding reference structure alignments. RESULTS: By analyzing the sequences and structures of 1 million protein domain pairs, we report the variation of the expected distance between sequence-based and structure-based alignments, as a function of (Markov time of) sequence divergence. Our results clearly demarcate the 'daylight', 'twilight' and 'midnight' zones for interpreting residue-residue correspondences from sequence information alone. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aminoácidos , Proteínas , Algoritmos , Sequência de Aminoácidos , Proteínas/química , Reprodutibilidade dos Testes , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
3.
Proteins ; 88(12): 1557-1558, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-32662915

RESUMO

We have modeled modifications of a known ligand to the SARS-CoV-2 (COVID-19) protease, that can form a covalent adduct, plus additional ligand-protein hydrogen bonds.


Assuntos
Antivirais , Afídeos , Infecções por Coronavirus , Inseticidas , Pandemias , Pneumonia Viral , Acetilcolinesterase , Animais , Betacoronavirus , COVID-19 , Cisteína Endopeptidases , Humanos , Simulação de Acoplamento Molecular , Inibidores de Proteases , SARS-CoV-2 , Proteínas não Estruturais Virais
4.
Bioinformatics ; 33(7): 1005-1013, 2017 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-28065899

RESUMO

Motivation: Structural molecular biology depends crucially on computational techniques that compare protein three-dimensional structures and generate structural alignments (the assignment of one-to-one correspondences between subsets of amino acids based on atomic coordinates). Despite its importance, the structural alignment problem has not been formulated, much less solved, in a consistent and reliable way. To overcome these difficulties, we present here a statistical framework for the precise inference of structural alignments, built on the Bayesian and information-theoretic principle of Minimum Message Length (MML). The quality of any alignment is measured by its explanatory power-the amount of lossless compression achieved to explain the protein coordinates using that alignment. Results: We have implemented this approach in MMLigner , the first program able to infer statistically significant structural alignments. We also demonstrate the reliability of MMLigner 's alignment results when compared with the state of the art. Importantly, MMLigner can also discover different structural alignments of comparable quality, a challenging problem for oligomers and protein complexes. Availability and Implementation: Source code, binaries and an interactive web version are available at http://lcb.infotech.monash.edu.au/mmligner . Contact: arun.konagurthu@monash.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Compressão de Dados , Modelos Estatísticos , Proteínas/química , Alinhamento de Sequência , Algoritmos , Teorema de Bayes , Reprodutibilidade dos Testes , Software
5.
Bioinformatics ; 30(17): i512-8, 2014 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-25161241

RESUMO

MOTIVATION: Progress in protein biology depends on the reliability of results from a handful of computational techniques, structural alignments being one. Recent reviews have highlighted substantial inconsistencies and differences between alignment results generated by the ever-growing stock of structural alignment programs. The lack of consensus on how the quality of structural alignments must be assessed has been identified as the main cause for the observed differences. Current methods assess structural alignment quality by constructing a scoring function that attempts to balance conflicting criteria, mainly alignment coverage and fidelity of structures under superposition. This traditional approach to measuring alignment quality, the subject of considerable literature, has failed to solve the problem. Further development along the same lines is unlikely to rectify the current deficiencies in the field. RESULTS: This paper proposes a new statistical framework to assess structural alignment quality and significance based on lossless information compression. This is a radical departure from the traditional approach of formulating scoring functions. It links the structural alignment problem to the general class of statistical inductive inference problems, solved using the information-theoretic criterion of minimum message length. Based on this, we developed an efficient and reliable measure of structural alignment quality, I-value. The performance of I-value is demonstrated in comparison with a number of popular scoring functions, on a large collection of competing alignments. Our analysis shows that I-value provides a rigorous and reliable quantification of structural alignment quality, addressing a major gap in the field. AVAILABILITY: http://lcb.infotech.monash.edu.au/I-value. SUPPLEMENTARY INFORMATION: Online supplementary data are available at http://lcb.infotech.monash.edu.au/I-value/suppl.html.


Assuntos
Homologia Estrutural de Proteína , Algoritmos , Compressão de Dados , Interpretação Estatística de Dados
6.
Nucleic Acids Res ; 40(Web Server issue): W334-9, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22638586

RESUMO

Searching for well-fitting 3D oligopeptide fragments within a large collection of protein structures is an important task central to many analyses involving protein structures. This article reports a new web server, Super, dedicated to the task of rapidly screening the protein data bank (PDB) to identify all fragments that superpose with a query under a prespecified threshold of root-mean-square deviation (RMSD). Super relies on efficiently computing a mathematical bound on the commonly used structural similarity measure, RMSD of superposition. This allows the server to filter out a large proportion of fragments that are unrelated to the query; >99% of the total number of fragments in some cases. For a typical query, Super scans the current PDB containing over 80,500 structures (with ∼40 million potential oligopeptide fragments to match) in under a minute. Super web server is freely accessible from: http://lcb.infotech.monash.edu.au/super.


Assuntos
Oligopeptídeos/química , Software , Algoritmos , Bases de Dados de Proteínas , Internet , Fragmentos de Peptídeos/química , Interface Usuário-Computador
7.
Front Mol Biosci ; 7: 612920, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33996891

RESUMO

What is the architectural "basis set" of the observed universe of protein structures? Using information-theoretic inference, we answer this question with a dictionary of 1,493 substructures-called concepts-typically at a subdomain level, based on an unbiased subset of known protein structures. Each concept represents a topologically conserved assembly of helices and strands that make contact. Any protein structure can be dissected into instances of concepts from this dictionary. We dissected the Protein Data Bank and completely inventoried all the concept instances. This yields many insights, including correlations between concepts and catalytic activities or binding sites, useful for rational drug design; local amino-acid sequence-structure correlations, useful for ab initio structure prediction methods; and information supporting the recognition and exploration of evolutionary relationships, useful for structural studies. An interactive site, Proçodic, at http://lcb.infotech.monash.edu.au/prosodic (click), provides access to and navigation of the entire dictionary of concepts and their usages, and all associated information. This report is part of a continuing programme with the goal of elucidating fundamental principles of protein architecture, in the spirit of the work of Cyrus Chothia.

8.
J Bioinform Comput Biol ; 3(3): 551-85, 2005 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16108084

RESUMO

Proteases play a fundamental role in the control of intra- and extra-cellular processes by binding and cleaving specific amino acid sequences. Identifying these targets is extremely challenging. Current computational attempts to predict cleavage sites are limited, representing these amino acid sequences as patterns or frequency matrices. Here we present PoPS, a publicly accessible bioinformatics tool (http://pops.csse.monash.edu.au/) that provides a novel method for building computational models of protease specificity, which while still being based on these amino acid sequences, can be built from any experimental data or expert knowledge available to the user. PoPS specificity models can be used to predict and rank likely cleavages within a single substrate, and within entire proteomes. Other factors, such as the secondary or tertiary structure of the substrate, can be used to screen unlikely sites. Furthermore, the tool also provides facilities to infer, compare and test models, and to store them in a publicly accessible database.


Assuntos
Algoritmos , Inteligência Artificial , Modelos Químicos , Modelos Moleculares , Peptídeo Hidrolases/química , Análise de Sequência de Proteína/métodos , Software , Sítios de Ligação , Simulação por Computador , Ativação Enzimática , Peptídeo Hidrolases/análise , Peptídeo Hidrolases/classificação , Ligação Proteica , Relação Estrutura-Atividade , Especificidade por Substrato
9.
Mol Immunol ; 47(2-3): 493-505, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19783309

RESUMO

Our aim was to ascertain structural determinants of autoantigenicity based on the model of the diabetes autoantigen glutamic acid decarboxylase 65 kDa isoform (GAD65) in comparison with that of the non-autoantigenic isoform GAD67. This difference exists despite the two isoforms having the same fold and high sequence identity. Autoantibodies to GAD65 precede the development of type 1 diabetes and are clinical markers of this and certain neural autoimmune diseases. To date, epitope mapping has been based on particular amino acid differences between the two isoforms, and there is no explanation as to why autoantibodies that react with GAD65 only infrequently cross-react with GAD67. To characterize each isoform of the enzyme and gain insights into their contrasting autoantigenic properties, we have used the recently determined crystal structures of GAD65 and GAD67 to compare their structure, hydrophobicity, electrostatics, flexibility and physiochemical properties. The results revealed striking differences which appear almost exclusively at the C-terminal domain of the isoforms. Whereas GAD65 displayed a highly charged and flexible C-terminal domain containing numerous patches of high electrostatic and solvation energies, these characteristics were absent in the GAD67 molecule. Additionally, analysis indicated potential N-terminal and PLP domain binding sites surrounding the C-terminal domain of GAD65, a major region of autoantigenic activity, but not of GAD67. These features agree with good accuracy with published epitope-mapping data. Our analysis suggests that the high flexibility and charge of GAD65 in the C-terminal domain is coupled with the mobility of its catalytic loop, a property that is absolutely required for its enzymatic function. Thus, the structural features that distinguish GAD65 from GAD67 as a B cell autoantigen are related to functional requirements for its enzymatic mechanism. This could well apply to the various other enzyme autoantigens and, if so, these features could be used as the basis of future predictive strategies.


Assuntos
Antígenos/imunologia , Glutamato Descarboxilase/química , Glutamato Descarboxilase/imunologia , Aminoácidos/química , Biocatálise , Mapeamento de Epitopos , Epitopos de Linfócito B/química , Epitopos de Linfócito B/imunologia , Humanos , Interações Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Fosfato de Piridoxal/metabolismo , Eletricidade Estática , Homologia Estrutural de Proteína , Propriedades de Superfície , Termodinâmica
10.
Artigo em Inglês | MEDLINE | ID: mdl-16448030

RESUMO

Proteases play a fundamental role in the control of intra- and extracellular processes by binding and cleaving specific amino acid sequences. Identifying these targets is extremely challenging. Current computational attempts to predict cleavage sites are limited, representing these amino acid sequences as patterns or frequency matrices. Here we present PoPS, a publicly accessible bioinformatics tool (http://pops.csse.monash.edu.au/) which provides a novel method for building computational models of protease specificity that, while still being based on these amino acid sequences, can be built from any experimental data or expert knowledge available to the user. PoPS specificity models can be used to predict and rank likely cleavages within a single substrate, and within entire proteomes. Other factors, such as the secondary or tertiary structure of the substrate, can be used to screen unlikely sites. Furthermore, the tool also provides facilities to infer, compare and test models, and to store them in a publicly accessible database.


Assuntos
Bases de Dados de Proteínas , Modelos Químicos , Peptídeo Hidrolases/análise , Peptídeo Hidrolases/química , Análise de Sequência de Proteína/métodos , Software , Sequência de Aminoácidos , Inteligência Artificial , Sítios de Ligação , Simulação por Computador , Armazenamento e Recuperação da Informação/métodos , Dados de Sequência Molecular , Peptídeo Hidrolases/classificação , Ligação Proteica , Relação Estrutura-Atividade , Especificidade por Substrato
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA