RESUMEN
The polyglutamine diseases are caused in part by a gain-of-function mechanism of neuronal toxicity involving protein conformational changes that result in the formation and deposition of ß-sheet rich aggregates. Recent evidence suggests that the misfolding mechanism is context-dependent, and that properties of the host protein, including the domain architecture and location of the repeat tract, can modulate aggregation. In order to allow the bioinformatic investigation of the context of polyglutamines, we have constructed a database, PolyQ (http://pxgrid.med.monash.edu.au/polyq). We have collected the sequences of all human proteins containing runs of seven or more glutamine residues and annotated their sequences with domain information. PolyQ can be interrogated such that the sequence context of polyglutamine repeats in disease and non-disease associated proteins can be investigated.
Asunto(s)
Bases de Datos de Proteínas , Péptidos/química , Secuencias Repetitivas de Aminoácido , Enfermedad , Humanos , Estructura Terciaria de Proteína , Proteínas/química , Análisis de Secuencia de ProteínaRESUMEN
The Protein Folding Database (PFD) is a publicly accessible repository of thermodynamic and kinetic protein folding data. Here we describe the first major revision of this work, featuring extensive restructuring that conforms to standards set out by the recently formed International Foldeomics Consortium. The database now adopts standards for data acquisition, analysis and reporting proposed by the consortium, which will facilitate the comparison of folding rates, energies and structure across diverse sets of proteins. Data can now be easily deposited using a rich set of deposition tools. Enhanced search tools allow sophisticated searching and graphical data analysis affords simple data analysis online. PFD can be accessed freely at http://www.foldeomics.org/pfd/.
Asunto(s)
Bases de Datos de Proteínas , Conformación Proteica , Internet , Cinética , Pliegue de Proteína , Termodinámica , Interfaz Usuario-ComputadorRESUMEN
The nine polyglutamine (polyQ) neurodegenerative diseases are caused in part by a gain-of-function mechanism involving protein misfolding, the deposition of ß-sheet-rich aggregates and neuronal toxicity. While previous experimental evidence suggests that the polyQ-induced misfolding mechanism is context dependent, the properties of the host protein, including the domain architecture and location of the polyQ tract, have not been investigated. Here, we use variants of a model polyQ-containing protein to systematically determine the effect of the location and number of flanking folded domains on polyQ-mediated aggregation. Our data indicate that when a pathological-length polyQ tract is present between two domains, it aggregates more slowly than the same-length tract in a terminal location within the protein. We also demonstrate that increasing the number of flanking domains decreases the polyQ protein's aggregation rate. Our experimental data, together with a bioinformatic analysis of all human proteins possessing polyQ tracts, suggest that repeat location and protein domain architecture affect the disease susceptibility of human polyQ proteins.
Asunto(s)
Péptidos/química , Péptidos/metabolismo , Desnaturalización Proteica , Proteínas/química , Proteínas/metabolismo , Dicroismo Circular , Humanos , Péptidos/genética , Pliegue de Proteína , Estructura Terciaria de Proteína , Proteínas/genéticaRESUMEN
BACKGROUND: The crystallographic determination of protein structures can be computationally demanding and for difficult cases can benefit from user-friendly interfaces to high-performance computing resources. Molecular replacement (MR) is a popular protein crystallographic technique that exploits the structural similarity between proteins that share some sequence similarity. But the need to trial permutations of search models, space group symmetries and other parameters makes MR time- and labour-intensive. However, MR calculations are embarrassingly parallel and thus ideally suited to distributed computing. In order to address this problem we have developed MrGrid, web-based software that allows multiple MR calculations to be executed across a grid of networked computers, allowing high-throughput MR. METHODOLOGY/PRINCIPAL FINDINGS: MrGrid is a portable web based application written in Java/JSP and Ruby, and taking advantage of Apple Xgrid technology. Designed to interface with a user defined Xgrid resource the package manages the distribution of multiple MR runs to the available nodes on the Xgrid. We evaluated MrGrid using 10 different protein test cases on a network of 13 computers, and achieved an average speed up factor of 5.69. CONCLUSIONS: MrGrid enables the user to retrieve and manage the results of tens to hundreds of MR calculations quickly and via a single web interface, as well as broadening the range of strategies that can be attempted. This high-throughput approach allows parameter sweeps to be performed in parallel, improving the chances of MR success.
Asunto(s)
Biología Computacional/métodos , Programas Informáticos , Homología Estructural de Proteína , Cristalografía por Rayos X , Internet , Lenguajes de Programación , Proteínas/químicaRESUMEN
There is a pressing need for the archiving and curation of raw X-ray diffraction data. This information is critical for validation, methods development and improvement of archived structures. However, the relatively large size of these data sets has presented challenges for storage in a single worldwide repository such as the Protein Data Bank archive. This problem can be avoided by using a federated approach, where each institution utilizes its institutional repository for storage, with a discovery service overlaid. Institutional repositories are relatively stable and adequately funded, ensuring persistence. Here, a simple repository solution is described, utilizing Fedora open-source database software and data-annotation and deposition tools that can be deployed at any site cheaply and easily. Data sets and associated metadata from federated repositories are given a unique and persistent handle, providing a simple mechanism for search and retrieval via web interfaces. In addition to ensuring that valuable data is not lost, the provision of raw data has several uses for the crystallographic community. Most importantly, structure determination can only be truly repeated or verified when the raw data are available. Moreover, the availability of raw data is extremely useful for the development of improved methods of image analysis and data processing.