Pesquisa | BVS IEC

An Atlas of Peroxiredoxins Created Using an Active Site Profile-Based Approach to Functionally Relevant Clustering of Proteins.

Harper, Angela F; Leuthaeuser, Janelle B; Babbitt, Patricia C; Morris, John H; Ferrin, Thomas E; Poole, Leslie B; Fetrow, Jacquelyn S.

PLoS Comput Biol ; 13(2): e1005284, 2017 02.

Artigo em Inglês | MEDLINE | ID: mdl-28187133

RESUMO

Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially-MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method's novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences.

Assuntos

Bases de Dados de Proteínas , Peroxirredoxinas/química , Peroxirredoxinas/classificação , Mapeamento de Interação de Proteínas/métodos , Análise de Sequência de Proteína/métodos , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Sítios de Ligação , Sistemas de Gerenciamento de Base de Dados , Ativação Enzimática , Ensaios de Triagem em Larga Escala/métodos , Dados de Sequência Molecular , Família Multigênica , Peroxirredoxinas/ultraestrutura , Ligação Proteica

DASP3: identification of protein sequences belonging to functionally relevant groups.

Leuthaeuser, Janelle B; Morris, John H; Harper, Angela F; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S.

BMC Bioinformatics ; 17(1): 458, 2016 Nov 11.

Artigo em Inglês | MEDLINE | ID: mdl-27835946

RESUMO

BACKGROUND: Development of automatable processes for clustering proteins into functionally relevant groups is a critical hurdle as an increasing number of sequences are deposited into databases. Experimental function determination is exceptionally time-consuming and can't keep pace with the identification of protein sequences. A tool, DASP (Deacon Active Site Profiler), was previously developed to identify protein sequences with active site similarity to a query set. Development of two iterative, automatable methods for clustering proteins into functionally relevant groups exposed algorithmic limitations to DASP. RESULTS: The accuracy and efficiency of DASP was significantly improved through six algorithmic enhancements implemented in two stages: DASP2 and DASP3. Validation demonstrated DASP3 provides greater score separation between true positives and false positives than earlier versions. In addition, DASP3 shows similar performance to previous versions in clustering protein structures into isofunctional groups (validated against manual curation), but DASP3 gathers and clusters protein sequences into isofunctional groups more efficiently than DASP and DASP2. CONCLUSIONS: DASP algorithmic enhancements resulted in improved efficiency and accuracy of identifying proteins that contain active site features similar to those of the query set. These enhancements provide incremental improvement in structure database searches and initial sequence database searches; however, the enhancements show significant improvement in iterative sequence searches, suggesting DASP3 is an appropriate tool for the iterative processes required for clustering proteins into isofunctional groups.

Assuntos

Algoritmos , Análise de Sequência de Proteína/métodos , Motivos de Aminoácidos , Sequência de Aminoácidos , Domínio Catalítico , Análise por Conglomerados , Bases de Dados de Proteínas , Proteínas/química

Modelling amorphous materials via a joint solid-state NMR and X-ray absorption spectroscopy and DFT approach: application to alumina.

Harper, Angela F; Emge, Steffen P; Magusin, Pieter C M M; Grey, Clare P; Morris, Andrew J.

Chem Sci ; 14(5): 1155-1167, 2023 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-36756318

RESUMO

Understanding a material's electronic structure is crucial to the development of many functional devices from semiconductors to solar cells and Li-ion batteries. A material's properties, including electronic structure, are dependent on the arrangement of its atoms. However, structure determination (the process of uncovering the atomic arrangement), is impeded, both experimentally and computationally, by disorder. The lack of a verifiable atomic model presents a huge challenge when designing functional amorphous materials. Such materials may be characterised through their local atomic environments using, for example, solid-state NMR and XAS. By using these two spectroscopy methods to inform the sampling of configurations from ab initio molecular dynamics we devise and validate an amorphous model, choosing amorphous alumina to illustrate the approach due to its wide range of technological uses. Our model predicts two distinct geometric environments of AlO5 coordination polyhedra and determines the origin of the pre-edge features in the Al K-edge XAS. From our model we construct an average electronic density of states for amorphous alumina, and identify localized states at the conduction band minimum (CBM). We show that the presence of a pre-edge peak in the XAS is a result of transitions from the Al 1s to Al 3s states at the CBM. Deconvoluting this XAS by coordination geometry reveals contributions from both AlO4 and AlO5 geometries at the CBM give rise to the pre-edge, which provides insight into the role of AlO5 in the electronic structure of alumina. This work represents an important advance within the field of solid-state amorphous modelling, providing a method for developing amorphous models through the comparison of experimental and computationally derived spectra, which may then be used to determine the electronic structure of amorphous materials.

Computational Investigation of Copper Phosphides as Conversion Anodes for Lithium-Ion Batteries.

Harper, Angela F; Evans, Matthew L; Morris, Andrew J.

Chem Mater ; 32(15): 6629-6639, 2020 Aug 11.

Artigo em Inglês | MEDLINE | ID: mdl-32905380

RESUMO

Using first-principles structure searching with density-functional theory (DFT), we identify a novel Fm3Ìm phase of Cu2P and two low-lying metastable structures, an I4Ì3d-Cu3P phase and a Cm-Cu3P11 phase. The computed pair distribution function of the novel Cm-Cu3P11 phase shows its structural similarity to the experimentally identified Cm-Cu2P7 phase. The relative stability of all Cu-P phases at finite temperatures is determined by calculating the Gibbs free energy using vibrational effects from phonon modes at 0 K. From this, a finite-temperature convex hull is created, on which Fm3Ìm-Cu2P is dynamically stable and the Cu3-x P (x < 1) defect phase Cmc21-Cu8P3 remains metastable (within 20 meV/atom of the convex hull) across a temperature range from 0 to 600 K. Both CuP2 and Cu3P exhibit theoretical gravimetric capacities higher than contemporary graphite anodes for Li-ion batteries; the predicted Cu2P phase has a theoretical gravimetric capacity of 508 mAh/g as a Li-ion battery electrode, greater than both Cu3P (363 mAh/g) and graphite (372 mAh/g). Cu2P is also predicted to be both nonmagnetic and metallic, which should promote efficient electron transfer in the anode. Cu2P's favorable properties as a metallic, high-capacity material suggest its use as a future conversion anode for Li-ion batteries; with a volume expansion of 99% during complete cycling, Cu2P anodes could be more durable than other conversion anodes in the Cu-P system, with volume expansions greater than 150%. The structures and figures presented in this paper, and the code used to generate them, can be interactively explored online using Binder.

An approach to functionally relevant clustering of the protein universe: Active site profile-based clustering of protein structures and sequences.

Knutson, Stacy T; Westwood, Brian M; Leuthaeuser, Janelle B; Turner, Brandon E; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D; Harper, Angela F; Brown, Shoshana D; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S.

Protein Sci ; 26(4): 677-699, 2017 04.

Artigo em Inglês | MEDLINE | ID: mdl-28054422

RESUMO

Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two-Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure-Function Linkage Database, SFLD) self-identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self-identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well-curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP-identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F-measure and performance analysis on the enolase search results and comparison to GEMMA and SCI-PHY demonstrate that TuLIP avoids the over-division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results.

Assuntos

Bases de Dados de Proteínas , Glutationa Transferase/química , Glutationa Transferase/genética , Fosfopiruvato Hidratase/química , Fosfopiruvato Hidratase/genética , Análise de Sequência de Proteína/métodos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA