Your browser doesn't support javascript.
loading
An approach to functionally relevant clustering of the protein universe: Active site profile-based clustering of protein structures and sequences.
Knutson, Stacy T; Westwood, Brian M; Leuthaeuser, Janelle B; Turner, Brandon E; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D; Harper, Angela F; Brown, Shoshana D; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S.
Afiliación
  • Knutson ST; Department of Physics, Wake Forest University, Winston-Salem, North Carolina, 27106.
  • Westwood BM; Department of Computer Science, Wake Forest University, Winston-Salem, North Carolina, 27106.
  • Leuthaeuser JB; Department of Physics, Wake Forest University, Winston-Salem, North Carolina, 27106.
  • Turner BE; Department of Computer Science, Wake Forest University, Winston-Salem, North Carolina, 27106.
  • Nguyendac D; Molecular Genetics and Genomics Program, Wake Forest School of Medicine, Winston-Salem, North Carolina, 27157.
  • Shea G; Department of Physics, Wake Forest University, Winston-Salem, North Carolina, 27106.
  • Kumar K; Department of Physics, Wake Forest University, Winston-Salem, North Carolina, 27106.
  • Hayden JD; Department of Physics, Wake Forest University, Winston-Salem, North Carolina, 27106.
  • Harper AF; Department of Physics, Wake Forest University, Winston-Salem, North Carolina, 27106.
  • Brown SD; Biochemistry Program, Dickinson College, Carlisle, Pennsylvania, 17013.
  • Morris JH; Department of Physics, Wake Forest University, Winston-Salem, North Carolina, 27106.
  • Ferrin TE; Department of Pharmaceutical Chemistry, University of California, San Francisco, California, 94158.
  • Babbitt PC; Department of Pharmaceutical Chemistry, University of California, San Francisco, California, 94158.
  • Fetrow JS; Department of Pharmaceutical Chemistry, University of California, San Francisco, California, 94158.
Protein Sci ; 26(4): 677-699, 2017 04.
Article en En | MEDLINE | ID: mdl-28054422
ABSTRACT
Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two-Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure-Function Linkage Database, SFLD) self-identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self-identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well-curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP-identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F-measure and performance analysis on the enolase search results and comparison to GEMMA and SCI-PHY demonstrate that TuLIP avoids the over-division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Fosfopiruvato Hidratasa / Análisis de Secuencia de Proteína / Bases de Datos de Proteínas / Glutatión Transferasa Idioma: En Revista: Protein Sci Asunto de la revista: BIOQUIMICA Año: 2017 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Fosfopiruvato Hidratasa / Análisis de Secuencia de Proteína / Bases de Datos de Proteínas / Glutatión Transferasa Idioma: En Revista: Protein Sci Asunto de la revista: BIOQUIMICA Año: 2017 Tipo del documento: Article