Your browser doesn't support javascript.
loading
Generalised sequence signatures through symbolic clustering.
Dorr, Dietmar H; Denton, Anne M.
Afiliação
  • Dorr DH; Research and Development, Thomson Reuters, St. Paul, MN 55123, USA. dietmar.dorr@thomsonreuters.com
Int J Data Min Bioinform ; 4(6): 656-74, 2010.
Article em En | MEDLINE | ID: mdl-21355500
Traditionally sequence motifs and domains are defined such that insertions, deletions and mismatched regions are small compared with matched regions. We introduce an algorithm for the identification of Generalised Sequence Signatures (GSS) that can be composed of windows distributed throughout the sequence. Our approach is based on clustering analysis of recurring subsequences of a predefined length, to which we refer as symbols. Sequences are grouped so as to maximise the number of shared symbols among them. We show that the utilisation of GSS for deriving sequence annotations yields higher confidence values than the usage of other signature recognition approaches.
Assuntos
Buscar no Google
Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Perfilação da Expressão Gênica / Genômica Idioma: En Ano de publicação: 2010 Tipo de documento: Article
Buscar no Google
Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Perfilação da Expressão Gênica / Genômica Idioma: En Ano de publicação: 2010 Tipo de documento: Article