Exclusive sequences of different genomes.
J Bioinform Comput Biol
; 8(3): 519-34, 2010 Jun.
Article
em En
| MEDLINE
| ID: mdl-20556860
ABSTRACT
We studied the distribution of 1-7 bp words in a dataset that includes 139 complete eukaryotic genomes, 33 masked eukaryotic genomes and coding regions from 35 genomes. We tested different statistical models to determine over- and under-represented words. The method described by Karlin et al. has the strongest predictive power compared to other methods. Using this method we identified over- and under-represented words consistent within a large array of taxonomic groups. Some of those words have not yet been described as exclusive. For example, CGCG is over-represented in CG-deficient organisms. We also describe exceptions for widely known exclusive words, such as CG and TA.
Buscar no Google
Base de dados:
MEDLINE
Assunto principal:
Algoritmos
/
Alinhamento de Sequência
/
Mapeamento Cromossômico
/
Genoma
/
Análise de Sequência de DNA
Tipo de estudo:
Prognostic_studies
/
Risk_factors_studies
Limite:
Animals
/
Humans
Idioma:
En
Ano de publicação:
2010
Tipo de documento:
Article