Your browser doesn't support javascript.
loading
Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments.
Chang, Mike S S; Benner, Steven A.
Afiliación
  • Chang MS; Foundation for Applied Molecular Evolution, Gainesville, FL 32601, USA.
J Mol Biol ; 341(2): 617-31, 2004 Aug 06.
Article en En | MEDLINE | ID: mdl-15276848
To understand how protein segments are inserted and deleted during divergent evolution, a set of pairwise alignments contained exactly one gap, and therefore arising from the first insertion-deletion (indel) event in the time separating the homologs, was examined. The alignments showed that "structure breaking" amino acids (PGDNS) were preferred within and flanking gapped regions, as are two residues with hydrophilic side-chains (QE) that frequently occur at the surface of protein folds. Conversely, hydrophobic residues (FMILYVW) occur infrequently within and flanking the gapped region. These preferences are modestly different in protein pairs separated by an episode of adaptive evolution, than in pairs diverging under strong functional constraints. Surprisingly, regions near an indel have not evolved more rapidly than the sequence pair overall, showing no evidence that an indel event must be compensated by local amino acid replacement. The gap-lengths are best approximated by a Zipfian distribution, with the probability of a gap of length L decreasing as a function of L(-1.8). These features are largely independent of the length of the gap and the extent of divergence (measured by both silent and non-silent sequence changes) separating the two proteins. Surprisingly, amino acid repeats were discovered in more than a third of the polypeptide segments in and around the gap. These correspond to repeats in the DNA sequence. This suggests that a signature of the mechanism by which indels occur in the DNA sequence remains in the encoded protein sequences. These data suggest specific tools to score gap placement in an alignment. They also suggest tools that distinguish true indels from gaps created by mistaken gene finding, including under-predicted and over-predicted introns. By providing mechanisms to identify errors, the tools will enhance the value of genome sequence databases in support of integrated paleogenomics strategies used to extract functional information in a post-genomic environment.
Asunto(s)
Buscar en Google
Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Variación Genética / Proteínas / Alineación de Secuencia / Eliminación de Secuencia / Bases de Datos de Proteínas Tipo de estudio: Prognostic_studies Límite: Animals / Humans Idioma: En Revista: J Mol Biol Año: 2004 Tipo del documento: Article País de afiliación: Estados Unidos Pais de publicación: Países Bajos
Buscar en Google
Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Variación Genética / Proteínas / Alineación de Secuencia / Eliminación de Secuencia / Bases de Datos de Proteínas Tipo de estudio: Prognostic_studies Límite: Animals / Humans Idioma: En Revista: J Mol Biol Año: 2004 Tipo del documento: Article País de afiliación: Estados Unidos Pais de publicación: Países Bajos