RESUMEN
Whole genome shotgun assemblies have proven remarkably successful in reconstructing the bulk of euchromatic genes, with the only limit appearing to be determined by the sequencing depth. For genes imbedded in heterochromatin, however, the low cloning efficiency of repetitive sequences, combined with the computational challenges, demand that additional clues be used to annotate the sequences. One approach that has proven very successful in identifying protein coding genes in Y-linked heterochromatin of Drosophila melanogaster has been to make a BLASTable database of the small, unmapped contigs and fragments leftover at the end of a shotgun assembly, and to attempt to capture these by blasting with an appropriate query sequence. This approach often yields a staggered alignment of contigs from the unmapped set to the query sequence, as though the disjoint contigs represent small portions of the gene. Further inspection frequently shows that the contigs are broken by very large, heterochromatic introns. Methods of this sort are being expanded to make best use of all available clues to determine which unmapped contigs are associated with genes. These include use of EST libraries, and, in the case of the Y chromosome, testing of male specific genes and reduced shotgun depth of relevant contigs. It appears much more hopeful than anyone would have imagined that whole genome shotgun assemblies can recover the great bulk of even heterochromatic genes.