RESUMO
The hypomethylated fraction of plant genomes is usually enriched in genes and can be selectively cloned using methylation filtration (MF). Therefore, MF has been used as a gene enrichment technology in sorghum and maize, where gene enrichment was proportional to genome size. Here we apply MF to a broad variety of plant species spanning a wide range of genome sizes. Differential methylation of genic and non-genic sequences was observed in all species tested, from non-vascular to vascular plants, but in some cases, such as wheat and pine, a lower than expected level of enrichment was observed. Remarkably, hexaploid wheat and pine show a dramatically large number of gene-like sequences relative to other plants. In hexaploid wheat, this apparent excess of genes may reflect an abundance of methylated pseudogenes, which may thus be more prevalent in recent polyploids.
Assuntos
Metilação de DNA , Genes de Plantas , Sequências Repetitivas de Ácido Nucleico , Duplicação Gênica , Dados de Sequência Molecular , Filogenia , PoliploidiaRESUMO
The completion of the mouse and other mammalian genome sequences will provide necessary, but not sufficient, knowledge for an understanding of much of mouse biology at the molecular level. As a requisite next step in this process, the genes in mouse and their structure must be elucidated. In particular, knowledge of the transcriptional start site of these genes will be necessary for further study of their regulatory regions. To assess the current state of mouse genome annotation to support this activity, we identified several hundred gene predictions in mouse with varying levels of supporting evidence and tested them using RACE-PCR. Modifications were made to the procedure allowing pooling of RNA samples, resulting in a scaleable procedure. The results illustrate potential errors or omissions in the current 5' end annotations in 58% of the genes detected. In testing experimentally unsupported gene predictions, we were able to identify 58 that are not usually annotated as genes but produced spliced transcripts (approximately 25% success rate). In addition, in many genes we were able to detect novel exons not predicted by any gene prediction algorithms. In 19.8% of the genes detected in this study, multiple transcript species were observed. These data show an urgent need to provide direct experimental validation of gene annotations. Moreover, these results show that direct validation using RACE-PCR can be an important component of genome-wide validation. This approach can be a useful tool in the ongoing efforts to increase the quality of gene annotations, especially transcriptional start sites, in complex genomes.
Assuntos
Genes/genética , Genoma , Camundongos/genética , Fases de Leitura Aberta/genética , Sítio de Iniciação de Transcrição , Animais , Sequência de Bases , Ilhas de CpG/genética , Primers do DNA , DNA Complementar/genética , Éxons/genética , Dados de Sequência Molecular , Reação em Cadeia da Polimerase/métodos , Análise de Sequência de DNARESUMO
Gene enrichment strategies offer an alternative to sequencing large and repetitive genomes such as that of maize. We report the generation and analysis of nearly 100,000 undermethylated (or methylation filtration) maize sequences. Comparison with the rice genome reveals that methylation filtration results in a more comprehensive representation of maize genes than those that result from expressed sequence tags or transposon insertion sites sequences. About 7% of the repetitive DNA is unmethylated and thus selected in our libraries, but potentially active transposons and unmethylated organelle genomes can be identified. Reverse transcription polymerase chain reaction can be used to finish the maize transcriptome.