RESUMO
Marine stickleback fish have colonized and adapted to thousands of streams and lakes formed since the last ice age, providing an exceptional opportunity to characterize genomic mechanisms underlying repeated ecological adaptation in nature. Here we develop a high-quality reference genome assembly for threespine sticklebacks. By sequencing the genomes of twenty additional individuals from a global set of marine and freshwater populations, we identify a genome-wide set of loci that are consistently associated with marine-freshwater divergence. Our results indicate that reuse of globally shared standing genetic variation, including chromosomal inversions, has an important role in repeated evolution of distinct marine and freshwater sticklebacks, and in the maintenance of divergent ecotypes during early stages of reproductive isolation. Both coding and regulatory changes occur in the set of loci underlying marine-freshwater evolution, but regulatory changes appear to predominate in this well known example of repeated adaptive evolution in nature.
Assuntos
Adaptação Fisiológica/genética , Evolução Biológica , Genoma/genética , Smegmamorpha/genética , Alaska , Animais , Organismos Aquáticos/genética , Inversão Cromossômica/genética , Cromossomos/genética , Sequência Conservada/genética , Ecótipo , Feminino , Água Doce , Variação Genética/genética , Genômica , Dados de Sequência Molecular , Água do Mar , Análise de Sequência de DNARESUMO
This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.
Assuntos
Mapeamento Cromossômico , Éxons , Genoma Humano , Regiões Promotoras Genéticas , Locos de Características Quantitativas , Transcrição Gênica/fisiologia , DNA Complementar/genética , Projeto Genoma Humano , Humanos , Fases de Leitura AbertaRESUMO
Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization.
Assuntos
Sequência de Bases , Biblioteca Gênica , Poliploidia , Xenopus laevis/genética , Xenopus/genética , Animais , Evolução Molecular , Expressão Gênica , Genes Duplicados , Genoma , Dados de Sequência Molecular , Fases de Leitura Aberta/genética , Filogenia , Homologia de Sequência do Ácido NucleicoRESUMO
Large-scale genetic screens in zebrafish have identified thousands of mutations in hundreds of essential genes. The genetic mapping of these mutations is necessary to link DNA sequences to the gene functions defined by mutant phenotypes. Here, we report two advances that will accelerate the mapping of zebrafish mutations: (1) The construction of a first generation single nucleotide polymorphism (SNP) map of the zebrafish genome comprising 2035 SNPs and 178 small insertions/deletions, and (2) the development of a method for mapping mutations in which hundreds of SNPs can be scored in parallel with an oligonucleotide microarray. We have demonstrated the utility of the microarray technique in crosses with haploid and diploid embryos by mapping two known mutations to their previously identified locations. We have also used this approach to localize four previously unmapped mutations. We expect that mapping with SNPs and oligonucleotide microarrays will accelerate the molecular analysis of zebrafish mutations.