RESUMEN
As part of a larger project to sequence the Populus genome and generate genomic resources for this emerging model tree, we constructed a physical map of the Populus genome, representing one of the few such maps of an undomesticated, highly heterozygous plant species. The physical map, consisting of 2802 contigs, was constructed from fingerprinted bacterial artificial chromosome (BAC) clones. The map represents approximately 9.4-fold coverage of the Populus genome, which has been estimated from the genome sequence assembly to be 485 +/- 10 Mb in size. BAC ends were sequenced to assist long-range assembly of whole-genome shotgun sequence scaffolds and to anchor the physical map to the genome sequence. Simple sequence repeat-based markers were derived from the end sequences and used to initiate integration of the BAC and genetic maps. A total of 2411 physical map contigs, representing 97% of all clones assigned to contigs, were aligned to the sequence assembly (JGI Populus trichocarpa, version 1.0). These alignments represent a total coverage of 384 Mb (79%) of the entire poplar sequence assembly and 295 Mb (96%) of linkage group sequence assemblies. A striking result of the physical map contig alignments to the sequence assembly was the co-localization of multiple contigs across numerous regions of the 19 linkage groups. Targeted sequencing of BAC clones and genetic analysis in a small number of representative regions showed that these co-aligning contigs represent distinct haplotypes in the heterozygous individual sequenced, and revealed the nature of these haplotype sequence differences.
Asunto(s)
Genoma de Planta , Mapeo Físico de Cromosoma , Populus/genética , Cromosomas Artificiales Bacterianos , Haplotipos , Repeticiones de Minisatélite , Polimorfismo Genético , Alineación de Secuencia , Análisis de Secuencia de ADNRESUMEN
A physical map of the Atlantic salmon (Salmo salar) genome was generated based on HindIII fingerprints of a publicly available BAC (bacterial artificial chromosome) library constructed from DNA isolated from a Norwegian male. Approximately 11.5 haploid genome equivalents (185,938 clones) were successfully fingerprinted. Contigs were first assembled via FPC using high-stringency (1e-16), and then end-to-end joins yielded 4354 contigs and 37,285 singletons. The accuracy of the contig assembly was verified by hybridization and PCR analysis using genetic markers. A subset of the BACs in the library contained few or no HindIII recognition sites in their insert DNA. BglI digestion fragment patterns of these BACs allowed us to identify three classes: (1) BACs containing histone genes, (2) BACs containing rDNA-repeating units, and (3) those that do not have BglI recognition sites. End-sequence analysis of selected BACs representing these three classes confirmed the identification of the first two classes and suggested that the third class contained highly repetitive DNA corresponding to tRNAs and related sequences.
Asunto(s)
Genoma , Mapeo Físico de Cromosoma/métodos , Salmo salar/genética , Animales , Mapeo Contig/métodos , Dermatoglifia del ADN , Histonas/genética , Masculino , Mapeo Físico de Cromosoma/normas , Mapeo Restrictivo , Metiltransferasa de ADN de Sitio Específico (Adenina Especifica)/genéticaRESUMEN
Cryptococcus neoformans is a basidiomycetous yeast ubiquitous in the environment, a model for fungal pathogenesis, and an opportunistic human pathogen of global importance. We have sequenced its approximately 20-megabase genome, which contains approximately 6500 intron-rich gene structures and encodes a transcriptome abundant in alternatively spliced and antisense messages. The genome is rich in transposons, many of which cluster at candidate centromeric regions. The presence of these transposons may drive karyotype instability and phenotypic variation. C. neoformans encodes unique genes that may contribute to its unusual virulence properties, and comparison of two phenotypically distinct strains reveals variation in gene content in addition to sequence polymorphisms between the genomes.
Asunto(s)
Cryptococcus neoformans/genética , Genoma Fúngico , Empalme Alternativo , Pared Celular/metabolismo , Cromosomas Fúngicos/genética , Biología Computacional , Cryptococcus neoformans/patogenicidad , Cryptococcus neoformans/fisiología , Elementos Transponibles de ADN , Proteínas Fúngicas/metabolismo , Biblioteca de Genes , Genes Fúngicos , Humanos , Intrones , Datos de Secuencia Molecular , Fenotipo , Polimorfismo Genético , Polimorfismo de Nucleótido Simple , Polisacáridos/metabolismo , ARN sin Sentido , Análisis de Secuencia de ADN , Transcripción Genética , Virulencia , Factores de Virulencia/metabolismoRESUMEN
Here we describe software tools for the automated detection of DNA restriction fragments resolved on agarose fingerprinting gels. We present a mathematical model for the location and shape of the restriction fragments as a function of fragment size, with model parameters determined empirically from "marker" lanes containing molecular size standards. Automated identification of restriction fragments involves several steps, including: image preprocessing, to put the data in a form consistent with a linear model; marker lane analysis, for determination of the model parameters; and data lane analysis, a procedure for detecting restriction fragment multiplets while simultaneously determining the amplitude curve that describes restriction fragment amplitude as a function of mobility. In validation experiments conducted on fingerprinted and sequenced Bacterial Artificial Chromosome (BAC) clones, sensitivity and specificity of restriction fragment identification exceeded 96% on restriction fragments ranging in size from 600 base pairs (bp) to 30,000 bp. The integrated suite of software tools, written in MATLAB and collectively called BandLeader, is in use at the BC Cancer Agency Genome Sciences Centre (GSC) and the Washington University Genome Sequencing Center, and has been provided to the Wellcome Trust Sanger Institute and the Whitehead Institute. Employed in a production mode at the GSC, BandLeader has been used to perform automated restriction fragment identification for more than 850,000 BAC clones for mouse, rat, bovine, and poplar fingerprint mapping projects.
Asunto(s)
Dermatoglifia del ADN/métodos , Geles , Programas Informáticos , Animales , Bovinos , Cromosomas Artificiales Bacterianos/genética , ADN/genética , Ratones , Modelos Químicos , Ratas , SefarosaRESUMEN
We describe an efficient high-throughput method for accurate DNA sequencing of entire cDNA clones. Developed as part of our involvement in the Mammalian Gene Collection full-length cDNA sequencing initiative, the method has been used and refined in our laboratory since September 2000. Amenable to large scale projects, we have used the method to generate >7 Mb of accurate sequence from 3695 candidate full-length cDNAs. Sequencing is accomplished through the insertion of Mu transposon into cDNAs, followed by sequencing reactions primed with Mu-specific sequencing primers. Transposon insertion reactions are not performed with individual cDNAs but rather on pools of up to 96 clones. This pooling strategy reduces the number of transposon insertion sequencing libraries that would otherwise be required, reducing the costs and enhancing the efficiency of the transposon library construction procedure. Sequences generated using transposon-specific sequencing primers are assembled to yield the full-length cDNA sequence, with sequence editing and other sequence finishing activities performed as required to resolve sequence ambiguities. Although analysis of the many thousands (22 785) of sequenced Mu transposon insertion events revealed a weak sequence preference for Mu insertion, we observed insertion of the Mu transposon into 1015 of the possible 1024 5mer candidate insertion sites.