Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 34
Filter
Add more filters










Publication year range
2.
Hum Mutat ; 38(9): 1182-1192, 2017 09.
Article in English | MEDLINE | ID: mdl-28634997

ABSTRACT

Precision medicine aims to predict a patient's disease risk and best therapeutic options by using that individual's genetic sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. For CAGI 4, three challenges involved using exome-sequencing data: Crohn's disease, bipolar disorder, and warfarin dosing. Previous CAGI challenges included prior versions of the Crohn's disease challenge. Here, we discuss the range of techniques used for phenotype prediction as well as the methods used for assessing predictive models. Additionally, we outline some of the difficulties associated with making predictions and evaluating them. The lessons learned from the exome challenges can be applied to both research and clinical efforts to improve phenotype prediction from genotype. In addition, these challenges serve as a vehicle for sharing clinical and research exome data in a secure manner with scientists who have a broad range of expertise, contributing to a collaborative effort to advance our understanding of genotype-phenotype relationships.


Subject(s)
Bipolar Disorder/genetics , Crohn Disease/genetics , Exome Sequencing/methods , Precision Medicine/methods , Warfarin/therapeutic use , Computational Biology/methods , Databases, Genetic , Genetic Predisposition to Disease , Humans , Information Dissemination , Pharmacogenomic Variants , Phenotype , Warfarin/pharmacology
3.
Hum Mutat ; 38(9): 1266-1276, 2017 09.
Article in English | MEDLINE | ID: mdl-28544481

ABSTRACT

The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Whole Genome Sequencing/methods , Area Under Curve , Genetic Predisposition to Disease , Human Genome Project , Humans , Phenotype , Quantitative Trait Loci
4.
Elife ; 42015 Mar 31.
Article in English | MEDLINE | ID: mdl-25824290

ABSTRACT

Here, we document a collection of ∼7434 MiMIC (Minos Mediated Integration Cassette) insertions of which 2854 are inserted in coding introns. They allowed us to create a library of 400 GFP-tagged genes. We show that 72% of internally tagged proteins are functional, and that more than 90% can be imaged in unfixed tissues. Moreover, the tagged mRNAs can be knocked down by RNAi against GFP (iGFPi), and the tagged proteins can be efficiently knocked down by deGradFP technology. The phenotypes associated with RNA and protein knockdown typically correspond to severe loss of function or null mutant phenotypes. Finally, we demonstrate reversible, spatial, and temporal knockdown of tagged proteins in larvae and adult flies. This new strategy and collection of strains allows unprecedented in vivo manipulations in flies for many genes. These strategies will likely extend to vertebrates.


Subject(s)
DNA Transposable Elements/genetics , Drosophila Proteins/genetics , Drosophila melanogaster/genetics , Gene Library , Mutagenesis, Insertional , RNA Interference , Animals , Animals, Genetically Modified , Blotting, Western , Brain/metabolism , Drosophila Proteins/metabolism , Drosophila melanogaster/metabolism , Drosophila melanogaster/physiology , Gene Expression , Green Fluorescent Proteins/genetics , Green Fluorescent Proteins/metabolism , Larva/genetics , Larva/metabolism , Learning/physiology , Microscopy, Confocal , Time Factors , Tumor Suppressor Proteins/genetics , Tumor Suppressor Proteins/metabolism , alpha Catenin/genetics , alpha Catenin/metabolism
5.
Genome Res ; 25(3): 445-58, 2015 Mar.
Article in English | MEDLINE | ID: mdl-25589440

ABSTRACT

Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads.


Subject(s)
Drosophila melanogaster/genetics , Genome , Animals , Chromosome Mapping , Chromosomes, Artificial, Bacterial , Computational Biology , Contig Mapping , High-Throughput Nucleotide Sequencing , In Situ Hybridization, Fluorescence , Molecular Sequence Data , Polytene Chromosomes , Restriction Mapping
6.
Nature ; 512(7515): 445-8, 2014 Aug 28.
Article in English | MEDLINE | ID: mdl-25164755

ABSTRACT

The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.


Subject(s)
Caenorhabditis elegans/genetics , Drosophila melanogaster/genetics , Gene Expression Profiling , Transcriptome/genetics , Animals , Caenorhabditis elegans/embryology , Caenorhabditis elegans/growth & development , Chromatin/genetics , Cluster Analysis , Drosophila melanogaster/growth & development , Gene Expression Regulation, Developmental/genetics , Histones/metabolism , Humans , Larva/genetics , Larva/growth & development , Models, Genetic , Molecular Sequence Annotation , Promoter Regions, Genetic/genetics , Pupa/genetics , Pupa/growth & development , RNA, Untranslated/genetics , Sequence Analysis, RNA
7.
Nat Biotechnol ; 32(4): 341-6, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24633242

ABSTRACT

The identification of full length transcripts entirely from short-read RNA sequencing data (RNA-seq) remains a challenge in the annotation of genomes. Here we describe an automated pipeline for genome annotation that integrates RNA-seq and gene-boundary data sets, which we call Generalized RNA Integration Tool, or GRIT. Applying GRIT to Drosophila melanogaster short-read RNA-seq, cap analysis of gene expression (CAGE) and poly(A)-site-seq data collected for the modENCODE project, we recovered the vast majority of previously annotated transcripts and doubled the total number of transcripts cataloged. We found that 20% of protein coding genes encode multiple protein-localization signals and that, in 20-d-old adult fly heads, genes with multiple polyadenylation sites are more common than genes with alternative splicing or alternative promoters. GRIT demonstrates 30% higher precision and recall than the most widely used transcript assembly tools. GRIT will facilitate the automated generation of high-quality genome annotations without the need for extensive manual annotation.


Subject(s)
Chromosome Mapping/methods , Genomics/methods , Molecular Sequence Annotation/methods , RNA/chemistry , RNA/genetics , Sequence Analysis, RNA/methods , Animals , Drosophila melanogaster/genetics , Genome, Insect/genetics , RNA/analysis
8.
Nature ; 512(7515): 393-9, 2014 Aug 28.
Article in English | MEDLINE | ID: mdl-24670639

ABSTRACT

Animal transcriptomes are dynamic, with each cell type, tissue and organ system expressing an ensemble of transcript isoforms that give rise to substantial diversity. Here we have identified new genes, transcripts and proteins using poly(A)+ RNA sequencing from Drosophila melanogaster in cultured cell lines, dissected organ systems and under environmental perturbations. We found that a small set of mostly neural-specific genes has the potential to encode thousands of transcripts each through extensive alternative promoter usage and RNA splicing. The magnitudes of splicing changes are larger between tissues than between developmental stages, and most sex-specific splicing is gonad-specific. Gonads express hundreds of previously unknown coding and long non-coding RNAs (lncRNAs), some of which are antisense to protein-coding genes and produce short regulatory RNAs. Furthermore, previously identified pervasive intergenic transcription occurs primarily within newly identified introns. The fly transcriptome is substantially more complex than previously recognized, with this complexity arising from combinatorial usage of promoters, splice sites and polyadenylation sites.


Subject(s)
Drosophila melanogaster/genetics , Gene Expression Profiling , Transcriptome/genetics , Alternative Splicing/genetics , Animals , Drosophila melanogaster/anatomy & histology , Drosophila melanogaster/cytology , Female , Male , Molecular Sequence Annotation , Nerve Tissue/metabolism , Organ Specificity , Poly A/genetics , Polyadenylation , Promoter Regions, Genetic/genetics , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , RNA, Messenger/metabolism , Sex Characteristics , Stress, Physiological/genetics
9.
Genetics ; 190(3): 931-40, 2012 Mar.
Article in English | MEDLINE | ID: mdl-22174071

ABSTRACT

In Drosophila collections of green fluorescent protein (GFP) trap lines have been used to probe the endogenous expression patterns of trapped genes or the subcellular localization of their protein products. Here, we describe a method, based on nonoverlapping, highly specific, shRNA transgenes directed against GFP, that extends the utility of these collections to loss-of-function studies. Furthermore, we used a MiMIC transposon to generate GFP traps in Drosophila cell lines with distinct subcellular localization patterns, which will permit high-throughput screens using fluorescently tagged proteins. Finally, we show that fluorescent traps, paired with recombinant nanobodies and mass spectrometry, allow the study of endogenous protein complexes in Drosophila.


Subject(s)
Drosophila Proteins/genetics , Drosophila Proteins/metabolism , Drosophila/genetics , Drosophila/metabolism , Fluorescent Dyes , Green Fluorescent Proteins , Protein Interaction Mapping/methods , Animals , Cell Line , Cell Survival/genetics , Embryo, Nonmammalian/metabolism , Female , Gene Order , Gene Silencing , Green Fluorescent Proteins/genetics , Green Fluorescent Proteins/metabolism , Multiprotein Complexes/isolation & purification , Multiprotein Complexes/metabolism , Peptide Elongation Factors/genetics , Peptide Elongation Factors/metabolism , Protein Binding/physiology , RNA, Small Interfering/metabolism , Recombinant Fusion Proteins/genetics , Recombinant Fusion Proteins/metabolism , Stem Cells/metabolism
10.
Nat Methods ; 8(9): 737-43, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21985007

ABSTRACT

We demonstrate the versatility of a collection of insertions of the transposon Minos-mediated integration cassette (MiMIC), in Drosophila melanogaster. MiMIC contains a gene-trap cassette and the yellow+ marker flanked by two inverted bacteriophage ΦC31 integrase attP sites. MiMIC integrates almost at random in the genome to create sites for DNAmanipulation. The attP sites allow the replacement of the intervening sequence of the transposon with any other sequence through recombinase-mediated cassette exchange (RMCE). We can revert insertions that function as gene traps and cause mutant phenotypes to revert to wild type by RMCE and modify insertions to control GAL4 or QF overexpression systems or perform lineage analysis using the Flp recombinase system. Insertions in coding introns can be exchanged with protein-tag cassettes to create fusion proteins to follow protein expression and perform biochemical experiments. The applications of MiMIC vastly extend the D. melanogaster toolkit.


Subject(s)
DNA Transposable Elements/genetics , Drosophila melanogaster/genetics , Animals , Bioengineering , Drosophila Proteins/genetics , Gene Expression Regulation , Introns , Mutagenesis, Insertional , Recombinant Fusion Proteins/analysis , Repetitive Sequences, Nucleic Acid
11.
Proc Natl Acad Sci U S A ; 108(38): 15948-53, 2011 Sep 20.
Article in English | MEDLINE | ID: mdl-21896744

ABSTRACT

The P transposable element recently invaded wild Drosophila melanogaster strains worldwide. A single introduced copy can multiply and spread throughout the fly genome in just a few generations, even though its cut-and-paste transposition mechanism does not inherently increase copy number. P element insertions preferentially target the promoters of a subset of genes, but why these sites are hotspots remains unknown. We show that P elements selectively target sites that in tissue-culture cells bind origin recognition complex proteins and function as replication origins. The association of origin recognition complex-binding sites with selected promoters and their absence near clustered differentiation genes may dictate P element site specificity. Inserting at unfired replication origins during S phase may allow P elements to be both repaired and reduplicated, thereby increasing element copy number. The advantage transposons gain by moving from replicated to unreplicated genomic regions may contribute to the association of heterochromatin with late-replicating genomic regions.


Subject(s)
DNA Transposable Elements/genetics , Drosophila melanogaster/genetics , Mutagenesis, Insertional , Replication Origin/genetics , Animals , Base Sequence , Binding Sites/genetics , Chromosomes, Insect/genetics , DNA Replication/genetics , Drosophila Proteins/genetics , Heterochromatin/genetics , Models, Genetic , Promoter Regions, Genetic/genetics , Time Factors
12.
Genetics ; 188(3): 731-43, 2011 Jul.
Article in English | MEDLINE | ID: mdl-21515576

ABSTRACT

The Drosophila Gene Disruption Project (GDP) has created a public collection of mutant strains containing single transposon insertions associated with different genes. These strains often disrupt gene function directly, allow production of new alleles, and have many other applications for analyzing gene function. Here we describe the addition of ∼7600 new strains, which were selected from >140,000 additional P or piggyBac element integrations and 12,500 newly generated insertions of the Minos transposon. These additions nearly double the size of the collection and increase the number of tagged genes to at least 9440, approximately two-thirds of all annotated protein-coding genes. We also compare the site specificity of the three major transposons used in the project. All three elements insert only rarely within many Polycomb-regulated regions, a property that may contribute to the origin of "transposon-free regions" (TFRs) in metazoan genomes. Within other genomic regions, Minos transposes essentially at random, whereas P or piggyBac elements display distinctive hotspots and coldspots. P elements, as previously shown, have a strong preference for promoters. In contrast, piggyBac site selectivity suggests that it has evolved to reduce deleterious and increase adaptive changes in host gene expression. The propensity of Minos to integrate broadly makes possible a hybrid finishing strategy for the project that will bring >95% of Drosophila genes under experimental control within their native genomic contexts.


Subject(s)
DNA Transposable Elements , Drosophila Proteins/genetics , Drosophila melanogaster/genetics , Genes, Insect , Mutagenesis, Insertional/methods , Alleles , Animals , Gene Expression , Genome, Insect , Models, Genetic , Mutation , Phenotype
13.
Genome Res ; 21(2): 182-92, 2011 Feb.
Article in English | MEDLINE | ID: mdl-21177961

ABSTRACT

Core promoters are critical regions for gene regulation in higher eukaryotes. However, the boundaries of promoter regions, the relative rates of initiation at the transcription start sites (TSSs) distributed within them, and the functional significance of promoter architecture remain poorly understood. We produced a high-resolution map of promoters active in the Drosophila melanogaster embryo by integrating data from three independent and complementary methods: 21 million cap analysis of gene expression (CAGE) tags, 1.2 million RNA ligase mediated rapid amplification of cDNA ends (RLM-RACE) reads, and 50,000 cap-trapped expressed sequence tags (ESTs). We defined 12,454 promoters of 8037 genes. Our analysis indicates that, due to non-promoter-associated RNA background signal, previous studies have likely overestimated the number of promoter-associated CAGE clusters by fivefold. We show that TSS distributions form a complex continuum of shapes, and that promoters active in the embryo and adult have highly similar shapes in 95% of cases. This suggests that these distributions are generally determined by static elements such as local DNA sequence and are not modulated by dynamic signals such as histone modifications. Transcription factor binding motifs are differentially enriched as a function of promoter shape, and peaked promoter shape is correlated with both temporal and spatial regulation of gene expression. Our results contribute to the emerging view that core promoters are functionally diverse and control patterning of gene expression in Drosophila and mammals.


Subject(s)
Computational Biology , Drosophila melanogaster/genetics , Genome, Insect/genetics , Promoter Regions, Genetic , 3' Untranslated Regions/genetics , Animals , Chromosome Mapping , Drosophila melanogaster/embryology , Expressed Sequence Tags , Gene Expression Profiling , Gene Expression Regulation/genetics , Genome-Wide Association Study , Transcription Initiation Site
14.
Genome Res ; 21(2): 301-14, 2011 Feb.
Article in English | MEDLINE | ID: mdl-21177962

ABSTRACT

Drosophila melanogaster cell lines are important resources for cell biologists. Here, we catalog the expression of exons, genes, and unannotated transcriptional signals for 25 lines. Unannotated transcription is substantial (typically 19% of euchromatic signal). Conservatively, we identify 1405 novel transcribed regions; 684 of these appear to be new exons of neighboring, often distant, genes. Sixty-four percent of genes are expressed detectably in at least one line, but only 21% are detected in all lines. Each cell line expresses, on average, 5885 genes, including a common set of 3109. Expression levels vary over several orders of magnitude. Major signaling pathways are well represented: most differentiation pathways are "off" and survival/growth pathways "on." Roughly 50% of the genes expressed by each line are not part of the common set, and these show considerable individuality. Thirty-one percent are expressed at a higher level in at least one cell line than in any single developmental stage, suggesting that each line is enriched for genes characteristic of small sets of cells. Most remarkable is that imaginal disc-derived lines can generally be assigned, on the basis of expression, to small territories within developing discs. These mappings reveal unexpected stability of even fine-grained spatial determination. No two cell lines show identical transcription factor expression. We conclude that each line has retained features of an individual founder cell superimposed on a common "cell line" gene expression pattern.


Subject(s)
Drosophila melanogaster/genetics , Genetic Variation , Transcription, Genetic , Animals , Cell Line , Cluster Analysis , Exons , Female , Gene Expression Profiling , Male , Molecular Sequence Data , Signal Transduction/genetics , Transcription Factors/genetics
15.
Nature ; 471(7339): 473-9, 2011 Mar 24.
Article in English | MEDLINE | ID: mdl-21179090

ABSTRACT

Drosophila melanogaster is one of the most well studied genetic model organisms; nonetheless, its genome still contains unannotated coding and non-coding genes, transcripts, exons and RNA editing sites. Full discovery and annotation are pre-requisites for understanding how the regulation of transcription, splicing and RNA editing directs the development of this complex organism. Here we used RNA-Seq, tiling microarrays and cDNA sequencing to explore the transcriptome in 30 distinct developmental stages. We identified 111,195 new elements, including thousands of genes, coding and non-coding transcripts, exons, splicing and editing events, and inferred protein isoforms that previously eluded discovery using established experimental, prediction and conservation-based approaches. These data substantially expand the number of known transcribed elements in the Drosophila genome and provide a high-resolution view of transcriptome dynamics throughout development.


Subject(s)
Drosophila melanogaster/growth & development , Drosophila melanogaster/genetics , Gene Expression Profiling , Gene Expression Regulation, Developmental/genetics , Transcription, Genetic/genetics , Alternative Splicing/genetics , Animals , Base Sequence , Drosophila Proteins/genetics , Drosophila melanogaster/embryology , Exons/genetics , Female , Genes, Insect/genetics , Genome, Insect/genetics , Male , MicroRNAs/genetics , Oligonucleotide Array Sequence Analysis , Protein Isoforms/genetics , RNA Editing/genetics , RNA, Messenger/analysis , RNA, Messenger/genetics , RNA, Small Untranslated/analysis , RNA, Small Untranslated/genetics , Sequence Analysis , Sex Characteristics
16.
PLoS Genet ; 6(12): e1001228, 2010 Dec 02.
Article in English | MEDLINE | ID: mdl-21151956

ABSTRACT

Genome rearrangements often result from non-allelic homologous recombination (NAHR) between repetitive DNA elements dispersed throughout the genome. Here we systematically analyze NAHR between Ty retrotransposons using a genome-wide approach that exploits unique features of Saccharomyces cerevisiae purebred and Saccharomyces cerevisiae/Saccharomyces bayanus hybrid diploids. We find that DNA double-strand breaks (DSBs) induce NAHR-dependent rearrangements using Ty elements located 12 to 48 kilobases distal to the break site. This break-distal recombination (BDR) occurs frequently, even when allelic recombination can repair the break using the homolog. Robust BDR-dependent NAHR demonstrates that sequences very distal to DSBs can effectively compete with proximal sequences for repair of the break. In addition, our analysis of NAHR partner choice between Ty repeats shows that intrachromosomal Ty partners are preferred despite the abundance of potential interchromosomal Ty partners that share higher sequence identity. This competitive advantage of intrachromosomal Tys results from the relative efficiencies of different NAHR repair pathways. Finally, NAHR generates deleterious rearrangements more frequently when DSBs occur outside rather than within a Ty repeat. These findings yield insights into mechanisms of repeat-mediated genome rearrangements associated with evolution and cancer.


Subject(s)
DNA Repair , Recombination, Genetic , Repetitive Sequences, Nucleic Acid , Saccharomyces cerevisiae/genetics , DNA Breaks, Double-Stranded , DNA Transposable Elements , Genome, Fungal , Saccharomyces/genetics
17.
Genetics ; 186(4): 1111-25, 2010 Dec.
Article in English | MEDLINE | ID: mdl-20876565

ABSTRACT

We describe a molecularly defined duplication kit for the X chromosome of Drosophila melanogaster. A set of 408 overlapping P[acman] BAC clones was used to create small duplications (average length 88 kb) covering the 22-Mb sequenced portion of the chromosome. The BAC clones were inserted into an attP docking site on chromosome 3L using ΦC31 integrase, allowing direct comparison of different transgenes. The insertions complement 92% of the essential and viable mutations and deletions tested, demonstrating that almost all Drosophila genes are compact and that the current annotations of the genome are reasonably accurate. Moreover, almost all genes are tolerated at twice the normal dosage. Finally, we more precisely mapped two regions at which duplications cause diplo-lethality in males. This collection comprises the first molecularly defined duplication set to cover a whole chromosome in a multicellular organism. The work presented removes a long-standing barrier to genetic analysis of the Drosophila X chromosome, will greatly facilitate functional assays of X-linked genes in vivo, and provides a model for functional analyses of entire chromosomes in other species.


Subject(s)
Drosophila melanogaster/genetics , Mutagenesis, Insertional , X Chromosome/genetics , Animals , Chromosome Mapping , Gene Dosage/genetics , Genes, Insect , Molecular Sequence Data
18.
Nat Methods ; 6(6): 431-4, 2009 Jun.
Article in English | MEDLINE | ID: mdl-19465919

ABSTRACT

We constructed Drosophila melanogaster bacterial artificial chromosome libraries with 21-kilobase and 83-kilobase inserts in the P[acman] system. We mapped clones representing 12-fold coverage and encompassing more than 95% of annotated genes onto the reference genome. These clones can be integrated into predetermined attP sites in the genome using UC31 integrase to rescue mutations. They can be modified through recombineering, for example, to incorporate protein tags and assess expression patterns.


Subject(s)
Animals, Genetically Modified/genetics , Chromosome Mapping/methods , Chromosomes, Artificial, Bacterial/genetics , Cloning, Molecular/methods , Drosophila melanogaster/genetics , Gene Library , Animals , Base Sequence , Molecular Sequence Data
20.
Science ; 316(5831): 1625-8, 2007 Jun 15.
Article in English | MEDLINE | ID: mdl-17569867

ABSTRACT

Genome sequences for most metazoans and plants are incomplete because of the presence of repeated DNA in the heterochromatin. The heterochromatic regions of Drosophila melanogaster contain 20 million bases (Mb) of sequence amenable to mapping, sequence assembly, and finishing. We describe the generation of 15 Mb of finished or improved heterochromatic sequence with the use of available clone resources and assembly methods. We also constructed a bacterial artificial chromosome-based physical map that spans 13 Mb of the pericentromeric heterochromatin and a cytogenetic map that positions 11 Mb in specific chromosomal locations. We have approached a complete assembly and mapping of the nonsatellite component of Drosophila heterochromatin. The strategy we describe is also applicable to generating substantially more information about heterochromatin in other species, including humans.


Subject(s)
Drosophila melanogaster/genetics , Heterochromatin/genetics , Sequence Analysis, DNA , Animals , Chromosome Mapping , Chromosomes, Artificial, Bacterial , Contig Mapping , Genome , In Situ Hybridization, Fluorescence , Physical Chromosome Mapping
SELECTION OF CITATIONS
SEARCH DETAIL
...