Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 42
Filter
1.
Cell ; 187(10): 2411-2427.e25, 2024 May 09.
Article in English | MEDLINE | ID: mdl-38608704

ABSTRACT

We set out to exhaustively characterize the impact of the cis-chromatin environment on prime editing, a precise genome engineering tool. Using a highly sensitive method for mapping the genomic locations of randomly integrated reporters, we discover massive position effects, exemplified by editing efficiencies ranging from ∼0% to 94% for an identical target site and edit. Position effects on prime editing efficiency are well predicted by chromatin marks, e.g., positively by H3K79me2 and negatively by H3K9me3. Next, we developed a multiplex perturbational framework to assess the interaction of trans-acting factors with the cis-chromatin environment on editing outcomes. Applying this framework to DNA repair factors, we identify HLTF as a context-dependent repressor of prime editing. Finally, several lines of evidence suggest that active transcriptional elongation enhances prime editing. Consistent with this, we show we can robustly decrease or increase the efficiency of prime editing by preceding it with CRISPR-mediated silencing or activation, respectively.


Subject(s)
CRISPR-Cas Systems , Chromatin , Epigenesis, Genetic , Gene Editing , Humans , Chromatin/metabolism , Chromatin/genetics , CRISPR-Cas Systems/genetics , Gene Editing/methods , Histones/metabolism , Transcription Factors/metabolism , Histone Code
2.
Cell ; 174(5): 1309-1324.e18, 2018 08 23.
Article in English | MEDLINE | ID: mdl-30078704

ABSTRACT

We applied a combinatorial indexing assay, sci-ATAC-seq, to profile genome-wide chromatin accessibility in ∼100,000 single cells from 13 adult mouse tissues. We identify 85 distinct patterns of chromatin accessibility, most of which can be assigned to cell types, and ∼400,000 differentially accessible elements. We use these data to link regulatory elements to their target genes, to define the transcription factor grammar specifying each cell type, and to discover in vivo correlates of heterogeneity in accessibility within cell types. We develop a technique for mapping single cell gene expression data to single-cell chromatin accessibility data, facilitating the comparison of atlases. By intersecting mouse chromatin accessibility with human genome-wide association summary statistics, we identify cell-type-specific enrichments of the heritability signal for hundreds of complex traits. These data define the in vivo landscape of the regulatory genome for common mammalian cell types at single-cell resolution.


Subject(s)
Chromatin/chemistry , Single-Cell Analysis/methods , Animals , Cluster Analysis , Epigenesis, Genetic , Epigenomics , Gene Expression Regulation , Genome, Human , Genome-Wide Association Study , Humans , Male , Mammals , Mice , Mice, Inbred C57BL , Transcription Factors
3.
Nature ; 608(7921): 98-107, 2022 08.
Article in English | MEDLINE | ID: mdl-35794474

ABSTRACT

DNA is naturally well suited to serve as a digital medium for in vivo molecular recording. However, contemporary DNA-based memory devices are constrained in terms of the number of distinct 'symbols' that can be concurrently recorded and/or by a failure to capture the order in which events occur1. Here we describe DNA Typewriter, a general system for in vivo molecular recording that overcomes these and other limitations. For DNA Typewriter, the blank recording medium ('DNA Tape') consists of a tandem array of partial CRISPR-Cas9 target sites, with all but the first site truncated at their 5' ends and therefore inactive. Short insertional edits serve as symbols that record the identity of the prime editing guide RNA2 mediating the edit while also shifting the position of the 'type guide' by one unit along the DNA Tape, that is, sequential genome editing. In this proof of concept of DNA Typewriter, we demonstrate recording and decoding of thousands of symbols, complex event histories and short text messages; evaluate the performance of dozens of orthogonal tapes; and construct 'long tape' potentially capable of recording as many as 20 serial events. Finally, we leverage DNA Typewriter in conjunction with single-cell RNA-seq to reconstruct a monophyletic lineage of 3,257 cells and find that the Poisson-like accumulation of sequential edits to multicopy DNA tape can be maintained across at least 20 generations and 25 days of in vitro clonal expansion.


Subject(s)
DNA , Gene Editing , Genome , CRISPR-Cas Systems/genetics , DNA/genetics , Gene Editing/methods , Genome/genetics , RNA, Guide, Kinetoplastida/genetics , RNA-Seq , Single-Cell Analysis , Time Factors
4.
Nat Methods ; 21(6): 983-993, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38724692

ABSTRACT

The inability to scalably and precisely measure the activity of developmental cis-regulatory elements (CREs) in multicellular systems is a bottleneck in genomics. Here we develop a dual RNA cassette that decouples the detection and quantification tasks inherent to multiplex single-cell reporter assays. The resulting measurement of reporter expression is accurate over multiple orders of magnitude, with a precision approaching the limit set by Poisson counting noise. Together with RNA barcode stabilization via circularization, these scalable single-cell quantitative expression reporters provide high-contrast readouts, analogous to classic in situ assays but entirely from sequencing. Screening >200 regions of accessible chromatin in a multicellular in vitro model of early mammalian development, we identify 13 (8 previously uncharacterized) autonomous and cell-type-specific developmental CREs. We further demonstrate that chimeric CRE pairs generate cognate two-cell-type activity profiles and assess gain- and loss-of-function multicellular expression phenotypes from CRE variants with perturbed transcription factor binding sites. Single-cell quantitative expression reporters can be applied in developmental and multicellular systems to quantitatively characterize native, perturbed and synthetic CREs at scale, with high sensitivity and at single-cell resolution.


Subject(s)
Gene Expression Regulation, Developmental , Single-Cell Analysis , Single-Cell Analysis/methods , Animals , Mice , Genes, Reporter , Regulatory Sequences, Nucleic Acid , Humans , Transcription Factors/genetics , Transcription Factors/metabolism , Chromatin/genetics , Chromatin/metabolism , Regulatory Elements, Transcriptional , Gene Expression Profiling/methods
5.
PLoS Comput Biol ; 16(9): e1008173, 2020 09.
Article in English | MEDLINE | ID: mdl-32946435

ABSTRACT

Single-cell Hi-C (scHi-C) interrogates genome-wide chromatin interaction in individual cells, allowing us to gain insights into 3D genome organization. However, the extremely sparse nature of scHi-C data poses a significant barrier to analysis, limiting our ability to tease out hidden biological information. In this work, we approach this problem by applying topic modeling to scHi-C data. Topic modeling is well-suited for discovering latent topics in a collection of discrete data. For our analysis, we generate nine different single-cell combinatorial indexed Hi-C (sci-Hi-C) libraries from five human cell lines (GM12878, H1Esc, HFF, IMR90, and HAP1), consisting over 19,000 cells. We demonstrate that topic modeling is able to successfully capture cell type differences from sci-Hi-C data in the form of "chromatin topics." We further show enrichment of particular compartment structures associated with locus pairs in these topics.


Subject(s)
Chromatin , Computational Biology/methods , High-Throughput Nucleotide Sequencing/methods , Single-Cell Analysis/methods , Cell Line , Chromatin/chemistry , Chromatin/genetics , Cluster Analysis , Gene Library , Humans , Natural Language Processing
6.
Methods ; 170: 61-68, 2020 01 01.
Article in English | MEDLINE | ID: mdl-31536770

ABSTRACT

The highly dynamic nature of chromosome conformation and three-dimensional (3D) genome organization leads to cell-to-cell variability in chromatin interactions within a cell population, even if the cells of the population appear to be functionally homogeneous. Hence, although Hi-C is a powerful tool for mapping 3D genome organization, this heterogeneity of chromosome higher order structure among individual cells limits the interpretive power of population based bulk Hi-C assays. Moreover, single-cell studies have the potential to enable the identification and characterization of rare cell populations or cell subtypes in a heterogeneous population. However, it may require surveying relatively large numbers of single cells to achieve statistically meaningful observations in single-cell studies. By applying combinatorial cellular indexing to chromosome conformation capture, we developed single-cell combinatorial indexed Hi-C (sci-Hi-C), a high throughput method that enables mapping chromatin interactomes in large number of single cells. We demonstrated the use of sci-Hi-C data to separate cells by karytoypic and cell-cycle state differences and to identify cellular variability in mammalian chromosomal conformation. Here, we provide a detailed description of method design and step-by-step working protocols for sci-Hi-C.


Subject(s)
Chromosome Mapping/methods , High-Throughput Nucleotide Sequencing/methods , Single-Cell Analysis/methods , Animals , Cell Line , Cell Nucleus/genetics , Cell Separation/methods , Chromatin/genetics , Chromatin/isolation & purification , Chromatin/metabolism , Computer Simulation , Gene Library , Humans , Mice , Nucleic Acid Conformation
7.
Am J Hum Genet ; 101(2): 192-205, 2017 Aug 03.
Article in English | MEDLINE | ID: mdl-28712454

ABSTRACT

The extent to which non-coding mutations contribute to Mendelian disease is a major unknown in human genetics. Relatedly, the vast majority of candidate regulatory elements have yet to be functionally validated. Here, we describe a CRISPR-based system that uses pairs of guide RNAs (gRNAs) to program thousands of kilobase-scale deletions that deeply scan across a targeted region in a tiling fashion ("ScanDel"). We applied ScanDel to HPRT1, the housekeeping gene underlying Lesch-Nyhan syndrome, an X-linked recessive disorder. Altogether, we programmed 4,342 overlapping 1 and 2 kb deletions that tiled 206 kb centered on HPRT1 (including 87 kb upstream and 79 kb downstream) with median 27-fold redundancy per base. We functionally assayed programmed deletions in parallel by selecting for loss of HPRT function with 6-thioguanine. As expected, sequencing gRNA pairs before and after selection confirmed that all HPRT1 exons are needed. However, HPRT1 function was robust to deletion of any intergenic or deeply intronic non-coding region, indicating that proximal regulatory sequences are sufficient for HPRT1 expression. Although our screen did identify the disruption of exon-proximal non-coding sequences (e.g., the promoter) as functionally consequential, long-read sequencing revealed that this signal was driven by rare, imprecise deletions that extended into exons. Our results suggest that no singular distal regulatory element is required for HPRT1 expression and that distal mutations are unlikely to contribute substantially to Lesch-Nyhan syndrome burden. Further application of ScanDel could shed light on the role of regulatory mutations in disease at other loci while also facilitating a deeper understanding of endogenous gene regulation.


Subject(s)
CRISPR-Cas Systems/genetics , Gene Expression Regulation/genetics , Hypoxanthine Phosphoribosyltransferase/genetics , Regulatory Sequences, Nucleic Acid/genetics , Sequence Deletion/genetics , Cell Line , HEK293 Cells , Humans , Hypoxanthine Phosphoribosyltransferase/biosynthesis , Lesch-Nyhan Syndrome/genetics , RNA, Guide, Kinetoplastida/genetics , Thioguanine/metabolism
9.
Methods ; 142: 59-73, 2018 06 01.
Article in English | MEDLINE | ID: mdl-29382556

ABSTRACT

The folding and three-dimensional (3D) organization of chromatin in the nucleus critically impacts genome function. The past decade has witnessed rapid advances in genomic tools for delineating 3D genome architecture. Among them, chromosome conformation capture (3C)-based methods such as Hi-C are the most widely used techniques for mapping chromatin interactions. However, traditional Hi-C protocols rely on restriction enzymes (REs) to fragment chromatin and are therefore limited in resolution. We recently developed DNase Hi-C for mapping 3D genome organization, which uses DNase I for chromatin fragmentation. DNase Hi-C overcomes RE-related limitations associated with traditional Hi-C methods, leading to improved methodological resolution. Furthermore, combining this method with DNA capture technology provides a high-throughput approach (targeted DNase Hi-C) that allows for mapping fine-scale chromatin architecture at exceptionally high resolution. Hence, targeted DNase Hi-C will be valuable for delineating the physical landscapes of cis-regulatory networks that control gene expression and for characterizing phenotype-associated chromatin 3D signatures. Here, we provide a detailed description of method design and step-by-step working protocols for these two methods.


Subject(s)
Chromosome Mapping/methods , Deoxyribonuclease I/metabolism , High-Throughput Nucleotide Sequencing/methods , Imaging, Three-Dimensional/methods , Molecular Imaging/methods , Cell Culture Techniques/instrumentation , Cell Culture Techniques/methods , Cell Nucleus/genetics , Cell Nucleus/metabolism , Chromatin/chemistry , Chromatin/genetics , Chromosome Mapping/instrumentation , Cross-Linking Reagents/chemistry , DNA Restriction Enzymes/chemistry , DNA Restriction Enzymes/metabolism , Deoxyribonuclease I/chemistry , Formaldehyde/chemistry , Gene Library , High-Throughput Nucleotide Sequencing/instrumentation , Imaging, Three-Dimensional/instrumentation , Molecular Imaging/instrumentation , Tissue Culture Techniques/instrumentation , Tissue Culture Techniques/methods , Whole Genome Sequencing/instrumentation , Whole Genome Sequencing/methods
10.
Nature ; 500(7461): 207-11, 2013 Aug 08.
Article in English | MEDLINE | ID: mdl-23925245

ABSTRACT

The HeLa cell line was established in 1951 from cervical cancer cells taken from a patient, Henrietta Lacks. This was the first successful attempt to immortalize human-derived cells in vitro. The robust growth and unrestricted distribution of HeLa cells resulted in its broad adoption--both intentionally and through widespread cross-contamination--and for the past 60 years it has served a role analogous to that of a model organism. The cumulative impact of the HeLa cell line on research is demonstrated by its occurrence in more than 74,000 PubMed abstracts (approximately 0.3%). The genomic architecture of HeLa remains largely unexplored beyond its karyotype, partly because like many cancers, its extensive aneuploidy renders such analyses challenging. We carried out haplotype-resolved whole-genome sequencing of the HeLa CCL-2 strain, examined point- and indel-mutation variations, mapped copy-number variations and loss of heterozygosity regions, and phased variants across full chromosome arms. We also investigated variation and copy-number profiles for HeLa S3 and eight additional strains. We find that HeLa is relatively stable in terms of point variation, with few new mutations accumulating after early passaging. Haplotype resolution facilitated reconstruction of an amplified, highly rearranged region of chromosome 8q24.21 at which integration of the human papilloma virus type 18 (HPV-18) genome occurred and that is likely to be the event that initiated tumorigenesis. We combined these maps with RNA-seq and ENCODE Project data sets to phase the HeLa epigenome. This revealed strong, haplotype-specific activation of the proto-oncogene MYC by the integrated HPV-18 genome approximately 500 kilobases upstream, and enabled global analyses of the relationship between gene dosage and expression. These data provide an extensively phased, high-quality reference genome for past and future experiments relying on HeLa, and demonstrate the value of haplotype resolution for characterizing cancer genomes and epigenomes.


Subject(s)
Epigenomics , Genome, Human/genetics , Aneuploidy , DNA Copy Number Variations , Female , Genes, myc/genetics , Haplotypes , HeLa Cells , Human papillomavirus 18/genetics , Human papillomavirus 18/physiology , Humans , Molecular Sequence Data , Mutation , Proto-Oncogene Mas , Sequence Analysis, DNA , Transcriptional Activation/genetics , Uterine Cervical Neoplasms/genetics , Uterine Cervical Neoplasms/pathology , Uterine Cervical Neoplasms/virology
11.
Genome Res ; 25(1): 119-28, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25373147

ABSTRACT

Large-scale bacterial genome sequencing efforts to date have provided limited information on the most prevalent category of disease: sporadically acquired infections caused by common pathogenic bacteria. Here, we performed whole-genome sequencing and de novo assembly of 312 blood- or urine-derived isolates of extraintestinal pathogenic (ExPEC) Escherichia coli, a common agent of sepsis and community-acquired urinary tract infections, obtained during the course of routine clinical care at a single institution. We find that ExPEC E. coli are highly genomically heterogeneous, consistent with pan-genome analyses encompassing the larger species. Investigation of differential virulence factor content and antibiotic resistance phenotypes reveals markedly different profiles among lineages and among strains infecting different body sites. We use high-resolution molecular epidemiology to explore the dynamics of infections at the level of individual patients, including identification of possible person-to-person transmission. Notably, a limited number of discrete lineages caused the majority of bloodstream infections, including one subclone (ST131-H30) responsible for 28% of bacteremic E. coli infections over a 3-yr period. We additionally use a microbial genome-wide-association study (GWAS) approach to identify individual genes responsible for antibiotic resistance, successfully recovering known genes but notably not identifying any novel factors. We anticipate that in the near future, whole-genome sequencing of microorganisms associated with clinical disease will become routine. Our study reveals what kind of information can be obtained from sequencing clinical isolates on a large scale, even well-characterized organisms such as E. coli, and provides insight into how this information might be utilized in a healthcare setting.


Subject(s)
Escherichia coli/genetics , Genome, Bacterial , Sequence Analysis, DNA/methods , Adolescent , Adult , Aged , Aged, 80 and over , Child , Child, Preschool , DNA, Bacterial/genetics , Drug Resistance, Multiple, Bacterial/genetics , Escherichia coli/classification , Escherichia coli/isolation & purification , Female , Gene Library , Genetic Association Studies , Humans , Infant , Infant, Newborn , Logistic Models , Longitudinal Studies , Male , Middle Aged , Phenotype , Phylogeny , Urinary Tract Infections/microbiology , Virulence Factors/genetics , Young Adult
12.
Nat Methods ; 12(1): 71-8, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25437436

ABSTRACT

High-throughput methods based on chromosome conformation capture have greatly advanced our understanding of the three-dimensional (3D) organization of genomes but are limited in resolution by their reliance on restriction enzymes. Here we describe a method called DNase Hi-C for comprehensively mapping global chromatin contacts. DNase Hi-C uses DNase I for chromatin fragmentation, leading to greatly improved efficiency and resolution over that of Hi-C. Coupling this method with DNA-capture technology provides a high-throughput approach for targeted mapping of fine-scale chromatin architecture. We applied targeted DNase Hi-C to characterize the 3D organization of 998 large intergenic noncoding RNA (lincRNA) promoters in two human cell lines. Our results revealed that expression of lincRNAs is tightly controlled by complex mechanisms involving both super-enhancers and the Polycomb repressive complex. Our results provide the first glimpse of the cell type-specific 3D organization of lincRNA genes.


Subject(s)
Chromatin/physiology , RNA, Untranslated/genetics , Chromatin/chemistry , Chromatin/ultrastructure , Chromosome Mapping , Deoxyribonuclease I/metabolism , Genome , Humans , K562 Cells , Protein Conformation , Regulatory Elements, Transcriptional/genetics
13.
Nature ; 485(7397): 246-50, 2012 Apr 04.
Article in English | MEDLINE | ID: mdl-22495309

ABSTRACT

It is well established that autism spectrum disorders (ASD) have a strong genetic component; however, for at least 70% of cases, the underlying genetic cause is unknown. Under the hypothesis that de novo mutations underlie a substantial fraction of the risk for developing ASD in families with no previous history of ASD or related phenotypes--so-called sporadic or simplex families--we sequenced all coding regions of the genome (the exome) for parent-child trios exhibiting sporadic ASD, including 189 new trios and 20 that were previously reported. Additionally, we also sequenced the exomes of 50 unaffected siblings corresponding to these new (n = 31) and previously reported trios (n = 19), for a total of 677 individual exomes from 209 families. Here we show that de novo point mutations are overwhelmingly paternal in origin (4:1 bias) and positively correlated with paternal age, consistent with the modest increased risk for children of older fathers to develop ASD. Moreover, 39% (49 of 126) of the most severe or disruptive de novo mutations map to a highly interconnected ß-catenin/chromatin remodelling protein network ranked significantly for autism candidate genes. In proband exomes, recurrent protein-altering mutations were observed in two genes: CHD8 and NTNG1. Mutation screening of six candidate genes in 1,703 ASD probands identified additional de novo, protein-altering mutations in GRIN2B, LAMC3 and SCN1A. Combined with copy number variant (CNV) data, these results indicate extreme locus heterogeneity but also provide a target for future discovery, diagnostics and therapeutics.


Subject(s)
Autistic Disorder/genetics , Exome/genetics , Exons/genetics , Point Mutation/genetics , Protein Interaction Maps/genetics , DNA-Binding Proteins/genetics , GPI-Linked Proteins/genetics , Genetic Predisposition to Disease/genetics , Humans , Laminin/genetics , NAV1.1 Voltage-Gated Sodium Channel , Nerve Tissue Proteins/genetics , Netrins , Parents , Receptors, N-Methyl-D-Aspartate/genetics , Reproducibility of Results , Siblings , Signal Transduction , Sodium Channels/genetics , Stochastic Processes , Transcription Factors/genetics , Tumor Suppressor Protein p53/metabolism , beta Catenin/metabolism
14.
PLoS Genet ; 11(7): e1005413, 2015 Jul.
Article in English | MEDLINE | ID: mdl-26230489

ABSTRACT

Bacterial whole genome sequencing holds promise as a disruptive technology in clinical microbiology, but it has not yet been applied systematically or comprehensively within a clinical context. Here, over the course of one year, we performed prospective collection and whole genome sequencing of nearly all bacterial isolates obtained from a tertiary care hospital's intensive care units (ICUs). This unbiased collection of 1,229 bacterial genomes from 391 patients enables detailed exploration of several features of clinical pathogens. A sizable fraction of isolates identified as clinically relevant corresponded to previously undescribed species: 12% of isolates assigned a species-level classification by conventional methods actually qualified as distinct, novel genomospecies on the basis of genomic similarity. Pan-genome analysis of the most frequently encountered pathogens in the collection revealed substantial variation in pan-genome size (1,420 to 20,432 genes) and the rate of gene discovery (1 to 152 genes per isolate sequenced). Surprisingly, although potential nosocomial transmission of actively surveilled pathogens was rare, 8.7% of isolates belonged to genomically related clonal lineages that were present among multiple patients, usually with overlapping hospital admissions, and were associated with clinically significant infection in 62% of patients from which they were recovered. Multi-patient clonal lineages were particularly evident in the neonatal care unit, where seven separate Staphylococcus epidermidis clonal lineages were identified, including one lineage associated with bacteremia in 5/9 neonates. Our study highlights key differences in the information made available by conventional microbiological practices versus whole genome sequencing, and motivates the further integration of microbial genome sequencing into routine clinical care.


Subject(s)
Bacteria/isolation & purification , Bacterial Infections/transmission , Genome, Bacterial/genetics , Intensive Care Units , Microbiota/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Bacteria/classification , Bacteria/genetics , Bacterial Infections/microbiology , Bacterial Typing Techniques , Biodiversity , Cross Infection/microbiology , Cross Infection/transmission , DNA, Bacterial/genetics , Female , Genetic Variation , Humans , Infant , Infant, Newborn , Male , Middle Aged , Molecular Epidemiology , Prospective Studies , Tertiary Care Centers , Young Adult
15.
Genes Chromosomes Cancer ; 55(3): 278-87, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26650888

ABSTRACT

Investigation of the genetic lesions underlying classical Hodgkin lymphoma (CHL) has been challenging due to the rarity of Hodgkin and Reed-Sternberg (HRS) cells, the pathognomonic neoplastic cells of CHL. In an effort to catalog more comprehensively recurrent copy number alterations occurring during oncogenesis, we investigated somatic alterations involved in CHL using whole-genome sequencing-mediated copy number analysis of purified HRS cells. We performed low-coverage sequencing of small numbers of intact HRS cells and paired non-neoplastic B lymphocytes isolated by flow cytometric cell sorting from 19 primary cases, as well as two commonly used HRS-derived cell lines (KM-H2 and L1236). We found that HRS cells contain strikingly fewer copy number abnormalities than CHL cell lines. A subset of cases displayed nonintegral chromosomal copy number states, suggesting internal heterogeneity within the HRS cell population. Recurrent somatic copy number alterations involving known factors in CHL pathogenesis were identified (REL, the PD-1 pathway, and TNFAIP3). In eight cases (42%) we observed recurrent copy number loss of chr1:2,352,236-4,574,271, a region containing the candidate tumor suppressor TNFRSF14. Using flow cytometry, we demonstrated reduced TNFRSF14 expression in HRS cells from 5 of 22 additional cases (23%) and in two of three CHL cell lines. These studies suggest that TNFRSF14 dysregulation may contribute to the pathobiology of CHL in a subset of cases.


Subject(s)
Hodgkin Disease/genetics , Receptors, Tumor Necrosis Factor, Member 14/genetics , Cell Line, Tumor , Cell Separation , Flow Cytometry , Hodgkin Disease/metabolism , Humans , Oligonucleotide Array Sequence Analysis , Receptors, Tumor Necrosis Factor, Member 14/biosynthesis , Receptors, Tumor Necrosis Factor, Member 14/deficiency , Reed-Sternberg Cells
16.
Nature ; 465(7296): 363-7, 2010 May 20.
Article in English | MEDLINE | ID: mdl-20436457

ABSTRACT

Layered on top of information conveyed by DNA sequence and chromatin are higher order structures that encompass portions of chromosomes, entire chromosomes, and even whole genomes. Interphase chromosomes are not positioned randomly within the nucleus, but instead adopt preferred conformations. Disparate DNA elements co-localize into functionally defined aggregates or 'factories' for transcription and DNA replication. In budding yeast, Drosophila and many other eukaryotes, chromosomes adopt a Rabl configuration, with arms extending from centromeres adjacent to the spindle pole body to telomeres that abut the nuclear envelope. Nonetheless, the topologies and spatial relationships of chromosomes remain poorly understood. Here we developed a method to globally capture intra- and inter-chromosomal interactions, and applied it to generate a map at kilobase resolution of the haploid genome of Saccharomyces cerevisiae. The map recapitulates known features of genome organization, thereby validating the method, and identifies new features. Extensive regional and higher order folding of individual chromosomes is observed. Chromosome XII exhibits a striking conformation that implicates the nucleolus as a formidable barrier to interaction between DNA sequences at either end. Inter-chromosomal contacts are anchored by centromeres and include interactions among transfer RNA genes, among origins of early DNA replication and among sites where chromosomal breakpoints occur. Finally, we constructed a three-dimensional model of the yeast genome. Our findings provide a glimpse of the interface between the form and function of a eukaryotic genome.


Subject(s)
Chromosome Positioning/physiology , Chromosomes, Fungal/metabolism , Genome, Fungal , Imaging, Three-Dimensional , Intranuclear Space/metabolism , Saccharomyces cerevisiae/cytology , Saccharomyces cerevisiae/genetics , Cell Nucleolus/genetics , Cell Nucleolus/metabolism , Cell Nucleus/genetics , Cell Nucleus/metabolism , Centromere/genetics , Centromere/metabolism , Chromosome Breakpoints , Chromosomes, Fungal/genetics , DNA Replication , Haploidy , RNA, Transfer/genetics , Replication Origin/genetics
17.
Am J Hum Genet ; 90(4): 599-613, 2012 Apr 06.
Article in English | MEDLINE | ID: mdl-22482802

ABSTRACT

Recurrent deletions have been associated with numerous diseases and genomic disorders. Few, however, have been resolved at the molecular level because their breakpoints often occur in highly copy-number-polymorphic duplicated sequences. We present an approach that uses a combination of somatic cell hybrids, array comparative genomic hybridization, and the specificity of next-generation sequencing to determine breakpoints that occur within segmental duplications. Applying our technique to the 17q21.31 microdeletion syndrome, we used genome sequencing to determine copy-number-variant breakpoints in three deletion-bearing individuals with molecular resolution. For two cases, we observed breakpoints consistent with nonallelic homologous recombination involving only H2 chromosomal haplotypes, as expected. Molecular resolution revealed that the breakpoints occurred at different locations within a 145 kbp segment of >99% identity and disrupt KANSL1 (previously known as KANSL1). In the remaining case, we found that unequal crossover occurred interchromosomally between the H1 and H2 haplotypes and that this event was mediated by a homologous sequence that was once again missing from the human reference. Interestingly, the breakpoints mapped preferentially to gaps in the current reference genome assembly, which we resolved in this study. Our method provides a strategy for the identification of breakpoints within complex regions of the genome harboring high-identity and copy-number-polymorphic segmental duplication. The approach should become particularly useful as high-quality alternate reference sequences become available and genome sequencing of individuals' DNA becomes more routine.


Subject(s)
Chromosome Breakpoints , Chromosomes, Human, Pair 17/genetics , Sequence Analysis, DNA/methods , Base Sequence , Chromosome Deletion , Comparative Genomic Hybridization/methods , DNA Copy Number Variations , Haplotypes , Homologous Recombination , Humans , Molecular Sequence Data , Segmental Duplications, Genomic , Smith-Magenis Syndrome
18.
Nat Methods ; 9(9): 913-5, 2012 Sep.
Article in English | MEDLINE | ID: mdl-22886093

ABSTRACT

We present dial-out PCR, a highly parallel method for retrieving accurate DNA molecules for gene synthesis. A complex library of DNA molecules is modified with unique flanking tags before massively parallel sequencing. Tag-directed primers then enable the retrieval of molecules with desired sequences by PCR. Dial-out PCR enables multiplex in vitro clone screening and is a compelling alternative to in vivo cloning and Sanger sequencing for accurate gene synthesis.


Subject(s)
DNA/genetics , Genes/genetics , Multiplex Polymerase Chain Reaction/methods , Oligonucleotide Array Sequence Analysis , DNA/biosynthesis , Escherichia coli/genetics , Oligonucleotides/genetics
19.
Nature ; 461(7261): 272-6, 2009 Sep 10.
Article in English | MEDLINE | ID: mdl-19684571

ABSTRACT

Genome-wide association studies suggest that common genetic variants explain only a modest fraction of heritable risk for common diseases, raising the question of whether rare variants account for a significant fraction of unexplained heritability. Although DNA sequencing costs have fallen markedly, they remain far from what is necessary for rare and novel variants to be routinely identified at a genome-wide scale in large cohorts. We have therefore sought to develop second-generation methods for targeted sequencing of all protein-coding regions ('exomes'), to reduce costs while enriching for discovery of highly penetrant variants. Here we report on the targeted capture and massively parallel sequencing of the exomes of 12 humans. These include eight HapMap individuals representing three populations, and four unrelated individuals with a rare dominantly inherited disorder, Freeman-Sheldon syndrome (FSS). We demonstrate the sensitive and specific identification of rare and common variants in over 300 megabases of coding sequence. Using FSS as a proof-of-concept, we show that candidate genes for Mendelian disorders can be identified by exome sequencing of a small number of unrelated, affected individuals. This strategy may be extendable to diseases with more complex genetics through larger sample sizes and appropriate weighting of non-synonymous variants by predicted functional impact.


Subject(s)
Exons/genetics , Genetic Predisposition to Disease/genetics , Genetic Testing/methods , Genetic Variation/genetics , Genome, Human/genetics , Sequence Analysis, DNA/methods , Gene Frequency/genetics , Gene Library , Genes, Dominant/genetics , Haplotypes/genetics , Humans , INDEL Mutation/genetics , Oligonucleotide Array Sequence Analysis , Polymorphism, Single Nucleotide/genetics , RNA Splice Sites/genetics , Sample Size , Sensitivity and Specificity , Syndrome
20.
Proc Natl Acad Sci U S A ; 109(46): 18749-54, 2012 Nov 13.
Article in English | MEDLINE | ID: mdl-23112150

ABSTRACT

The relatively short read lengths associated with the most cost-effective DNA sequencing technologies have limited their use in de novo genome assembly, structural variation detection, and haplotype-resolved genome sequencing. Consequently, there is a strong need for methods that capture various scales of contiguity information at a throughput commensurate with the current scale of massively parallel sequencing. We propose in situ library construction and optical sequencing on the flow cells of currently available massively parallel sequencing platforms as an efficient means of capturing both contiguity information and primary sequence with a single technology. In this proof-of-concept study, we demonstrate basic feasibility by generating >30,000 Escherichia coli paired-end reads separated by 1, 2, or 3 kb using in situ library construction on standard Illumina flow cells. We also show that it is possible to stretch single molecules ranging from 3 to 8 kb on the surface of a flow cell before in situ library construction, thereby enabling the production of clusters whose physical relationship to one another on the flow cell is related to genomic distance.


Subject(s)
DNA, Bacterial/genetics , Escherichia coli/genetics , Genome, Bacterial , Genomic Library , Sequence Analysis, DNA/methods
SELECTION OF CITATIONS
SEARCH DETAIL