Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 187(10): 2411-2427.e25, 2024 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-38608704

RESUMEN

We set out to exhaustively characterize the impact of the cis-chromatin environment on prime editing, a precise genome engineering tool. Using a highly sensitive method for mapping the genomic locations of randomly integrated reporters, we discover massive position effects, exemplified by editing efficiencies ranging from ∼0% to 94% for an identical target site and edit. Position effects on prime editing efficiency are well predicted by chromatin marks, e.g., positively by H3K79me2 and negatively by H3K9me3. Next, we developed a multiplex perturbational framework to assess the interaction of trans-acting factors with the cis-chromatin environment on editing outcomes. Applying this framework to DNA repair factors, we identify HLTF as a context-dependent repressor of prime editing. Finally, several lines of evidence suggest that active transcriptional elongation enhances prime editing. Consistent with this, we show we can robustly decrease or increase the efficiency of prime editing by preceding it with CRISPR-mediated silencing or activation, respectively.


Asunto(s)
Sistemas CRISPR-Cas , Cromatina , Epigénesis Genética , Edición Génica , Humanos , Cromatina/metabolismo , Cromatina/genética , Sistemas CRISPR-Cas/genética , Edición Génica/métodos , Histonas/metabolismo , Factores de Transcripción/metabolismo , Código de Histonas
2.
Immunity ; 57(2): 271-286.e13, 2024 Feb 13.
Artículo en Inglés | MEDLINE | ID: mdl-38301652

RESUMEN

The immune system encodes information about the severity of a pathogenic threat in the quantity and type of memory cells it forms. This encoding emerges from lymphocyte decisions to maintain or lose self-renewal and memory potential during a challenge. By tracking CD8+ T cells at the single-cell and clonal lineage level using time-resolved transcriptomics, quantitative live imaging, and an acute infection model, we find that T cells will maintain or lose memory potential early after antigen recognition. However, following pathogen clearance, T cells may regain memory potential if initially lost. Mechanistically, this flexibility is implemented by a stochastic cis-epigenetic switch that tunably and reversibly silences the memory regulator, TCF1, in response to stimulation. Mathematical modeling shows how this flexibility allows memory T cell numbers to scale robustly with pathogen virulence and immune response magnitudes. We propose that flexibility and stochasticity in cellular decisions ensure optimal immune responses against diverse threats.


Asunto(s)
Linfocitos T CD8-positivos , Células T de Memoria , Epigénesis Genética , Células Clonales , Memoria Inmunológica , Diferenciación Celular
3.
Cell ; 174(5): 1309-1324.e18, 2018 08 23.
Artículo en Inglés | MEDLINE | ID: mdl-30078704

RESUMEN

We applied a combinatorial indexing assay, sci-ATAC-seq, to profile genome-wide chromatin accessibility in ∼100,000 single cells from 13 adult mouse tissues. We identify 85 distinct patterns of chromatin accessibility, most of which can be assigned to cell types, and ∼400,000 differentially accessible elements. We use these data to link regulatory elements to their target genes, to define the transcription factor grammar specifying each cell type, and to discover in vivo correlates of heterogeneity in accessibility within cell types. We develop a technique for mapping single cell gene expression data to single-cell chromatin accessibility data, facilitating the comparison of atlases. By intersecting mouse chromatin accessibility with human genome-wide association summary statistics, we identify cell-type-specific enrichments of the heritability signal for hundreds of complex traits. These data define the in vivo landscape of the regulatory genome for common mammalian cell types at single-cell resolution.


Asunto(s)
Cromatina/química , Análisis de la Célula Individual/métodos , Animales , Análisis por Conglomerados , Epigénesis Genética , Epigenómica , Regulación de la Expresión Génica , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Mamíferos , Ratones , Ratones Endogámicos C57BL , Factores de Transcripción
4.
Cell ; 164(1-2): 57-68, 2016 Jan 14.
Artículo en Inglés | MEDLINE | ID: mdl-26771485

RESUMEN

Nucleosome positioning varies between cell types. By deep sequencing cell-free DNA (cfDNA), isolated from circulating blood plasma, we generated maps of genome-wide in vivo nucleosome occupancy and found that short cfDNA fragments harbor footprints of transcription factors. The cfDNA nucleosome occupancies correlate well with the nuclear architecture, gene structure, and expression observed in cells, suggesting that they could inform the cell type of origin. Nucleosome spacing inferred from cfDNA in healthy individuals correlates most strongly with epigenetic features of lymphoid and myeloid cells, consistent with hematopoietic cell death as the normal source of cfDNA. We build on this observation to show how nucleosome footprints can be used to infer cell types contributing to cfDNA in pathological states such as cancer. Since this strategy does not rely on genetic differences to distinguish between contributing tissues, it may enable the noninvasive monitoring of a much broader set of clinical conditions than currently possible.


Asunto(s)
ADN/química , Nucleosomas/química , Especificidad de Órganos , Factor de Unión a CCCTC , Línea Celular , Ensamble y Desensamble de Cromatina , ADN/metabolismo , Huella de ADN , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Neoplasias/genética , Proteínas Represoras/metabolismo , Análisis de Secuencia de ADN
5.
Nature ; 626(8001): 1084-1093, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38355799

RESUMEN

The house mouse (Mus musculus) is an exceptional model system, combining genetic tractability with close evolutionary affinity to humans1,2. Mouse gestation lasts only 3 weeks, during which the genome orchestrates the astonishing transformation of a single-cell zygote into a free-living pup composed of more than 500 million cells. Here, to establish a global framework for exploring mammalian development, we applied optimized single-cell combinatorial indexing3 to profile the transcriptional states of 12.4 million nuclei from 83 embryos, precisely staged at 2- to 6-hour intervals spanning late gastrulation (embryonic day 8) to birth (postnatal day 0). From these data, we annotate hundreds of cell types and explore the ontogenesis of the posterior embryo during somitogenesis and of kidney, mesenchyme, retina and early neurons. We leverage the temporal resolution and sampling depth of these whole-embryo snapshots, together with published data4-8 from earlier timepoints, to construct a rooted tree of cell-type relationships that spans the entirety of prenatal development, from zygote to birth. Throughout this tree, we systematically nominate genes encoding transcription factors and other proteins as candidate drivers of the in vivo differentiation of hundreds of cell types. Remarkably, the most marked temporal shifts in cell states are observed within one hour of birth and presumably underlie the massive physiological adaptations that must accompany the successful transition of a mammalian fetus to life outside the womb.


Asunto(s)
Animales Recién Nacidos , Embrión de Mamíferos , Desarrollo Embrionario , Gástrula , Análisis de la Célula Individual , Imagen de Lapso de Tiempo , Animales , Femenino , Ratones , Embarazo , Animales Recién Nacidos/embriología , Animales Recién Nacidos/genética , Diferenciación Celular/genética , Embrión de Mamíferos/citología , Embrión de Mamíferos/embriología , Desarrollo Embrionario/genética , Gástrula/citología , Gástrula/embriología , Gastrulación/genética , Riñón/citología , Riñón/embriología , Mesodermo/citología , Mesodermo/enzimología , Neuronas/citología , Neuronas/metabolismo , Retina/citología , Retina/embriología , Somitos/citología , Somitos/embriología , Factores de Tiempo , Factores de Transcripción/genética , Transcripción Genética , Especificidad de Órganos/genética
6.
Nature ; 622(7983): 584-593, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37369347

RESUMEN

The human embryo undergoes morphogenetic transformations following implantation into the uterus, but our knowledge of this crucial stage is limited by the inability to observe the embryo in vivo. Models of the embryo derived from stem cells are important tools for interrogating developmental events and tissue-tissue crosstalk during these stages1. Here we establish a model of the human post-implantation embryo, a human embryoid, comprising embryonic and extraembryonic tissues. We combine two types of extraembryonic-like cell generated by overexpression of transcription factors with wild-type embryonic stem cells and promote their self-organization into structures that mimic several aspects of the post-implantation human embryo. These self-organized aggregates contain a pluripotent epiblast-like domain surrounded by extraembryonic-like tissues. Our functional studies demonstrate that the epiblast-like domain robustly differentiates into amnion, extraembryonic mesenchyme and primordial germ cell-like cells in response to bone morphogenetic protein cues. In addition, we identify an inhibitory role for SOX17 in the specification of anterior hypoblast-like cells2. Modulation of the subpopulations in the hypoblast-like compartment demonstrates that extraembryonic-like cells influence epiblast-like domain differentiation, highlighting functional tissue-tissue crosstalk. In conclusion, we present a modular, tractable, integrated3 model of the human embryo that will enable us to probe key questions of human post-implantation development, a critical window during which substantial numbers of pregnancies fail.


Asunto(s)
Implantación del Embrión , Embrión de Mamíferos , Desarrollo Embrionario , Modelos Biológicos , Células Madre Pluripotentes , Femenino , Humanos , Embarazo , Proteínas Morfogenéticas Óseas , Diferenciación Celular , Embrión de Mamíferos/citología , Embrión de Mamíferos/embriología , Cuerpos Embrioides/citología , Estratos Germinativos/citología , Estratos Germinativos/embriología , Células Madre Embrionarias Humanas/citología , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Células Madre Pluripotentes/citología
7.
Nature ; 608(7921): 98-107, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35794474

RESUMEN

DNA is naturally well suited to serve as a digital medium for in vivo molecular recording. However, contemporary DNA-based memory devices are constrained in terms of the number of distinct 'symbols' that can be concurrently recorded and/or by a failure to capture the order in which events occur1. Here we describe DNA Typewriter, a general system for in vivo molecular recording that overcomes these and other limitations. For DNA Typewriter, the blank recording medium ('DNA Tape') consists of a tandem array of partial CRISPR-Cas9 target sites, with all but the first site truncated at their 5' ends and therefore inactive. Short insertional edits serve as symbols that record the identity of the prime editing guide RNA2 mediating the edit while also shifting the position of the 'type guide' by one unit along the DNA Tape, that is, sequential genome editing. In this proof of concept of DNA Typewriter, we demonstrate recording and decoding of thousands of symbols, complex event histories and short text messages; evaluate the performance of dozens of orthogonal tapes; and construct 'long tape' potentially capable of recording as many as 20 serial events. Finally, we leverage DNA Typewriter in conjunction with single-cell RNA-seq to reconstruct a monophyletic lineage of 3,257 cells and find that the Poisson-like accumulation of sequential edits to multicopy DNA tape can be maintained across at least 20 generations and 25 days of in vitro clonal expansion.


Asunto(s)
ADN , Edición Génica , Genoma , Sistemas CRISPR-Cas/genética , ADN/genética , Edición Génica/métodos , Genoma/genética , ARN Guía de Kinetoplastida/genética , RNA-Seq , Análisis de la Célula Individual , Factores de Tiempo
8.
Mol Cell ; 71(5): 858-871.e8, 2018 09 06.
Artículo en Inglés | MEDLINE | ID: mdl-30078726

RESUMEN

Linking regulatory DNA elements to their target genes, which may be located hundreds of kilobases away, remains challenging. Here, we introduce Cicero, an algorithm that identifies co-accessible pairs of DNA elements using single-cell chromatin accessibility data and so connects regulatory elements to their putative target genes. We apply Cicero to investigate how dynamically accessible elements orchestrate gene regulation in differentiating myoblasts. Groups of Cicero-linked regulatory elements meet criteria of "chromatin hubs"-they are enriched for physical proximity, interact with a common set of transcription factors, and undergo coordinated changes in histone marks that are predictive of changes in gene expression. Pseudotemporal analysis revealed that most DNA elements remain in chromatin hubs throughout differentiation. A subset of elements bound by MYOD1 in myoblasts exhibit early opening in a PBX1- and MEIS1-dependent manner. Our strategy can be applied to dissect the architecture, sequence determinants, and mechanisms of cis-regulation on a genome-wide scale.


Asunto(s)
Ensamble y Desensamble de Cromatina/genética , Cromatina/genética , ADN/genética , Elementos de Facilitación Genéticos/genética , Regulación de la Expresión Génica/genética , Adolescente , Diferenciación Celular/genética , Femenino , Genes Homeobox/genética , Histonas/genética , Humanos , Mioblastos/fisiología , Factores de Transcripción/genética
9.
Genome Res ; 31(10): 1952-1969, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-33888511

RESUMEN

Recently developed single-cell technologies allow researchers to characterize cell states at ever greater resolution and scale. Caenorhabditis elegans is a particularly tractable system for studying development, and recent single-cell RNA-seq studies characterized the gene expression patterns for nearly every cell type in the embryo and at the second larval stage (L2). Gene expression patterns give insight about gene function and into the biochemical state of different cell types; recent advances in other single-cell genomics technologies can now also characterize the regulatory context of the genome that gives rise to these gene expression levels at a single-cell resolution. To explore the regulatory DNA of individual cell types in C. elegans, we collected single-cell chromatin accessibility data using the sci-ATAC-seq assay in L2 larvae to match the available single-cell RNA-seq data set. By using a novel implementation of the latent Dirichlet allocation algorithm, we identify 37 clusters of cells that correspond to different cell types in the worm, providing new maps of putative cell type-specific gene regulatory sites, with promise for better understanding of cellular differentiation and gene regulation.


Asunto(s)
Caenorhabditis elegans , Cromatina , Animales , Caenorhabditis elegans/genética , Cromatina/genética , Secuenciación de Inmunoprecipitación de Cromatina , ADN/genética , Regulación de la Expresión Génica
10.
Mol Syst Biol ; 19(6): e11517, 2023 06 12.
Artículo en Inglés | MEDLINE | ID: mdl-37154091

RESUMEN

Recent advances in multiplexed single-cell transcriptomics experiments facilitate the high-throughput study of drug and genetic perturbations. However, an exhaustive exploration of the combinatorial perturbation space is experimentally unfeasible. Therefore, computational methods are needed to predict, interpret, and prioritize perturbations. Here, we present the compositional perturbation autoencoder (CPA), which combines the interpretability of linear models with the flexibility of deep-learning approaches for single-cell response modeling. CPA learns to in silico predict transcriptional perturbation response at the single-cell level for unseen dosages, cell types, time points, and species. Using newly generated single-cell drug combination data, we validate that CPA can predict unseen drug combinations while outperforming baseline models. Additionally, the architecture's modularity enables incorporating the chemical representation of the drugs, allowing the prediction of cellular response to completely unseen drugs. Furthermore, CPA is also applicable to genetic combinatorial screens. We demonstrate this by imputing in silico 5,329 missing combinations (97.6% of all possibilities) in a single-cell Perturb-seq experiment with diverse genetic interactions. We envision CPA will facilitate efficient experimental design and hypothesis generation by enabling in silico response prediction at the single-cell level and thus accelerate therapeutic applications using single-cell technologies.


Asunto(s)
Biología Computacional , Perfilación de la Expresión Génica , Ensayos Analíticos de Alto Rendimiento , Análisis de Expresión Génica de una Sola Célula
12.
Nature ; 562(7726): 217-222, 2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-30209399

RESUMEN

Variants of uncertain significance fundamentally limit the clinical utility of genetic information. The challenge they pose is epitomized by BRCA1, a tumour suppressor gene in which germline loss-of-function variants predispose women to breast and ovarian cancer. Although BRCA1 has been sequenced in millions of women, the risk associated with most newly observed variants cannot be definitively assigned. Here we use saturation genome editing to assay 96.5% of all possible single-nucleotide variants (SNVs) in 13 exons that encode functionally critical domains of BRCA1. Functional effects for nearly 4,000 SNVs are bimodally distributed and almost perfectly concordant with established assessments of pathogenicity. Over 400 non-functional missense SNVs are identified, as well as around 300 SNVs that disrupt expression. We predict that these results will be immediately useful for the clinical interpretation of BRCA1 variants, and that this approach can be extended to overcome the challenge of variants of uncertain significance in additional clinically actionable genes.


Asunto(s)
Proteína BRCA1/genética , Edición Génica , Predisposición Genética a la Enfermedad/clasificación , Variación Genética/genética , Genoma Humano/genética , Síndrome de Cáncer de Mama y Ovario Hereditario/genética , Línea Celular , Exones/genética , Femenino , Genes Esenciales/genética , Humanos , Mutación con Pérdida de Función/genética , Modelos Moleculares , Pronóstico , ARN Mensajero/genética , ARN Mensajero/metabolismo , Reparación del ADN por Recombinación/genética
13.
Nature ; 555(7697): 538-542, 2018 03 22.
Artículo en Inglés | MEDLINE | ID: mdl-29539636

RESUMEN

Understanding how gene regulatory networks control the progressive restriction of cell fates is a long-standing challenge. Recent advances in measuring gene expression in single cells are providing new insights into lineage commitment. However, the regulatory events underlying these changes remain unclear. Here we investigate the dynamics of chromatin regulatory landscapes during embryogenesis at single-cell resolution. Using single-cell combinatorial indexing assay for transposase accessible chromatin with sequencing (sci-ATAC-seq), we profiled chromatin accessibility in over 20,000 single nuclei from fixed Drosophila melanogaster embryos spanning three landmark embryonic stages: 2-4 h after egg laying (predominantly stage 5 blastoderm nuclei), when each embryo comprises around 6,000 multipotent cells; 6-8 h after egg laying (predominantly stage 10-11), to capture a midpoint in embryonic development when major lineages in the mesoderm and ectoderm are specified; and 10-12 h after egg laying (predominantly stage 13), when each of the embryo's more than 20,000 cells are undergoing terminal differentiation. Our results show that there is spatial heterogeneity in the accessibility of the regulatory genome before gastrulation, a feature that aligns with future cell fate, and that nuclei can be temporally ordered along developmental trajectories. During mid-embryogenesis, tissue granularity emerges such that individual cell types can be inferred by their chromatin accessibility while maintaining a signature of their germ layer of origin. Analysis of the data reveals overlapping usage of regulatory elements between cells of the endoderm and non-myogenic mesoderm, suggesting a common developmental program that is reminiscent of the mesendoderm lineage in other species. We identify 30,075 distal regulatory elements that exhibit tissue-specific accessibility. We validated the germ-layer specificity of a subset of these predicted enhancers in transgenic embryos, achieving an accuracy of 90%. Overall, our results demonstrate the power of shotgun single-cell profiling of embryos to resolve dynamic changes in the chromatin landscape during development, and to uncover the cis-regulatory programs of metazoan germ layers and cell types.


Asunto(s)
Drosophila melanogaster/citología , Drosophila melanogaster/embriología , Desarrollo Embrionario/genética , Regulación del Desarrollo de la Expresión Génica , Análisis de la Célula Individual , Animales , Diferenciación Celular/genética , Linaje de la Célula/genética , Cromatina/genética , Cromatina/metabolismo , Drosophila melanogaster/genética , Endodermo/citología , Endodermo/metabolismo , Elementos de Facilitación Genéticos/genética , Femenino , Gastrulación/genética , Genoma de los Insectos/genética , Masculino , Mesodermo/citología , Mesodermo/metabolismo , Especificidad de Órganos/genética , Organismos Modificados Genéticamente/citología , Organismos Modificados Genéticamente/genética , Reproducibilidad de los Resultados
14.
BMC Genomics ; 24(1): 737, 2023 Dec 04.
Artículo en Inglés | MEDLINE | ID: mdl-38049719

RESUMEN

Single-cell chromatin accessibility has emerged as a powerful means of understanding the epigenetic landscape of diverse tissues and cell types, but profiling cells from many independent specimens is challenging and costly. Here we describe a novel approach, sciPlex-ATAC-seq, which uses unmodified DNA oligos as sample-specific nuclear labels, enabling the concurrent profiling of chromatin accessibility within single nuclei from virtually unlimited specimens or experimental conditions. We first demonstrate our method with a chemical epigenomics screen, in which we identify drug-altered distal regulatory sites predictive of compound- and dose-dependent effects on transcription. We then analyze cell type-specific chromatin changes in PBMCs from multiple donors responding to synthetic and allogeneic immune stimulation. We quantify stimulation-altered immune cell compositions and isolate the unique effects of allogeneic stimulation on chromatin accessibility specific to T-lymphocytes. Finally, we observe that impaired global chromatin decondensation often coincides with chemical inhibition of allogeneic T-cell activation.


Asunto(s)
Cromatina , ADN , Cromatina/genética , ADN/genética , Secuenciación de Inmunoprecipitación de Cromatina , Análisis de Secuencia de ADN/métodos , Epigenómica/métodos
15.
PLoS Genet ; 12(7): e1006162, 2016 07.
Artículo en Inglés | MEDLINE | ID: mdl-27428049

RESUMEN

Malignant tumors shed DNA into the circulation. The transient half-life of circulating tumor DNA (ctDNA) may afford the opportunity to diagnose, monitor recurrence, and evaluate response to therapy solely through a non-invasive blood draw. However, detecting ctDNA against the normally occurring background of cell-free DNA derived from healthy cells has proven challenging, particularly in non-metastatic solid tumors. In this study, distinct differences in fragment length size between ctDNAs and normal cell-free DNA are defined. Human ctDNA in rat plasma derived from human glioblastoma multiforme stem-like cells in the rat brain and human hepatocellular carcinoma in the rat flank were found to have a shorter principal fragment length than the background rat cell-free DNA (134-144 bp vs. 167 bp, respectively). Subsequently, a similar shift in the fragment length of ctDNA in humans with melanoma and lung cancer was identified compared to healthy controls. Comparison of fragment lengths from cell-free DNA between a melanoma patient and healthy controls found that the BRAF V600E mutant allele occurred more commonly at a shorter fragment length than the fragment length of the wild-type allele (132-145 bp vs. 165 bp, respectively). Moreover, size-selecting for shorter cell-free DNA fragment lengths substantially increased the EGFR T790M mutant allele frequency in human lung cancer. These findings provide compelling evidence that experimental or bioinformatic isolation of a specific subset of fragment lengths from cell-free DNA may improve detection of ctDNA.


Asunto(s)
ADN de Neoplasias/sangre , ADN de Neoplasias/genética , Alelos , Animales , Biomarcadores de Tumor/sangre , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/metabolismo , Línea Celular Tumoral , Glioblastoma/sangre , Glioblastoma/genética , Células Hep G2 , Humanos , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/metabolismo , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/metabolismo , Imagen por Resonancia Magnética , Masculino , Melanoma/genética , Melanoma/metabolismo , Mutación , Trasplante de Neoplasias , Proteínas Proto-Oncogénicas B-raf/genética , Ratas
16.
N Engl J Med ; 372(17): 1639-45, 2015 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-25830323

RESUMEN

Investigations of noninvasive prenatal screening for aneuploidy by analysis of circulating cell-free DNA (cfDNA) have shown high sensitivity and specificity in both high-risk and low-risk cohorts. However, the overall low incidence of aneuploidy limits the positive predictive value of these tests. Currently, the causes of false positive results are poorly understood. We investigated four pregnancies with discordant prenatal test results and found in two cases that maternal duplications on chromosome 18 were the likely cause of the discordant results. Modeling based on population-level copy-number variation supports the possibility that some false positive results of noninvasive prenatal screening may be attributable to large maternal copy-number variants. (Funded by the National Institutes of Health and others.).


Asunto(s)
Aneuploidia , Trastornos de los Cromosomas/diagnóstico , Variaciones en el Número de Copia de ADN , ADN/sangre , Reacciones Falso Positivas , Diagnóstico Prenatal , Adulto , Cromosomas Humanos Par 13 , Cromosomas Humanos Par 18 , Cromosomas Humanos Par 21 , ADN/análisis , Femenino , Humanos , Modelos Estadísticos , Embarazo
17.
Genome Res ; 24(12): 2041-9, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25327137

RESUMEN

We describe a method that exploits contiguity preserving transposase sequencing (CPT-seq) to facilitate the scaffolding of de novo genome assemblies. CPT-seq is an entirely in vitro means of generating libraries comprised of 9216 indexed pools, each of which contains thousands of sparsely sequenced long fragments ranging from 5 kilobases to > 1 megabase. These pools are "subhaploid," in that the lengths of fragments contained in each pool sums to ∼5% to 10% of the full genome. The scaffolding approach described here, termed fragScaff, leverages coincidences between the content of different pools as a source of contiguity information. Specifically, CPT-seq data is mapped to a de novo genome assembly, followed by the identification of pairs of contigs or scaffolds whose ends disproportionately co-occur in the same indexed pools, consistent with true adjacency in the genome. Such candidate "joins" are used to construct a graph, which is then resolved by a minimum spanning tree. As a proof-of-concept, we apply CPT-seq and fragScaff to substantially boost the contiguity of de novo assemblies of the human, mouse, and fly genomes, increasing the scaffold N50 of de novo assemblies by eight- to 57-fold with high accuracy. We also demonstrate that fragScaff is complementary to Hi-C-based contact probability maps, providing midrange contiguity to support robust, accurate chromosome-scale de novo genome assemblies without the need for laborious in vivo cloning steps. Finally, we demonstrate CPT-seq as a means of anchoring unplaced novel human contigs to the reference genome as well as for detecting misassembled sequences.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Transposasas/metabolismo , Animales , Biología Computacional/métodos , Biblioteca de Genes , Genómica/métodos , Humanos , Ratones , Programas Informáticos
18.
Hum Genet ; 135(5): 525-540, 2016 May.
Artículo en Inglés | MEDLINE | ID: mdl-27023906

RESUMEN

Ehlers-Danlos syndrome (EDS) describes a group of clinical entities in which the connective tissue, primarily that of the skin, joint and vessels, is abnormal, although the resulting clinical manifestations can vary widely between the different historical subtypes. Many cases of hereditary disorders of connective tissue that do not seem to fit these historical subtypes exist. The aim of this study is to describe a large series of patients with inherited connective tissue disorders evaluated by our clinical genetics service and for whom a likely causal variant was identified. In addition to clinical phenotyping, patients underwent various genetic tests including molecular karyotyping, candidate gene analysis, autozygome analysis, and whole-exome and whole-genome sequencing as appropriate. We describe a cohort of 69 individuals representing 40 families, all referred because of suspicion of an inherited connective tissue disorder by their primary physician. Molecular lesions included variants in the previously published disease genes B3GALT6, GORAB, ZNF469, B3GAT3, ALDH18A1, FKBP14, PYCR1, CHST14 and SPARC with interesting variations on the published clinical phenotypes. We also describe the first recessive EDS-like condition to be caused by a recessive COL1A1 variant. In addition, exome capture in a familial case identified a homozygous truncating variant in a novel and compelling candidate gene, AEBP1. Finally, we also describe a distinct novel clinical syndrome of cutis laxa and marked facial features and propose ATP6V1E1 and ATP6V0D2 (two subunits of vacuolar ATPase) as likely candidate genes based on whole-genome and whole-exome sequencing of the two families with this new clinical entity. Our study expands the clinical spectrum of hereditary disorders of connective tissue and adds three novel candidate genes including two that are associated with a highly distinct syndrome.


Asunto(s)
Enfermedades del Tejido Conjuntivo/genética , Heterogeneidad Genética , Marcadores Genéticos/genética , Anomalías Cutáneas/genética , Secuencia de Aminoácidos , Estudios de Cohortes , Enfermedades del Tejido Conjuntivo/patología , Exoma/genética , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Masculino , Datos de Secuencia Molecular , Linaje , Fenotipo , Homología de Secuencia de Aminoácido
19.
Genet Med ; 18(7): 686-95, 2016 07.
Artículo en Inglés | MEDLINE | ID: mdl-26633546

RESUMEN

PURPOSE: Dysmorphology syndromes are among the most common referrals to clinical genetics specialists. Inability to match the dysmorphology pattern to a known syndrome can pose a major diagnostic challenge. With an aim to accelerate the establishment of new syndromes and their genetic etiology, we describe our experience with multiplex consanguineous families that appeared to represent novel autosomal recessive dysmorphology syndromes at the time of evaluation. METHODS: Combined autozygome/exome analysis of multiplex consanguineous families with apparently novel dysmorphology syndromes. RESULTS: Consistent with the apparent novelty of the phenotypes, our analysis revealed a strong candidate variant in genes that were novel at the time of the analysis in the majority of cases, and 10 of these genes are published here for the first time as novel candidates (CDK9, NEK9, ZNF668, TTC28, MBL2, CADPS, CACNA1H, HYAL2, CTU2, and C3ORF17). A significant minority of the phenotypes (6/31, 19%), however, were caused by genes known to cause Mendelian phenotypes, thus expanding the phenotypic spectrum of the diseases linked to these genes. The conspicuous inheritance pattern and the highly specific phenotypes appear to have contributed to the high yield (90%) of plausible molecular diagnoses in our study cohort. CONCLUSION: Reporting detailed clinical and genomic analysis of a large series of apparently novel dysmorphology syndromes will likely lead to a trend to accelerate the establishment of novel syndromes and their underlying genes through open exchange of data for the benefit of patients, their families, health-care providers, and the research community.Genet Med 18 7, 686-695.


Asunto(s)
Anomalías Múltiples/diagnóstico , Exoma/genética , Genómica , Hipoglucemia/diagnóstico , Microcefalia/diagnóstico , Anomalías Múltiples/genética , Anomalías Múltiples/fisiopatología , Consanguinidad , Trastornos del Desarrollo Sexual/diagnóstico , Trastornos del Desarrollo Sexual/genética , Trastornos del Desarrollo Sexual/fisiopatología , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Hipoglucemia/genética , Hipoglucemia/fisiopatología , Masculino , Microcefalia/genética , Microcefalia/fisiopatología , Mutación , Linaje , Fenotipo , Análisis de Secuencia de ADN/métodos
20.
Proc Natl Acad Sci U S A ; 108(4): 1513-8, 2011 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-21187386

RESUMEN

Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (~100-base) sequence reads at very low cost. Whereas such data can be readily used for a wide range of biomedical applications, it has proven difficult to use them to generate high-quality de novo genome assemblies of large, repeat-rich vertebrate genomes. To date, the genome assemblies generated from such data have fallen far short of those obtained with the older (but much more expensive) capillary-based sequencing approach. Here, we report the development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform. The resulting draft genome assemblies have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome. In particular, the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size = 11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with capillary-based sequencing. The combination of improved sequencing technology and improved computational methods should now make it possible to increase dramatically the de novo sequencing of large genomes. The ALLPATHS-LG program is available at http://www.broadinstitute.org/science/programs/genome-biology/crd.


Asunto(s)
Algoritmos , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Animales , Genoma/genética , Humanos , Internet , Ratones , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA