Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Nature ; 526(7571): 75-81, 2015 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-26432246

RESUMEN

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.


Asunto(s)
Variación Genética/genética , Genoma Humano/genética , Mapeo Físico de Cromosoma , Secuencia de Aminoácidos , Predisposición Genética a la Enfermedad , Genética Médica , Genética de Población , Estudio de Asociación del Genoma Completo , Genómica , Genotipo , Haplotipos/genética , Homocigoto , Humanos , Datos de Secuencia Molecular , Tasa de Mutación , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Análisis de Secuencia de ADN , Eliminación de Secuencia/genética
2.
Bioinformatics ; 34(13): i115-i123, 2018 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-29949971

RESUMEN

Motivation: Current sequencing technologies are able to produce reads orders of magnitude longer than ever possible before. Such long reads have sparked a new interest in de novo genome assembly, which removes reference biases inherent to re-sequencing approaches and allows for a direct characterization of complex genomic variants. However, even with latest algorithmic advances, assembling a mammalian genome from long error-prone reads incurs a significant computational burden and does not preclude occasional misassemblies. Both problems could potentially be mitigated if assembly could commence for each chromosome separately. Results: To address this, we show how single-cell template strand sequencing (Strand-seq) data can be leveraged for this purpose. We introduce a novel latent variable model and a corresponding Expectation Maximization algorithm, termed SaaRclust, and demonstrates its ability to reliably cluster long reads by chromosome. For each long read, this approach produces a posterior probability distribution over all chromosomes of origin and read directionalities. In this way, it allows to assess the amount of uncertainty inherent to sparse Strand-seq data on the level of individual reads. Among the reads that our algorithm confidently assigns to a chromosome, we observed more than 99% correct assignments on a subset of Pacific Bioscience reads with 30.1× coverage. To our knowledge, SaaRclust is the first approach for the in silico separation of long reads by chromosome prior to assembly. Availability and implementation: https://github.com/daewoooo/SaaRclust.


Asunto(s)
Cromosomas Humanos , Simulación por Computador , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Programas Informáticos , Algoritmos , Femenino , Genoma Humano , Humanos , Análisis de Secuencia de ADN/métodos
3.
BMC Genomics ; 17: 342, 2016 05 10.
Artículo en Inglés | MEDLINE | ID: mdl-27161561

RESUMEN

BACKGROUND: While active LINE-1 (L1) elements possess the ability to mobilize flanking sequences to different genomic loci through a process termed transduction influencing genomic content and structure, an approach for detecting polymorphic germline non-reference transductions in massively-parallel sequencing data has been lacking. RESULTS: Here we present the computational approach TIGER (Transduction Inference in GERmline genomes), enabling the discovery of non-reference L1-mediated transductions by combining L1 discovery with detection of unique insertion sequences and detailed characterization of insertion sites. We employed TIGER to characterize polymorphic transductions in fifteen genomes from non-human primate species (chimpanzee, orangutan and rhesus macaque), as well as in a human genome. We achieved high accuracy as confirmed by PCR and two single molecule DNA sequencing techniques, and uncovered differences in relative rates of transduction between primate species. CONCLUSIONS: By enabling detection of polymorphic transductions, TIGER makes this form of relevant structural variation amenable for population and personal genome analysis.


Asunto(s)
Células Germinativas/metabolismo , Secuenciación de Nucleótidos de Alto Rendimiento , Elementos de Nucleótido Esparcido Largo , Transducción Genética , Animales , Secuencia de Bases , Biología Computacional/métodos , Genoma , Humanos , Macaca mulatta/genética , Pan troglodytes/genética
4.
Nat Biotechnol ; 38(3): 343-354, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-31873213

RESUMEN

Structural variation (SV), involving deletions, duplications, inversions and translocations of DNA segments, is a major source of genetic variability in somatic cells and can dysregulate cancer-related pathways. However, discovering somatic SVs in single cells has been challenging, with copy-number-neutral and complex variants typically escaping detection. Here we describe single-cell tri-channel processing (scTRIP), a computational framework that integrates read depth, template strand and haplotype phase to comprehensively discover SVs in individual cells. We surveyed SV landscapes of 565 single cells, including transformed epithelial cells and patient-derived leukemic samples, to discover abundant SV classes, including inversions, translocations and complex DNA rearrangements. Analysis of the leukemic samples revealed four times more somatic SVs than cytogenetic karyotyping, submicroscopic copy-number alterations, oncogenic copy-neutral rearrangements and a subclonal chromothripsis event. Advancing current methods, single-cell tri-channel processing can directly measure SV mutational processes in individual cells, such as breakage-fusion-bridge cycles, facilitating studies of clonal evolution, genetic mosaicism and SV formation mechanisms, which could improve disease classification for precision medicine.


Asunto(s)
Biología Computacional/métodos , Variación Estructural del Genoma , Leucemia/genética , Análisis de la Célula Individual/métodos , Línea Celular , Cromotripsis , Evolución Clonal , Reordenamiento Génico , Humanos , Mutación INDEL , Inversión de Secuencia , Translocación Genética
5.
Nat Genet ; 51(8): 1272-1282, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31308546

RESUMEN

Chromatin topology is intricately linked to gene expression, yet its functional requirement remains unclear. Here, we comprehensively assessed the interplay between genome topology and gene expression using highly rearranged chromosomes (balancers) spanning ~75% of the Drosophila genome. Using transheterozyte (balancer/wild-type) embryos, we measured allele-specific changes in topology and gene expression in cis, while minimizing trans effects. Through genome sequencing, we resolved eight large nested inversions, smaller inversions, duplications and thousands of deletions. These extensive rearrangements caused many changes to chromatin topology, disrupting long-range loops, topologically associating domains (TADs) and promoter interactions, yet these are not predictive of changes in expression. Gene expression is generally not altered around inversion breakpoints, indicating that mis-appropriate enhancer-promoter activation is a rare event. Similarly, shuffling or fusing TADs, changing intra-TAD connections and disrupting long-range inter-TAD loops does not alter expression for the majority of genes. Our results suggest that properties other than chromatin topology ensure productive enhancer-promoter interactions.


Asunto(s)
Cromatina/genética , Cromosomas de Insectos/genética , Drosophila melanogaster/genética , Elementos de Facilitación Genéticos , Regulación de la Expresión Génica , Reordenamiento Génico , Genoma de los Insectos , Animales , Mapeo Cromosómico , Femenino , Masculino , Regiones Promotoras Genéticas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA