Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 56
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 178(3): 624-639.e19, 2019 07 25.
Artículo en Inglés | MEDLINE | ID: mdl-31348889

RESUMEN

Recent breakthroughs with synthetic budding yeast chromosomes expedite the creation of synthetic mammalian chromosomes and genomes. Mammals, unlike budding yeast, depend on the histone H3 variant, CENP-A, to epigenetically specify the location of the centromere-the locus essential for chromosome segregation. Prior human artificial chromosomes (HACs) required large arrays of centromeric α-satellite repeats harboring binding sites for the DNA sequence-specific binding protein, CENP-B. We report the development of a type of HAC that functions independently of these constraints. Formed by an initial CENP-A nucleosome seeding strategy, a construct lacking repetitive centromeric DNA formed several self-sufficient HACs that showed no uptake of genomic DNA. In contrast to traditional α-satellite HAC formation, the non-repetitive construct can form functional HACs without CENP-B or initial CENP-A nucleosome seeding, revealing distinct paths to centromere formation for different DNA sequence types. Our developments streamline the construction and characterization of HACs to facilitate mammalian synthetic genome efforts.


Asunto(s)
Centrómero/metabolismo , Cromosomas Artificiales Humanos/metabolismo , ADN Satélite/metabolismo , Sitios de Unión , Línea Celular Tumoral , Centrómero/genética , Proteína A Centromérica/genética , Proteína A Centromérica/metabolismo , Proteína B del Centrómero/deficiencia , Proteína B del Centrómero/genética , Proteína B del Centrómero/metabolismo , Epigénesis Genética , Humanos , Nucleosomas/química , Nucleosomas/metabolismo , Plásmidos/genética , Plásmidos/metabolismo
2.
Nature ; 630(8016): 401-411, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38811727

RESUMEN

Apes possess two sex chromosomes-the male-specific Y chromosome and the X chromosome, which is present in both males and females. The Y chromosome is crucial for male reproduction, with deletions being linked to infertility1. The X chromosome is vital for reproduction and cognition2. Variation in mating patterns and brain function among apes suggests corresponding differences in their sex chromosomes. However, owing to their repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the methodology developed for the telomere-to-telomere (T2T) human genome, we produced gapless assemblies of the X and Y chromosomes for five great apes (bonobo (Pan paniscus), chimpanzee (Pan troglodytes), western lowland gorilla (Gorilla gorilla gorilla), Bornean orangutan (Pongo pygmaeus) and Sumatran orangutan (Pongo abelii)) and a lesser ape (the siamang gibbon (Symphalangus syndactylus)), and untangled the intricacies of their evolution. Compared with the X chromosomes, the ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements-owing to the accumulation of lineage-specific ampliconic regions, palindromes, transposable elements and satellites. Many Y chromosome genes expand in multi-copy families and some evolve under purifying selection. Thus, the Y chromosome exhibits dynamic evolution, whereas the X chromosome is more stable. Mapping short-read sequencing data to these assemblies revealed diversity and selection patterns on sex chromosomes of more than 100 individual great apes. These reference assemblies are expected to inform human evolution and conservation genetics of non-human apes, all of which are endangered species.


Asunto(s)
Hominidae , Cromosoma X , Cromosoma Y , Animales , Femenino , Masculino , Gorilla gorilla/genética , Hominidae/genética , Hominidae/clasificación , Hylobatidae/genética , Pan paniscus/genética , Pan troglodytes/genética , Filogenia , Pongo abelii/genética , Pongo pygmaeus/genética , Telómero/genética , Cromosoma X/genética , Cromosoma Y/genética , Evolución Molecular , Variaciones en el Número de Copia de ADN/genética , Humanos , Especies en Peligro de Extinción , Estándares de Referencia
3.
Annu Rev Genet ; 55: 583-602, 2021 11 23.
Artículo en Inglés | MEDLINE | ID: mdl-34813350

RESUMEN

We are entering a new era in genomics where entire centromeric regions are accurately represented in human reference assemblies. Access to these high-resolution maps will enable new surveys of sequence and epigenetic variation in the population and offer new insight into satellite array genomics and centromere function. Here, we focus on the sequence organization and evolution of alpha satellites, which are credited as the genetic and genomic definition of human centromeres due to their interaction with inner kinetochore proteins and their importance in the development of human artificial chromosome assays. We provide an overview of alpha satellite repeat structure and array organization in the context of these high-quality reference data sets; discuss the emergence of variation-based surveys; and provide perspective on the role of this new source of genetic and epigenetic variation in the context of chromosome biology, genome instability, and human disease.


Asunto(s)
Centrómero , Genoma , Centrómero/genética , Inestabilidad Genómica/genética , Genómica , Humanos
4.
Nat Rev Genet ; 24(7): 464-483, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37059810

RESUMEN

Genetic variant calling from DNA sequencing has enabled understanding of germline variation in hundreds of thousands of humans. Sequencing technologies and variant-calling methods have advanced rapidly, routinely providing reliable variant calls in most of the human genome. We describe how advances in long reads, deep learning, de novo assembly and pangenomes have expanded access to variant calls in increasingly challenging, repetitive genomic regions, including medically relevant regions, and how new benchmark sets and benchmarking methods illuminate their strengths and limitations. Finally, we explore the possible future of more complete characterization of human genome variation in light of the recent completion of a telomere-to-telomere human genome reference assembly and human pangenomes, and we consider the innovations needed to benchmark their newly accessible repetitive regions and complex variants.


Asunto(s)
Benchmarking , Genoma Humano , Humanos , Genómica , Análisis de Secuencia de ADN , Secuenciación de Nucleótidos de Alto Rendimiento
5.
Nature ; 604(7906): 437-446, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35444317

RESUMEN

The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation. A high-quality reference with global representation of common variants, including single-nucleotide variants, structural variants and functional elements, is needed. The Human Pangenome Reference Consortium aims to create a more sophisticated and complete human reference genome with a graph-based, telomere-to-telomere representation of global genomic diversity. Here we leverage innovations in technology, study design and global partnerships with the goal of constructing the highest-possible quality human pangenome reference. Our goal is to improve data representation and streamline analyses to enable routine assembly of complete diploid genomes. With attention to ethical frameworks, the human pangenome reference will contain a more accurate and diverse representation of global genomic variation, improve gene-disease association studies across populations, expand the scope of genomics research to the most repetitive and polymorphic regions of the genome, and serve as the ultimate genetic resource for future biomedical research and precision medicine.


Asunto(s)
Genoma Humano , Genómica , Genoma Humano/genética , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN
6.
Artículo en Inglés | MEDLINE | ID: mdl-38663087

RESUMEN

The Human Genome Project was an enormous accomplishment, providing a foundation for countless explorations into the genetics and genomics of the human species. Yet for many years, the human genome reference sequence remained incomplete and lacked representation of human genetic diversity. Recently, two major advances have emerged to address these shortcomings: complete gap-free human genome sequences, such as the one developed by the Telomere-to-Telomere Consortium, and high-quality pangenomes, such as the one developed by the Human Pangenome Reference Consortium. Facilitated by advances in long-read DNA sequencing and genome assembly algorithms, complete human genome sequences resolve regions that have been historically difficult to sequence, including centromeres, telomeres, and segmental duplications. In parallel, pangenomes capture the extensive genetic diversity across populations worldwide. Together, these advances usher in a new era of genomics research, enhancing the accuracy of genomic analysis, paving the path for precision medicine, and contributing to deeper insights into human biology.

7.
Genome Res ; 34(3): 454-468, 2024 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-38627094

RESUMEN

Reference-free genome phasing is vital for understanding allele inheritance and the impact of single-molecule DNA variation on phenotypes. To achieve thorough phasing across homozygous or repetitive regions of the genome, long-read sequencing technologies are often used to perform phased de novo assembly. As a step toward reducing the cost and complexity of this type of analysis, we describe new methods for accurately phasing Oxford Nanopore Technologies (ONT) sequence data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of ONT PromethION sequencing, including those using proximity ligation, and show that newer, higher accuracy ONT reads substantially improve assembly quality.


Asunto(s)
Nanoporos , Humanos , Análisis de Secuencia de ADN/métodos , Secuenciación de Nanoporos/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Programas Informáticos , Genómica/métodos
8.
Nature ; 593(7857): 101-107, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33828295

RESUMEN

The complete assembly of each human chromosome is essential for understanding human biology and evolution1,2. Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08-Mb centromeric α-satellite array, a 644-kb copy number polymorphism in the ß-defensin gene cluster that is important for disease risk, and an 863-kb variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73-kb hypomethylated region of diverse higher-order α-satellites enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. In addition, we confirm the overall organization and methylation pattern of the centromere in a diploid human genome. Using a dual long-read sequencing approach, we complete high-quality draft assemblies of the orthologous centromere from chromosome 8 in chimpanzee, orangutan and macaque to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved in the great ape ancestor with a layered symmetry, in which more ancient higher-order repeats locate peripherally to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated by more than 2.2-fold compared to the unique portions of the genome, and this acceleration extends into the flanking sequence.


Asunto(s)
Cromosomas Humanos Par 8/química , Cromosomas Humanos Par 8/genética , Evolución Molecular , Animales , Línea Celular , Centrómero/química , Centrómero/genética , Centrómero/metabolismo , Cromosomas Humanos Par 8/fisiología , Metilación de ADN , ADN Satélite/genética , Epigénesis Genética , Femenino , Humanos , Macaca mulatta/genética , Masculino , Repeticiones de Minisatélite/genética , Pan troglodytes/genética , Filogenia , Pongo abelii/genética , Telómero/química , Telómero/genética , Telómero/metabolismo
9.
Am J Hum Genet ; 110(11): 1832-1840, 2023 11 02.
Artículo en Inglés | MEDLINE | ID: mdl-37922882

RESUMEN

Advances in long-read sequencing and assembly now mean that individual labs can generate phased genomes that are more accurate and more contiguous than the original human reference genome. With declining costs and increasing democratization of technology, we suggest that complete genome assemblies, where both parental haplotypes are phased telomere to telomere, will become standard in human genetics. Soon, even in clinical settings where rigorous sample-handling standards must be met, affected individuals could have reference-grade genomes fully sequenced and assembled in just a few hours given advances in technology, computational processing, and annotation. Complete genetic variant discovery will transform how we map, catalog, and associate variation with human disease and fundamentally change our understanding of the genetic diversity of all humans.


Asunto(s)
Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN , Genoma Humano/genética , Telómero/genética
10.
Nat Methods ; 20(10): 1483-1492, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37710018

RESUMEN

Long-read sequencing technologies substantially overcome the limitations of short-reads but have not been considered as a feasible replacement for population-scale projects, being a combination of too expensive, not scalable enough or too error-prone. Here we develop an efficient and scalable wet lab and computational protocol, Napu, for Oxford Nanopore Technologies long-read sequencing that seeks to address those limitations. We applied our protocol to cell lines and brain tissue samples as part of a pilot project for the National Institutes of Health Center for Alzheimer's and Related Dementias. Using a single PromethION flow cell, we can detect single nucleotide polymorphisms with F1-score comparable to Illumina short-read sequencing. Small indel calling remains difficult within homopolymers and tandem repeats, but achieves good concordance to Illumina indel calls elsewhere. Further, we can discover structural variants with F1-score on par with state-of-the-art de novo assembly methods. Our protocol phases small and structural variants at megabase scales and produces highly accurate, haplotype-specific methylation calls.


Asunto(s)
Genoma Humano , Secuenciación de Nanoporos , Humanos , Análisis de Secuencia de ADN/métodos , Haplotipos , Metilación , Proyectos Piloto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
11.
Nature ; 585(7823): 79-84, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32663838

RESUMEN

After two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist1,2. Here we present a human genome assembly that surpasses the continuity of GRCh382, along with a gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome3, we reconstructed the centromeric satellite DNA array (approximately 3.1 Mb) and closed the 29 remaining gaps in the current reference, including new sequences from the human pseudoautosomal regions and from cancer-testis ampliconic gene families (CT-X and GAGE). These sequences will be integrated into future human reference genome releases. In addition, the complete chromosome X, combined with the ultra-long nanopore data, allowed us to map methylation patterns across complex tandem repeats and satellite arrays. Our results demonstrate that finishing the entire human genome is now within reach, and the data presented here will facilitate ongoing efforts to complete the other human chromosomes.


Asunto(s)
Cromosomas Humanos X/genética , Genoma Humano/genética , Telómero/genética , Centrómero/genética , Islas de CpG/genética , Metilación de ADN , ADN Satélite/genética , Femenino , Humanos , Mola Hidatiforme/genética , Masculino , Embarazo , Reproducibilidad de los Resultados , Testículo/metabolismo
12.
Mol Cell ; 70(5): 842-853.e7, 2018 06 07.
Artículo en Inglés | MEDLINE | ID: mdl-29861157

RESUMEN

Heterochromatic repetitive satellite RNAs are extensively transcribed in a variety of human cancers, including BRCA1 mutant breast cancer. Aberrant expression of satellite RNAs in cultured cells induces the DNA damage response, activates cell cycle checkpoints, and causes defects in chromosome segregation. However, the mechanism by which satellite RNA expression leads to genomic instability is not well understood. Here we provide evidence that increased levels of satellite RNAs in mammary glands induce tumor formation in mice. Using mass spectrometry, we further show that genomic instability induced by satellite RNAs occurs through interactions with BRCA1-associated protein networks required for the stabilization of DNA replication forks. Additionally, de-stabilized replication forks likely promote the formation of RNA-DNA hybrids in cells expressing satellite RNAs. These studies lay the foundation for developing novel therapeutic strategies that block the effects of non-coding satellite RNAs in cancer cells.


Asunto(s)
Proteína BRCA1/genética , Neoplasias de la Mama/genética , Transformación Celular Neoplásica/genética , Daño del ADN , Inestabilidad Genómica , Heterocromatina/genética , ARN Neoplásico/genética , Satélite de ARN/genética , Animales , Proteína BRCA1/metabolismo , Neoplasias de la Mama/metabolismo , Neoplasias de la Mama/patología , Proliferación Celular , Transformación Celular Neoplásica/metabolismo , Transformación Celular Neoplásica/patología , Femenino , Regulación Neoplásica de la Expresión Génica , Células HEK293 , Heterocromatina/metabolismo , Humanos , Células MCF-7 , Ratones , Unión Proteica , ARN Neoplásico/metabolismo , Satélite de ARN/metabolismo , Carga Tumoral
13.
Nucleic Acids Res ; 52(D1): D1082-D1088, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37953330

RESUMEN

The UCSC Genome Browser (https://genome.ucsc.edu) is a web-based genomic visualization and analysis tool that serves data to over 7,000 distinct users per day worldwide. It provides annotation data on thousands of genome assemblies, ranging from human to SARS-CoV2. This year, we have introduced new data from the Human Pangenome Reference Consortium and on viral genomes including SARS-CoV2. We have added 1,200 new genomes to our GenArk genome system, increasing the overall diversity of our genomic representation. We have added support for nine new user-contributed track hubs to our public hub system. Additionally, we have released 29 new tracks on the human genome and 11 new tracks on the mouse genome. Collectively, these new features expand both the breadth and depth of the genomic knowledge that we share publicly with users worldwide.


Asunto(s)
Bases de Datos Genéticas , Genómica , ARN Viral , Animales , Humanos , Ratones , Genoma Humano , Genoma Viral , Internet , Anotación de Secuencia Molecular , Programas Informáticos
14.
Nat Methods ; 19(6): 711-723, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35396487

RESUMEN

Studies of genome regulation routinely use high-throughput DNA sequencing approaches to determine where specific proteins interact with DNA, and they rely on DNA amplification and short-read sequencing, limiting their quantitative application in complex genomic regions. To address these limitations, we developed directed methylation with long-read sequencing (DiMeLo-seq), which uses antibody-tethered enzymes to methylate DNA near a target protein's binding sites in situ. These exogenous methylation marks are then detected simultaneously with endogenous CpG methylation on unamplified DNA using long-read, single-molecule sequencing technologies. We optimized and benchmarked DiMeLo-seq by mapping chromatin-binding proteins and histone modifications across the human genome. Furthermore, we identified where centromere protein A localizes within highly repetitive regions that were unmappable with short sequencing reads, and we estimated the density of centromere protein A molecules along single chromatin fibers. DiMeLo-seq is a versatile method that provides multimodal, genome-wide information for investigating protein-DNA interactions.


Asunto(s)
Metilación de ADN , Secuenciación de Nucleótidos de Alto Rendimiento , Proteína A Centromérica/genética , Cromatina/genética , ADN/química , ADN/genética , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Análisis de Secuencia de ADN/métodos
15.
Nat Methods ; 19(6): 687-695, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35361931

RESUMEN

Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first telomere-to-telomere human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Although derived from highly accurate sequences, evaluation revealed evidence of small errors and structural misassemblies in the initial draft assembly. To correct these errors, we designed a new repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly quality value from 70.2 to 73.9 measured from PacBio high-fidelity and Illumina k-mers. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both high-fidelity and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Nanoporos , Femenino , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Embarazo , Análisis de Secuencia de ADN/métodos , Telómero/genética
16.
Semin Cell Dev Biol ; 128: 15-25, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35644878

RESUMEN

Satellite DNAs are present on every chromosome in the cell and are typically enriched in repetitive, heterochromatic parts of the human genome. Sex chromosomes represent a unique genomic and epigenetic context. In this review, we first report what is known about satellite DNA biology on human X and Y chromosomes, including repeat content and organization, as well as satellite variation in typical euploid individuals. Then, we review sex chromosome aneuploidies that are among the most common types of aneuploidies in the general population, and are better tolerated than autosomal aneuploidies. This is demonstrated also by the fact that aging is associated with the loss of the X, and especially the Y chromosome. In addition, supernumerary sex chromosomes enable us to study general processes in a cell, such as analyzing heterochromatin dosage (i.e. additional Barr bodies and long heterochromatin arrays on Yq) and their downstream consequences. Finally, genomic and epigenetic organization and regulation of satellite DNA could influence chromosome stability and lead to aneuploidy. In this review, we argue that the complete annotation of satellite DNA on sex chromosomes in human, and especially in centromeric regions, will aid in explaining the prevalence and the consequences of sex chromosome aneuploidies.


Asunto(s)
ADN Satélite , Heterocromatina , Aneuploidia , Centrómero/genética , Cromosomas Humanos , ADN Satélite/genética , Heterocromatina/genética , Humanos , Cromosomas Sexuales/genética
17.
Annu Rev Genomics Hum Genet ; 22: 81-102, 2021 08 31.
Artículo en Inglés | MEDLINE | ID: mdl-33929893

RESUMEN

The reference human genome sequence is inarguably the most important and widely used resource in the fields of human genetics and genomics. It has transformed the conduct of biomedical sciences and brought invaluable benefits to the understanding and improvement of human health. However, the commonly used reference sequence has profound limitations, because across much of its span, it represents the sequence of just one human haplotype. This single, monoploid reference structure presents a critical barrier to representing the broad genomic diversity in the human population. In this review, we discuss the modernization of the reference human genome sequence to a more complete reference of human genomic diversity, known as a human pangenome.


Asunto(s)
Genoma Humano , Genómica , Humanos
18.
EMBO J ; 39(2): e102924, 2020 01 15.
Artículo en Inglés | MEDLINE | ID: mdl-31750958

RESUMEN

Intrinsic genomic features of individual chromosomes can contribute to chromosome-specific aneuploidy. Centromeres are key elements for the maintenance of chromosome segregation fidelity via a specialized chromatin marked by CENP-A wrapped by repetitive DNA. These long stretches of repetitive DNA vary in length among human chromosomes. Using CENP-A genetic inactivation in human cells, we directly interrogate if differences in the centromere length reflect the heterogeneity of centromeric DNA-dependent features and whether this, in turn, affects the genesis of chromosome-specific aneuploidy. Using three distinct approaches, we show that mis-segregation rates vary among different chromosomes under conditions that compromise centromere function. Whole-genome sequencing and centromere mapping combined with cytogenetic analysis, small molecule inhibitors, and genetic manipulation revealed that inter-chromosomal heterogeneity of centromeric features, but not centromere length, influences chromosome segregation fidelity. We conclude that faithful chromosome segregation for most of human chromosomes is biased in favor of centromeres with high abundance of DNA-dependent centromeric components. These inter-chromosomal differences in centromere features can translate into non-random aneuploidy, a hallmark of cancer and genetic diseases.


Asunto(s)
Aneuploidia , Proteína A Centromérica/metabolismo , Centrómero/metabolismo , Cromatina/metabolismo , Cromosomas Humanos/genética , ADN/metabolismo , Células Cultivadas , Centrómero/genética , Proteína A Centromérica/genética , Cromatina/genética , Segregación Cromosómica , ADN/genética , Femenino , Humanos , Masculino
19.
Nat Methods ; 18(11): 1322-1332, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34725481

RESUMEN

Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished).


Asunto(s)
Genes , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Nanoporos , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Genoma Humano , Humanos , Anotación de Secuencia Molecular
20.
Hum Mol Genet ; 30(R2): R198-R205, 2021 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-34302168

RESUMEN

The recent accomplishment of a truly complete human genome has afforded a new view of chromosome structure and function that was limited 30 years ago. Here, we discuss the expansion of knowledge from the early cytological studies of the genome to the current high-resolution genomic, epigenetic and functional maps that have been achieved by recent technology and computational advances. These studies have revealed unexpected complexities of genome organization and function and uncovered new views of fundamental chromosomal elements. Comprehensive genomic maps will enable accurate diagnosis of human diseases caused by altered chromosome structure and function, facilitate development of chromosome-based therapies and shape the future of preventative medicine and healthcare.


Asunto(s)
Estructuras Cromosómicas , Cromosomas/genética , Genoma Humano , Genómica , Animales , Mapeo Cromosómico , Cromosomas/química , Biología Computacional/métodos , Estudios de Asociación Genética , Marcadores Genéticos , Predisposición Genética a la Enfermedad , Genómica/métodos , Humanos , Patrón de Herencia , Análisis de la Célula Individual/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA