Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Nat Methods ; 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38730258

RESUMEN

Despite advances in long-read sequencing technologies, constructing a near telomere-to-telomere assembly is still computationally demanding. Here we present hifiasm (UL), an efficient de novo assembly algorithm combining multiple sequencing technologies to scale up population-wide near telomere-to-telomere assemblies. Applied to 22 human and two plant genomes, our algorithm produces better diploid assemblies at a cost of an order of magnitude lower than existing methods, and it also works with polyploid genomes.

2.
Genome Res ; 34(3): 454-468, 2024 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-38627094

RESUMEN

Reference-free genome phasing is vital for understanding allele inheritance and the impact of single-molecule DNA variation on phenotypes. To achieve thorough phasing across homozygous or repetitive regions of the genome, long-read sequencing technologies are often used to perform phased de novo assembly. As a step toward reducing the cost and complexity of this type of analysis, we describe new methods for accurately phasing Oxford Nanopore Technologies (ONT) sequence data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of ONT PromethION sequencing, including those using proximity ligation, and show that newer, higher accuracy ONT reads substantially improve assembly quality.


Asunto(s)
Nanoporos , Humanos , Análisis de Secuencia de ADN/métodos , Secuenciación de Nanoporos/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Programas Informáticos , Genómica/métodos
3.
Nature ; 629(8010): 136-145, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38570684

RESUMEN

Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size1. As a result, patterns of human centromeric variation and models for their evolution and function remain incomplete, despite centromeres being among the most rapidly mutating regions2,3. Here, using long-read sequencing, we completely sequenced and assembled all centromeres from a second human genome and compared it to the finished reference genome4,5. We find that the two sets of centromeres show at least a 4.1-fold increase in single-nucleotide variation when compared with their unique flanks and vary up to 3-fold in size. Moreover, we find that 45.8% of centromeric sequence cannot be reliably aligned using standard methods owing to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by >500 kb. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan and macaque genomes. Comparative analyses reveal a nearly complete turnover of α-satellite HORs, with characteristic idiosyncratic changes in α-satellite HORs for each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the short (p) and long (q) arms across centromeres and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.


Asunto(s)
Centrómero , Evolución Molecular , Variación Genética , Animales , Humanos , Centrómero/genética , Centrómero/metabolismo , Proteína A Centromérica/metabolismo , Metilación de ADN/genética , ADN Satélite/genética , Cinetocoros/metabolismo , Macaca/genética , Pan troglodytes/genética , Polimorfismo de Nucleótido Simple/genética , Pongo/genética , Masculino , Femenino , Estándares de Referencia , Inmunoprecipitación de Cromatina , Haplotipos , Mutación , Amplificación de Genes , Alineación de Secuencia , Cromatina/genética , Cromatina/metabolismo , Especificidad de la Especie
4.
bioRxiv ; 2024 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-38529488

RESUMEN

The combination of ultra-long Oxford Nanopore (ONT) sequencing reads with long, accurate PacBio HiFi reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, "telomere-to-telomere" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT "Duplex" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely-studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used "Pore-C" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the ultra-long reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and has the potential to provide a single-instrument solution for the reconstruction of complete genomes.

5.
Nature ; 621(7978): 344-354, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37612512

RESUMEN

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.


Asunto(s)
Cromosomas Humanos Y , Genómica , Análisis de Secuencia de ADN , Humanos , Secuencia de Bases , Cromosomas Humanos Y/genética , ADN Satélite/genética , Variación Genética/genética , Genética de Población , Genómica/métodos , Genómica/normas , Heterocromatina/genética , Familia de Multigenes/genética , Estándares de Referencia , Duplicaciones Segmentarias en el Genoma/genética , Análisis de Secuencia de ADN/normas , Secuencias Repetidas en Tándem/genética , Telómero/genética
6.
bioRxiv ; 2023 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-37398417

RESUMEN

We completely sequenced and assembled all centromeres from a second human genome and used two reference sets to benchmark genetic, epigenetic, and evolutionary variation within centromeres from a diversity panel of humans and apes. We find that centromere single-nucleotide variation can increase by up to 4.1-fold relative to other genomic regions, with the caveat that up to 45.8% of centromeric sequence, on average, cannot be reliably aligned with current methods due to the emergence of new α-satellite higher-order repeat (HOR) structures and two to threefold differences in the length of the centromeres. The extent to which this occurs differs depending on the chromosome and haplotype. Comparing the two sets of complete human centromeres, we find that eight harbor distinctly different α-satellite HOR array structures and four contain novel α-satellite HOR variants in high abundance. DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by at least 500 kbp-a property not readily associated with novel α-satellite HORs. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan, and macaque genomes. Comparative analyses reveal nearly complete turnover of α-satellite HORs, but with idiosyncratic changes in structure characteristic to each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the p- and q-arms of human chromosomes and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.

7.
ArXiv ; 2023 Jun 06.
Artículo en Inglés | MEDLINE | ID: mdl-37332563

RESUMEN

Despite recent advances in the length and the accuracy of long-read data, building haplotype-resolved genome assemblies from telomere to telomere still requires considerable computational resources. In this study, we present an efficient de novo assembly algorithm that combines multiple sequencing technologies to scale up population-wide telomere-to-telomere assemblies. By utilizing twenty-two human and two plant genomes, we demonstrate that our algorithm is around an order of magnitude cheaper than existing methods, while producing better diploid and haploid assemblies. Notably, our algorithm is the only feasible solution to the haplotype-resolved assembly of polyploid genomes.

8.
Nature ; 617(7960): 325-334, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-37165237

RESUMEN

Single-nucleotide variants (SNVs) in segmental duplications (SDs) have not been systematically assessed because of the limitations of mapping short-read sequencing data1,2. Here we constructed 1:1 unambiguous alignments spanning high-identity SDs across 102 human haplotypes and compared the pattern of SNVs between unique and duplicated regions3,4. We find that human SNVs are elevated 60% in SDs compared to unique regions and estimate that at least 23% of this increase is due to interlocus gene conversion (IGC) with up to 4.3 megabase pairs of SD sequence converted on average per human haplotype. We develop a genome-wide map of IGC donors and acceptors, including 498 acceptor and 454 donor hotspots affecting the exons of about 800 protein-coding genes. These include 171 genes that have 'relocated' on average 1.61 megabase pairs in a subset of human haplotypes. Using a coalescent framework, we show that SD regions are slightly evolutionarily older when compared to unique sequences, probably owing to IGC. SNVs in SDs, however, show a distinct mutational spectrum: a 27.1% increase in transversions that convert cytosine to guanine or the reverse across all triplet contexts and a 7.6% reduction in the frequency of CpG-associated mutations when compared to unique DNA. We reason that these distinct mutational properties help to maintain an overall higher GC content of SD DNA compared to that of unique DNA, probably driven by GC-biased conversion between paralogous sequences5,6.


Asunto(s)
Conversión Génica , Mutación , Duplicaciones Segmentarias en el Genoma , Humanos , Conversión Génica/genética , Genoma Humano/genética , Polimorfismo de Nucleótido Simple/genética , Haplotipos/genética , Exones/genética , Citosina/química , Guanina/química , Islas de CpG/genética
9.
Nature ; 617(7960): 312-324, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-37165242

RESUMEN

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.


Asunto(s)
Genoma Humano , Genómica , Humanos , Diploidia , Genoma Humano/genética , Haplotipos/genética , Análisis de Secuencia de ADN , Genómica/normas , Estándares de Referencia , Estudios de Cohortes , Alelos , Variación Genética
10.
bioRxiv ; 2023 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-36865218

RESUMEN

As a step towards simplifying and reducing the cost of haplotype resolved de novo assembly, we describe new methods for accurately phasing nanopore data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of Oxford Nanopore Technologies' (ONT) PromethION sequencing, including those using proximity ligation and show that newer, higher accuracy ONT reads substantially improve assembly quality.

11.
EXCLI J ; 22: 1155-1172, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38204967

RESUMEN

A current clinical challenge in cancer is multidrug resistance (MDR) mediated by ABC transporters. Breast cancer resistance protein (BCRP) or ABCG2 transporter is one of the most important ABC transporters implicated in MDR and the use of inhibitors is a promising approach to overcome the resistance in cancer. This study aimed to characterize the molecular mechanism of ABCG2 inhibitors identified by a repurposing drug strategy using antiviral, anti-inflammatory and antiparasitic agents. Lopinavir and ivermectin can be considered as pan-inhibitors of ABC transporters, since both compounds inhibited ABCG2, P-glycoprotein and MRP1. They inhibited ABCG2 activity showing IC50 values of 25.5 and 23.4 µM, respectively. These drugs were highly cytotoxic and not transported by ABCG2. Additionally, these drugs increased the 5D3 antibody binding and did not affect the mRNA and protein expression levels. Cell-based analysis of the type of inhibition suggested a non-competitive inhibition, which was further corroborated by in silico approaches of molecular docking and molecular dynamics simulations. These results showed an overlap of the lopinavir and ivermectin binding sites on ABCG2, mainly interacting with E446 residue. However, the substrate mitoxantrone occupies a different site, binding to the F436 region, closer to the L554/L555 plug. In conclusion, these results revealed the mechanistic basis of lopinavir and ivermectin interaction with ABCG2. See also the Graphical abstract(Fig. 1).

12.
Nature ; 611(7936): 519-531, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-36261518

RESUMEN

The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society1,2. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals3,4. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome5. To address these limitations, the Human Pangenome Reference Consortium formed with the goal of creating high-quality, cost-effective, diploid genome assemblies for a pangenome reference that represents human genetic diversity6. Here, in our first scientific report, we determined which combination of current genome sequencing and assembly approaches yield the most complete and accurate diploid genome assembly with minimal manual curation. Approaches that used highly accurate long reads and parent-child data with graph-based haplotype phasing during assembly outperformed those that did not. Developing a combination of the top-performing methods, we generated our first high-quality diploid reference assembly, containing only approximately four gaps per chromosome on average, with most chromosomes within ±1% of the length of CHM13. Nearly 48% of protein-coding genes have non-synonymous amino acid changes between haplotypes, and centromeric regions showed the highest diversity. Our findings serve as a foundation for assembling near-complete diploid human genomes at scale for a pangenome reference to capture global genetic variation from single nucleotides to structural rearrangements.


Asunto(s)
Mapeo Cromosómico , Diploidia , Genoma Humano , Genómica , Humanos , Mapeo Cromosómico/normas , Genoma Humano/genética , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ADN/normas , Estándares de Referencia , Genómica/métodos , Genómica/normas , Cromosomas Humanos/genética , Variación Genética/genética
13.
Nature ; 604(7906): 437-446, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35444317

RESUMEN

The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation. A high-quality reference with global representation of common variants, including single-nucleotide variants, structural variants and functional elements, is needed. The Human Pangenome Reference Consortium aims to create a more sophisticated and complete human reference genome with a graph-based, telomere-to-telomere representation of global genomic diversity. Here we leverage innovations in technology, study design and global partnerships with the goal of constructing the highest-possible quality human pangenome reference. Our goal is to improve data representation and streamline analyses to enable routine assembly of complete diploid genomes. With attention to ethical frameworks, the human pangenome reference will contain a more accurate and diverse representation of global genomic variation, improve gene-disease association studies across populations, expand the scope of genomics research to the most repetitive and polymorphic regions of the genome, and serve as the ultimate genetic resource for future biomedical research and precision medicine.


Asunto(s)
Genoma Humano , Genómica , Genoma Humano/genética , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN
14.
Science ; 376(6588): eabl4178, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35357911

RESUMEN

Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.


Asunto(s)
Centrómero/genética , Mapeo Cromosómico , Epigénesis Genética , Genoma Humano , Evolución Molecular , Genómica , Humanos , Secuencias Repetitivas de Ácidos Nucleicos
15.
Mol Diagn Ther ; 23(4): 521-535, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31209714

RESUMEN

INTRODUCTION: Comprehensive genetic cancer profiling using circulating tumor DNA has enabled the detection of National Comprehensive Cancer Network (NCCN) guideline-recommended somatic alterations from a single, non-invasive blood draw. However, reliably detecting somatic variants at low variant allele fractions (VAFs) remains a challenge for next-generation sequencing (NGS)-based tests. We have developed the single-molecule sequencing (SMSEQ) platform to address these challenges. METHODS: The OncoLBx assay utilizes the SMSEQ platform to optimize cell-free DNA extraction and library preparation with variant type-specific calling algorithms to improve sensitivity and specificity. OncoLBx is a pan-cancer panel for solid tumors targeting 75 genes and five microsatellite sites analyzing five classes of NCCN-recommended somatic variants: single-nucleotide variants (SNVs), insertions and deletions (indels), copy number variants (CNVs), fusions and microsatellite instability (MSI). Circulating DNA was extracted from plasma, followed by library preparation using SMSEQ. Analytical validation was performed according to recently published American College of Medical Genetics and Genomics (ACMG)/Association for Molecular Pathology (AMP) guidelines and established the limit of detection (LOD), sensitivity, specificity, accuracy and reproducibility using 126 gold-standard reference samples, healthy donor samples verified by whole-exome sequencing by an external College of American Pathologists (CAP) reference lab and cell lines with known variants. Results were analyzed using a locus-specific modeling algorithm. RESULTS: We have demonstrated that OncoLBx detects VAFs of ≥ 0.1% for SNVs and indels, ≥ 0.5% for fusions, ≥ 4.5 copies for CNVs and ≥ 2% for MSI, with all variant types having specificity ≥ 99.999%. Diagnostic performance of paired samples displays 80% sensitivity and > 99.999% clinical specificity. Clinical utility and performance were assessed in 416 solid tumor samples. Variants were detected in 79% of samples, for which 87.34% of positive samples had available targeted therapy.


Asunto(s)
Biomarcadores de Tumor , ADN Tumoral Circulante , Neoplasias/genética , Polimorfismo de Nucleótido Simple , Línea Celular Tumoral , Toma de Decisiones Clínicas , Biología Computacional/métodos , Manejo de la Enfermedad , Variación Genética , Genómica/métodos , Humanos , Terapia Molecular Dirigida , Neoplasias/diagnóstico , Neoplasias/terapia , Pronóstico , Reproducibilidad de los Resultados
16.
Vertex ; XXVII(126): 85-93, 2016 Mar.
Artículo en Español | MEDLINE | ID: mdl-28199423

RESUMEN

OBJECTIVE: To define the profile and the treatments evolution of the outpatients assisted by the psychiatry residents of the Rossi Hospital in La Plata. METHODS: It was analyzed the period between 2005 and 2010 (six years). The variables selected included gender, age, diagnosis (according to ICD 10), duration and evolution of treatment. RESULTS: From the total number of patients (n=341), 58,7% were women (n= 200) and 41,3% men (n= 141). The most frequent diagnoses were: anxiety disorders (13,6%), depressive disorders (12,8%), schizophrenia (12,2%), other non-affective psychoses (9%) and mental retardation (8,4%). The patients who dropped out represented the 26,1%, the ones who continued the 25,2% and the percentage of discharged was 17. DISCUSSION: The high frequency of psychoses (schizophrenia and other non-affective psychoses) reminds us the importance of the public hospital in the assistance of severe and chronic pathologies. There are similarities and differences with other publications. However, the population studied has distinctive characteristics which hamper comparisons among them, as our patients are already in psychiatric treatment, when other investigations are based on general population or first consultation on Mental Health Services.


Asunto(s)
Trastornos Mentales , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Argentina , Femenino , Hospitales Generales , Humanos , Internado y Residencia , Masculino , Trastornos Mentales/diagnóstico , Trastornos Mentales/epidemiología , Trastornos Mentales/terapia , Persona de Mediana Edad , Psiquiatría/educación , Adulto Joven
17.
Genes Dev ; 26(15): 1691-702, 2012 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-22810624

RESUMEN

Forty years of classical biochemical analysis have identified the molecular players involved in initiation of transcription by eukaryotic RNA polymerase II (Pol II) and largely assigned their functions. However, a dynamic picture of Pol II transcription initiation and an understanding of the mechanisms of its regulation have remained elusive due in part to inherent limitations of conventional ensemble biochemistry. Here we have begun to dissect promoter-specific transcription initiation directed by a reconstituted human Pol II system at single-molecule resolution using fluorescence video-microscopy. We detected several stochastic rounds of human Pol II transcription from individual DNA templates, observed attenuation of transcription by promoter mutations, observed enhancement of transcription by activator Sp1, and correlated the transcription signals with real-time interactions of holo-TFIID molecules at individual DNA templates. This integrated single-molecule methodology should be applicable to studying other complex biological processes.


Asunto(s)
Imagen Molecular/métodos , ARN Polimerasa II/química , Transcripción Genética , Humanos , Microscopía Fluorescente/métodos , Microscopía por Video/métodos , Mutación , Regiones Promotoras Genéticas , ARN Polimerasa II/genética , ARN Polimerasa II/metabolismo , Factor de Transcripción Sp1/química , Factor de Transcripción Sp1/metabolismo , Factor de Transcripción TFIID/química , Factor de Transcripción TFIID/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...