RESUMEN
After two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist1,2. Here we present a human genome assembly that surpasses the continuity of GRCh382, along with a gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome3, we reconstructed the centromeric satellite DNA array (approximately 3.1 Mb) and closed the 29 remaining gaps in the current reference, including new sequences from the human pseudoautosomal regions and from cancer-testis ampliconic gene families (CT-X and GAGE). These sequences will be integrated into future human reference genome releases. In addition, the complete chromosome X, combined with the ultra-long nanopore data, allowed us to map methylation patterns across complex tandem repeats and satellite arrays. Our results demonstrate that finishing the entire human genome is now within reach, and the data presented here will facilitate ongoing efforts to complete the other human chromosomes.
Asunto(s)
Cromosomas Humanos X/genética , Genoma Humano/genética , Telómero/genética , Centrómero/genética , Islas de CpG/genética , Metilación de ADN , ADN Satélite/genética , Femenino , Humanos , Mola Hidatiforme/genética , Masculino , Embarazo , Reproducibilidad de los Resultados , Testículo/metabolismoRESUMEN
Auxin has a fundamental role throughout the life cycle of land plants. Previous studies showed that the tomato cyclophilin DIAGEOTROPICA (DGT) promotes auxin response, but its specific role in auxin signaling remains unknown. We sequenced candidate genes in auxin-insensitive mutants of Physcomitrella patens and identified mutations in highly conserved regions of the moss ortholog of tomato DGT. As P. patens and tomato diverged from a common ancestor more than 500 million years ago, this result suggests a conserved and central role for DGT in auxin signaling in land plants. In this study we characterize the P. patens dgt (Ppdgt) mutants and show that their response to auxin is altered, affecting the chloronema-to-caulonema transition and the development of rhizoids. To gain an understanding of PpDGT function we tested its interactions with the TIR1/AFB-dependent auxin signaling pathway. We did not observe a clear effect of the Ppdgt mutation on the degradation of Aux/IAA proteins. However, the induction of several auxin-regulated genes was reduced. Genetic analysis revealed that dgt can suppress the phenotype conferred by overexpression of an AFB auxin receptor. Our results indicate that the DGT protein affects auxin-induced transcription and has a conserved function in auxin regulation in land plants.
Asunto(s)
Bryopsida/genética , Ciclofilinas/metabolismo , Ácidos Indolacéticos/metabolismo , Proteínas de Plantas/metabolismo , Solanum lycopersicum/genética , Secuencia de Bases , Bryopsida/embriología , Ciclofilinas/genética , Evolución Molecular , Proteínas F-Box/metabolismo , Regulación de la Expresión Génica de las Plantas , Proteínas de Plantas/genética , Plantas Modificadas Genéticamente , Receptores de Superficie Celular/metabolismo , Análisis de Secuencia de ADN , Transducción de Señal , Transcripción GenéticaRESUMEN
BACKGROUND: Insulin producing beta cell and glucagon producing alpha cells are colocalized in pancreatic islets in an arrangement that facilitates the coordinated release of the two principal hormones that regulate glucose homeostasis and prevent both hypoglycemia and diabetes. However, this intricate organization has also complicated the determination of the cellular source(s) of the expression of genes that are detected in the islet. This reflects a significant gap in our understanding of mouse islet physiology, which reduces the effectiveness by which mice model human islet disease. RESULTS: To overcome this challenge, we generated a bitransgenic reporter mouse that faithfully labels all beta and alpha cells in mouse islets to enable FACS-based purification and the generation of comprehensive transcriptomes of both populations. This facilitates systematic comparison across thousands of genes between the two major endocrine cell types of the islets of Langerhans whose principal hormones are of cardinal importance for glucose homeostasis. Our data leveraged against similar data for human beta cells reveal a core common beta cell transcriptome of 9900+ genes. Against the backdrop of overall similar beta cell transcriptomes, we describe marked differences in the repertoire of receptors and long non-coding RNAs between mouse and human beta cells. CONCLUSIONS: The comprehensive mouse alpha and beta cell transcriptomes complemented by the comparison of the global (dis)similarities between mouse and human beta cells represent invaluable resources to boost the accuracy by which rodent models offer guidance in finding cures for human diabetes.
Asunto(s)
Células Secretoras de Insulina/metabolismo , ARN Largo no Codificante/metabolismo , Animales , Citometría de Flujo , Biblioteca de Genes , Glucagón/genética , Glucagón/metabolismo , Células Secretoras de Glucagón/citología , Células Secretoras de Glucagón/metabolismo , Humanos , Células Secretoras de Insulina/citología , Proteínas Luminiscentes/genética , Proteínas Luminiscentes/metabolismo , Ratones , ARN/genética , ARN/metabolismo , ARN Largo no Codificante/genética , Receptores de Superficie Celular/genética , Receptores de Superficie Celular/metabolismo , Análisis de Secuencia de ARN , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Transcriptoma , Proteína Fluorescente RojaRESUMEN
Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.
Asunto(s)
Centrómero/genética , Mapeo Cromosómico , Epigénesis Genética , Genoma Humano , Evolución Molecular , Genómica , Humanos , Secuencias Repetitivas de Ácidos NucleicosRESUMEN
De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100 kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6 h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV = 30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned human samples and report superior accuracy and speed.