RESUMO
Structural variants contribute substantially to genetic diversity and are important evolutionarily and medically, but they are still understudied. Here we present a comprehensive analysis of structural variation in the Human Genome Diversity panel, a high-coverage dataset of 911 samples from 54 diverse worldwide populations. We identify, in total, 126,018 variants, 78% of which were not identified in previous global sequencing projects. Some reach high frequency and are private to continental groups or even individual populations, including regionally restricted runaway duplications and putatively introgressed variants from archaic hominins. By de novo assembly of 25 genomes using linked-read sequencing, we discover 1,643 breakpoint-resolved unique insertions, in aggregate accounting for 1.9 Mb of sequence absent from the GRCh38 reference. Our results illustrate the limitation of a single human reference and the need for high-quality genomes from diverse populations to fully discover and understand human genetic variation.
Assuntos
Genética Populacional , Variação Estrutural do Genoma , Alelos , Bases de Dados Genéticas , Dosagem de Genes , Duplicação Gênica , Frequência do Gene/genética , Variação Genética , Genoma Humano , HumanosRESUMO
Cancer is driven by somatically acquired point mutations and chromosomal rearrangements, conventionally thought to accumulate gradually over time. Using next-generation sequencing, we characterize a phenomenon, which we term chromothripsis, whereby tens to hundreds of genomic rearrangements occur in a one-off cellular crisis. Rearrangements involving one or a few chromosomes crisscross back and forth across involved regions, generating frequent oscillations between two copy number states. These genomic hallmarks are highly improbable if rearrangements accumulate over time and instead imply that nearly all occur during a single cellular catastrophe. The stamp of chromothripsis can be seen in at least 2%-3% of all cancers, across many subtypes, and is present in â¼25% of bone cancers. We find that one, or indeed more than one, cancer-causing lesion can emerge out of the genomic crisis. This phenomenon has important implications for the origins of genomic remodeling and temporal emergence of cancer.
Assuntos
Aberrações Cromossômicas , Neoplasias/genética , Neoplasias/patologia , Neoplasias Ósseas/genética , Linhagem Celular Tumoral , Coloração Cromossômica , Feminino , Rearranjo Gênico , Humanos , Leucemia Linfocítica Crônica de Células B/genética , Pessoa de Meia-IdadeRESUMO
The increasing prevalence of diabetes has resulted in a global epidemic1. Diabetes is a major cause of blindness, kidney failure, heart attacks, stroke and amputation of lower limbs. These are often caused by changes in blood vessels, such as the expansion of the basement membrane and a loss of vascular cells2-4. Diabetes also impairs the functions of endothelial cells5 and disturbs the communication between endothelial cells and pericytes6. How dysfunction of endothelial cells and/or pericytes leads to diabetic vasculopathy remains largely unknown. Here we report the development of self-organizing three-dimensional human blood vessel organoids from pluripotent stem cells. These human blood vessel organoids contain endothelial cells and pericytes that self-assemble into capillary networks that are enveloped by a basement membrane. Human blood vessel organoids transplanted into mice form a stable, perfused vascular tree, including arteries, arterioles and venules. Exposure of blood vessel organoids to hyperglycaemia and inflammatory cytokines in vitro induces thickening of the vascular basement membrane. Human blood vessels, exposed in vivo to a diabetic milieu in mice, also mimic the microvascular changes found in patients with diabetes. DLL4 and NOTCH3 were identified as key drivers of diabetic vasculopathy in human blood vessels. Therefore, organoids derived from human stem cells faithfully recapitulate the structure and function of human blood vessels and are amenable systems for modelling and identifying the regulators of diabetic vasculopathy, a disease that affects hundreds of millions of patients worldwide.
Assuntos
Membrana Basal/patologia , Vasos Sanguíneos/patologia , Angiopatias Diabéticas/patologia , Modelos Biológicos , Organoides/patologia , Organoides/transplante , Proteínas Adaptadoras de Transdução de Sinal , Secretases da Proteína Precursora do Amiloide/antagonistas & inibidores , Secretases da Proteína Precursora do Amiloide/metabolismo , Animais , Artérias/citologia , Artérias/efeitos dos fármacos , Arteríolas/citologia , Arteríolas/efeitos dos fármacos , Membrana Basal/citologia , Membrana Basal/efeitos dos fármacos , Vasos Sanguíneos/citologia , Vasos Sanguíneos/efeitos dos fármacos , Vasos Sanguíneos/crescimento & desenvolvimento , Proteínas de Ligação ao Cálcio , Angiopatias Diabéticas/enzimologia , Células Endoteliais/citologia , Células Endoteliais/efeitos dos fármacos , Humanos , Hiperglicemia/complicações , Técnicas In Vitro , Mediadores da Inflamação/farmacologia , Peptídeos e Proteínas de Sinalização Intercelular/metabolismo , Camundongos , Organoides/citologia , Organoides/efeitos dos fármacos , Pericitos/citologia , Pericitos/efeitos dos fármacos , Células-Tronco Pluripotentes/citologia , Células-Tronco Pluripotentes/efeitos dos fármacos , Receptor Notch3/metabolismo , Transdução de Sinais , Vênulas/citologia , Vênulas/efeitos dos fármacosRESUMO
Mutations in the SETX gene, which encodes Senataxin, are associated with the progressive neurodegenerative diseases ataxia with oculomotor apraxia 2 (AOA2) and amyotrophic lateral sclerosis 4 (ALS4). To identify the causal defect in AOA2, patient-derived cells and SETX knockouts (human and mouse) were analyzed using integrated genomic and transcriptomic approaches. A genome-wide increase in chromosome instability (gains and losses) within genes and at chromosome fragile sites was observed, resulting in changes to gene-expression profiles. Transcription stress near promoters correlated with high GCskew and the accumulation of R-loops at promoter-proximal regions, which localized with chromosomal regions where gains and losses were observed. In the absence of Senataxin, the Cockayne syndrome protein CSB was required for the recruitment of the transcription-coupled repair endonucleases (XPG and XPF) and RAD52 recombination protein to target and resolve transcription bubbles containing R-loops, leading to genomic instability. These results show that transcription stress is an important contributor to SETX mutation-associated chromosome fragility and AOA2.
Assuntos
Instabilidade Cromossômica/genética , DNA Helicases/metabolismo , Enzimas Multifuncionais/metabolismo , RNA Helicases/metabolismo , Ataxias Espinocerebelares/congênito , Animais , Apraxias/genética , Ataxia/genética , Linhagem Celular , Ataxia Cerebelar/genética , DNA Helicases/genética , Reparo do DNA/genética , Perfilação da Expressão Gênica/métodos , Instabilidade Genômica/genética , Genômica/métodos , Humanos , Camundongos , Células-Tronco Embrionárias Murinas , Enzimas Multifuncionais/genética , Mutação/genética , Doenças Neurodegenerativas/genética , Cultura Primária de Células , Regiões Promotoras Genéticas/genética , RNA Helicases/genética , Ataxias Espinocerebelares/genética , Ataxias Espinocerebelares/fisiopatologia , Transcriptoma/genéticaRESUMO
Chromosome-scale genome assemblies based on ultralong-read sequencing technologies are able to illuminate previously intractable aspects of genome biology such as fine-scale centromere structure and large-scale variation in genome features such as heterochromatin, GC content, recombination rate, and gene content. We present here a new chromosome-scale genome of the Mongolian gerbil (Meriones unguiculatus), which includes the complete sequence of all centromeres. Gerbils are thus the one of the first vertebrates to have their centromeres completely sequenced. Gerbil centromeres are composed of four different repeats of length 6, 37, 127, or 1,747â bp, which occur in simple alternating arrays and span 1-6â Mb. Gerbil genomes have both an extensive set of GC-rich genes and chromosomes strikingly enriched for constitutive heterochromatin. We sought to determine if there was a link between these two phenomena and found that the two heterochromatic chromosomes of the Mongolian gerbil have distinct underpinnings: Chromosome 5 has a large block of intraarm heterochromatin as the result of a massive expansion of centromeric repeats, while chromosome 13 is comprised of extremely large (>150â kb) repeated sequences. In addition to characterizing centromeres, our results demonstrate the importance of including karyotypic features such as chromosome number and the locations of centromeres in the interpretation of genome sequence data and highlight novel patterns involved in the evolution of chromosomes.
Assuntos
Centrômero , Heterocromatina , Animais , Gerbillinae/genética , Heterocromatina/genética , Centrômero/genética , Genoma , Sequências Repetitivas de Ácido NucleicoRESUMO
Stem cells and regenerative medicine have recently become important research topics. However, the complex stem cell regulatory networks involved in various microRNA (miRNA)-mediated mechanisms have not yet been fully elucidated. Planarians are ideal animal models for studying stem cells owing to their rich stem cell populations (neoblasts) and extremely strong regeneration capacity. The roles of planarian miRNAs in stem cells and regeneration have long attracted attention. However, previous studies have generally provided simple datasets lacking integrative analysis. Here, we have summarized the miRNA family reported in planarians and highlighted conservation in both sequence and function. Furthermore, we summarized miRNA data related to planarian stem cells and regeneration and screened potential involved candidates. Nevertheless, the roles of these miRNAs in planarian regeneration and stem cells remain unclear. The identification of potential stem cell-related miRNAs offers more precise suggestions and references for future investigations of miRNAs in planarians. Furthermore, it provides potential research avenues for understanding the mechanisms of stem cell regulatory networks. Finally, we compiled a summary of the experimental methods employed for studying planarian miRNAs, with the aim of highlighting special considerations in certain procedures and providing more convenient technical support for future research endeavors.
Assuntos
MicroRNAs , Planárias , Regeneração , Células-Tronco , Animais , Planárias/genética , MicroRNAs/genética , MicroRNAs/metabolismo , Células-Tronco/metabolismo , Regeneração/genética , Redes Reguladoras de GenesRESUMO
The poor correlation of mutational landscapes with phenotypes limits our understanding of the pathogenesis and metastasis of pancreatic ductal adenocarcinoma (PDAC). Here we show that oncogenic dosage-variation has a critical role in PDAC biology and phenotypic diversification. We find an increase in gene dosage of mutant KRAS in human PDAC precursors, which drives both early tumorigenesis and metastasis and thus rationalizes early PDAC dissemination. To overcome the limitations posed to gene dosage studies by the stromal richness of PDAC, we have developed large cell culture resources of metastatic mouse PDAC. Integration of cell culture genomes, transcriptomes and tumour phenotypes with functional studies and human data reveals additional widespread effects of oncogenic dosage variation on cell morphology and plasticity, histopathology and clinical outcome, with the highest KrasMUT levels underlying aggressive undifferentiated phenotypes. We also identify alternative oncogenic gains (Myc, Yap1 or Nfkb2), which collaborate with heterozygous KrasMUT in driving tumorigenesis, but have lower metastatic potential. Mechanistically, different oncogenic gains and dosages evolve along distinct evolutionary routes, licensed by defined allelic states and/or combinations of hallmark tumour suppressor alterations (Cdkn2a, Trp53, Tgfß-pathway). Thus, evolutionary constraints and contingencies direct oncogenic dosage gain and variation along defined routes to drive the early progression of PDAC and shape its downstream biology. Our study uncovers universal principles of Ras-driven oncogenesis that have potential relevance beyond pancreatic cancer.
Assuntos
Carcinoma Ductal Pancreático/genética , Carcinoma Ductal Pancreático/patologia , Evolução Molecular , Dosagem de Genes , Neoplasias Pancreáticas/genética , Neoplasias Pancreáticas/patologia , Proteínas Proto-Oncogênicas p21(ras)/genética , Proteínas Adaptadoras de Transdução de Sinal/genética , Alelos , Animais , Carcinogênese/genética , Proteínas de Ciclo Celular , Inibidor p16 de Quinase Dependente de Ciclina/genética , Progressão da Doença , Feminino , Genes myc , Genes p53 , Humanos , Masculino , Camundongos , Mutação , Subunidade p52 de NF-kappa B/genética , Metástase Neoplásica/genética , Proteínas Nucleares/genética , Fenótipo , Fosfoproteínas/genética , Fatores de Transcrição/genética , Transcriptoma/genética , Fator de Crescimento Transformador beta1/genética , Proteínas de Sinalização YAPRESUMO
TatD DNase, a key enzyme in vertebrates and invertebrates, plays a pivotal role in various physiological processes. Dugesia japonica (D. japonica), a flatworm species, has remarkable regenerative capabilities and possesses a simplified immune system. However, the existence and biological functions of TatD DNase in D. japonica require further investigation. Here, we obtained the open reading frame (ORF) of DjTatD and demonstrated its conservation. The three-dimensional structure of DjTatD revealed its active site and binding mechanism. To investigate its enzymological properties, we overexpressed, purified, and characterized recombinant DjTatD (rDjTatD). We observed that DjTatD was primarily expressed in the pharynx and its expression could be significantly challenged upon stimulation with lipopolysaccharide, peptidoglycan, gram-positive and gram-negative bacteria. RNA interference results indicated that both DjTatD and DjDN2s play a role in pharyngeal regeneration and may serve as functional complements to each other. Additionally, we found that rDjTatD and recombinant T7DjTatD effectively reduce biofilm formation regardless of their bacterial origin. Together, our results demonstrated that DjTatD may be involved in the planarian immune response and pharyngeal regeneration. Furthermore, after further optimization in the future, rDjTatD and T7DjTatD can be considered highly effective antibiofilm agents.
Assuntos
Biofilmes , Desoxirribonucleases , Planárias , Animais , Planárias/genética , Planárias/fisiologia , Planárias/enzimologia , Biofilmes/efeitos dos fármacos , Desoxirribonucleases/metabolismo , Desoxirribonucleases/genética , Desoxirribonucleases/química , Proteínas de Helminto/genética , Proteínas de Helminto/metabolismo , Proteínas de Helminto/química , Proteínas de Helminto/farmacologia , Sequência de AminoácidosRESUMO
Cell junctions, which are typically associated with dynamic cytoskeletons, are essential for a wide range of cellular activities, including cell migration, cell communication, barrier function and signal transduction. Observing cell junctions in real-time can help us understand the mechanisms by which they regulate these cellular activities. This study examined the binding capacity of a modified tridecapeptide from Connexin 43 (Cx43) to the cell junction protein zonula occludens-1 (ZO-1). The goal was to create a fluorescent peptide that can label cell junctions. A cell-penetrating peptide was linked to the modified tridecapeptide. The heterotrimeric peptide molecule was then synthesized. The binding of the modified tridecapeptide was tested using pulldown and immunoprecipitation assays. The ability of the peptide to label cell junctions was assessed by adding it to fixed or live Caco-2 cells. The testing assays revealed that the Cx43-derived peptide can bind to ZO-1. Additionally, the peptide was able to label cell junctions of fixed cells, although no obvious cell junction labeling was observed clearly in live cells, probably due to the inadequate affinity. These findings suggest that labeling cell junctions using a peptide-based strategy is feasible. Further efforts to improve its affinity are warranted in the future.
Assuntos
Conexina 43 , Junções Comunicantes , Humanos , Conexina 43/química , Conexina 43/metabolismo , Junções Comunicantes/metabolismo , Proteínas de Membrana/metabolismo , Células CACO-2 , Peptídeos/metabolismo , Fosfoproteínas/metabolismoRESUMO
Matrix metalloproteinases (MMPs) are members of a family of zinc-dependent metallopeptidase proteins that are widely found in plants, animals, and microorganisms. As the regulators of the extracellular matrix and basement membrane, MMPs play an important role in embryogenesis, development, innate immunity, and regeneration. However, the function of MMP family in planarian, a model for regeneration research, is still ambiguous. Here, we cloned 5 MMPs genes from Dugesia japonica and found that DjMMPA was associated with the process of regeneration, neoblasts cell maintenance confusion and destruction. Loss of DjMMPA led to homeostasis confusion and eventually death, owing to neoblasts proliferation disorder. Additionally, DjMMPA RNAi-treated animals had impaired regeneration after amputation. Furthermore, knockdown of DjMMPA had noticeable defects in cell differentiation of ectoderm, especially in eyes and neural progenitor cells, possibly by inhibiting Wnt signaling. Our results suggest that extracellular matrix-regulator MMPA is required for the orderly proliferation of neoblasts and differentiation of ectodermal progenitor cells in the planarian, which provide valuable information for further explorations into the molecular mechanism of MMPS, stem cells, and regeneration.
Assuntos
Planárias , Animais , Planárias/genética , Ectoderma , Células-Tronco , Diferenciação Celular , Proliferação de Células , Metaloproteinases da Matriz/genéticaRESUMO
Genome-wide association studies (GWAS) have identified a large number of single nucleotide polymorphism (SNP) sites associated with human diseases. In the annotation of human diseases, especially cancers, SNPs, as an important component of genetic factors, have gained increasing attention. Given that most of the SNPs are located in non-coding regions, the functional verification of these SNPs is a great challenge. The key to functional annotation for risk SNPs is to screen SNPs with regulatory activity from thousands of disease associated-SNPs. In this review, we systematically recapitulate the characteristics and functional roles of SNP sites, discuss three parallel reporter screening strategies in detail based on barcode tag classification, and recommend the common in silico strategies to help supplement the annotation of SNP sites with epigenetic activity analysis, prediction of target genes and trans-acting factors. We hope that this review will contribute to this exuberant research field by providing robust activity analysis strategies that can facilitate the translation of GWAS results into personalized diagnosis and prevention measures for human diseases.
Assuntos
Estudo de Associação Genômica Ampla , Neoplasias , Humanos , Polimorfismo de Nucleotídeo Único , Predisposição Genética para DoençaRESUMO
Mouse embryonic stem cells derived from the epiblast contribute to the somatic lineages and the germline but are excluded from the extra-embryonic tissues that are derived from the trophectoderm and the primitive endoderm upon reintroduction to the blastocyst. Here we report that cultures of expanded potential stem cells can be established from individual eight-cell blastomeres, and by direct conversion of mouse embryonic stem cells and induced pluripotent stem cells. Remarkably, a single expanded potential stem cell can contribute both to the embryo proper and to the trophectoderm lineages in a chimaera assay. Bona fide trophoblast stem cell lines and extra-embryonic endoderm stem cells can be directly derived from expanded potential stem cells in vitro. Molecular analyses of the epigenome and single-cell transcriptome reveal enrichment for blastomere-specific signature and a dynamic DNA methylome in expanded potential stem cells. The generation of mouse expanded potential stem cells highlights the feasibility of establishing expanded potential stem cells for other mammalian species.
Assuntos
Blastômeros/citologia , Células-Tronco Embrionárias Murinas/citologia , Animais , Blastocisto/citologia , Blastômeros/metabolismo , Linhagem da Célula , Células Cultivadas , Quimera , Embrião de Mamíferos/citologia , Endoderma/citologia , Epigênese Genética , Epigenômica , Feminino , Masculino , Camundongos , Células-Tronco Embrionárias Murinas/metabolismo , Placenta/citologia , Células-Tronco Pluripotentes/citologia , Células-Tronco Pluripotentes/metabolismo , Gravidez , Análise de Célula Única , Transcriptoma , Trofoblastos/citologiaRESUMO
The ability to directly uncover the contributions of genes to a given phenotype is fundamental for biology research. However, ostensibly homogeneous cell populations exhibit large clonal variance that can confound analyses and undermine reproducibility. Here we used genome-saturated mutagenesis to create a biobank of over 100,000 individual haploid mouse embryonic stem (mES) cell lines targeting 16,970 genes with genetically barcoded, conditional and reversible mutations. This Haplobank is, to our knowledge, the largest resource of hemi/homozygous mutant mES cells to date and is available to all researchers. Reversible mutagenesis overcomes clonal variance by permitting functional annotation of the genome directly in sister cells. We use the Haplobank in reverse genetic screens to investigate the temporal resolution of essential genes in mES cells, and to identify novel genes that control sprouting angiogenesis and lineage specification of blood vessels. Furthermore, a genome-wide forward screen with Haplobank identified PLA2G16 as a host factor that is required for cytotoxicity by rhinoviruses, which cause the common cold. Therefore, clones from the Haplobank combined with the use of reversible technologies enable high-throughput, reproducible, functional annotation of the genome.
Assuntos
Bancos de Espécimes Biológicos , Genômica/métodos , Haploidia , Células-Tronco Embrionárias Murinas/metabolismo , Mutação , Animais , Vasos Sanguíneos/citologia , Linhagem da Célula/genética , Resfriado Comum/genética , Resfriado Comum/virologia , Genes Essenciais/genética , Testes Genéticos , Células HEK293 , Homozigoto , Humanos , Camundongos , Células-Tronco Embrionárias Murinas/citologia , Neovascularização Fisiológica/genética , Fosfolipases A2 Independentes de Cálcio/genética , Fosfolipases A2 Independentes de Cálcio/metabolismo , Rhinovirus/patogenicidadeRESUMO
The maternal mode of mitochondrial DNA (mtDNA) inheritance is central to human genetics. Recently, evidence for bi-parental inheritance of mtDNA was claimed for individuals of three pedigrees that suffered mitochondrial disorders. We sequenced mtDNA using both direct Sanger and Massively Parallel Sequencing in several tissues of eleven maternally related and other affiliated healthy individuals of a family pedigree and observed mixed mitotypes in eight individuals. Cells without nuclear DNA, i.e. thrombocytes and hair shafts, only showed the mitotype of haplogroup (hg) V. Skin biopsies were prepared to generate ρ° cells void of mtDNA, sequencing of which resulted in a hg U4c1 mitotype. The position of the Mega-NUMT sequence was determined by fluorescence in situ hybridization and two different quantitative PCR assays were used to determine the number of contributing mtDNA copies. Thus, evidence for the presence of repetitive, full mitogenome Mega-NUMTs matching haplogroup U4c1 in various tissues of eight maternally related individuals was provided. Multi-copy Mega-NUMTs mimic mixtures of mtDNA that cannot be experimentally avoided and thus may appear in diverse fields of mtDNA research and diagnostics. We demonstrate that hair shaft mtDNA sequencing provides a simple but reliable approach to exclude NUMTs as source of misleading results.
Assuntos
DNA Mitocondrial , Genoma Humano , Núcleo Celular/genética , Variações do Número de Cópias de DNA , Feminino , Humanos , Masculino , Linhagem , Análise de Sequência de DNARESUMO
Human RBMY1 genes are located in four variable-sized clusters on the Y chromosome, expressed in male germ cells and possibly associated with sperm motility. We have re-investigated the mutational background and evolutionary history of the RBMY1 copy number distribution in worldwide samples and its relevance to sperm parameters in an Estonian cohort of idiopathic male factor infertility subjects. We estimated approximate RBMY1 copy numbers in 1218 1000 Genomes Project phase 3 males from sequencing read-depth, then chose 14 for valid ation by multicolour fibre-FISH. These fibre-FISH samples provided accurate calibration standards for the entire panel and led to detailed insights into population variation and mutational mechanisms. RBMY1 copy number worldwide ranged from 3 to 13 with a mode of 8. The two larger proximal clusters were the most variable, and additional duplications, deletions and inversions were detected. Placing the copy number estimates onto the published Y-SNP-based phylogeny of the same samples suggested a minimum of 562 mutational changes, translating to a mutation rate of 2.20 × 10-3 (95% CI 1.94 × 10-3 to 2.48 × 10-3) per father-to-son Y-transmission, higher than many short tandem repeat (Y-STRs), and showed no evidence for selection for increased or decreased copy number, but possible copy number stabilizing selection. An analysis of RBMY1 copy numbers among 376 infertility subjects failed to replicate a previously reported association with sperm motility and showed no significant effect on sperm count and concentration, serum follicle stimulating hormone (FSH), luteinizing hormone (LH) and testosterone levels or testicular and semen volume. These results provide the first in-depth insights into the structural rearrangements underlying RBMY1 copy number variation across diverse human lineages.
Assuntos
Cromossomos Humanos Y , Variações do Número de Cópias de DNA , Evolução Molecular , Proteínas Nucleares/genética , Proteínas de Ligação a RNA/genética , Hibridização Genômica Comparativa , Genoma Humano , Genômica/métodos , Humanos , Hibridização in Situ Fluorescente , Masculino , Família Multigênica , Mutação , Filogenia , Espermatozoides/metabolismoRESUMO
Glycophorin A and glycophorin B are red blood cell surface proteins and are both receptors for the parasite Plasmodium falciparum, which is the principal cause of malaria in sub-Saharan Africa. DUP4 is a complex structural genomic variant that carries extra copies of a glycophorin A-glycophorin B fusion gene and has a dramatic effect on malaria risk by reducing the risk of severe malaria by up to 40%. Using fiber-FISH and Illumina sequencing, we validate the structural arrangement of the glycophorin locus in the DUP4 variant and reveal somatic variation in copy number of the glycophorin B-glycophorin A fusion gene. By developing a simple, specific, PCR-based assay for DUP4, we show that the DUP4 variant reaches a frequency of 13% in the population of a malaria-endemic village in south-eastern Tanzania. We genotype a substantial proportion of that village and demonstrate an association of DUP4 genotype with hemoglobin levels, a phenotype related to malaria, using a family-based association test. Taken together, we show that DUP4 is a complex structural variant that may be susceptible to somatic variation and show that DUP4 is associated with a malarial-related phenotype in a longitudinally followed population.
Assuntos
Variação Estrutural do Genoma/genética , Glicoforinas/genética , Hemoglobinas/genética , Malária/genética , Linhagem Celular , Criança , Pré-Escolar , Eritrócitos/metabolismo , Feminino , Genótipo , Humanos , Estudos Longitudinais , Masculino , Mosaicismo , Fenótipo , Plasmodium falciparum/genética , TanzâniaRESUMO
Despite the rapid development of sequencing technologies, the assembly of mammalian-scale genomes into complete chromosomes remains one of the most challenging problems in bioinformatics. To help address this difficulty, we developed Ragout 2, a reference-assisted assembly tool that works for large and complex genomes. By taking one or more target assemblies (generated from an NGS assembler) and one or multiple related reference genomes, Ragout 2 infers the evolutionary relationships between the genomes and builds the final assemblies using a genome rearrangement approach. By using Ragout 2, we transformed NGS assemblies of 16 laboratory mouse strains into sets of complete chromosomes, leaving <5% of sequence unlocalized per set. Various benchmarks, including PCR testing and realigning of long Pacific Biosciences (PacBio) reads, suggest only a small number of structural errors in the final assemblies, comparable with direct assembly approaches. We applied Ragout 2 to the Mus caroli and Mus pahari genomes, which exhibit karyotype-scale variations compared with other genomes from the Muridae family. Chromosome painting maps confirmed most large-scale rearrangements that Ragout 2 detected. We applied Ragout 2 to improve draft sequences of three ape genomes that have recently been published. Ragout 2 transformed three sets of contigs (generated using PacBio reads only) into chromosome-scale assemblies with accuracy comparable to chromosome assemblies generated in the original study using BioNano maps, Hi-C, BAC clones, and FISH.
Assuntos
Mapeamento de Sequências Contíguas/métodos , Sequenciamento Completo do Genoma/métodos , Animais , Mapeamento de Sequências Contíguas/normas , Camundongos , Padrões de Referência , Sequenciamento Completo do Genoma/normasRESUMO
Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology.
Assuntos
Evolução Molecular , Genoma/genética , Muridae/genética , Filogenia , Animais , Sítios de Ligação , Fator de Ligação a CCCTC/genética , Cromossomos/genética , Cariotipagem/métodos , Elementos Nucleotídeos Longos e Dispersos/genética , Camundongos , Retroelementos/genética , Especificidade da EspécieRESUMO
Dystroglycan, an extracellular matrix receptor, has essential functions in various tissues. Loss of α-dystroglycan-laminin interaction due to defective glycosylation of α-dystroglycan underlies a group of congenital muscular dystrophies often associated with brain malformations, referred to as dystroglycanopathies. The lack of isogenic human dystroglycanopathy cell models has limited our ability to test potential drugs in a human- and neural-specific context. Here, we generated induced pluripotent stem cells (iPSCs) from a severe dystroglycanopathy patient with homozygous FKRP (fukutin-related protein gene) mutation. We showed that CRISPR/Cas9-mediated gene correction of FKRP restored glycosylation of α-dystroglycan in iPSC-derived cortical neurons, whereas targeted gene mutation of FKRP in wild-type cells disrupted this glycosylation. In parallel, we screened 31,954 small molecule compounds using a mouse myoblast line for increased glycosylation of α-dystroglycan. Using human FKRP-iPSC-derived neural cells for hit validation, we demonstrated that compound 4-(4-bromophenyl)-6-ethylsulfanyl-2-oxo-3,4-dihydro-1H-pyridine-5-carbonitrile (4BPPNit) significantly augmented glycosylation of α-dystroglycan, in part through upregulation of LARGE1 glycosyltransferase gene expression. Together, isogenic human iPSC-derived cells represent a valuable platform for facilitating dystroglycanopathy drug discovery and therapeutic development.
Assuntos
Avaliação Pré-Clínica de Medicamentos , Distroglicanas/metabolismo , Células-Tronco Pluripotentes Induzidas/metabolismo , Sequência de Bases , Sistemas CRISPR-Cas , Células Cultivadas , Avaliação Pré-Clínica de Medicamentos/métodos , Distroglicanas/genética , Edição de Genes , Marcação de Genes , Loci Gênicos , Glicosilação/efeitos dos fármacos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Imagem Molecular , Distrofias Musculares/tratamento farmacológico , Distrofias Musculares/etiologia , Distrofias Musculares/metabolismo , Mutação , N-Acetilglucosaminiltransferases/genética , N-Acetilglucosaminiltransferases/metabolismo , Células-Tronco Neurais/metabolismo , Neurônios/metabolismo , Pentosiltransferases/genética , Pentosiltransferases/metabolismoRESUMO
BACKGROUND: Approximately 5% of the human genome shows common structural variation, which is enriched for genes involved in the immune response and cell-cell interactions. A well-established region of extensive structural variation is the glycophorin gene cluster, comprising three tandemly-repeated regions about 120 kb in length and carrying the highly homologous genes GYPA, GYPB and GYPE. Glycophorin A (encoded by GYPA) and glycophorin B (encoded by GYPB) are glycoproteins present at high levels on the surface of erythrocytes, and they have been suggested to act as decoy receptors for viral pathogens. They are receptors for the invasion of the protist parasite Plasmodium falciparum, a causative agent of malaria. A particular complex structural variant, called DUP4, creates a GYPB-GYPA fusion gene known to confer resistance to malaria. Many other structural variants exist across the glycophorin gene cluster, and they remain poorly characterised. RESULTS: Here, we analyse sequences from 3234 diploid genomes from across the world for structural variation at the glycophorin locus, confirming 15 variants in the 1000 Genomes project cohort, discovering 9 new variants, and characterising a selection of these variants using fibre-FISH and breakpoint mapping at the sequence level. We identify variants predicted to create novel fusion genes and a common inversion duplication variant at appreciable frequencies in West Africans. We show that almost all variants can be explained by non-allelic homologous recombination and by comparing the structural variant breakpoints with recombination hotspot maps, confirm the importance of a particular meiotic recombination hotspot on structural variant formation in this region. CONCLUSIONS: We identify and validate large structural variants in the human glycophorin A-B-E gene cluster which may be associated with different clinical aspects of malaria.