RESUMEN
Orchidaceae is one of the largest families of angiosperms. Considering the large number of species in this family and its symbiotic relationship with fungi, Orchidaceae provide an ideal model to study the evolution of plant mitogenomes. However, to date, there is only one draft mitochondrial genome of this family available. Here, we present a fully assembled and annotated sequence of the mitochondrial genome (mitogenome) of Paphiopedilum micranthum, a species with high economic and ornamental value. The mitogenome of P. micranthum was 447,368 bp in length and comprised 26 circular subgenomes ranging in size from 5973 bp to 32,281 bp. The genome encoded for 39 mitochondrial-origin, protein-coding genes; 16 tRNAs (three of plastome origin); three rRNAs; and 16 ORFs, while rpl10 and sdh3 were lost from the mitogenome. Moreover, interorganellar DNA transfer was identified in 14 of the 26 chromosomes. These plastid-derived DNA fragments represented 28.32% (46,273 bp) of the P. micranthum plastome, including 12 intact plastome origin genes. Remarkably, the mitogenome of P. micranthum and Gastrodia elata shared 18% (about 81 kb) of their mitochondrial DNA sequences. Additionally, we found a positive correlation between repeat length and recombination frequency. The mitogenome of P. micranthum had more compact and fragmented chromosomes compared to other species with multichromosomal structures. We suggest that repeat-mediated homologous recombination enables the dynamic structure of mitochondrial genomes in Orchidaceae.
Asunto(s)
Genoma Mitocondrial , Magnoliopsida , Orchidaceae , ADN Mitocondrial , Mitocondrias/genética , Magnoliopsida/genética , Orchidaceae/genética , FilogeniaRESUMEN
Development of primary mediastinal B-cell lymphoma (PMBL) is driven by cumulative genomic aberrations. We discovered a unique copy-neutral loss of heterozygosity (CN-LOH) landscape of PMBL which distinguishes this tumor from other B-cell malignancies, including the biologically related diffuse large B-cell lymphoma. Using single nucleotide polymorphism array analysis we identified large-scale CN-LOH lesions in 91% (30/33) of diagnostic PMBLs and both investigated PMBL-derived cell lines. Altogether, the cohort showed 157 extra-large (25.3-248.4 Mb) CN-LOH lesions affecting up to 14 chromosomes per case (mean of 4.4) and resulting in a reduction of heterozygosity an average of 9.9% (range 1.3-51%) of the genome. Predominant involvement of terminal chromosomal segments suggests the implication of B-cell specific crossover events in the pathogenesis of PMBL. Notably, CN-LOH stretches non-randomly clustered on 6p (60%), 15 (37.2%), and 17q (40%), and frequently co-occurred with homozygous mutations in the MHC I (6p21), B2M (15q15), and GNA13 (17q23) genes, respectively, as shown by preliminary whole-exome/genome sequencing data. Altogether, our findings implicate CN-LOH as a novel and distinct mutational process contributing to the molecular pathogenesis of PMBL. The aberration acting as "second hit" in the Knudson hypothesis, ranks as the major mechanism converting to homozygosity the PMBL-related driver genes. Screening of the cohort of 199 B cell leukemia/lymphoma whole-genomes revealed significant differences in the CN-LOH landscape of PMBL and other B-cell malignancies, including the biologically related diffuse large B-cell lymphoma.
Asunto(s)
Linfoma de Células B Grandes Difuso , Neoplasias del Mediastino , Genómica , Humanos , Pérdida de Heterocigocidad , Linfoma de Células B Grandes Difuso/diagnóstico , Neoplasias del Mediastino/genética , MutaciónRESUMEN
The evolution in next-generation sequencing (NGS) technology has led to the development of many different assembly algorithms, but few of them focus on assembling the organelle genomes. These genomes are used in phylogenetic studies, food identification and are the most deposited eukaryotic genomes in GenBank. Producing organelle genome assembly from whole genome sequencing (WGS) data would be the most accurate and least laborious approach, but a tool specifically designed for this task is lacking. We developed a seed-and-extend algorithm that assembles organelle genomes from whole genome sequencing (WGS) data, starting from a related or distant single seed sequence. The algorithm has been tested on several new (Gonioctena intermedia and Avicennia marina) and public (Arabidopsis thaliana and Oryza sativa) whole genome Illumina data sets where it outperforms known assemblers in assembly accuracy and coverage. In our benchmark, NOVOPlasty assembled all tested circular genomes in less than 30 min with a maximum memory requirement of 16 GB and an accuracy over 99.99%. In conclusion, NOVOPlasty is the sole de novo assembler that provides a fast and straightforward extraction of the extranuclear genomes from WGS data in one circular high quality contig. The software is open source and can be downloaded at https://github.com/ndierckx/NOVOPlasty.
Asunto(s)
Algoritmos , Genoma del Cloroplasto , Genoma Mitocondrial , Secuenciación Completa del Genoma/métodos , Animales , Avicennia/genética , Escarabajos/genética , Programas InformáticosRESUMEN
COLOMBOS is a database that integrates publicly available transcriptomics data for several prokaryotic model organisms. Compared to the previous version it has more than doubled in size, both in terms of species and data available. The manually curated condition annotation has been overhauled as well, giving more complete information about samples' experimental conditions and their differences. Functionality-wise cross-species analyses now enable users to analyse expression data for all species simultaneously, and identify candidate genes with evolutionary conserved expression behaviour. All the expression-based query tools have undergone a substantial improvement, overcoming the limit of enforced co-expression data retrieval and instead enabling the return of more complex patterns of expression behaviour. COLOMBOS is freely available through a web application at http://colombos.net/. The complete database is also accessible via REST API or downloadable as tab-delimited text files.
Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Archaea/genética , Archaea/metabolismo , Bacterias/genética , Bacterias/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos , Análisis de Secuencia de ARN , Programas InformáticosRESUMEN
Oikopleura dioica is a planktonic tunicate (Appendicularia class) found extensively across the marine waters of the globe. The genome of a single male individual collected from Okinawa, Japan was sequenced using the single-molecule PacBio Hi-Fi method and assembled with NOVOLoci. The mitogenome is 39,268â bp long, featuring a large control region of around 22,000â bp. We annotated the proteins atp6, cob, cox1, cox2, cox3, nad1, nad4, and nad5, and found one more open reading frame that did not match any known gene. This study marks the first complete mitogenome assembly for an appendicularian, and reveals that A and T homopolymers cumulatively account for nearly half of its length. This reference sequence will be an asset for environmental DNA and phylogenetic studies.
Asunto(s)
Genoma Mitocondrial , Urocordados , Animales , Urocordados/genética , Masculino , FilogeniaRESUMEN
The 22q11.2 deletion syndrome (22q11.2DS) is the most common microdeletion disorder. Why the incidence of 22q11.2DS is much greater than that of other genomic disorders remains unknown. Short read sequencing cannot resolve the complex segmental duplicon structure to provide direct confirmation of the hypothesis that the rearrangements are caused by non-allelic homologous recombination between the low copy repeats on chromosome 22 (LCR22s). To enable haplotype-specific assembly and rearrangement mapping in LCR22 regions, we combined fiber-FISH optical mapping with whole genome (ultra-)long read sequencing or rearrangement-specific long-range PCR on 24 duos (22q11.2DS patient and parent-of-origin) comprising several different LCR22-mediated rearrangements. Unexpectedly, we demonstrate that not only different paralogous segmental duplicon but also palindromic AT-rich repeats (PATRR) are driving 22q11.2 rearrangements. In addition, we show the existence of two different inversion polymorphisms preceding rearrangement, and somatic mosaicism. The existence of different recombination sites and mechanisms in paralogues and PATRRs which are copy number expanding in the human population are a likely explanation for the high 22q11.2DS incidence.
RESUMEN
BACKGROUND: During normal zygotic division, two haploid parental genomes replicate, unite and segregate into two biparental diploid blastomeres. RESULTS: Contrary to this fundamental biological tenet, we demonstrate here that parental genomes can segregate to distinct blastomeres during the zygotic division resulting in haploid or uniparental diploid and polyploid cells, a phenomenon coined heterogoneic division. By mapping the genomic landscape of 82 blastomeres from 25 bovine zygotes, we show that multipolar zygotic division is a tell-tale of whole-genome segregation errors. Based on the haplotypes and live-imaging of zygotic divisions, we demonstrate that various combinations of androgenetic, gynogenetic, diploid, and polyploid blastomeres arise via distinct parental genome segregation errors including the formation of additional paternal, private parental, or tripolar spindles, or by extrusion of paternal genomes. Hence, we provide evidence that private parental spindles, if failing to congress before anaphase, can lead to whole-genome segregation errors. In addition, anuclear blastomeres are common, indicating that cytokinesis can be uncoupled from karyokinesis. Dissociation of blastocyst-stage embryos further demonstrates that whole-genome segregation errors might lead to mixoploid or chimeric development in both human and cow. Yet, following multipolar zygotic division, fewer embryos reach the blastocyst stage and diploidization occurs frequently indicating that alternatively, blastomeres with genome-wide errors resulting from whole-genome segregation errors can be selected against or contribute to embryonic arrest. CONCLUSIONS: Heterogoneic zygotic division provides an overarching paradigm for the development of mixoploid and chimeric individuals and moles and can be an important cause of embryonic and fetal arrest following natural conception or IVF.
Asunto(s)
Blastómeros , Cigoto , Animales , Blastocisto , Bovinos , Femenino , Genoma , Humanos , MitosisRESUMEN
Accurate simulations of structural variation distributions and sequencing data are crucial for the development and benchmarking of new tools. We develop Sim-it, a straightforward tool for the simulation of both structural variation and long-read data. These simulations from Sim-it reveal the strengths and weaknesses for current available structural variation callers and long-read sequencing platforms. With these findings, we develop a new method (combiSV) that can combine the results from structural variation callers into a superior call set with increased recall and precision, which is also observed for the latest structural variation benchmark set developed by the GIAB Consortium.
Asunto(s)
Benchmarking , Simulación por Computador , Genoma Humano , Análisis de Secuencia , Genómica , Humanos , Secuenciación de Nanoporos , Programas InformáticosRESUMEN
Segmental duplications or low copy repeats (LCRs) constitute duplicated regions interspersed in the human genome, currently neglected in standard analyses due to their extreme complexity. Recent functional studies have indicated the potential of genes within LCRs in synaptogenesis, neuronal migration, and neocortical expansion in the human lineage. One of the regions with the highest proportion of duplicated sequence is the 22q11.2 locus, carrying eight LCRs (LCR22-A until LCR22-H), and rearrangements between them cause the 22q11.2 deletion syndrome. The LCR22-A block was recently reported to be hypervariable in the human population. It remains unknown whether this variability also exists in non-human primates, since research is strongly hampered by the presence of sequence gaps in the human and non-human primate reference genomes. To chart the LCR22 haplotypes and the associated inter- and intra-species variability, we de novo assembled the region in non-human primates by a combination of optical mapping techniques. A minimal and likely ancient haplotype is present in the chimpanzee, bonobo, and rhesus monkey without intra-species variation. In addition, the optical maps identified assembly errors and closed gaps in the orthologous chromosome 22 reference sequences. These findings indicate the LCR22 expansion to be unique to the human population, which might indicate involvement of the region in human evolution and adaptation. Those maps will enable LCR22-specific functional studies and investigate potential associations with the phenotypic variability in the 22q11.2 deletion syndrome.
RESUMEN
Heteroplasmy, the existence of multiple mitochondrial haplotypes within an individual, has been studied across different scientific fields. Mitochondrial genome polymorphisms have been linked to multiple severe disorders and are of interest to evolutionary studies and forensic science. Before the development of massive parallel sequencing (MPS), most studies of mitochondrial genome variation were limited to short fragments and to heteroplasmic variants associated with a relatively high frequency (>10%). By utilizing ultra-deep sequencing, it has now become possible to uncover previously undiscovered patterns of intra-individual polymorphisms. Despite these technological advances, it is still challenging to determine the origin of the observed intra-individual polymorphisms. We therefore developed a new method that not only detects intra-individual polymorphisms within mitochondrial and chloroplast genomes more accurately, but also looks for linkage among polymorphic sites by assembling the sequence around each detected polymorphic site. Our benchmark study shows that this method is capable of detecting heteroplasmy more accurately than any method previously available and is the first tool that is able to completely or partially reconstruct the sequence for each mitochondrial haplotype (allele). The method is implemented in our open source software NOVOPlasty that can be downloaded at https://github.com/ndierckx/NOVOPlasty.