Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 59
Filter
1.
bioRxiv ; 2024 Apr 20.
Article in English | MEDLINE | ID: mdl-38659906

ABSTRACT

Structural variants (SVs) contribute significantly to human genetic diversity and disease 1-4 . Previously, SVs have remained incompletely resolved by population genomics, with short-read sequencing facing limitations in capturing the whole spectrum of SVs at nucleotide resolution 5-7 . Here we leveraged nanopore sequencing 8 to construct an intermediate coverage resource of 1,019 long-read genomes sampled within 26 human populations from the 1000 Genomes Project. By integrating linear and graph-based approaches for SV analysis via pangenome graph-augmentation, we uncover 167,291 sequence-resolved SVs in these samples, considerably advancing SV characterization compared to population-wide short-read sequencing studies 3,4 . Our analysis details diverse SV classes-deletions, duplications, insertions, and inversions-at population-scale. LINE-1 and SVA retrotransposition activities frequently mediate transductions 9,10 of unique sequences, with both mobile element classes transducing sequences at either the 3'- or 5'-end, depending on the source element locus. Furthermore, analyses of SV breakpoint junctions suggest a continuum of homology-mediated rearrangement processes are integral to SV formation, and highlight evidence for SV recurrence involving repeat sequences. Our open-access dataset underscores the transformative impact of long-read sequencing in advancing the characterisation of polymorphic genomic architectures, and provides a resource for guiding variant prioritisation in future long-read sequencing-based disease studies.

2.
Infect Genet Evol ; 119: 105577, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38403035

ABSTRACT

In January 2021, the monitoring of circulating variants of SARS-CoV-2 was initiated in Germany under the Corona Surveillance Act, which was discontinued after July 2023. This initiative aimed to enhance pandemic containment, as specific amino acid changes, particularly in the spike protein, were associated with increased transmission and reduced vaccine efficacy. Our group conducted whole genome sequencing using the ARTIC protocol (currently V4) on Illumina's NextSeq 500 platform (and, starting in May 2023, on the MiSeq DX platform) for SARS-CoV-2 positive specimen from patients at Heidelberg University Hospital, associated hospitals, and the public health office in the Rhine-Neckar/Heidelberg region. In total, we sequenced 26,795 SARS-CoV-2-positive samples between January 2021 and July 2023. Valid sequences, meeting the requirements for upload to the German electronic sequencing data hub (DESH) operated by the Robert Koch Institute (RKI), were determined for 24,852 samples, and the lineage/clade could be identified for 25,912 samples. The year 2021 witnessed significant dynamics in the circulating variants in the Rhine-Neckar/Heidelberg region, including A.27.RN, followed by the emergence of B.1.1.7 (Alpha), subsequently displaced by B.1.617.2 (Delta), and the initial occurrences of B.1.1.529 (Omicron). By January 2022, B.1.1.529 had superseded B.1.617.2, dominating with over 90%. The years 2022 and 2023 were then characterized by the dominance of B.1.1.529 and its sublineages, particularly BA.5 and BA.2, and more recently, the emergence of recombinant variants like XBB.1.5. Since the global dominance of B.1.617.2, the identified variant distribution in our local study, apart from a time delay in the spread of new variants, can be considered largely representative of the global distribution. om a time delay in the spread of new variants, can be considered largely representative of the global distribution.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19/epidemiology , Germany/epidemiology , Hospitals, University
3.
Cell Genom ; 3(4): 100281, 2023 Apr 12.
Article in English | MEDLINE | ID: mdl-37082141

ABSTRACT

Cancer genomes harbor a broad spectrum of structural variants (SVs) driving tumorigenesis, a relevant subset of which escape discovery using short-read sequencing. We employed Oxford Nanopore Technologies (ONT) long-read sequencing in a paired diagnostic and post-therapy medulloblastoma to unravel the haplotype-resolved somatic genetic and epigenetic landscape. We assembled complex rearrangements, including a 1.55-Mbp chromothripsis event, and we uncover a complex SV pattern termed templated insertion (TI) thread, characterized by short (mostly <1 kb) insertions showing prevalent self-concatenation into highly amplified structures of up to 50 kbp in size. TI threads occur in 3% of cancers, with a prevalence up to 74% in liposarcoma, and frequent colocalization with chromothripsis. We also perform long-read-based methylome profiling and discover allele-specific methylation (ASM) effects, complex rearrangements exhibiting differential methylation, and differential promoter methylation in cancer-driver genes. Our study shows the advantage of long-read sequencing in the discovery and characterization of complex somatic rearrangements.

5.
Nat Biotechnol ; 41(6): 832-844, 2023 06.
Article in English | MEDLINE | ID: mdl-36424487

ABSTRACT

Somatic structural variants (SVs) are widespread in cancer, but their impact on disease evolution is understudied due to a lack of methods to directly characterize their functional consequences. We present a computational method, scNOVA, which uses Strand-seq to perform haplotype-aware integration of SV discovery and molecular phenotyping in single cells by using nucleosome occupancy to infer gene expression as a readout. Application to leukemias and cell lines identifies local effects of copy-balanced rearrangements on gene deregulation, and consequences of SVs on aberrant signaling pathways in subclones. We discovered distinct SV subclones with dysregulated Wnt signaling in a chronic lymphocytic leukemia patient. We further uncovered the consequences of subclonal chromothripsis in T cell acute lymphoblastic leukemia, which revealed c-Myb activation, enrichment of a primitive cell state and informed successful targeting of the subclone in cell culture, using a Notch inhibitor. By directly linking SVs to their functional effects, scNOVA enables systematic single-cell multiomic studies of structural variation in heterogeneous cell populations.


Subject(s)
Chromothripsis , Leukemia , Neoplasms , Humans , Neoplasms/genetics , Leukemia/genetics , Gene Rearrangement , Cell Line , Genomic Structural Variation
6.
Leukemia ; 36(7): 1759-1768, 2022 07.
Article in English | MEDLINE | ID: mdl-35585141

ABSTRACT

The mechanisms underlying T-ALL relapse remain essentially unknown. Multilevel-omics in 38 matched pairs of initial and relapsed T-ALL revealed 18 (47%) type-1 (defined by being derived from the major ancestral clone) and 20 (53%) type-2 relapses (derived from a minor ancestral clone). In both types of relapse, we observed known and novel drivers of multidrug resistance including MDR1 and MVP, NT5C2 and JAK-STAT activators. Patients with type-1 relapses were specifically characterized by IL7R upregulation. In remarkable contrast, type-2 relapses demonstrated (1) enrichment of constitutional cancer predisposition gene mutations, (2) divergent genetic and epigenetic remodeling, and (3) enrichment of somatic hypermutator phenotypes, related to BLM, BUB1B/PMS2 and TP53 mutations. T-ALLs that later progressed to type-2 relapses exhibited a complex subclonal architecture, unexpectedly, already at the time of initial diagnosis. Deconvolution analysis of ATAC-Seq profiles showed that T-ALLs later developing into type-1 relapses resembled a predominant immature thymic T-cell population, whereas T-ALLs developing into type-2 relapses resembled a mixture of normal T-cell precursors. In sum, our analyses revealed fundamentally different mechanisms driving either type-1 or type-2 T-ALL relapse and indicate that differential capacities of disease evolution are already inherent to the molecular setup of the initial leukemia.


Subject(s)
Precursor T-Cell Lymphoblastic Leukemia-Lymphoma , Child , Clonal Evolution/genetics , Humans , Mutation , Precursor T-Cell Lymphoblastic Leukemia-Lymphoma/genetics , Precursor T-Cell Lymphoblastic Leukemia-Lymphoma/metabolism , Recurrence
7.
Am J Transplant ; 22(7): 1873-1883, 2022 07.
Article in English | MEDLINE | ID: mdl-35384272

ABSTRACT

Seroconversion after COVID-19 vaccination is impaired in kidney transplant recipients. Emerging variants of concern such as the B.1.617.2 (delta) and the B.1.1.529 (omicron) variants pose an increasing threat to these patients. In this observational cohort study, we measured anti-S1 IgG, surrogate neutralizing, and anti-receptor-binding domain antibodies three weeks after a third mRNA vaccine dose in 49 kidney transplant recipients and compared results to 25 age-matched healthy controls. In addition, vaccine-induced neutralization of SARS-CoV-2 wild-type, the B.1.617.2 (delta), and the B.1.1.529 (omicron) variants was assessed using a live-virus assay. After a third vaccine dose, anti-S1 IgG, surrogate neutralizing, and anti-receptor-binding domain antibodies were significantly lower in kidney transplant recipients compared to healthy controls. Only 29/49 (59%) sera of kidney transplant recipients contained neutralizing antibodies against the SARS-CoV-2 wild-type or the B.1.617.2 (delta) variant and neutralization titers were significantly reduced compared to healthy controls (p < 0.001). Vaccine-induced cross-neutralization of the B.1.1.529 (omicron) variants was detectable in 15/35 (43%) kidney transplant recipients with seropositivity for anti-S1 IgG, surrogate neutralizing, and/or anti-RBD antibodies. Neutralization of the B.1.1.529 (omicron) variants was significantly reduced compared to neutralization of SARS-CoV-2 wild-type or the B.1.617.2 (delta) variant for both, kidney transplant recipients and healthy controls (p < .001 for all).


Subject(s)
COVID-19 , Kidney Transplantation , Antibodies, Neutralizing , Antibodies, Viral , COVID-19/prevention & control , COVID-19 Vaccines , Humans , Immunoglobulin G , RNA, Messenger , SARS-CoV-2 , Transplant Recipients , Vaccines, Synthetic , Viral Envelope Proteins/genetics , mRNA Vaccines
8.
Nat Genet ; 54(4): 518-525, 2022 04.
Article in English | MEDLINE | ID: mdl-35410384

ABSTRACT

Typical genotyping workflows map reads to a reference genome before identifying genetic variants. Generating such alignments introduces reference biases and comes with substantial computational burden. Furthermore, short-read lengths limit the ability to characterize repetitive genomic regions, which are particularly challenging for fast k-mer-based genotypers. In the present study, we propose a new algorithm, PanGenie, that leverages a haplotype-resolved pangenome reference together with k-mer counts from short-read sequencing data to genotype a wide spectrum of genetic variation-a process we refer to as genome inference. Compared with mapping-based approaches, PanGenie is more than 4 times faster at 30-fold coverage and achieves better genotype concordances for almost all variant types and coverages tested. Improvements are especially pronounced for large insertions (≥50 bp) and variants in repetitive regions, enabling the inclusion of these classes of variants in genome-wide association studies. PanGenie efficiently leverages the increasing amount of haplotype-resolved assemblies to unravel the functional impact of previously inaccessible variants while being faster compared with alignment-based workflows.


Subject(s)
Genetic Variation , Genome, Human , Genomics , Algorithms , Genome, Human/genetics , Genome-Wide Association Study , Genomics/methods , Genotype , High-Throughput Nucleotide Sequencing , Humans , Sequence Analysis, DNA
10.
J Pers Med ; 11(12)2021 Nov 25.
Article in English | MEDLINE | ID: mdl-34945722

ABSTRACT

The heritable component of schizophrenia (SCH) as a polygenic trait is represented by numerous variants from a heterogeneous group of genes each contributing a relatively small effect. Various SNPs have already been found and analyzed in genes encoding the NMDAR subunits. However, less is known about genetic variations of genes encoding the AMPA and kainate receptor subunits. We analyzed sixteen iGluR genes in full length to determine the sequence variability of iGluR genes. Our aim was to describe the rate of genetic variability, its distribution, and the co-occurrence of variants and to identify new candidate risk variants or haplotypes. The cumulative effect of genetic risk was then estimated using a simple scoring model. GRIN2A-B, GRIN3A-B, and GRIK4 genes showed significantly increased genetic variation in SCH patients. The fixation index statistic revealed eight intronic haplotypes and an additional four intronic SNPs within the sequences of iGluR genes associated with SCH (p < 0.05). The haplotypes were used in the proposed simple scoring model and moreover as a test for genetic predisposition to schizophrenia. The positive likelihood ratio for the scoring model test reached 7.11. We also observed 41 protein-altering variants (38 missense variants, four frameshifts, and one nonsense variant) that were not significantly associated with SCH. Our data suggest that some intronic regulatory regions of iGluR genes and their common variability are among the components from which the genetic predisposition to SCH is composed.

11.
Mol Oncol ; 15(12): 3363-3384, 2021 12.
Article in English | MEDLINE | ID: mdl-34328665

ABSTRACT

The paucity of microbiome studies at intestinal tissues has contributed to a yet limited understanding of potential viral and bacterial cofactors of colorectal cancer (CRC) carcinogenesis or progression. We analysed whole-genome sequences of CRC primary tumours, their corresponding metastases and matched normal tissue for sequences of viral, phage and bacterial species. Bacteriome analysis showed Fusobacterium nucleatum, Streptococcus sanguinis, F. Hwasookii, Anaerococcus mediterraneensis and further species enriched in primary CRCs. The primary CRC of one patient was enriched for F. alocis, S. anginosus, Parvimonas micra and Gemella sp. 948. Enrichment of Escherichia coli strains IAI1, SE11, K-12 and M8 was observed in metastases together with coliphages enterobacteria phage φ80 and Escherichia phage VT2φ_272. Virome analysis showed that phages were the most preponderant viral species (46%), the main families being Myoviridae, Siphoviridae and Podoviridae. Primary CRCs were enriched for bacteriophages, showing five phages (Enterobacteria, Bacillus, Proteus, Streptococcus phages) together with their pathogenic hosts in contrast to normal tissues. The most frequently detected, and Blast-confirmed, viruses included human endogenous retrovirus K113, human herpesviruses 7 and 6B, Megavirus chilensis, cytomegalovirus (CMV) and Epstein-Barr virus (EBV), with one patient showing EBV enrichment in primary tumour and metastases. EBV was PCR-validated in 80 pairs of CRC primary tumour and their corresponding normal tissues; in 21 of these pairs (26.3%), it was detectable in primary tumours only. The number of viral species was increased and bacterial species decreased in CRCs compared with normal tissues, and we could discriminate primary CRCs from metastases and normal tissues by applying the Hutcheson t-test on the Shannon indices based on viral and bacterial species. Taken together, our results descriptively support hypotheses on microorganisms as potential (co)risk factors of CRC and extend putative suggestions on critical microbiome species in CRC metastasis.


Subject(s)
Colorectal Neoplasms , Epstein-Barr Virus Infections , Microbiota , Colorectal Neoplasms/genetics , Herpesvirus 4, Human , Humans , Risk Factors
12.
Blood Cancer J ; 11(5): 102, 2021 05 26.
Article in English | MEDLINE | ID: mdl-34039950

ABSTRACT

Epstein-Barr virus (EBV)-associated diffuse large B-cell lymphoma not otherwise specified (DLBCL NOS) constitute a distinct clinicopathological entity in the current World Health Organization (WHO) classification. However, its genomic features remain sparsely characterized. Here, we combine whole-genome sequencing (WGS), targeted amplicon sequencing (tNGS), and fluorescence in situ hybridization (FISH) from 47 EBV + DLBCL (NOS) cases to delineate the genomic landscape of this rare disease. Integrated WGS and tNGS analysis clearly distinguished this tumor type from EBV-negative DLBCL due to frequent mutations in ARID1A (45%), KMT2A/KMT2D (32/30%), ANKRD11 (32%), or NOTCH2 (32%). WGS uncovered structural aberrations including 6q deletions (5/8 patients), which were subsequently validated by FISH (14/32 cases). Expanding on previous reports, we identified recurrent alterations in CCR6 (15%), DAPK1 (15%), TNFRSF21 (13%), CCR7 (11%), and YY1 (6%). Lastly, functional annotation of the mutational landscape by sequential gene set enrichment and network propagation predicted an effect on the nuclear factor κB (NFκB) pathway (CSNK2A2, CARD10), IL6/JAK/STAT (SOCS1/3, STAT3), and WNT signaling (FRAT1, SFRP5) alongside aberrations in immunological processes, such as interferon response. This first comprehensive description of EBV + DLBCL (NOS) tumors substantiates the evidence of its pathobiological independence and helps stratify the molecular taxonomy of aggressive lymphomas in the effort for future therapeutic strategies.


Subject(s)
Epstein-Barr Virus Infections/complications , Lymphoma, Large B-Cell, Diffuse/genetics , Lymphoma, Large B-Cell, Diffuse/virology , Adult , Aged , Aged, 80 and over , Chromosome Aberrations , Female , Gene Regulatory Networks , Herpesvirus 4, Human/isolation & purification , High-Throughput Nucleotide Sequencing , Humans , In Situ Hybridization, Fluorescence , Male , Middle Aged , Mutation , Whole Genome Sequencing , Young Adult
13.
Science ; 372(6537)2021 04 02.
Article in English | MEDLINE | ID: mdl-33632895

ABSTRACT

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.


Subject(s)
Genetic Variation , Genome, Human , Haplotypes , Female , Genotype , High-Throughput Nucleotide Sequencing , Humans , INDEL Mutation , Interspersed Repetitive Sequences , Male , Population Groups/genetics , Quantitative Trait Loci , Retroelements , Sequence Analysis, DNA , Sequence Inversion , Whole Genome Sequencing
14.
Gigascience ; 9(10)2020 10 07.
Article in English | MEDLINE | ID: mdl-33034633

ABSTRACT

BACKGROUND: Tandem repeat sequences are widespread in the human genome, and their expansions cause multiple repeat-mediated disorders. Genome-wide discovery approaches are needed to fully elucidate their roles in health and disease, but resolving tandem repeat variation accurately remains a challenging task. While traditional mapping-based approaches using short-read data have severe limitations in the size and type of tandem repeats they can resolve, recent third-generation sequencing technologies exhibit substantially higher sequencing error rates, which complicates repeat resolution. RESULTS: We developed TRiCoLOR, a freely available tool for tandem repeat profiling using error-prone long reads from third-generation sequencing technologies. The method can identify repetitive regions in sequencing data without a prior knowledge of their motifs or locations and resolve repeat multiplicity and period size in a haplotype-specific manner. The tool includes methods to interactively visualize the identified repeats and to trace their Mendelian consistency in pedigrees. CONCLUSIONS: TRiCoLOR demonstrates excellent performance and improved sensitivity and specificity compared with alternative tools on synthetic data. For real human whole-genome sequencing data, TRiCoLOR achieves high validation rates, suggesting its suitability to identify tandem repeat variation in personal genomes.


Subject(s)
Genome, Human , Tandem Repeat Sequences , High-Throughput Nucleotide Sequencing , Humans , Sensitivity and Specificity , Sequence Analysis, DNA , Whole Genome Sequencing
15.
EMBO Mol Med ; 12(9): e12104, 2020 09 07.
Article in English | MEDLINE | ID: mdl-32755029

ABSTRACT

We aimed at identifying the developmental stage at which leukemic cells of pediatric T-ALLs are arrested and at defining leukemogenic mechanisms based on ATAC-Seq. Chromatin accessibility maps of seven developmental stages of human healthy T cells revealed progressive chromatin condensation during T-cell maturation. Developmental stages were distinguished by 2,823 signature chromatin regions with 95% accuracy. Open chromatin surrounding SAE1 was identified to best distinguish thymic developmental stages suggesting a potential role of SUMOylation in T-cell development. Deconvolution using signature regions revealed that T-ALLs, including those with mature immunophenotypes, resemble the most immature populations, which was confirmed by TF-binding motif profiles. We integrated ATAC-Seq and RNA-Seq and found DAB1, a gene not related to leukemia previously, to be overexpressed, abnormally spliced and hyper-accessible in T-ALLs. DAB1-negative patients formed a distinct subgroup with particularly immature chromatin profiles and hyper-accessible binding sites for SPI1 (PU.1), a TF crucial for normal T-cell maturation. In conclusion, our analyses of chromatin accessibility and TF-binding motifs showed that pediatric T-ALL cells are most similar to immature thymic precursors, indicating an early developmental arrest.


Subject(s)
Precursor Cells, T-Lymphoid , Precursor T-Cell Lymphoblastic Leukemia-Lymphoma , Child , Chromatin , Humans , Oncogenes , Precursor T-Cell Lymphoblastic Leukemia-Lymphoma/genetics , Protein Binding
16.
Nature ; 580(7803): 396-401, 2020 04.
Article in English | MEDLINE | ID: mdl-32296180

ABSTRACT

Cancer genomics has revealed many genes and core molecular processes that contribute to human malignancies, but the genetic and molecular bases of many rare cancers remains unclear. Genetic predisposition accounts for 5 to 10% of cancer diagnoses in children1,2, and genetic events that cooperate with known somatic driver events are poorly understood. Pathogenic germline variants in established cancer predisposition genes have been recently identified in 5% of patients with the malignant brain tumour medulloblastoma3. Here, by analysing all protein-coding genes, we identify and replicate rare germline loss-of-function variants across ELP1 in 14% of paediatric patients with the medulloblastoma subgroup Sonic Hedgehog (MBSHH). ELP1 was the most common medulloblastoma predisposition gene and increased the prevalence of genetic predisposition to 40% among paediatric patients with MBSHH. Parent-offspring and pedigree analyses identified two families with a history of paediatric medulloblastoma. ELP1-associated medulloblastomas were restricted to the molecular SHHα subtype4 and characterized by universal biallelic inactivation of ELP1 owing to somatic loss of chromosome arm 9q. Most ELP1-associated medulloblastomas also exhibited somatic alterations in PTCH1, which suggests that germline ELP1 loss-of-function variants predispose individuals to tumour development in combination with constitutive activation of SHH signalling. ELP1 is the largest subunit of the evolutionarily conserved Elongator complex, which catalyses translational elongation through tRNA modifications at the wobble (U34) position5,6. Tumours from patients with ELP1-associated MBSHH were characterized by a destabilized Elongator complex, loss of Elongator-dependent tRNA modifications, codon-dependent translational reprogramming, and induction of the unfolded protein response, consistent with loss of protein homeostasis due to Elongator deficiency in model systems7-9. Thus, genetic predisposition to proteome instability may be a determinant in the pathogenesis of paediatric brain cancers. These results support investigation of the role of protein homeostasis in other cancer types and potential for therapeutic interference.


Subject(s)
Cerebellar Neoplasms/metabolism , Germ-Line Mutation , Medulloblastoma/metabolism , Transcriptional Elongation Factors/metabolism , Cerebellar Neoplasms/genetics , Cerebellar Neoplasms/pathology , Child , Female , Humans , Male , Medulloblastoma/genetics , Pedigree , RNA, Transfer/metabolism , Transcriptional Elongation Factors/genetics
17.
BMC Genomics ; 21(1): 230, 2020 Mar 14.
Article in English | MEDLINE | ID: mdl-32171249

ABSTRACT

BACKGROUND: DNA sequencing is at the core of many molecular biology laboratories. Despite its long history, there is a lack of user-friendly Sanger sequencing data analysis tools that can be run interactively as a web application or at large-scale in batch from the command-line. RESULTS: We present Tracy, an efficient and versatile command-line application that enables basecalling, alignment, assembly and deconvolution of sequencing chromatogram files. Its companion web applications make all functionality of Tracy easily accessible using standard web browser technologies and interactive graphical user interfaces. Tracy can be easily integrated in large-scale pipelines and high-throughput settings, and it uses state-of-the-art file formats such as JSON and BCF for reporting chromatogram sequencing results and variant calls. The software is open-source and freely available at https://github.com/gear-genomics/tracy, the companion web applications are hosted at https://www.gear-genomics.com. CONCLUSIONS: Tracy can be routinely applied in large-scale validation efforts conducted in clinical genomics studies as well as for high-throughput genome editing techniques that require a fast and rapid method to confirm discovered variants or engineered mutations. Molecular biologists benefit from the companion web applications that enable installation-free Sanger chromatogram analyses using intuitive, graphical user interfaces.


Subject(s)
Computational Biology/methods , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Software , User-Computer Interface , Web Browser
18.
Bioinformatics ; 36(4): 1267-1269, 2020 02 15.
Article in English | MEDLINE | ID: mdl-31589307

ABSTRACT

SUMMARY: VISOR is a tool for haplotype-specific simulations of simple and complex structural variants (SVs). The method is applicable to haploid, diploid or higher ploidy simulations for bulk or single-cell sequencing data. SVs are implanted into FASTA haplotypes at single-basepair resolution, optionally with nearby single-nucleotide variants. Short or long reads are drawn at random from these haplotypes using standard error profiles. Double- or single-stranded data can be simulated and VISOR supports the generation of haplotype-tagged BAM files. The tool further includes methods to interactively visualize simulated variants in single-stranded data. The versatility of VISOR is unmet by comparable tools and it lays the foundation to simulate haplotype-resolved cancer heterogeneity data in bulk or at single-cell resolution. AVAILABILITY AND IMPLEMENTATION: VISOR is implemented in python 3.6, open-source and freely available at https://github.com/davidebolo1993/VISOR. Documentation is available at https://davidebolo1993.github.io/visordoc/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
High-Throughput Nucleotide Sequencing , Software , Diploidy , Haplotypes , Sequence Analysis, DNA
19.
Nat Biotechnol ; 38(3): 343-354, 2020 03.
Article in English | MEDLINE | ID: mdl-31873213

ABSTRACT

Structural variation (SV), involving deletions, duplications, inversions and translocations of DNA segments, is a major source of genetic variability in somatic cells and can dysregulate cancer-related pathways. However, discovering somatic SVs in single cells has been challenging, with copy-number-neutral and complex variants typically escaping detection. Here we describe single-cell tri-channel processing (scTRIP), a computational framework that integrates read depth, template strand and haplotype phase to comprehensively discover SVs in individual cells. We surveyed SV landscapes of 565 single cells, including transformed epithelial cells and patient-derived leukemic samples, to discover abundant SV classes, including inversions, translocations and complex DNA rearrangements. Analysis of the leukemic samples revealed four times more somatic SVs than cytogenetic karyotyping, submicroscopic copy-number alterations, oncogenic copy-neutral rearrangements and a subclonal chromothripsis event. Advancing current methods, single-cell tri-channel processing can directly measure SV mutational processes in individual cells, such as breakage-fusion-bridge cycles, facilitating studies of clonal evolution, genetic mosaicism and SV formation mechanisms, which could improve disease classification for precision medicine.


Subject(s)
Computational Biology/methods , Genomic Structural Variation , Leukemia/genetics , Single-Cell Analysis/methods , Cell Line , Chromothripsis , Clonal Evolution , Gene Rearrangement , Humans , INDEL Mutation , Sequence Inversion , Translocation, Genetic
20.
Nature ; 576(7786): 274-280, 2019 12.
Article in English | MEDLINE | ID: mdl-31802000

ABSTRACT

Embryonal tumours with multilayered rosettes (ETMRs) are aggressive paediatric embryonal brain tumours with a universally poor prognosis1. Here we collected 193 primary ETMRs and 23 matched relapse samples to investigate the genomic landscape of this distinct tumour type. We found that patients with tumours in which the proposed driver C19MC2-4 was not amplified frequently had germline mutations in DICER1 or other microRNA-related aberrations such as somatic amplification of miR-17-92 (also known as MIR17HG). Whole-genome sequencing revealed that tumours had an overall low recurrence of single-nucleotide variants (SNVs), but showed prevalent genomic instability caused by widespread occurrence of R-loop structures. We show that R-loop-associated chromosomal instability can be induced by the loss of DICER1 function. Comparison of primary tumours and matched relapse samples showed a strong conservation of structural variants, but low conservation of SNVs. Moreover, many newly acquired SNVs are associated with a mutational signature related to cisplatin treatment. Finally, we show that targeting R-loops with topoisomerase and PARP inhibitors might be an effective treatment strategy for this deadly disease.


Subject(s)
MicroRNAs/genetics , Neoplasms, Germ Cell and Embryonal/genetics , DEAD-box RNA Helicases/genetics , DNA Topoisomerases, Type I/genetics , Humans , Mutation , Neoplasms, Germ Cell and Embryonal/diagnosis , Poly(ADP-ribose) Polymerase Inhibitors , Poly(ADP-ribose) Polymerases/genetics , Polymorphism, Single Nucleotide , RNA, Long Noncoding , Recurrence , Ribonuclease III/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...