Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 64
Filtrar
1.
bioRxiv ; 2024 Sep 25.
Artículo en Inglés | MEDLINE | ID: mdl-39372794

RESUMEN

Diverse sets of complete human genomes are required to construct a pangenome reference and to understand the extent of complex structural variation. Here, we sequence 65 diverse human genomes and build 130 haplotype-resolved assemblies (130 Mbp median continuity), closing 92% of all previous assembly gaps and reaching telomere-to-telomere (T2T) status for 39% of the chromosomes. We highlight complete sequence continuity of complex loci, including the major histocompatibility complex (MHC), SMN1/SMN2, NBPF8, and AMY1/AMY2, and fully resolve 1,852 complex structural variants (SVs). In addition, we completely assemble and validate 1,246 human centromeres. We find up to 30-fold variation in α-satellite high-order repeat (HOR) array length and characterize the pattern of mobile element insertions into α-satellite HOR arrays. While most centromeres predict a single site of kinetochore attachment, epigenetic analysis suggests the presence of two hypomethylated regions for 7% of centromeres. Combining our data with the draft pangenome reference significantly enhances genotyping accuracy from short-read data, enabling whole-genome inference to a median quality value (QV) of 45. Using this approach, 26,115 SVs per sample are detected, substantially increasing the number of SVs now amenable to downstream disease association studies.

2.
Nat Commun ; 15(1): 8007, 2024 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-39266513

RESUMEN

Modern sequencing technology enables the systematic detection of complex structural variation (SV) across genomes. However, extensive DNA rearrangements arising through a series of mutations, a phenomenon we refer to as serial SV (sSV), remain underexplored, posing a challenge for SV discovery. Here, we present NAHRwhals ( https://github.com/WHops/NAHRwhals ), a method to infer repeat-mediated series of SVs in long-read genomic assemblies. Applying NAHRwhals to haplotype-resolved human genomes from 28 individuals reveals 37 sSV loci of various length and complexity. These sSVs explain otherwise cryptic variation in medically relevant regions such as the TPSAB1 gene, 8p23.1, 22q11 and Sotos syndrome regions. Comparisons with great ape assemblies indicate that most human sSVs formed recently, after the human-ape split, and involved non-repeat-mediated processes in addition to non-allelic homologous recombination. NAHRwhals reliably discovers and characterizes sSVs at scale and independent of species, uncovering their genomic abundance and suggesting broader implications for disease.


Asunto(s)
Genoma Humano , Variación Estructural del Genoma , Hominidae , Humanos , Animales , Hominidae/genética , Genoma Humano/genética , Genómica/métodos , Haplotipos
3.
Data Brief ; 55: 110607, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39006345

RESUMEN

In January 2021, Germany commenced surveillance of SARS-CoV-2 variants under the Corona Surveillance Act, which ceased in July 2023. The objective was to bolster pandemic control, as specific alterations in amino acids, particularly within the spike protein, were linked to heightened transmission and decreased vaccine effectiveness. Consequently, our team conducted whole genome sequencing using the commercially accessible ARTIC protocol on Illumina's NextSeq500 platform and MiSeq for SARS-CoV-2 positive samples obtained from patients at Heidelberg University Hospital, affiliated hospitals, and the public health office in the Rhine-Neckar/Heidelberg region. Throughout the pandemic, we refined the existing ARTIC V4 protocol as well as our bioinformatics pipeline, the details of which are outlined in this report. This report reflects the protocol for the MiSeq analysis, the protocol for the NextSeq500 can be found in our previous publication.

4.
Genome Med ; 16(1): 83, 2024 06 17.
Artículo en Inglés | MEDLINE | ID: mdl-38886830

RESUMEN

BACKGROUND: Somatic copy number alterations are a hallmark of cancer that offer unique opportunities for therapeutic exploitation. Here, we focused on the identification of specific vulnerabilities for tumors harboring chromosome 8p deletions. METHODS: We developed and applied an integrative analysis of The Cancer Genome Atlas (TCGA), the Cancer Dependency Map (DepMap), and the Cancer Cell Line Encyclopedia to identify chromosome 8p-specific vulnerabilities. We employ orthogonal gene targeting strategies, both in vitro and in vivo, including short hairpin RNA-mediated gene knockdown and CRISPR/Cas9-mediated gene knockout to validate vulnerabilities. RESULTS: We identified SLC25A28 (also known as MFRN2), as a specific vulnerability for tumors harboring chromosome 8p deletions. We demonstrate that vulnerability towards MFRN2 loss is dictated by the expression of its paralog, SLC25A37 (also known as MFRN1), which resides on chromosome 8p. In line with their function as mitochondrial iron transporters, MFRN1/2 paralog protein deficiency profoundly impaired mitochondrial respiration, induced global depletion of iron-sulfur cluster proteins, and resulted in DNA-damage and cell death. MFRN2 depletion in MFRN1-deficient tumors led to impaired growth and even tumor eradication in preclinical mouse xenograft experiments, highlighting its therapeutic potential. CONCLUSIONS: Our data reveal MFRN2 as a therapeutic target of chromosome 8p deleted cancers and nominate MFNR1 as the complimentary biomarker for MFRN2-directed therapies.


Asunto(s)
Deleción Cromosómica , Cromosomas Humanos Par 8 , Neoplasias , Humanos , Cromosomas Humanos Par 8/genética , Animales , Ratones , Neoplasias/genética , Línea Celular Tumoral , Mutaciones Letales Sintéticas , Mitocondrias/metabolismo , Mitocondrias/genética , Proteínas Mitocondriales/genética , Proteínas Mitocondriales/metabolismo , Regulación Neoplásica de la Expresión Génica , Variaciones en el Número de Copia de ADN
5.
Nat Genet ; 56(6): 1134-1146, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38806714

RESUMEN

The functional impact and cellular context of mosaic structural variants (mSVs) in normal tissues is understudied. Utilizing Strand-seq, we sequenced 1,133 single-cell genomes from 19 human donors of increasing age, and discovered the heterogeneous mSV landscapes of hematopoietic stem and progenitor cells. While mSVs are continuously acquired throughout life, expanded subclones in our cohort are confined to individuals >60. Cells already harboring mSVs are more likely to acquire additional somatic structural variants, including megabase-scale segmental aneuploidies. Capitalizing on comprehensive single-cell micrococcal nuclease digestion with sequencing reference data, we conducted high-resolution cell-typing for eight hematopoietic stem and progenitor cells. Clonally expanded mSVs disrupt normal cellular function by dysregulating diverse cellular pathways, and enriching for myeloid progenitors. Our findings underscore the contribution of mSVs to the cellular and molecular phenotypes associated with the aging hematopoietic system, and establish a foundation for deciphering the molecular links between mSVs, aging and disease susceptibility in normal tissues.


Asunto(s)
Células Madre Hematopoyéticas , Mosaicismo , Humanos , Células Madre Hematopoyéticas/metabolismo , Células Madre Hematopoyéticas/citología , Persona de Mediana Edad , Adulto , Análisis de la Célula Individual/métodos , Anciano , Femenino , Masculino , Envejecimiento/genética , Anciano de 80 o más Años , Células Madre/metabolismo , Variación Genética
6.
bioRxiv ; 2024 Apr 20.
Artículo en Inglés | MEDLINE | ID: mdl-38659906

RESUMEN

Structural variants (SVs) contribute significantly to human genetic diversity and disease 1-4 . Previously, SVs have remained incompletely resolved by population genomics, with short-read sequencing facing limitations in capturing the whole spectrum of SVs at nucleotide resolution 5-7 . Here we leveraged nanopore sequencing 8 to construct an intermediate coverage resource of 1,019 long-read genomes sampled within 26 human populations from the 1000 Genomes Project. By integrating linear and graph-based approaches for SV analysis via pangenome graph-augmentation, we uncover 167,291 sequence-resolved SVs in these samples, considerably advancing SV characterization compared to population-wide short-read sequencing studies 3,4 . Our analysis details diverse SV classes-deletions, duplications, insertions, and inversions-at population-scale. LINE-1 and SVA retrotransposition activities frequently mediate transductions 9,10 of unique sequences, with both mobile element classes transducing sequences at either the 3'- or 5'-end, depending on the source element locus. Furthermore, analyses of SV breakpoint junctions suggest a continuum of homology-mediated rearrangement processes are integral to SV formation, and highlight evidence for SV recurrence involving repeat sequences. Our open-access dataset underscores the transformative impact of long-read sequencing in advancing the characterisation of polymorphic genomic architectures, and provides a resource for guiding variant prioritisation in future long-read sequencing-based disease studies.

7.
Infect Genet Evol ; 119: 105577, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38403035

RESUMEN

In January 2021, the monitoring of circulating variants of SARS-CoV-2 was initiated in Germany under the Corona Surveillance Act, which was discontinued after July 2023. This initiative aimed to enhance pandemic containment, as specific amino acid changes, particularly in the spike protein, were associated with increased transmission and reduced vaccine efficacy. Our group conducted whole genome sequencing using the ARTIC protocol (currently V4) on Illumina's NextSeq 500 platform (and, starting in May 2023, on the MiSeq DX platform) for SARS-CoV-2 positive specimen from patients at Heidelberg University Hospital, associated hospitals, and the public health office in the Rhine-Neckar/Heidelberg region. In total, we sequenced 26,795 SARS-CoV-2-positive samples between January 2021 and July 2023. Valid sequences, meeting the requirements for upload to the German electronic sequencing data hub (DESH) operated by the Robert Koch Institute (RKI), were determined for 24,852 samples, and the lineage/clade could be identified for 25,912 samples. The year 2021 witnessed significant dynamics in the circulating variants in the Rhine-Neckar/Heidelberg region, including A.27.RN, followed by the emergence of B.1.1.7 (Alpha), subsequently displaced by B.1.617.2 (Delta), and the initial occurrences of B.1.1.529 (Omicron). By January 2022, B.1.1.529 had superseded B.1.617.2, dominating with over 90%. The years 2022 and 2023 were then characterized by the dominance of B.1.1.529 and its sublineages, particularly BA.5 and BA.2, and more recently, the emergence of recombinant variants like XBB.1.5. Since the global dominance of B.1.617.2, the identified variant distribution in our local study, apart from a time delay in the spread of new variants, can be considered largely representative of the global distribution. om a time delay in the spread of new variants, can be considered largely representative of the global distribution.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , COVID-19/epidemiología , Alemania/epidemiología , Hospitales Universitarios
8.
Cell Genom ; 3(4): 100281, 2023 Apr 12.
Artículo en Inglés | MEDLINE | ID: mdl-37082141

RESUMEN

Cancer genomes harbor a broad spectrum of structural variants (SVs) driving tumorigenesis, a relevant subset of which escape discovery using short-read sequencing. We employed Oxford Nanopore Technologies (ONT) long-read sequencing in a paired diagnostic and post-therapy medulloblastoma to unravel the haplotype-resolved somatic genetic and epigenetic landscape. We assembled complex rearrangements, including a 1.55-Mbp chromothripsis event, and we uncover a complex SV pattern termed templated insertion (TI) thread, characterized by short (mostly <1 kb) insertions showing prevalent self-concatenation into highly amplified structures of up to 50 kbp in size. TI threads occur in 3% of cancers, with a prevalence up to 74% in liposarcoma, and frequent colocalization with chromothripsis. We also perform long-read-based methylome profiling and discover allele-specific methylation (ASM) effects, complex rearrangements exhibiting differential methylation, and differential promoter methylation in cancer-driver genes. Our study shows the advantage of long-read sequencing in the discovery and characterization of complex somatic rearrangements.

10.
Nat Biotechnol ; 41(6): 832-844, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-36424487

RESUMEN

Somatic structural variants (SVs) are widespread in cancer, but their impact on disease evolution is understudied due to a lack of methods to directly characterize their functional consequences. We present a computational method, scNOVA, which uses Strand-seq to perform haplotype-aware integration of SV discovery and molecular phenotyping in single cells by using nucleosome occupancy to infer gene expression as a readout. Application to leukemias and cell lines identifies local effects of copy-balanced rearrangements on gene deregulation, and consequences of SVs on aberrant signaling pathways in subclones. We discovered distinct SV subclones with dysregulated Wnt signaling in a chronic lymphocytic leukemia patient. We further uncovered the consequences of subclonal chromothripsis in T cell acute lymphoblastic leukemia, which revealed c-Myb activation, enrichment of a primitive cell state and informed successful targeting of the subclone in cell culture, using a Notch inhibitor. By directly linking SVs to their functional effects, scNOVA enables systematic single-cell multiomic studies of structural variation in heterogeneous cell populations.


Asunto(s)
Cromotripsis , Leucemia , Neoplasias , Humanos , Neoplasias/genética , Leucemia/genética , Reordenamiento Génico , Línea Celular , Variación Estructural del Genoma
11.
Leukemia ; 36(7): 1759-1768, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35585141

RESUMEN

The mechanisms underlying T-ALL relapse remain essentially unknown. Multilevel-omics in 38 matched pairs of initial and relapsed T-ALL revealed 18 (47%) type-1 (defined by being derived from the major ancestral clone) and 20 (53%) type-2 relapses (derived from a minor ancestral clone). In both types of relapse, we observed known and novel drivers of multidrug resistance including MDR1 and MVP, NT5C2 and JAK-STAT activators. Patients with type-1 relapses were specifically characterized by IL7R upregulation. In remarkable contrast, type-2 relapses demonstrated (1) enrichment of constitutional cancer predisposition gene mutations, (2) divergent genetic and epigenetic remodeling, and (3) enrichment of somatic hypermutator phenotypes, related to BLM, BUB1B/PMS2 and TP53 mutations. T-ALLs that later progressed to type-2 relapses exhibited a complex subclonal architecture, unexpectedly, already at the time of initial diagnosis. Deconvolution analysis of ATAC-Seq profiles showed that T-ALLs later developing into type-1 relapses resembled a predominant immature thymic T-cell population, whereas T-ALLs developing into type-2 relapses resembled a mixture of normal T-cell precursors. In sum, our analyses revealed fundamentally different mechanisms driving either type-1 or type-2 T-ALL relapse and indicate that differential capacities of disease evolution are already inherent to the molecular setup of the initial leukemia.


Asunto(s)
Leucemia-Linfoma Linfoblástico de Células T Precursoras , Niño , Evolución Clonal/genética , Humanos , Mutación , Leucemia-Linfoma Linfoblástico de Células T Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células T Precursoras/metabolismo , Recurrencia
12.
Am J Transplant ; 22(7): 1873-1883, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35384272

RESUMEN

Seroconversion after COVID-19 vaccination is impaired in kidney transplant recipients. Emerging variants of concern such as the B.1.617.2 (delta) and the B.1.1.529 (omicron) variants pose an increasing threat to these patients. In this observational cohort study, we measured anti-S1 IgG, surrogate neutralizing, and anti-receptor-binding domain antibodies three weeks after a third mRNA vaccine dose in 49 kidney transplant recipients and compared results to 25 age-matched healthy controls. In addition, vaccine-induced neutralization of SARS-CoV-2 wild-type, the B.1.617.2 (delta), and the B.1.1.529 (omicron) variants was assessed using a live-virus assay. After a third vaccine dose, anti-S1 IgG, surrogate neutralizing, and anti-receptor-binding domain antibodies were significantly lower in kidney transplant recipients compared to healthy controls. Only 29/49 (59%) sera of kidney transplant recipients contained neutralizing antibodies against the SARS-CoV-2 wild-type or the B.1.617.2 (delta) variant and neutralization titers were significantly reduced compared to healthy controls (p < 0.001). Vaccine-induced cross-neutralization of the B.1.1.529 (omicron) variants was detectable in 15/35 (43%) kidney transplant recipients with seropositivity for anti-S1 IgG, surrogate neutralizing, and/or anti-RBD antibodies. Neutralization of the B.1.1.529 (omicron) variants was significantly reduced compared to neutralization of SARS-CoV-2 wild-type or the B.1.617.2 (delta) variant for both, kidney transplant recipients and healthy controls (p < .001 for all).


Asunto(s)
COVID-19 , Trasplante de Riñón , Anticuerpos Neutralizantes , Anticuerpos Antivirales , COVID-19/prevención & control , Vacunas contra la COVID-19 , Humanos , Inmunoglobulina G , ARN Mensajero , SARS-CoV-2 , Receptores de Trasplantes , Vacunas Sintéticas , Proteínas del Envoltorio Viral/genética , Vacunas de ARNm
13.
Nat Genet ; 54(4): 518-525, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35410384

RESUMEN

Typical genotyping workflows map reads to a reference genome before identifying genetic variants. Generating such alignments introduces reference biases and comes with substantial computational burden. Furthermore, short-read lengths limit the ability to characterize repetitive genomic regions, which are particularly challenging for fast k-mer-based genotypers. In the present study, we propose a new algorithm, PanGenie, that leverages a haplotype-resolved pangenome reference together with k-mer counts from short-read sequencing data to genotype a wide spectrum of genetic variation-a process we refer to as genome inference. Compared with mapping-based approaches, PanGenie is more than 4 times faster at 30-fold coverage and achieves better genotype concordances for almost all variant types and coverages tested. Improvements are especially pronounced for large insertions (≥50 bp) and variants in repetitive regions, enabling the inclusion of these classes of variants in genome-wide association studies. PanGenie efficiently leverages the increasing amount of haplotype-resolved assemblies to unravel the functional impact of previously inaccessible variants while being faster compared with alignment-based workflows.


Asunto(s)
Variación Genética , Genoma Humano , Genómica , Algoritmos , Genoma Humano/genética , Estudio de Asociación del Genoma Completo , Genómica/métodos , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN
15.
J Pers Med ; 11(12)2021 Nov 25.
Artículo en Inglés | MEDLINE | ID: mdl-34945722

RESUMEN

The heritable component of schizophrenia (SCH) as a polygenic trait is represented by numerous variants from a heterogeneous group of genes each contributing a relatively small effect. Various SNPs have already been found and analyzed in genes encoding the NMDAR subunits. However, less is known about genetic variations of genes encoding the AMPA and kainate receptor subunits. We analyzed sixteen iGluR genes in full length to determine the sequence variability of iGluR genes. Our aim was to describe the rate of genetic variability, its distribution, and the co-occurrence of variants and to identify new candidate risk variants or haplotypes. The cumulative effect of genetic risk was then estimated using a simple scoring model. GRIN2A-B, GRIN3A-B, and GRIK4 genes showed significantly increased genetic variation in SCH patients. The fixation index statistic revealed eight intronic haplotypes and an additional four intronic SNPs within the sequences of iGluR genes associated with SCH (p < 0.05). The haplotypes were used in the proposed simple scoring model and moreover as a test for genetic predisposition to schizophrenia. The positive likelihood ratio for the scoring model test reached 7.11. We also observed 41 protein-altering variants (38 missense variants, four frameshifts, and one nonsense variant) that were not significantly associated with SCH. Our data suggest that some intronic regulatory regions of iGluR genes and their common variability are among the components from which the genetic predisposition to SCH is composed.

16.
Mol Oncol ; 15(12): 3363-3384, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34328665

RESUMEN

The paucity of microbiome studies at intestinal tissues has contributed to a yet limited understanding of potential viral and bacterial cofactors of colorectal cancer (CRC) carcinogenesis or progression. We analysed whole-genome sequences of CRC primary tumours, their corresponding metastases and matched normal tissue for sequences of viral, phage and bacterial species. Bacteriome analysis showed Fusobacterium nucleatum, Streptococcus sanguinis, F. Hwasookii, Anaerococcus mediterraneensis and further species enriched in primary CRCs. The primary CRC of one patient was enriched for F. alocis, S. anginosus, Parvimonas micra and Gemella sp. 948. Enrichment of Escherichia coli strains IAI1, SE11, K-12 and M8 was observed in metastases together with coliphages enterobacteria phage φ80 and Escherichia phage VT2φ_272. Virome analysis showed that phages were the most preponderant viral species (46%), the main families being Myoviridae, Siphoviridae and Podoviridae. Primary CRCs were enriched for bacteriophages, showing five phages (Enterobacteria, Bacillus, Proteus, Streptococcus phages) together with their pathogenic hosts in contrast to normal tissues. The most frequently detected, and Blast-confirmed, viruses included human endogenous retrovirus K113, human herpesviruses 7 and 6B, Megavirus chilensis, cytomegalovirus (CMV) and Epstein-Barr virus (EBV), with one patient showing EBV enrichment in primary tumour and metastases. EBV was PCR-validated in 80 pairs of CRC primary tumour and their corresponding normal tissues; in 21 of these pairs (26.3%), it was detectable in primary tumours only. The number of viral species was increased and bacterial species decreased in CRCs compared with normal tissues, and we could discriminate primary CRCs from metastases and normal tissues by applying the Hutcheson t-test on the Shannon indices based on viral and bacterial species. Taken together, our results descriptively support hypotheses on microorganisms as potential (co)risk factors of CRC and extend putative suggestions on critical microbiome species in CRC metastasis.


Asunto(s)
Neoplasias Colorrectales , Infecciones por Virus de Epstein-Barr , Microbiota , Neoplasias Colorrectales/genética , Herpesvirus Humano 4 , Humanos , Factores de Riesgo
17.
Blood Cancer J ; 11(5): 102, 2021 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-34039950

RESUMEN

Epstein-Barr virus (EBV)-associated diffuse large B-cell lymphoma not otherwise specified (DLBCL NOS) constitute a distinct clinicopathological entity in the current World Health Organization (WHO) classification. However, its genomic features remain sparsely characterized. Here, we combine whole-genome sequencing (WGS), targeted amplicon sequencing (tNGS), and fluorescence in situ hybridization (FISH) from 47 EBV + DLBCL (NOS) cases to delineate the genomic landscape of this rare disease. Integrated WGS and tNGS analysis clearly distinguished this tumor type from EBV-negative DLBCL due to frequent mutations in ARID1A (45%), KMT2A/KMT2D (32/30%), ANKRD11 (32%), or NOTCH2 (32%). WGS uncovered structural aberrations including 6q deletions (5/8 patients), which were subsequently validated by FISH (14/32 cases). Expanding on previous reports, we identified recurrent alterations in CCR6 (15%), DAPK1 (15%), TNFRSF21 (13%), CCR7 (11%), and YY1 (6%). Lastly, functional annotation of the mutational landscape by sequential gene set enrichment and network propagation predicted an effect on the nuclear factor κB (NFκB) pathway (CSNK2A2, CARD10), IL6/JAK/STAT (SOCS1/3, STAT3), and WNT signaling (FRAT1, SFRP5) alongside aberrations in immunological processes, such as interferon response. This first comprehensive description of EBV + DLBCL (NOS) tumors substantiates the evidence of its pathobiological independence and helps stratify the molecular taxonomy of aggressive lymphomas in the effort for future therapeutic strategies.


Asunto(s)
Infecciones por Virus de Epstein-Barr/complicaciones , Linfoma de Células B Grandes Difuso/genética , Linfoma de Células B Grandes Difuso/virología , Adulto , Anciano , Anciano de 80 o más Años , Aberraciones Cromosómicas , Femenino , Redes Reguladoras de Genes , Herpesvirus Humano 4/aislamiento & purificación , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Hibridación Fluorescente in Situ , Masculino , Persona de Mediana Edad , Mutación , Secuenciación Completa del Genoma , Adulto Joven
18.
Science ; 372(6537)2021 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-33632895

RESUMEN

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.


Asunto(s)
Variación Genética , Genoma Humano , Haplotipos , Femenino , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Mutación INDEL , Secuencias Repetitivas Esparcidas , Masculino , Grupos de Población/genética , Sitios de Carácter Cuantitativo , Retroelementos , Análisis de Secuencia de ADN , Inversión de Secuencia , Secuenciación Completa del Genoma
19.
Gigascience ; 9(10)2020 10 07.
Artículo en Inglés | MEDLINE | ID: mdl-33034633

RESUMEN

BACKGROUND: Tandem repeat sequences are widespread in the human genome, and their expansions cause multiple repeat-mediated disorders. Genome-wide discovery approaches are needed to fully elucidate their roles in health and disease, but resolving tandem repeat variation accurately remains a challenging task. While traditional mapping-based approaches using short-read data have severe limitations in the size and type of tandem repeats they can resolve, recent third-generation sequencing technologies exhibit substantially higher sequencing error rates, which complicates repeat resolution. RESULTS: We developed TRiCoLOR, a freely available tool for tandem repeat profiling using error-prone long reads from third-generation sequencing technologies. The method can identify repetitive regions in sequencing data without a prior knowledge of their motifs or locations and resolve repeat multiplicity and period size in a haplotype-specific manner. The tool includes methods to interactively visualize the identified repeats and to trace their Mendelian consistency in pedigrees. CONCLUSIONS: TRiCoLOR demonstrates excellent performance and improved sensitivity and specificity compared with alternative tools on synthetic data. For real human whole-genome sequencing data, TRiCoLOR achieves high validation rates, suggesting its suitability to identify tandem repeat variation in personal genomes.


Asunto(s)
Genoma Humano , Secuencias Repetidas en Tándem , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Sensibilidad y Especificidad , Análisis de Secuencia de ADN , Secuenciación Completa del Genoma
20.
EMBO Mol Med ; 12(9): e12104, 2020 09 07.
Artículo en Inglés | MEDLINE | ID: mdl-32755029

RESUMEN

We aimed at identifying the developmental stage at which leukemic cells of pediatric T-ALLs are arrested and at defining leukemogenic mechanisms based on ATAC-Seq. Chromatin accessibility maps of seven developmental stages of human healthy T cells revealed progressive chromatin condensation during T-cell maturation. Developmental stages were distinguished by 2,823 signature chromatin regions with 95% accuracy. Open chromatin surrounding SAE1 was identified to best distinguish thymic developmental stages suggesting a potential role of SUMOylation in T-cell development. Deconvolution using signature regions revealed that T-ALLs, including those with mature immunophenotypes, resemble the most immature populations, which was confirmed by TF-binding motif profiles. We integrated ATAC-Seq and RNA-Seq and found DAB1, a gene not related to leukemia previously, to be overexpressed, abnormally spliced and hyper-accessible in T-ALLs. DAB1-negative patients formed a distinct subgroup with particularly immature chromatin profiles and hyper-accessible binding sites for SPI1 (PU.1), a TF crucial for normal T-cell maturation. In conclusion, our analyses of chromatin accessibility and TF-binding motifs showed that pediatric T-ALL cells are most similar to immature thymic precursors, indicating an early developmental arrest.


Asunto(s)
Células Precursoras de Linfocitos T , Leucemia-Linfoma Linfoblástico de Células T Precursoras , Niño , Cromatina , Humanos , Oncogenes , Leucemia-Linfoma Linfoblástico de Células T Precursoras/genética , Unión Proteica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA