Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 182
Filtrar
1.
Nat Commun ; 15(1): 8007, 2024 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-39266513

RESUMEN

Modern sequencing technology enables the systematic detection of complex structural variation (SV) across genomes. However, extensive DNA rearrangements arising through a series of mutations, a phenomenon we refer to as serial SV (sSV), remain underexplored, posing a challenge for SV discovery. Here, we present NAHRwhals ( https://github.com/WHops/NAHRwhals ), a method to infer repeat-mediated series of SVs in long-read genomic assemblies. Applying NAHRwhals to haplotype-resolved human genomes from 28 individuals reveals 37 sSV loci of various length and complexity. These sSVs explain otherwise cryptic variation in medically relevant regions such as the TPSAB1 gene, 8p23.1, 22q11 and Sotos syndrome regions. Comparisons with great ape assemblies indicate that most human sSVs formed recently, after the human-ape split, and involved non-repeat-mediated processes in addition to non-allelic homologous recombination. NAHRwhals reliably discovers and characterizes sSVs at scale and independent of species, uncovering their genomic abundance and suggesting broader implications for disease.


Asunto(s)
Genoma Humano , Variación Estructural del Genoma , Hominidae , Humanos , Animales , Hominidae/genética , Genoma Humano/genética , Genómica/métodos , Haplotipos
2.
Cell Genom ; 4(7): 100590, 2024 Jul 10.
Artículo en Inglés | MEDLINE | ID: mdl-38908378

RESUMEN

The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a complex genomic rearrangement (CGR). Although it has been identified as an important pathogenic DNA mutation signature in genomic disorders and cancer genomes, its architecture remains unresolved. Here, we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the DNA of 24 patients identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted structural variant (SV) haplotypes. Using a combination of short-read genome sequencing (GS), long-read GS, optical genome mapping, and single-cell DNA template strand sequencing (strand-seq), the haplotype structure was resolved in 18 samples. The point of template switching in 4 samples was shown to be a segment of ∼2.2-5.5 kb of 100% nucleotide similarity within inverted repeat pairs. These data provide experimental evidence that inverted low-copy repeats act as recombinant substrates. This type of CGR can result in multiple conformers generating diverse SV haplotypes in susceptible dosage-sensitive loci.


Asunto(s)
Haplotipos , Humanos , Haplotipos/genética , Hibridación Genómica Comparativa , Variación Estructural del Genoma/genética , Genoma Humano/genética , Duplicación de Gen/genética
3.
Genome Med ; 16(1): 83, 2024 06 17.
Artículo en Inglés | MEDLINE | ID: mdl-38886830

RESUMEN

BACKGROUND: Somatic copy number alterations are a hallmark of cancer that offer unique opportunities for therapeutic exploitation. Here, we focused on the identification of specific vulnerabilities for tumors harboring chromosome 8p deletions. METHODS: We developed and applied an integrative analysis of The Cancer Genome Atlas (TCGA), the Cancer Dependency Map (DepMap), and the Cancer Cell Line Encyclopedia to identify chromosome 8p-specific vulnerabilities. We employ orthogonal gene targeting strategies, both in vitro and in vivo, including short hairpin RNA-mediated gene knockdown and CRISPR/Cas9-mediated gene knockout to validate vulnerabilities. RESULTS: We identified SLC25A28 (also known as MFRN2), as a specific vulnerability for tumors harboring chromosome 8p deletions. We demonstrate that vulnerability towards MFRN2 loss is dictated by the expression of its paralog, SLC25A37 (also known as MFRN1), which resides on chromosome 8p. In line with their function as mitochondrial iron transporters, MFRN1/2 paralog protein deficiency profoundly impaired mitochondrial respiration, induced global depletion of iron-sulfur cluster proteins, and resulted in DNA-damage and cell death. MFRN2 depletion in MFRN1-deficient tumors led to impaired growth and even tumor eradication in preclinical mouse xenograft experiments, highlighting its therapeutic potential. CONCLUSIONS: Our data reveal MFRN2 as a therapeutic target of chromosome 8p deleted cancers and nominate MFNR1 as the complimentary biomarker for MFRN2-directed therapies.


Asunto(s)
Deleción Cromosómica , Cromosomas Humanos Par 8 , Neoplasias , Humanos , Cromosomas Humanos Par 8/genética , Animales , Ratones , Neoplasias/genética , Línea Celular Tumoral , Mutaciones Letales Sintéticas , Mitocondrias/metabolismo , Mitocondrias/genética , Proteínas Mitocondriales/genética , Proteínas Mitocondriales/metabolismo , Regulación Neoplásica de la Expresión Génica , Variaciones en el Número de Copia de ADN
4.
Nat Genet ; 56(6): 1134-1146, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38806714

RESUMEN

The functional impact and cellular context of mosaic structural variants (mSVs) in normal tissues is understudied. Utilizing Strand-seq, we sequenced 1,133 single-cell genomes from 19 human donors of increasing age, and discovered the heterogeneous mSV landscapes of hematopoietic stem and progenitor cells. While mSVs are continuously acquired throughout life, expanded subclones in our cohort are confined to individuals >60. Cells already harboring mSVs are more likely to acquire additional somatic structural variants, including megabase-scale segmental aneuploidies. Capitalizing on comprehensive single-cell micrococcal nuclease digestion with sequencing reference data, we conducted high-resolution cell-typing for eight hematopoietic stem and progenitor cells. Clonally expanded mSVs disrupt normal cellular function by dysregulating diverse cellular pathways, and enriching for myeloid progenitors. Our findings underscore the contribution of mSVs to the cellular and molecular phenotypes associated with the aging hematopoietic system, and establish a foundation for deciphering the molecular links between mSVs, aging and disease susceptibility in normal tissues.


Asunto(s)
Células Madre Hematopoyéticas , Mosaicismo , Humanos , Células Madre Hematopoyéticas/metabolismo , Células Madre Hematopoyéticas/citología , Persona de Mediana Edad , Adulto , Análisis de la Célula Individual/métodos , Anciano , Femenino , Masculino , Envejecimiento/genética , Anciano de 80 o más Años , Células Madre/metabolismo , Variación Genética
5.
bioRxiv ; 2024 Apr 20.
Artículo en Inglés | MEDLINE | ID: mdl-38659906

RESUMEN

Structural variants (SVs) contribute significantly to human genetic diversity and disease 1-4 . Previously, SVs have remained incompletely resolved by population genomics, with short-read sequencing facing limitations in capturing the whole spectrum of SVs at nucleotide resolution 5-7 . Here we leveraged nanopore sequencing 8 to construct an intermediate coverage resource of 1,019 long-read genomes sampled within 26 human populations from the 1000 Genomes Project. By integrating linear and graph-based approaches for SV analysis via pangenome graph-augmentation, we uncover 167,291 sequence-resolved SVs in these samples, considerably advancing SV characterization compared to population-wide short-read sequencing studies 3,4 . Our analysis details diverse SV classes-deletions, duplications, insertions, and inversions-at population-scale. LINE-1 and SVA retrotransposition activities frequently mediate transductions 9,10 of unique sequences, with both mobile element classes transducing sequences at either the 3'- or 5'-end, depending on the source element locus. Furthermore, analyses of SV breakpoint junctions suggest a continuum of homology-mediated rearrangement processes are integral to SV formation, and highlight evidence for SV recurrence involving repeat sequences. Our open-access dataset underscores the transformative impact of long-read sequencing in advancing the characterisation of polymorphic genomic architectures, and provides a resource for guiding variant prioritisation in future long-read sequencing-based disease studies.

6.
Mol Oncol ; 18(2): 245-279, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38135904

RESUMEN

Analyses of inequalities related to prevention and cancer therapeutics/care show disparities between countries with different economic standing, and within countries with high Gross Domestic Product. The development of basic technological and biological research provides clinical and prevention opportunities that make their implementation into healthcare systems more complex, mainly due to the growth of Personalized/Precision Cancer Medicine (PCM). Initiatives like the USA-Cancer Moonshot and the EU-Mission on Cancer and Europe's Beating Cancer Plan are initiated to boost cancer prevention and therapeutics/care innovation and to mitigate present inequalities. The conference organized by the Pontifical Academy of Sciences in collaboration with the European Academy of Cancer Sciences discussed the inequality problem, dependent on the economic status of a country, the increasing demands for infrastructure supportive of innovative research and its implementation in healthcare and prevention programs. Establishing translational research defined as a coherent cancer research continuum is still a challenge. Research has to cover the entire continuum from basic to outcomes research for clinical and prevention modalities. Comprehensive Cancer Centres (CCCs) are of critical importance for integrating research innovations to preclinical and clinical research, as for ensuring state-of-the-art patient care within healthcare systems. International collaborative networks between CCCs are necessary to reach the critical mass of infrastructures and patients for PCM research, and for introducing prevention modalities and new treatments effectively. Outcomes and health economics research are required to assess the cost-effectiveness of new interventions, currently a missing element in the research portfolio. Data sharing and critical mass are essential for innovative research to develop PCM. Despite advances in cancer research, cancer incidence and prevalence is growing. Making cancer research infrastructures accessible for all patients, considering the increasing inequalities, requires science policy actions incentivizing research aimed at prevention and cancer therapeutics/care with an increased focus on patients' needs and cost-effective healthcare.


Asunto(s)
Neoplasias , Humanos , Ciudad del Vaticano , Neoplasias/prevención & control , Investigación Biomédica Traslacional , Atención a la Salud , Medicina de Precisión
7.
Bioinformatics ; 39(11)2023 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-37851409

RESUMEN

SUMMARY: Single-cell DNA template strand sequencing (Strand-seq) allows a range of various genomic analysis including chromosome length haplotype phasing and structural variation (SV) calling in individual cells. Here, we present MosaiCatcher v2, a standardized workflow and reference framework for single-cell SV detection using Strand-seq. This framework introduces a range of functionalities, including: an automated upstream Quality Control (QC) and assembly sub-workflow that relies on multiple genome assemblies and incorporates a multistep normalization module, integration of the single-cell nucleosome occupancy and genetic variation analysis SV functional characterization and of the ArbiGent SV genotyping modules, platform portability, as well as a user-friendly and shareable web report. These new features of MosaiCatcher v2 enable reproducible computational processing of Strand-seq data, which are increasingly used in human genetics and single-cell genomics, toward production environments. MosaiCatcher v2 is compatible with both container and conda environments, ensuring reproducibility and robustness and positioning the framework as a cornerstone in computational processing of Strand-seq data. AVAILABILITY AND IMPLEMENTATION: MosaiCatcher v2 is a standardized workflow, implemented using the Snakemake workflow management system. The pipeline is available on GitHub: https://github.com/friendsofstrandseq/mosaicatcher-pipeline/ and on the snakemake-workflow-catalog: https://snakemake.github.io/snakemake-workflow-catalog/?usage=friendsofstrandseq/mosaicatcher-pipeline. Strand-seq example input data used in the publication can be found in the Data availability statement. Additionally, a lightweight dataset for test purposes can be found on the GitHub repository.


Asunto(s)
Replicación del ADN , Genómica , Humanos , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN , Haplotipos , Programas Informáticos , Flujo de Trabajo , Análisis de la Célula Individual
8.
Sci Rep ; 13(1): 17160, 2023 10 11.
Artículo en Inglés | MEDLINE | ID: mdl-37821491

RESUMEN

We use a comprehensive longitudinal dataset on criminal acts over 6 years in a European country to study specialization in criminal careers. We present a method to cluster crime categories by their relative co-occurrence within criminal careers, deriving a natural, data-based taxonomy of criminal specialization. Defining specialists as active criminals who stay within one category of offending behavior, we study their socio-demographic attributes, geographic range, and positions in their collaboration networks relative to their generalist counterparts. Compared to generalists, specialists tend to be older, are more likely to be women, operate within a smaller geographic range, and collaborate in smaller, more tightly-knit local networks. We observe that specialists are more intensely embedded in criminal networks, suggesting a potential source of self-reinforcing dynamics in criminal careers.


Asunto(s)
Criminales , Humanos , Femenino , Masculino , Crimen , Conducta Criminal , Especialización , Europa (Continente)
9.
bioRxiv ; 2023 Oct 03.
Artículo en Inglés | MEDLINE | ID: mdl-37873367

RESUMEN

Background: The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a type of complex genomic rearrangement (CGR) hypothesized to result from replicative repair of DNA due to replication fork collapse. It is often mediated by a pair of inverted low-copy repeats (LCR) followed by iterative template switches resulting in at least two breakpoint junctions in cis . Although it has been identified as an important mutation signature of pathogenicity for genomic disorders and cancer genomes, its architecture remains unresolved and is predicted to display at least four structural variation (SV) haplotypes. Results: Here we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the genomic DNA of 24 patients with neurodevelopmental disorders identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted SV haplotypes. Using a combination of short-read genome sequencing (GS), long- read GS, optical genome mapping and StrandSeq the haplotype structure was resolved in 18 samples. This approach refined the point of template switching between inverted LCRs in 4 samples revealing a DNA segment of ∼2.2-5.5 kb of 100% nucleotide similarity. A prediction model was developed to infer the LCR used to mediate the non-allelic homology repair. Conclusions: These data provide experimental evidence supporting the hypothesis that inverted LCRs act as a recombinant substrate in replication-based repair mechanisms. Such inverted repeats are particularly relevant for formation of copy-number associated inversions, including the DUP-TRP/INV-DUP structures. Moreover, this type of CGR can result in multiple conformers which contributes to generate diverse SV haplotypes in susceptible loci .

10.
Nature ; 621(7978): 355-364, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37612510

RESUMEN

The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date1 and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes2. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established1 boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.


Asunto(s)
Cromosomas Humanos Y , Evolución Molecular , Humanos , Masculino , Cromosomas Humanos Y/genética , Genoma Humano/genética , Genómica , Tasa de Mutación , Fenotipo , Eucromatina/genética , Seudogenes , Variación Genética/genética , Cromosomas Humanos X/genética , Regiones Pseudoautosómicas/genética
11.
Genome Med ; 15(1): 47, 2023 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-37420249

RESUMEN

BACKGROUND: Cancer genome sequencing enables accurate classification of tumours and tumour subtypes. However, prediction performance is still limited using exome-only sequencing and for tumour types with low somatic mutation burden such as many paediatric tumours. Moreover, the ability to leverage deep representation learning in discovery of tumour entities remains unknown. METHODS: We introduce here Mutation-Attention (MuAt), a deep neural network to learn representations of simple and complex somatic alterations for prediction of tumour types and subtypes. In contrast to many previous methods, MuAt utilizes the attention mechanism on individual mutations instead of aggregated mutation counts. RESULTS: We trained MuAt models on 2587 whole cancer genomes (24 tumour types) from the Pan-Cancer Analysis of Whole Genomes (PCAWG) and 7352 cancer exomes (20 types) from the Cancer Genome Atlas (TCGA). MuAt achieved prediction accuracy of 89% for whole genomes and 64% for whole exomes, and a top-5 accuracy of 97% and 90%, respectively. MuAt models were found to be well-calibrated and perform well in three independent whole cancer genome cohorts with 10,361 tumours in total. We show MuAt to be able to learn clinically and biologically relevant tumour entities including acral melanoma, SHH-activated medulloblastoma, SPOP-associated prostate cancer, microsatellite instability, POLE proofreading deficiency, and MUTYH-associated pancreatic endocrine tumours without these tumour subtypes and subgroups being provided as training labels. Finally, scrunity of MuAt attention matrices revealed both ubiquitous and tumour-type specific patterns of simple and complex somatic mutations. CONCLUSIONS: Integrated representations of somatic alterations learnt by MuAt were able to accurately identify histological tumour types and identify tumour entities, with potential to impact precision cancer medicine.


Asunto(s)
Mutación , Neoplasias , Neoplasias/genética , Neoplasias/patología , Humanos , Aprendizaje Profundo , Benchmarking
12.
bioRxiv ; 2023 Jul 17.
Artículo en Inglés | MEDLINE | ID: mdl-37503087

RESUMEN

Single-cell DNA template strand sequencing (Strand-seq) allows a range of various genomic analysis including chromosome length haplotype phasing and structural variation (SV) calling in individual cells. Here, we present MosaiCatcher v2, a standardised workflow and reference framework for single-cell SV detection using Strand-seq. This framework introduces a range of functionalities, including: an automated upstream Quality Control (QC) and assembly sub-workflow that relies on multiple genome assemblies and incorporates a multistep normalisation module, integration of the scNOVA SV functional characterization and of the ArbiGent SV genotyping modules, platform portability, as well as a user-friendly and shareable web report. These new features of MosaiCatcher v2 enables reproducible computational processing of Strand-seq data, which are increasingly used in human genetics and single cell genomics, towards production environments.

13.
Genome Res ; 33(4): 496-510, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-37164484

RESUMEN

There has been tremendous progress in phased genome assembly production by combining long-read data with parental information or linked-read data. Nevertheless, a typical phased genome assembly generated by trio-hifiasm still generates more than 140 gaps. We perform a detailed analysis of gaps, assembly breaks, and misorientations from 182 haploid assemblies obtained from a diversity panel of 77 unique human samples. Although trio-based approaches using HiFi are the current gold standard, chromosome-wide phasing accuracy is comparable when using Strand-seq instead of parental data. Importantly, the majority of assembly gaps cluster near the largest and most identical repeats (including segmental duplications [35.4%], satellite DNA [22.3%], or regions enriched in GA/AT-rich DNA [27.4%]). Consequently, 1513 protein-coding genes overlap assembly gaps in at least one haplotype, and 231 are recurrently disrupted or missing from five or more haplotypes. Furthermore, we estimate that 6-7 Mbp of DNA are misorientated per haplotype irrespective of whether trio-free or trio-based approaches are used. Of these misorientations, 81% correspond to bona fide large inversion polymorphisms in the human species, most of which are flanked by large segmental duplications. We also identify large-scale alignment discontinuities consistent with 11.9 Mbp of deletions and 161.4 Mbp of insertions per haploid genome. Although 99% of this variation corresponds to satellite DNA, we identify 230 regions of euchromatic DNA with frequent expansions and contractions, nearly half of which overlap with 197 protein-coding genes. Such variable and incompletely assembled regions are important targets for future algorithmic development and pangenome representation.


Asunto(s)
ADN Satélite , Polimorfismo Genético , Humanos , ADN Satélite/genética , Haplotipos , Duplicaciones Segmentarias en el Genoma , Análisis de Secuencia de ADN
14.
Entropy (Basel) ; 25(4)2023 Apr 18.
Artículo en Inglés | MEDLINE | ID: mdl-37190466

RESUMEN

The recent link discovered between generalized Legendre transforms and non-dually flat statistical manifolds suggests a fundamental reason behind the ubiquity of Rényi's divergence and entropy in a wide range of physical phenomena. However, these early findings still provide little intuition on the nature of this relationship and its implications for physical systems. Here we shed new light on the Legendre transform by revealing the consequences of its deformation via symplectic geometry and complexification. These findings reveal a novel common framework that leads to a principled and unified understanding of physical systems that are not well-described by classic information-theoretic quantities.

15.
Cell Genom ; 3(4): 100281, 2023 Apr 12.
Artículo en Inglés | MEDLINE | ID: mdl-37082141

RESUMEN

Cancer genomes harbor a broad spectrum of structural variants (SVs) driving tumorigenesis, a relevant subset of which escape discovery using short-read sequencing. We employed Oxford Nanopore Technologies (ONT) long-read sequencing in a paired diagnostic and post-therapy medulloblastoma to unravel the haplotype-resolved somatic genetic and epigenetic landscape. We assembled complex rearrangements, including a 1.55-Mbp chromothripsis event, and we uncover a complex SV pattern termed templated insertion (TI) thread, characterized by short (mostly <1 kb) insertions showing prevalent self-concatenation into highly amplified structures of up to 50 kbp in size. TI threads occur in 3% of cancers, with a prevalence up to 74% in liposarcoma, and frequent colocalization with chromothripsis. We also perform long-read-based methylome profiling and discover allele-specific methylation (ASM) effects, complex rearrangements exhibiting differential methylation, and differential promoter methylation in cancer-driver genes. Our study shows the advantage of long-read sequencing in the discovery and characterization of complex somatic rearrangements.

16.
Genome Biol ; 24(1): 100, 2023 04 30.
Artículo en Inglés | MEDLINE | ID: mdl-37122002

RESUMEN

The telomere-to-telomere (T2T) complete human reference has significantly improved our ability to characterize genome structural variation. To understand its impact on inversion polymorphisms, we remapped data from 41 genomes against the T2T reference genome and compared it to the GRCh38 reference. We find a ~ 21% increase in sensitivity improving mapping of 63 inversions on the T2T reference. We identify 26 misorientations within GRCh38 and show that the T2T reference is three times more likely to represent the correct orientation of the major human allele. Analysis of 10 additional samples reveals novel rare inversions at chromosomes 15q25.2, 16p11.2, 16q22.1-23.1, and 22q11.21.


Asunto(s)
Genoma Humano , Polimorfismo Genético , Humanos , Variación Estructural del Genoma , Inversión Cromosómica
20.
Phys Rev Lett ; 130(5): 057401, 2023 Feb 03.
Artículo en Inglés | MEDLINE | ID: mdl-36800470

RESUMEN

Homophily, the tendency of humans to attract each other when sharing similar features, traits, or opinions, has been identified as one of the main driving forces behind the formation of structured societies. Here we ask to what extent homophily can explain the formation of social groups, particularly their size distribution. We propose a spin-glass-inspired framework of self-assembly, where opinions are represented as multidimensional spins that dynamically self-assemble into groups; individuals within a group tend to share similar opinions (intragroup homophily), and opinions between individuals belonging to different groups tend to be different (intergroup heterophily). We compute the associated nontrivial phase diagram by solving a self-consistency equation for "magnetization" (combined average opinion). Below a critical temperature, there exist two stable phases: one ordered with nonzero magnetization and large clusters, the other disordered with zero magnetization and no clusters. The system exhibits a first-order transition to the disordered phase. We analytically derive the group-size distribution that successfully matches empirical group-size distributions from online communities.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA