Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
1.
Am J Hum Genet ; 110(8): 1229-1248, 2023 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-37541186

RESUMO

Despite advances in clinical genetic testing, including the introduction of exome sequencing (ES), more than 50% of individuals with a suspected Mendelian condition lack a precise molecular diagnosis. Clinical evaluation is increasingly undertaken by specialists outside of clinical genetics, often occurring in a tiered fashion and typically ending after ES. The current diagnostic rate reflects multiple factors, including technical limitations, incomplete understanding of variant pathogenicity, missing genotype-phenotype associations, complex gene-environment interactions, and reporting differences between clinical labs. Maintaining a clear understanding of the rapidly evolving landscape of diagnostic tests beyond ES, and their limitations, presents a challenge for non-genetics professionals. Newer tests, such as short-read genome or RNA sequencing, can be challenging to order, and emerging technologies, such as optical genome mapping and long-read DNA sequencing, are not available clinically. Furthermore, there is no clear guidance on the next best steps after inconclusive evaluation. Here, we review why a clinical genetic evaluation may be negative, discuss questions to be asked in this setting, and provide a framework for further investigation, including the advantages and disadvantages of new approaches that are nascent in the clinical sphere. We present a guide for the next best steps after inconclusive molecular testing based upon phenotype and prior evaluation, including when to consider referral to research consortia focused on elucidating the underlying cause of rare unsolved genetic disorders.


Assuntos
Exoma , Testes Genéticos , Humanos , Exoma/genética , Análise de Sequência de DNA , Fenótipo , Sequenciamento do Exoma , Doenças Raras
2.
Genome Res ; 31(4): 635-644, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33602693

RESUMO

The COVID-19 pandemic has sparked an urgent need to uncover the underlying biology of this devastating disease. Though RNA viruses mutate more rapidly than DNA viruses, there are a relatively small number of single nucleotide polymorphisms (SNPs) that differentiate the main SARS-CoV-2 lineages that have spread throughout the world. In this study, we investigated 129 RNA-seq data sets and 6928 consensus genomes to contrast the intra-host and inter-host diversity of SARS-CoV-2. Our analyses yielded three major observations. First, the mutational profile of SARS-CoV-2 highlights intra-host single nucleotide variant (iSNV) and SNP similarity, albeit with differences in C > U changes. Second, iSNV and SNP patterns in SARS-CoV-2 are more similar to MERS-CoV than SARS-CoV-1. Third, a significant fraction of insertions and deletions contribute to the genetic diversity of SARS-CoV-2. Altogether, our findings provide insight into SARS-CoV-2 genomic diversity, inform the design of detection tests, and highlight the potential of iSNVs for tracking the transmission of SARS-CoV-2.


Assuntos
COVID-19/diagnóstico , COVID-19/transmissão , Variação Genética , Genoma Viral , Reação em Cadeia da Polimerase em Tempo Real/métodos , SARS-CoV-2/genética , COVID-19/virologia , Interações Hospedeiro-Patógeno , Humanos , Polimorfismo de Nucleotídeo Único
3.
Ann Neurol ; 93(5): 1012-1022, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36695634

RESUMO

OBJECTIVE: Identification of genetic risk factors for Parkinson disease (PD) has to date been primarily limited to the study of single nucleotide variants, which only represent a small fraction of the genetic variation in the human genome. Consequently, causal variants for most PD risk are not known. Here we focused on structural variants (SVs), which represent a major source of genetic variation in the human genome. We aimed to discover SVs associated with PD risk by performing the first large-scale characterization of SVs in PD. METHODS: We leveraged a recently developed computational pipeline to detect and genotype SVs from 7,772 Illumina short-read whole genome sequencing samples. Using this set of SV variants, we performed a genome-wide association study using 2,585 cases and 2,779 controls and identified SVs associated with PD risk. Furthermore, to validate the presence of these variants, we generated a subset of matched whole-genome long-read sequencing data. RESULTS: We genotyped and tested 3,154 common SVs, representing over 412 million nucleotides of previously uncatalogued genetic variation. Using long-read sequencing data, we validated the presence of three novel deletion SVs that are associated with risk of PD from our initial association analysis, including a 2 kb intronic deletion within the gene LRRN4. INTERPRETATION: We identified three SVs associated with genetic risk of PD. This study represents the most comprehensive assessment of the contribution of SVs to the genetic risk of PD to date. ANN NEUROL 2023;93:1012-1022.


Assuntos
Estudo de Associação Genômica Ampla , Doença de Parkinson , Humanos , Doença de Parkinson/genética , Genoma Humano , Sequenciamento Completo do Genoma , Genótipo
4.
Hum Mutat ; 43(12): 2033-2053, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36054313

RESUMO

Xia-Gibbs syndrome (XGS; MIM# 615829) is a rare mendelian disorder characterized by Development Delay (DD), intellectual disability (ID), and hypotonia. Individuals with XGS typically harbor de novo protein-truncating mutations in the AT-Hook DNA binding motif containing 1 (AHDC1) gene, although some missense mutations can also cause XGS. Large de novo heterozygous deletions that encompass the AHDC1 gene have also been ascribed as diagnostic for the disorder, without substantial evidence to support their pathogenicity. We analyzed 19 individuals with large contiguous deletions involving AHDC1, along with other genes. One individual bore the smallest known contiguous AHDC1 deletion (∼350 Kb), encompassing eight other genes within chr1p36.11 (Feline Gardner-Rasheed, IFI6, FAM76A, STX12, PPP1R8, THEMIS2, RPA2, SMPDL3B) and terminating within the first intron of AHDC1. The breakpoint junctions and phase of the deletion were identified using both short and long read sequencing (Oxford Nanopore). Quantification of RNA expression patterns in whole blood revealed that AHDC1 exhibited a mono-allelic expression pattern with no deficiency in overall AHDC1 expression levels, in contrast to the other deleted genes, which exhibited a 50% reduction in mRNA expression. These results suggest that AHDC1 expression in this individual is compensated by a novel regulatory mechanism and advances understanding of mutational and regulatory mechanisms in neurodevelopmental disorders.


Assuntos
Anormalidades Múltiplas , Deficiência Intelectual , Anormalidades Musculoesqueléticas , Transtornos do Neurodesenvolvimento , Humanos , Anormalidades Múltiplas/genética , Proteínas de Ligação a DNA/genética , Endorribonucleases , Deficiência Intelectual/genética , Transtornos do Neurodesenvolvimento/genética , Fosfoproteínas Fosfatases , Proteínas Qa-SNARE , Proteínas de Ligação a RNA , Esfingomielina Fosfodiesterase
5.
Mem Cognit ; 48(4): 511-525, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-31755026

RESUMO

Previous research has shown that early-acquired words are produced faster than late-acquired words. Juhasz and colleagues (Juhasz, Lai & Woodcock, Behavior Research Methods, 47 (4), 1004-1019, 2015; Juhasz, The Quarterly Journal of Experimental Psychology, 1-10, 2018) argue that the Age-of-Acquisition (AoA) loci for complex words, specifically compound words, are found at the lexical/semantic level. In the current study, two experiments were conducted to evaluate this claim and investigate the influence of AoA in reading compound words aloud. In Experiment 1, 48 participants completed a word naming task. Using general linear mixed modelling, we found that the age at which the compound word was learned significantly affected the naming latencies beyond the other psycholinguistic properties measured. The second experiment required 48 participants to name the compound word when the two morphemes were presented with a space in-between (combinatorial naming, e.g. air plane). We found that the age at which the compound word was learned, as well as the AoA of the individual morphemes that formed the compound word, significantly influenced combinatorial naming latency. These findings are discussed in relation to theories of the AoA in language processing.


Assuntos
Processamento de Texto , Humanos , Desenvolvimento da Linguagem , Psicolinguística , Tempo de Reação , Leitura , Semântica , Vocabulário
6.
Genomics ; 111(1): 43-49, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-29268960

RESUMO

Long sequencing reads offer unprecedented opportunities in analysis and reconstruction of complex genomic regions. However, the gain in sequence length is often traded for quality. Therefore, recently several approaches have been proposed (e.g. higher sequencing coverage, hybrid assembly or sequence correction) to enhance the quality of long sequencing reads. A simple and cost-effective approach includes use of the high quality 2nd generation sequencing data to improve the quality of long reads. We designed a dedicated testing procedure and selected universal programs for long read correction, which provide as the output sequences that can be used in further genomic and transcriptomic studies. Our results show that HALC is the best choice for correction of long PacBio reads, when both, read size and quality, are the main focus of the analysis. However, the tested tools show some unexpected behaviors, including read trimming and fragmentation.


Assuntos
Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Animais , Bases de Dados Genéticas , Escherichia coli/genética , Genômica , Humanos , Oryza/genética , Trypanosoma/genética , Leveduras/genética
8.
J Virol ; 89(23): 11899-908, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26378176

RESUMO

UNLABELLED: Infected peripheral blood mononuclear cells (PBMC) effectively transport equine herpesvirus type 1 (EHV-1), but not EHV-4, to endothelial cells (EC) lining the blood vessels of the pregnant uterus or central nervous system, a process that can result in abortion or myeloencephalopathy. We examined, using a dynamic in vitro model, the differences between EHV-1 and EHV-4 infection of PBMC and PBMC-EC interactions. In order to evaluate viral transfer between infected PBMC and EC, cocultivation assays were performed. Only EHV-1 was transferred from PBMC to EC, and viral glycoprotein B (gB) was shown to be mainly responsible for this form of cell-to-cell transfer. For addressing the more dynamic aspects of PBMC-EC interaction, infected PBMC were perfused through a flow channel containing EC in the presence of neutralizing antibodies. By simulating capillary blood flow and analyzing the behavior of infected PBMC through live fluorescence imaging and automated cell tracking, we observed that EHV-1 was able to maintain tethering and rolling of infected PBMC on EC more effectively than EHV-4. Deletion of US3 reduced the ability of infected PBMC to tether and roll compared to that of cells infected with parental virus, which resulted in a significant reduction in virus transfer from PBMC to EC. Taking the results together, we conclude that systemic spread and EC infection by EHV-1, but not EHV-4, is caused by its ability to infect and/or reprogram mononuclear cells with respect to their tethering and rolling behavior on EC and consequent virus transfer. IMPORTANCE: EHV-1 is widespread throughout the world and causes substantial economic losses through outbreaks of respiratory disease, abortion, and myeloencephalopathy. Despite many years of research, no fully protective vaccines have been developed, and several aspects of viral pathogenesis still need to be uncovered. In the current study, we investigated the molecular mechanisms that facilitate the cell-associated viremia, which is arguably the most important aspect of EHV-1 pathogenesis. The newly discovered functions of gB and pUS3 add new facets to their previously reported roles. Due to the conserved nature of cell-associated viremia among numerous herpesviruses, these results are also very relevant for viruses such as varicella-zoster virus, pseudorabies virus, human cytomegalovirus, and others. In addition, the constructed mutant and recombinant viruses exhibit potent in vitro replication but have significant defects in certain stages of the disease course. These viruses therefore show much promise as candidates for future live vaccines.


Assuntos
Células Endoteliais/virologia , Infecções por Herpesviridae/fisiopatologia , Herpesvirus Equídeo 1/fisiologia , Herpesvirus Equídeo 4/fisiologia , Leucócitos Mononucleares/virologia , Proteínas Serina-Treonina Quinases/metabolismo , Proteínas do Envelope Viral/metabolismo , Análise de Variância , Animais , Agregação Celular , Células Cultivadas , Fluorescência , Cavalos , Técnicas In Vitro , Estatísticas não Paramétricas , Internalização do Vírus
9.
Nat Commun ; 15(1): 5327, 2024 Jun 22.
Artigo em Inglês | MEDLINE | ID: mdl-38909018

RESUMO

The assignment of variants across haplotypes, phasing, is crucial for predicting the consequences, interaction, and inheritance of mutations and is a key step in improving our understanding of phenotype and disease. However, phasing is limited by read length and stretches of homozygosity along the genome. To overcome this limitation, we designed MethPhaser, a method that utilizes methylation signals from Oxford Nanopore Technologies to extend Single Nucleotide Variation (SNV)-based phasing. We demonstrate that haplotype-specific methylations extensively exist in Human genomes and the advent of long-read technologies enabled direct report of methylation signals. For ONT R9 and R10 cell line data, we increase the phase length N50 by 78%-151% at a phasing accuracy of 83.4-98.7% To assess the impact of tissue purity and random methylation signals due to inactivation, we also applied MethPhaser on blood samples from 4 patients, still showing improvements over SNV-only phasing. MethPhaser further improves phasing across HLA and multiple other medically relevant genes, improving our understanding of how mutations interact across multiple phenotypes. The concept of MethPhaser can also be extended to non-human diploid genomes. MethPhaser is available at https://github.com/treangenlab/methphaser .


Assuntos
Metilação de DNA , Genoma Humano , Haplótipos , Polimorfismo de Nucleotídeo Único , Humanos , Linhagem Celular , Mutação
10.
Nat Biotechnol ; 42(10): 1571-1580, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38168980

RESUMO

Calling structural variations (SVs) is technically challenging, but using long reads remains the most accurate way to identify complex genomic alterations. Here we present Sniffles2, which improves over current methods by implementing a repeat aware clustering coupled with a fast consensus sequence and coverage-adaptive filtering. Sniffles2 is 11.8 times faster and 29% more accurate than state-of-the-art SV callers across different coverages (5-50×), sequencing technologies (ONT and HiFi) and SV types. Furthermore, Sniffles2 solves the problem of family-level to population-level SV calling to produce fully genotyped VCF files. Across 11 probands, we accurately identified causative SVs around MECP2, including highly complex alleles with three overlapping SVs. Sniffles2 also enables the detection of mosaic SVs in bulk long-read data. As a result, we identified multiple mosaic SVs in brain tissue from a patient with multiple system atrophy. The identified SV showed a remarkable diversity within the cingulate cortex, impacting both genes involved in neuron function and repetitive elements.


Assuntos
Mosaicismo , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Variação Estrutural do Genoma/genética , Software , Análise de Sequência de DNA/métodos
11.
medRxiv ; 2024 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-38712270

RESUMO

Both long-read genome sequencing (lrGS) and the recently published Telomere to Telomere (T2T) reference genome provide increased coverage and resolution across repetitive regions promising heightened structural variant detection and improved mapping. Inversions (INV), intrachromosomal segments which are rotated 180° and inserted back into the same chromosome, are a class of structural variants particularly challenging to detect due to their copy-number neutral state and association with repetitive regions. Inversions represent about 1/20 of all balanced structural chromosome aberrations and can lead to disease by gene disruption or altering regulatory regions of dosage sensitive genes in cis . Here we remapped the genome data from six individuals carrying unsolved cytogenetically detected inversions. An INV6 and INV10 were resolved using GRCh38 and T2T-CHM13. Finally, an INV9 required optical genome mapping, de novo assembly of lrGS data and T2T-CHM13. This inversion disrupted intron 25 of EHMT1, confirming a diagnosis of Kleefstra syndrome 1 (MIM#610253). These three inversions, only mappable in specific references, prompted us to investigate the presence and population frequencies of differential reference regions (DRRs) between T2T-CHM13, GRCh37, GRCh38, the chimpanzee and bonobo, and hundreds of megabases of DRRs were identified. Our results emphasize the significance of the chosen reference genome and the added benefits of lrGS and optical genome mapping in solving rearrangements in challenging regions of the genome. This is particularly important for inversions and may impact clinical diagnostics.

12.
NPJ Parkinsons Dis ; 10(1): 136, 2024 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-39060285

RESUMO

Parkinson's disease (PD) is a common neurodegenerative disorder with a significant risk proportion driven by genetics. While much progress has been made, most of the heritability remains unknown. This is in-part because previous genetic studies have focused on the contribution of single nucleotide variants. More complex forms of variation, such as structural variants and tandem repeats, are already associated with several synucleinopathies. However, because more sophisticated sequencing methods are usually required to detect these regions, little is understood regarding their contribution to PD. One example is a polymorphic CT-rich region in intron 4 of the SNCA gene. This haplotype has been suggested to be associated with risk of Lewy Body (LB) pathology in Alzheimer's Disease and SNCA gene expression, but is yet to be investigated in PD. Here, we attempt to resolve this CT-rich haplotype and investigate its role in PD. We performed targeted PacBio HiFi sequencing of the region in 1375 PD cases and 959 controls. We replicate the previously reported associations and a novel association between two PD risk SNVs (rs356182 and rs5019538) and haplotype 4, the largest haplotype. Through quantitative trait locus analyzes we identify a significant haplotype 4 association with alternative CAGE transcriptional start site usage, not leading to significant differential SNCA gene expression in post-mortem frontal cortex brain tissue. Therefore, disease association in this locus might not be biologically driven by this CT-rich repeat region. Our data demonstrates the complexity of this SNCA region and highlights that further follow up functional studies are warranted.

13.
medRxiv ; 2024 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-38562723

RESUMO

Comprehending the mechanism behind human diseases with an established heritable component represents the forefront of personalized medicine. Nevertheless, numerous medically important genes are inaccurately represented in short-read sequencing data analysis due to their complexity and repetitiveness or the so-called 'dark regions' of the human genome. The advent of PacBio as a long-read platform has provided new insights, yet HiFi whole-genome sequencing (WGS) cost remains frequently prohibitive. We introduce a targeted sequencing and analysis framework, Twist Alliance Dark Genes Panel (TADGP), designed to offer phased variants across 389 medically important yet complex autosomal genes. We highlight TADGP accuracy across eleven control samples and compare it to WGS. This demonstrates that TADGP achieves variant calling accuracy comparable to HiFi-WGS data, but at a fraction of the cost. Thus, enabling scalability and broad applicability for studying rare diseases or complementing previously sequenced samples to gain insights into these complex genes. TADGP revealed several candidate variants across all cases and provided insight into LPA diversity when tested on samples from rare disease and cardiovascular disease cohorts. In both cohorts, we identified novel variants affecting individual disease-associated genes (e.g., IKZF1, KCNE1). Nevertheless, the annotation of the variants across these 389 medically important genes remains challenging due to their underrepresentation in ClinVar and gnomAD. Consequently, we also offer an annotation resource to enhance the evaluation and prioritization of these variants. Overall, we can demonstrate that TADGP offers a cost-efficient and scalable approach to routinely assess the dark regions of the human genome with clinical relevance.

14.
BMC Med Genomics ; 17(1): 255, 2024 Oct 24.
Artigo em Inglês | MEDLINE | ID: mdl-39449055

RESUMO

The abundance of Lp(a) protein holds significant implications for the risk of cardiovascular disease (CVD), which is directly impacted by the copy number (CN) of KIV-2, a 5.5 kbp sub-region. KIV-2 is highly polymorphic in the population and accurate analysis is challenging. In this study, we present the DRAGEN KIV-2 CN caller, which utilizes short reads. Data across 166 WGS show that the caller has high accuracy, compared to optical mapping and can further phase approximately 50% of the samples. We compared KIV-2 CN numbers to 24 previously postulated KIV-2 relevant SNVs, revealing that many are ineffective predictors of KIV-2 copy number. Population studies, including USA-based cohorts, showed distinct KIV-2 CN, distributions for European-, African-, and Hispanic-American populations and further underscored the limitations of SNV predictors. We demonstrate that the CN estimates correlate significantly with the available Lp(a) protein levels and that phasing is highly important.


Assuntos
Alelos , Doenças Cardiovasculares , Lipoproteína(a) , Humanos , Doenças Cardiovasculares/genética , Lipoproteína(a)/genética , Lipoproteína(a)/sangue , Variações do Número de Cópias de DNA , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único
15.
Cell Genom ; 4(7): 100590, 2024 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-38908378

RESUMO

The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a complex genomic rearrangement (CGR). Although it has been identified as an important pathogenic DNA mutation signature in genomic disorders and cancer genomes, its architecture remains unresolved. Here, we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the DNA of 24 patients identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted structural variant (SV) haplotypes. Using a combination of short-read genome sequencing (GS), long-read GS, optical genome mapping, and single-cell DNA template strand sequencing (strand-seq), the haplotype structure was resolved in 18 samples. The point of template switching in 4 samples was shown to be a segment of ∼2.2-5.5 kb of 100% nucleotide similarity within inverted repeat pairs. These data provide experimental evidence that inverted low-copy repeats act as recombinant substrates. This type of CGR can result in multiple conformers generating diverse SV haplotypes in susceptible dosage-sensitive loci.


Assuntos
Haplótipos , Humanos , Haplótipos/genética , Hibridização Genômica Comparativa , Variação Estrutural do Genoma/genética , Genoma Humano/genética , Duplicação Gênica/genética
16.
G3 (Bethesda) ; 13(2)2023 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-36454082

RESUMO

Identifying selection on polygenic complex traits in crops and livestock is important for understanding evolution and helps prioritize important characteristics for breeding. Quantitative trait loci (QTL) that contribute to polygenic trait variation often exhibit small or infinitesimal effects. This hinders the ability to detect QTL-controlling polygenic traits because enormously high statistical power is needed for their detection. Recently, we circumvented this challenge by introducing a method to identify selection on complex traits by evaluating the relationship between genome-wide changes in allele frequency and estimates of effect size. The approach involves calculating a composite statistic across all markers that capture this relationship, followed by implementing a linkage disequilibrium-aware permutation test to evaluate if the observed pattern differs from that expected due to drift during evolution and population stratification. In this manuscript, we describe "Ghat," an R package developed to implement this method to test for selection on polygenic traits. We demonstrate the package by applying it to test for polygenic selection on 15 published European wheat traits including yield, biomass, quality, morphological characteristics, and disease resistance traits. Moreover, we applied Ghat to different simulated populations with different breeding histories and genetic architectures. The results highlight the power of Ghat to identify selection on complex traits. The Ghat package is accessible on CRAN, the Comprehensive R Archival Network, and on GitHub.


Assuntos
Herança Multifatorial , Melhoramento Vegetal , Herança Multifatorial/genética , Locos de Características Quantitativas , Desequilíbrio de Ligação , Frequência do Gene , Fenótipo
17.
Genome Biol ; 24(1): 221, 2023 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-37798733

RESUMO

Genomic benchmark datasets are essential to driving the field of genomics and bioinformatics. They provide a snapshot of the performances of sequencing technologies and analytical methods and highlight future challenges. However, they depend on sequencing technology, reference genome, and available benchmarking methods. Thus, creating a genomic benchmark dataset is laborious and highly challenging, often involving multiple sequencing technologies, different variant calling tools, and laborious manual curation. In this review, we discuss the available benchmark datasets and their utility. Additionally, we focus on the most recent benchmark of genes with medical relevance and challenging genomic complexity.


Assuntos
Benchmarking , Genômica , Genômica/métodos , Biologia Computacional/métodos , Genoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos
18.
Genome Biol ; 24(1): 31, 2023 02 21.
Artigo em Inglês | MEDLINE | ID: mdl-36810122

RESUMO

The current version of the human reference genome, GRCh38, contains a number of errors including 1.2 Mbp of falsely duplicated and 8.04 Mbp of collapsed regions. These errors impact the variant calling of 33 protein-coding genes, including 12 with medical relevance. Here, we present FixItFelix, an efficient remapping approach, together with a modified version of the GRCh38 reference genome that improves the subsequent analysis across these genes within minutes for an existing alignment file while maintaining the same coordinates. We showcase these improvements over multi-ethnic control samples, demonstrating improvements for population variant calling as well as eQTL studies.


Assuntos
Genoma Humano , Genômica , Humanos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA
19.
Cancer Discov ; 13(4): 910-927, 2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-36715691

RESUMO

The human papillomavirus (HPV) genome is integrated into host DNA in most HPV-positive cancers, but the consequences for chromosomal integrity are unknown. Continuous long-read sequencing of oropharyngeal cancers and cancer cell lines identified a previously undescribed form of structural variation, "heterocateny," characterized by diverse, interrelated, and repetitive patterns of concatemerized virus and host DNA segments within a cancer. Unique breakpoints shared across structural variants facilitated stepwise reconstruction of their evolution from a common molecular ancestor. This analysis revealed that virus and virus-host concatemers are unstable and, upon insertion into and excision from chromosomes, facilitate capture, amplification, and recombination of host DNA and chromosomal rearrangements. Evidence of heterocateny was detected in extrachromosomal and intrachromosomal DNA. These findings indicate that heterocateny is driven by the dynamic, aberrant replication and recombination of an oncogenic DNA virus, thereby extending known consequences of HPV integration to include promotion of intratumoral heterogeneity and clonal evolution. SIGNIFICANCE: Long-read sequencing of HPV-positive cancers revealed "heterocateny," a previously unreported form of genomic structural variation characterized by heterogeneous, interrelated, and repetitive genomic rearrangements within a tumor. Heterocateny is driven by unstable concatemerized HPV genomes, which facilitate capture, rearrangement, and amplification of host DNA, and promotes intratumoral heterogeneity and clonal evolution. See related commentary by McBride and White, p. 814. This article is highlighted in the In This Issue feature, p. 799.


Assuntos
Neoplasias Orofaríngeas , Infecções por Papillomavirus , Humanos , Papillomavirus Humano , Rearranjo Gênico , Evolução Clonal/genética , Integração Viral/genética , Papillomaviridae/genética
20.
bioRxiv ; 2023 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-37873367

RESUMO

Background: The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a type of complex genomic rearrangement (CGR) hypothesized to result from replicative repair of DNA due to replication fork collapse. It is often mediated by a pair of inverted low-copy repeats (LCR) followed by iterative template switches resulting in at least two breakpoint junctions in cis . Although it has been identified as an important mutation signature of pathogenicity for genomic disorders and cancer genomes, its architecture remains unresolved and is predicted to display at least four structural variation (SV) haplotypes. Results: Here we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the genomic DNA of 24 patients with neurodevelopmental disorders identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted SV haplotypes. Using a combination of short-read genome sequencing (GS), long- read GS, optical genome mapping and StrandSeq the haplotype structure was resolved in 18 samples. This approach refined the point of template switching between inverted LCRs in 4 samples revealing a DNA segment of ∼2.2-5.5 kb of 100% nucleotide similarity. A prediction model was developed to infer the LCR used to mediate the non-allelic homology repair. Conclusions: These data provide experimental evidence supporting the hypothesis that inverted LCRs act as a recombinant substrate in replication-based repair mechanisms. Such inverted repeats are particularly relevant for formation of copy-number associated inversions, including the DUP-TRP/INV-DUP structures. Moreover, this type of CGR can result in multiple conformers which contributes to generate diverse SV haplotypes in susceptible loci .

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa