Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 44
Filter
1.
Am J Hum Genet ; 110(8): 1229-1248, 2023 08 03.
Article in English | MEDLINE | ID: mdl-37541186

ABSTRACT

Despite advances in clinical genetic testing, including the introduction of exome sequencing (ES), more than 50% of individuals with a suspected Mendelian condition lack a precise molecular diagnosis. Clinical evaluation is increasingly undertaken by specialists outside of clinical genetics, often occurring in a tiered fashion and typically ending after ES. The current diagnostic rate reflects multiple factors, including technical limitations, incomplete understanding of variant pathogenicity, missing genotype-phenotype associations, complex gene-environment interactions, and reporting differences between clinical labs. Maintaining a clear understanding of the rapidly evolving landscape of diagnostic tests beyond ES, and their limitations, presents a challenge for non-genetics professionals. Newer tests, such as short-read genome or RNA sequencing, can be challenging to order, and emerging technologies, such as optical genome mapping and long-read DNA sequencing, are not available clinically. Furthermore, there is no clear guidance on the next best steps after inconclusive evaluation. Here, we review why a clinical genetic evaluation may be negative, discuss questions to be asked in this setting, and provide a framework for further investigation, including the advantages and disadvantages of new approaches that are nascent in the clinical sphere. We present a guide for the next best steps after inconclusive molecular testing based upon phenotype and prior evaluation, including when to consider referral to research consortia focused on elucidating the underlying cause of rare unsolved genetic disorders.


Subject(s)
Exome , Genetic Testing , Humans , Exome/genetics , Sequence Analysis, DNA , Phenotype , Exome Sequencing , Rare Diseases
2.
Genome Res ; 31(4): 635-644, 2021 04.
Article in English | MEDLINE | ID: mdl-33602693

ABSTRACT

The COVID-19 pandemic has sparked an urgent need to uncover the underlying biology of this devastating disease. Though RNA viruses mutate more rapidly than DNA viruses, there are a relatively small number of single nucleotide polymorphisms (SNPs) that differentiate the main SARS-CoV-2 lineages that have spread throughout the world. In this study, we investigated 129 RNA-seq data sets and 6928 consensus genomes to contrast the intra-host and inter-host diversity of SARS-CoV-2. Our analyses yielded three major observations. First, the mutational profile of SARS-CoV-2 highlights intra-host single nucleotide variant (iSNV) and SNP similarity, albeit with differences in C > U changes. Second, iSNV and SNP patterns in SARS-CoV-2 are more similar to MERS-CoV than SARS-CoV-1. Third, a significant fraction of insertions and deletions contribute to the genetic diversity of SARS-CoV-2. Altogether, our findings provide insight into SARS-CoV-2 genomic diversity, inform the design of detection tests, and highlight the potential of iSNVs for tracking the transmission of SARS-CoV-2.


Subject(s)
COVID-19/diagnosis , COVID-19/transmission , Genetic Variation , Genome, Viral , Real-Time Polymerase Chain Reaction/methods , SARS-CoV-2/genetics , COVID-19/virology , Host-Pathogen Interactions , Humans , Polymorphism, Single Nucleotide
3.
Ann Neurol ; 93(5): 1012-1022, 2023 05.
Article in English | MEDLINE | ID: mdl-36695634

ABSTRACT

OBJECTIVE: Identification of genetic risk factors for Parkinson disease (PD) has to date been primarily limited to the study of single nucleotide variants, which only represent a small fraction of the genetic variation in the human genome. Consequently, causal variants for most PD risk are not known. Here we focused on structural variants (SVs), which represent a major source of genetic variation in the human genome. We aimed to discover SVs associated with PD risk by performing the first large-scale characterization of SVs in PD. METHODS: We leveraged a recently developed computational pipeline to detect and genotype SVs from 7,772 Illumina short-read whole genome sequencing samples. Using this set of SV variants, we performed a genome-wide association study using 2,585 cases and 2,779 controls and identified SVs associated with PD risk. Furthermore, to validate the presence of these variants, we generated a subset of matched whole-genome long-read sequencing data. RESULTS: We genotyped and tested 3,154 common SVs, representing over 412 million nucleotides of previously uncatalogued genetic variation. Using long-read sequencing data, we validated the presence of three novel deletion SVs that are associated with risk of PD from our initial association analysis, including a 2 kb intronic deletion within the gene LRRN4. INTERPRETATION: We identified three SVs associated with genetic risk of PD. This study represents the most comprehensive assessment of the contribution of SVs to the genetic risk of PD to date. ANN NEUROL 2023;93:1012-1022.


Subject(s)
Genome-Wide Association Study , Parkinson Disease , Humans , Parkinson Disease/genetics , Genome, Human , Whole Genome Sequencing , Genotype
4.
Hum Mutat ; 43(12): 2033-2053, 2022 12.
Article in English | MEDLINE | ID: mdl-36054313

ABSTRACT

Xia-Gibbs syndrome (XGS; MIM# 615829) is a rare mendelian disorder characterized by Development Delay (DD), intellectual disability (ID), and hypotonia. Individuals with XGS typically harbor de novo protein-truncating mutations in the AT-Hook DNA binding motif containing 1 (AHDC1) gene, although some missense mutations can also cause XGS. Large de novo heterozygous deletions that encompass the AHDC1 gene have also been ascribed as diagnostic for the disorder, without substantial evidence to support their pathogenicity. We analyzed 19 individuals with large contiguous deletions involving AHDC1, along with other genes. One individual bore the smallest known contiguous AHDC1 deletion (∼350 Kb), encompassing eight other genes within chr1p36.11 (Feline Gardner-Rasheed, IFI6, FAM76A, STX12, PPP1R8, THEMIS2, RPA2, SMPDL3B) and terminating within the first intron of AHDC1. The breakpoint junctions and phase of the deletion were identified using both short and long read sequencing (Oxford Nanopore). Quantification of RNA expression patterns in whole blood revealed that AHDC1 exhibited a mono-allelic expression pattern with no deficiency in overall AHDC1 expression levels, in contrast to the other deleted genes, which exhibited a 50% reduction in mRNA expression. These results suggest that AHDC1 expression in this individual is compensated by a novel regulatory mechanism and advances understanding of mutational and regulatory mechanisms in neurodevelopmental disorders.


Subject(s)
Abnormalities, Multiple , Intellectual Disability , Musculoskeletal Abnormalities , Neurodevelopmental Disorders , Humans , Abnormalities, Multiple/genetics , DNA-Binding Proteins/genetics , Endoribonucleases , Intellectual Disability/genetics , Neurodevelopmental Disorders/genetics , Phosphoprotein Phosphatases , Qa-SNARE Proteins , RNA-Binding Proteins , Sphingomyelin Phosphodiesterase
5.
Mem Cognit ; 48(4): 511-525, 2020 05.
Article in English | MEDLINE | ID: mdl-31755026

ABSTRACT

Previous research has shown that early-acquired words are produced faster than late-acquired words. Juhasz and colleagues (Juhasz, Lai & Woodcock, Behavior Research Methods, 47 (4), 1004-1019, 2015; Juhasz, The Quarterly Journal of Experimental Psychology, 1-10, 2018) argue that the Age-of-Acquisition (AoA) loci for complex words, specifically compound words, are found at the lexical/semantic level. In the current study, two experiments were conducted to evaluate this claim and investigate the influence of AoA in reading compound words aloud. In Experiment 1, 48 participants completed a word naming task. Using general linear mixed modelling, we found that the age at which the compound word was learned significantly affected the naming latencies beyond the other psycholinguistic properties measured. The second experiment required 48 participants to name the compound word when the two morphemes were presented with a space in-between (combinatorial naming, e.g. air plane). We found that the age at which the compound word was learned, as well as the AoA of the individual morphemes that formed the compound word, significantly influenced combinatorial naming latency. These findings are discussed in relation to theories of the AoA in language processing.


Subject(s)
Word Processing , Humans , Language Development , Psycholinguistics , Reaction Time , Reading , Semantics , Vocabulary
6.
Genomics ; 111(1): 43-49, 2019 01.
Article in English | MEDLINE | ID: mdl-29268960

ABSTRACT

Long sequencing reads offer unprecedented opportunities in analysis and reconstruction of complex genomic regions. However, the gain in sequence length is often traded for quality. Therefore, recently several approaches have been proposed (e.g. higher sequencing coverage, hybrid assembly or sequence correction) to enhance the quality of long sequencing reads. A simple and cost-effective approach includes use of the high quality 2nd generation sequencing data to improve the quality of long reads. We designed a dedicated testing procedure and selected universal programs for long read correction, which provide as the output sequences that can be used in further genomic and transcriptomic studies. Our results show that HALC is the best choice for correction of long PacBio reads, when both, read size and quality, are the main focus of the analysis. However, the tested tools show some unexpected behaviors, including read trimming and fragmentation.


Subject(s)
Algorithms , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA , Animals , Databases, Genetic , Escherichia coli/genetics , Genomics , Humans , Oryza/genetics , Trypanosoma/genetics , Yeasts/genetics
8.
J Virol ; 89(23): 11899-908, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26378176

ABSTRACT

UNLABELLED: Infected peripheral blood mononuclear cells (PBMC) effectively transport equine herpesvirus type 1 (EHV-1), but not EHV-4, to endothelial cells (EC) lining the blood vessels of the pregnant uterus or central nervous system, a process that can result in abortion or myeloencephalopathy. We examined, using a dynamic in vitro model, the differences between EHV-1 and EHV-4 infection of PBMC and PBMC-EC interactions. In order to evaluate viral transfer between infected PBMC and EC, cocultivation assays were performed. Only EHV-1 was transferred from PBMC to EC, and viral glycoprotein B (gB) was shown to be mainly responsible for this form of cell-to-cell transfer. For addressing the more dynamic aspects of PBMC-EC interaction, infected PBMC were perfused through a flow channel containing EC in the presence of neutralizing antibodies. By simulating capillary blood flow and analyzing the behavior of infected PBMC through live fluorescence imaging and automated cell tracking, we observed that EHV-1 was able to maintain tethering and rolling of infected PBMC on EC more effectively than EHV-4. Deletion of US3 reduced the ability of infected PBMC to tether and roll compared to that of cells infected with parental virus, which resulted in a significant reduction in virus transfer from PBMC to EC. Taking the results together, we conclude that systemic spread and EC infection by EHV-1, but not EHV-4, is caused by its ability to infect and/or reprogram mononuclear cells with respect to their tethering and rolling behavior on EC and consequent virus transfer. IMPORTANCE: EHV-1 is widespread throughout the world and causes substantial economic losses through outbreaks of respiratory disease, abortion, and myeloencephalopathy. Despite many years of research, no fully protective vaccines have been developed, and several aspects of viral pathogenesis still need to be uncovered. In the current study, we investigated the molecular mechanisms that facilitate the cell-associated viremia, which is arguably the most important aspect of EHV-1 pathogenesis. The newly discovered functions of gB and pUS3 add new facets to their previously reported roles. Due to the conserved nature of cell-associated viremia among numerous herpesviruses, these results are also very relevant for viruses such as varicella-zoster virus, pseudorabies virus, human cytomegalovirus, and others. In addition, the constructed mutant and recombinant viruses exhibit potent in vitro replication but have significant defects in certain stages of the disease course. These viruses therefore show much promise as candidates for future live vaccines.


Subject(s)
Endothelial Cells/virology , Herpesviridae Infections/physiopathology , Herpesvirus 1, Equid/physiology , Herpesvirus 4, Equid/physiology , Leukocytes, Mononuclear/virology , Protein Serine-Threonine Kinases/metabolism , Viral Envelope Proteins/metabolism , Analysis of Variance , Animals , Cell Aggregation , Cells, Cultured , Fluorescence , Horses , In Vitro Techniques , Statistics, Nonparametric , Virus Internalization
9.
Nat Commun ; 15(1): 5327, 2024 Jun 22.
Article in English | MEDLINE | ID: mdl-38909018

ABSTRACT

The assignment of variants across haplotypes, phasing, is crucial for predicting the consequences, interaction, and inheritance of mutations and is a key step in improving our understanding of phenotype and disease. However, phasing is limited by read length and stretches of homozygosity along the genome. To overcome this limitation, we designed MethPhaser, a method that utilizes methylation signals from Oxford Nanopore Technologies to extend Single Nucleotide Variation (SNV)-based phasing. We demonstrate that haplotype-specific methylations extensively exist in Human genomes and the advent of long-read technologies enabled direct report of methylation signals. For ONT R9 and R10 cell line data, we increase the phase length N50 by 78%-151% at a phasing accuracy of 83.4-98.7% To assess the impact of tissue purity and random methylation signals due to inactivation, we also applied MethPhaser on blood samples from 4 patients, still showing improvements over SNV-only phasing. MethPhaser further improves phasing across HLA and multiple other medically relevant genes, improving our understanding of how mutations interact across multiple phenotypes. The concept of MethPhaser can also be extended to non-human diploid genomes. MethPhaser is available at https://github.com/treangenlab/methphaser .


Subject(s)
DNA Methylation , Genome, Human , Haplotypes , Polymorphism, Single Nucleotide , Humans , Cell Line , Mutation
10.
Nat Biotechnol ; 2024 Jan 02.
Article in English | MEDLINE | ID: mdl-38168980

ABSTRACT

Calling structural variations (SVs) is technically challenging, but using long reads remains the most accurate way to identify complex genomic alterations. Here we present Sniffles2, which improves over current methods by implementing a repeat aware clustering coupled with a fast consensus sequence and coverage-adaptive filtering. Sniffles2 is 11.8 times faster and 29% more accurate than state-of-the-art SV callers across different coverages (5-50×), sequencing technologies (ONT and HiFi) and SV types. Furthermore, Sniffles2 solves the problem of family-level to population-level SV calling to produce fully genotyped VCF files. Across 11 probands, we accurately identified causative SVs around MECP2, including highly complex alleles with three overlapping SVs. Sniffles2 also enables the detection of mosaic SVs in bulk long-read data. As a result, we identified multiple mosaic SVs in brain tissue from a patient with multiple system atrophy. The identified SV showed a remarkable diversity within the cingulate cortex, impacting both genes involved in neuron function and repetitive elements.

11.
medRxiv ; 2024 Apr 24.
Article in English | MEDLINE | ID: mdl-38712270

ABSTRACT

Both long-read genome sequencing (lrGS) and the recently published Telomere to Telomere (T2T) reference genome provide increased coverage and resolution across repetitive regions promising heightened structural variant detection and improved mapping. Inversions (INV), intrachromosomal segments which are rotated 180° and inserted back into the same chromosome, are a class of structural variants particularly challenging to detect due to their copy-number neutral state and association with repetitive regions. Inversions represent about 1/20 of all balanced structural chromosome aberrations and can lead to disease by gene disruption or altering regulatory regions of dosage sensitive genes in cis . Here we remapped the genome data from six individuals carrying unsolved cytogenetically detected inversions. An INV6 and INV10 were resolved using GRCh38 and T2T-CHM13. Finally, an INV9 required optical genome mapping, de novo assembly of lrGS data and T2T-CHM13. This inversion disrupted intron 25 of EHMT1, confirming a diagnosis of Kleefstra syndrome 1 (MIM#610253). These three inversions, only mappable in specific references, prompted us to investigate the presence and population frequencies of differential reference regions (DRRs) between T2T-CHM13, GRCh37, GRCh38, the chimpanzee and bonobo, and hundreds of megabases of DRRs were identified. Our results emphasize the significance of the chosen reference genome and the added benefits of lrGS and optical genome mapping in solving rearrangements in challenging regions of the genome. This is particularly important for inversions and may impact clinical diagnostics.

12.
medRxiv ; 2024 Mar 18.
Article in English | MEDLINE | ID: mdl-38562723

ABSTRACT

Comprehending the mechanism behind human diseases with an established heritable component represents the forefront of personalized medicine. Nevertheless, numerous medically important genes are inaccurately represented in short-read sequencing data analysis due to their complexity and repetitiveness or the so-called 'dark regions' of the human genome. The advent of PacBio as a long-read platform has provided new insights, yet HiFi whole-genome sequencing (WGS) cost remains frequently prohibitive. We introduce a targeted sequencing and analysis framework, Twist Alliance Dark Genes Panel (TADGP), designed to offer phased variants across 389 medically important yet complex autosomal genes. We highlight TADGP accuracy across eleven control samples and compare it to WGS. This demonstrates that TADGP achieves variant calling accuracy comparable to HiFi-WGS data, but at a fraction of the cost. Thus, enabling scalability and broad applicability for studying rare diseases or complementing previously sequenced samples to gain insights into these complex genes. TADGP revealed several candidate variants across all cases and provided insight into LPA diversity when tested on samples from rare disease and cardiovascular disease cohorts. In both cohorts, we identified novel variants affecting individual disease-associated genes (e.g., IKZF1, KCNE1). Nevertheless, the annotation of the variants across these 389 medically important genes remains challenging due to their underrepresentation in ClinVar and gnomAD. Consequently, we also offer an annotation resource to enhance the evaluation and prioritization of these variants. Overall, we can demonstrate that TADGP offers a cost-efficient and scalable approach to routinely assess the dark regions of the human genome with clinical relevance.

13.
Cell Genom ; 4(7): 100590, 2024 Jul 10.
Article in English | MEDLINE | ID: mdl-38908378

ABSTRACT

The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a complex genomic rearrangement (CGR). Although it has been identified as an important pathogenic DNA mutation signature in genomic disorders and cancer genomes, its architecture remains unresolved. Here, we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the DNA of 24 patients identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted structural variant (SV) haplotypes. Using a combination of short-read genome sequencing (GS), long-read GS, optical genome mapping, and single-cell DNA template strand sequencing (strand-seq), the haplotype structure was resolved in 18 samples. The point of template switching in 4 samples was shown to be a segment of ∼2.2-5.5 kb of 100% nucleotide similarity within inverted repeat pairs. These data provide experimental evidence that inverted low-copy repeats act as recombinant substrates. This type of CGR can result in multiple conformers generating diverse SV haplotypes in susceptible dosage-sensitive loci.


Subject(s)
Haplotypes , Humans , Haplotypes/genetics , Comparative Genomic Hybridization , Genomic Structural Variation/genetics , Genome, Human/genetics , Gene Duplication/genetics
14.
G3 (Bethesda) ; 13(2)2023 02 09.
Article in English | MEDLINE | ID: mdl-36454082

ABSTRACT

Identifying selection on polygenic complex traits in crops and livestock is important for understanding evolution and helps prioritize important characteristics for breeding. Quantitative trait loci (QTL) that contribute to polygenic trait variation often exhibit small or infinitesimal effects. This hinders the ability to detect QTL-controlling polygenic traits because enormously high statistical power is needed for their detection. Recently, we circumvented this challenge by introducing a method to identify selection on complex traits by evaluating the relationship between genome-wide changes in allele frequency and estimates of effect size. The approach involves calculating a composite statistic across all markers that capture this relationship, followed by implementing a linkage disequilibrium-aware permutation test to evaluate if the observed pattern differs from that expected due to drift during evolution and population stratification. In this manuscript, we describe "Ghat," an R package developed to implement this method to test for selection on polygenic traits. We demonstrate the package by applying it to test for polygenic selection on 15 published European wheat traits including yield, biomass, quality, morphological characteristics, and disease resistance traits. Moreover, we applied Ghat to different simulated populations with different breeding histories and genetic architectures. The results highlight the power of Ghat to identify selection on complex traits. The Ghat package is accessible on CRAN, the Comprehensive R Archival Network, and on GitHub.


Subject(s)
Multifactorial Inheritance , Plant Breeding , Multifactorial Inheritance/genetics , Quantitative Trait Loci , Linkage Disequilibrium , Gene Frequency , Phenotype
15.
Genome Biol ; 24(1): 221, 2023 10 05.
Article in English | MEDLINE | ID: mdl-37798733

ABSTRACT

Genomic benchmark datasets are essential to driving the field of genomics and bioinformatics. They provide a snapshot of the performances of sequencing technologies and analytical methods and highlight future challenges. However, they depend on sequencing technology, reference genome, and available benchmarking methods. Thus, creating a genomic benchmark dataset is laborious and highly challenging, often involving multiple sequencing technologies, different variant calling tools, and laborious manual curation. In this review, we discuss the available benchmark datasets and their utility. Additionally, we focus on the most recent benchmark of genes with medical relevance and challenging genomic complexity.


Subject(s)
Benchmarking , Genomics , Genomics/methods , Computational Biology/methods , Genome , High-Throughput Nucleotide Sequencing/methods
16.
Cancer Discov ; 13(4): 910-927, 2023 04 03.
Article in English | MEDLINE | ID: mdl-36715691

ABSTRACT

The human papillomavirus (HPV) genome is integrated into host DNA in most HPV-positive cancers, but the consequences for chromosomal integrity are unknown. Continuous long-read sequencing of oropharyngeal cancers and cancer cell lines identified a previously undescribed form of structural variation, "heterocateny," characterized by diverse, interrelated, and repetitive patterns of concatemerized virus and host DNA segments within a cancer. Unique breakpoints shared across structural variants facilitated stepwise reconstruction of their evolution from a common molecular ancestor. This analysis revealed that virus and virus-host concatemers are unstable and, upon insertion into and excision from chromosomes, facilitate capture, amplification, and recombination of host DNA and chromosomal rearrangements. Evidence of heterocateny was detected in extrachromosomal and intrachromosomal DNA. These findings indicate that heterocateny is driven by the dynamic, aberrant replication and recombination of an oncogenic DNA virus, thereby extending known consequences of HPV integration to include promotion of intratumoral heterogeneity and clonal evolution. SIGNIFICANCE: Long-read sequencing of HPV-positive cancers revealed "heterocateny," a previously unreported form of genomic structural variation characterized by heterogeneous, interrelated, and repetitive genomic rearrangements within a tumor. Heterocateny is driven by unstable concatemerized HPV genomes, which facilitate capture, rearrangement, and amplification of host DNA, and promotes intratumoral heterogeneity and clonal evolution. See related commentary by McBride and White, p. 814. This article is highlighted in the In This Issue feature, p. 799.


Subject(s)
Oropharyngeal Neoplasms , Papillomavirus Infections , Humans , Human Papillomavirus Viruses , Gene Rearrangement , Clonal Evolution/genetics , Virus Integration/genetics , Papillomaviridae/genetics
17.
Genome Biol ; 24(1): 31, 2023 02 21.
Article in English | MEDLINE | ID: mdl-36810122

ABSTRACT

The current version of the human reference genome, GRCh38, contains a number of errors including 1.2 Mbp of falsely duplicated and 8.04 Mbp of collapsed regions. These errors impact the variant calling of 33 protein-coding genes, including 12 with medical relevance. Here, we present FixItFelix, an efficient remapping approach, together with a modified version of the GRCh38 reference genome that improves the subsequent analysis across these genes within minutes for an existing alignment file while maintaining the same coordinates. We showcase these improvements over multi-ethnic control samples, demonstrating improvements for population variant calling as well as eQTL studies.


Subject(s)
Genome, Human , Genomics , Humans , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA
18.
ArXiv ; 2023 Jan 18.
Article in English | MEDLINE | ID: mdl-36713248

ABSTRACT

Despite advances in clinical genetic testing, including the introduction of exome sequencing (ES), more than 50% of individuals with a suspected Mendelian condition lack a precise molecular diagnosis. Clinical evaluation is increasingly undertaken by specialists outside of clinical genetics, often occurring in a tiered fashion and typically ending after ES. The current diagnostic rate reflects multiple factors, including technical limitations, incomplete understanding of variant pathogenicity, missing genotype-phenotype associations, complex gene-environment interactions, and reporting differences between clinical labs. Maintaining a clear understanding of the rapidly evolving landscape of diagnostic tests beyond ES, and their limitations, presents a challenge for non-genetics professionals. Newer tests, such as short-read genome or RNA sequencing, can be challenging to order and emerging technologies, such as optical genome mapping and long-read DNA or RNA sequencing, are not available clinically. Furthermore, there is no clear guidance on the next best steps after inconclusive evaluation. Here, we review why a clinical genetic evaluation may be negative, discuss questions to be asked in this setting, and provide a framework for further investigation, including the advantages and disadvantages of new approaches that are nascent in the clinical sphere. We present a guide for the next best steps after inconclusive molecular testing based upon phenotype and prior evaluation, including when to consider referral to a consortium such as GREGoR, which is focused on elucidating the underlying cause of rare unsolved genetic disorders.

19.
bioRxiv ; 2023 Oct 03.
Article in English | MEDLINE | ID: mdl-37873367

ABSTRACT

Background: The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a type of complex genomic rearrangement (CGR) hypothesized to result from replicative repair of DNA due to replication fork collapse. It is often mediated by a pair of inverted low-copy repeats (LCR) followed by iterative template switches resulting in at least two breakpoint junctions in cis . Although it has been identified as an important mutation signature of pathogenicity for genomic disorders and cancer genomes, its architecture remains unresolved and is predicted to display at least four structural variation (SV) haplotypes. Results: Here we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the genomic DNA of 24 patients with neurodevelopmental disorders identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted SV haplotypes. Using a combination of short-read genome sequencing (GS), long- read GS, optical genome mapping and StrandSeq the haplotype structure was resolved in 18 samples. This approach refined the point of template switching between inverted LCRs in 4 samples revealing a DNA segment of ∼2.2-5.5 kb of 100% nucleotide similarity. A prediction model was developed to infer the LCR used to mediate the non-allelic homology repair. Conclusions: These data provide experimental evidence supporting the hypothesis that inverted LCRs act as a recombinant substrate in replication-based repair mechanisms. Such inverted repeats are particularly relevant for formation of copy-number associated inversions, including the DUP-TRP/INV-DUP structures. Moreover, this type of CGR can result in multiple conformers which contributes to generate diverse SV haplotypes in susceptible loci .

20.
Nat Commun ; 13(1): 1321, 2022 03 14.
Article in English | MEDLINE | ID: mdl-35288552

ABSTRACT

Infectious disease monitoring on Oxford Nanopore Technologies (ONT) platforms offers rapid turnaround times and low cost. Tracking low frequency intra-host variants provides important insights with respect to elucidating within-host viral population dynamics and transmission. However, given the higher error rate of ONT, accurate identification of intra-host variants with low allele frequencies remains an open challenge with no viable computational solutions available. In response to this need, we present Variabel, a novel approach and first method designed for rescuing low frequency intra-host variants from ONT data alone. We evaluate Variabel on both synthetic data (SARS-CoV-2) and patient derived datasets (Ebola virus, norovirus, SARS-CoV-2); our results show that Variabel can accurately identify low frequency variants below 0.5 allele frequency, outperforming existing state-of-the-art ONT variant callers for this task. Variabel is open-source and available for download at: www.gitlab.com/treangenlab/variabel .


Subject(s)
COVID-19 , Nanopore Sequencing , Nanopores , High-Throughput Nucleotide Sequencing/methods , Humans , SARS-CoV-2/genetics
SELECTION OF CITATIONS
SEARCH DETAIL