RESUMEN
Fewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control datasets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project ONT Sequencing Consortium aims to generate LRS data from at least 800 of the 1000 Genomes Project samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37x and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.
RESUMEN
Studies of model organisms have provided important insights into how natural genetic differences shape trait variation. These discoveries are driven by the growing availability of genomes and the expansive experimental toolkits afforded to researchers using these species. For example, Caenorhabditis elegans is increasingly being used to identify and measure the effects of natural genetic variants on traits using quantitative genetics. Since 2016, the C. elegans Natural Diversity Resource (CeNDR) has facilitated many of these studies by providing an archive of wild strains, genome-wide sequence and variant data for each strain, and a genome-wide association (GWA) mapping portal for the C. elegans community. Here, we present an updated platform, the Caenorhabditis Natural Diversity Resource (CaeNDR), that enables quantitative genetics and genomics studies across the three Caenorhabditis species: C. elegans, C. briggsae and C. tropicalis. The CaeNDR platform hosts several databases that are continually updated by the addition of new strains, whole-genome sequence data and annotated variants. Additionally, CaeNDR provides new interactive tools to explore natural variation and enable GWA mappings. All CaeNDR data and tools are accessible through a freely available web portal located at caendr.org.
Asunto(s)
Caenorhabditis , Bases de Datos Genéticas , Animales , Caenorhabditis/clasificación , Caenorhabditis/genética , Caenorhabditis elegans/genética , Genoma , Estudio de Asociación del Genoma Completo , GenómicaRESUMEN
Sequence-based genetic testing identifies causative variants in ~ 50% of individuals with developmental and epileptic encephalopathies (DEEs). Aberrant changes in DNA methylation are implicated in various neurodevelopmental disorders but remain unstudied in DEEs. We interrogate the diagnostic utility of genome-wide DNA methylation array analysis on peripheral blood samples from 582 individuals with genetically unsolved DEEs. We identify rare differentially methylated regions (DMRs) and explanatory episignatures to uncover causative and candidate genetic etiologies in 12 individuals. Using long-read sequencing, we identify DNA variants underlying rare DMRs, including one balanced translocation, three CG-rich repeat expansions, and four copy number variants. We also identify pathogenic variants associated with episignatures. Finally, we refine the CHD2 episignature using an 850 K methylation array and bisulfite sequencing to investigate potential insights into CHD2 pathophysiology. Our study demonstrates the diagnostic yield of genome-wide DNA methylation analysis to identify causal and candidate variants as 2% (12/582) for unsolved DEE cases.
Asunto(s)
Variaciones en el Número de Copia de ADN , Metilación de ADN , Epilepsia , Humanos , Metilación de ADN/genética , Femenino , Niño , Masculino , Epilepsia/genética , Epilepsia/diagnóstico , Variaciones en el Número de Copia de ADN/genética , Preescolar , Proteínas de Unión al ADN/genética , Adolescente , Pruebas Genéticas/métodos , LactanteRESUMEN
Less than half of individuals with a suspected Mendelian condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control datasets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project ONT Sequencing Consortium aims to generate LRS data from at least 800 of the 1000 Genomes Project samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37x and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.
RESUMEN
Sequence-based genetic testing currently identifies causative genetic variants in â¼50% of individuals with developmental and epileptic encephalopathies (DEEs). Aberrant changes in DNA methylation are implicated in various neurodevelopmental disorders but remain unstudied in DEEs. Rare epigenetic variations ("epivariants") can drive disease by modulating gene expression at single loci, whereas genome-wide DNA methylation changes can result in distinct "episignature" biomarkers for monogenic disorders in a growing number of rare diseases. Here, we interrogate the diagnostic utility of genome-wide DNA methylation array analysis on peripheral blood samples from 516 individuals with genetically unsolved DEEs who had previously undergone extensive genetic testing. We identified rare differentially methylated regions (DMRs) and explanatory episignatures to discover causative and candidate genetic etiologies in 10 individuals. We then used long-read sequencing to identify DNA variants underlying rare DMRs, including one balanced translocation, three CG-rich repeat expansions, and two copy number variants. We also identify pathogenic sequence variants associated with episignatures; some had been missed by previous exome sequencing. Although most DEE genes lack known episignatures, the increase in diagnostic yield for DNA methylation analysis in DEEs is comparable to the added yield of genome sequencing. Finally, we refine an episignature for CHD2 using an 850K methylation array which was further refined at higher CpG resolution using bisulfite sequencing to investigate potential insights into CHD2 pathophysiology. Our study demonstrates the diagnostic yield of genome-wide DNA methylation analysis to identify causal and candidate genetic causes as â¼2% (10/516) for unsolved DEE cases.
RESUMEN
Parasitic nematode infections cause an enormous global burden to both humans and livestock. Resistance to the limited arsenal of anthelmintic drugs used to combat these infections is widespread, including benzimidazole (BZ) compounds. Previous studies using the free-living nematode Caenorhabditis elegans to model parasitic nematode resistance have shown that loss-of-function mutations in the beta-tubulin gene ben-1 confer resistance to BZ drugs. However, the mechanism of resistance and the tissue-specific susceptibility are not well known in any nematode species. To identify in which tissue(s) ben-1 function underlies BZ susceptibility, transgenic strains that express ben-1 in different tissues, including hypodermis, muscles, neurons, intestine, and ubiquitous expression were generated. High-throughput fitness assays were performed to measure and compare the quantitative responses to BZ compounds among different transgenic lines. Significant BZ susceptibility was observed in animals expressing ben-1 in neurons, comparable to expression using the ben-1 promoter. This result suggests that ben-1 function in neurons underlies susceptibility to BZ. Subsetting neuronal expression of ben-1 based on the neurotransmitter system further restricted ben-1 function in cholinergic neurons to cause BZ susceptibility. These results better inform our current understanding of the cellular mode of action of BZs and also suggest additional treatments that might potentiate the effects of BZs in neurons.
Asunto(s)
Antihelmínticos , Nematodos , Animales , Humanos , Tubulina (Proteína)/genética , Caenorhabditis elegans , Resistencia a Medicamentos/genética , Antihelmínticos/farmacología , Antihelmínticos/uso terapéuticoRESUMEN
The publication of the Caenorhabditis briggsae reference genome in 2003 enabled the first comparative genomics studies between C. elegans and C. briggsae, shedding light on the evolution of genome content and structure in the Caenorhabditis genus. However, despite being widely used, the currently available C. briggsae reference genome is substantially less complete and structurally accurate than the C. elegans reference genome. Here, we used high-coverage Oxford Nanopore long-read and chromosome-conformation capture data to generate chromosome-level reference genomes for two C. briggsae strains: QX1410, a new reference strain closely related to the laboratory AF16 strain, and VX34, a highly divergent strain isolated in China. We also sequenced 99 recombinant inbred lines generated from reciprocal crosses between QX1410 and VX34 to create a recombination map and identify chromosomal domains. Additionally, we used both short- and long-read RNA sequencing data to generate high-quality gene annotations. By comparing these new reference genomes to the current reference, we reveal that hyper-divergent haplotypes cover large portions of the C. briggsae genome, similar to recent reports in C. elegans and C. tropicalis. We also show that the genomes of selfing Caenorhabditis species have undergone more rearrangement than their outcrossing relatives, which has biased previous estimates of rearrangement rate in Caenorhabditis. These new genomes provide a substantially improved platform for comparative genomics in Caenorhabditis and narrow the gap between the quality of genomic resources available for C. elegans and C. briggsae.
Asunto(s)
Caenorhabditis , Animales , Caenorhabditis/genética , Caenorhabditis elegans/genética , Cromosomas , Genoma , GenómicaRESUMEN
To better understand the mechanism of resistance caused by putative interactions between beta-tubulin and benzimidazole compounds, we sought to purify nematode-specific beta-tubulins using heterologous expression after replacement of the single Saccharomyces cerevisiae beta-tubulin gene. However, we found that haploid yeast cells containing nematode-specific beta-tubulin genes were not viable, suggesting that nematode beta-tubulin cannot substitute for the loss of the yeast beta-tubulin gene.