Search | VHL Regional Portal

A comparison of methods for detecting DNA methylation from long-read sequencing of human genomes.

Sigurpalsdottir, Brynja D; Stefansson, Olafur A; Holley, Guillaume; Beyter, Doruk; Zink, Florian; Hardarson, Marteinn Þ; Sverrisson, Sverrir Þ; Kristinsdottir, Nina; Magnusdottir, Droplaug N; Magnusson, Olafur Þ; Gudbjartsson, Daniel F; Halldorsson, Bjarni V; Stefansson, Kari.

Genome Biol ; 25(1): 69, 2024 Mar 11.

Article in English | MEDLINE | ID: mdl-38468278

ABSTRACT

BACKGROUND: Long-read sequencing can enable the detection of base modifications, such as CpG methylation, in single molecules of DNA. The most commonly used methods for long-read sequencing are nanopore developed by Oxford Nanopore Technologies (ONT) and single molecule real-time (SMRT) sequencing developed by Pacific Bioscience (PacBio). In this study, we systematically compare the performance of CpG methylation detection from long-read sequencing. RESULTS: We demonstrate that CpG methylation detection from 7179 nanopore-sequenced DNA samples is highly accurate and consistent with 132 oxidative bisulfite-sequenced (oxBS) samples, isolated from the same blood draws. We introduce quality filters for CpGs that further enhance the accuracy of CpG methylation detection from nanopore-sequenced DNA, while removing at most 30% of CpGs. We evaluate the per-site performance of CpG methylation detection across different genomic features and CpG methylation rates and demonstrate how the latest R10.4 flowcell chemistry and base-calling algorithms improve methylation detection from nanopore sequencing. Additionally, we show how the methylation detection of 50 SMRT-sequenced genomes compares to nanopore sequencing and oxBS. CONCLUSIONS: This study provides the first systematic comparison of CpG methylation detection tools for long-read sequencing methods. We compare two commonly used computational methods for the detection of CpG methylation in a large number of nanopore genomes, including samples sequenced using the latest R10.4 nanopore flowcell chemistry and 50 SMRT sequenced samples. We provide insights into the strengths and limitations of each sequencing method as well as recommendations for standardization and evaluation of tools designed for genome-scale modified base detection using long-read sequencing.

Subject(s)

DNA Methylation , Genome, Human , Humans , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods , DNA

Variants at the Interleukin 1 Gene Locus and Pericarditis.

Thorolfsdottir, Rosa B; Jonsdottir, Andrea B; Sveinbjornsson, Gardar; Aegisdottir, Hildur M; Oddsson, Asmundur; Stefansson, Olafur A; Halldorsson, Gisli H; Saevarsdottir, Saedis; Thorleifsson, Gudmar; Stefansdottir, Lilja; Pedersen, Ole B; Sørensen, Erik; Ghouse, Jonas; Raja, Anna Axelsson; Zheng, Chaoqun; Silajdzija, Elvira; Rand, Søren Albertsen; Erikstrup, Christian; Ullum, Henrik; Mikkelsen, Christina; Banasik, Karina; Brunak, Søren; Ivarsdottir, Erna V; Sigurdsson, Asgeir; Beyter, Doruk; Sturluson, Arni; Einarsson, Hafsteinn; Tragante, Vinicius; Helgason, Hannes; Lund, Sigrun H; Halldorsson, Bjarni V; Sigurpalsdottir, Brynja D; Olafsson, Isleifur; Arnar, David O; Thorgeirsson, Gudmundur; Knowlton, Kirk U; Nadauld, Lincoln D; Gretarsdottir, Solveig; Helgadottir, Anna; Ostrowski, Sisse R; Gudbjartssson, Daniel F; Jonsdottir, Ingileif; Bundgaard, Henning; Holm, Hilma; Sulem, Patrick; Stefansson, Kari.

JAMA Cardiol ; 9(2): 165-172, 2024 Feb 01.

Article in English | MEDLINE | ID: mdl-38150231

ABSTRACT

Importance: Recurrent pericarditis is a treatment challenge and often a debilitating condition. Drugs inhibiting interleukin 1 cytokines are a promising new treatment option, but their use is based on scarce biological evidence and clinical trials of modest sizes, and the contributions of innate and adaptive immune processes to the pathophysiology are incompletely understood. Objective: To use human genomics, transcriptomics, and proteomics to shed light on the pathogenesis of pericarditis. Design, Setting, and Participants: This was a meta-analysis of genome-wide association studies of pericarditis from 5 countries. Associations were examined between the pericarditis-associated variants and pericarditis subtypes (including recurrent pericarditis) and secondary phenotypes. To explore mechanisms, associations with messenger RNA expression (cis-eQTL), plasma protein levels (pQTL), and CpG methylation of DNA (ASM-QTL) were assessed. Data from Iceland (deCODE genetics, 1983-2020), Denmark (Copenhagen Hospital Biobank/Danish Blood Donor Study, 1977-2022), the UK (UK Biobank, 1953-2021), the US (Intermountain, 1996-2022), and Finland (FinnGen, 1970-2022) were included. Data were analyzed from September 2022 to August 2023. Exposure: Genotype. Main Outcomes and Measures: Pericarditis. Results: In this genome-wide association study of 4894 individuals with pericarditis (mean [SD] age at diagnosis, 51.4 [17.9] years, 2734 [67.6%] male, excluding the FinnGen cohort), associations were identified with 2 independent common intergenic variants at the interleukin 1 locus on chromosome 2q14. The lead variant was rs12992780 (T) (effect allele frequency [EAF], 31%-40%; odds ratio [OR], 0.83; 95% CI, 0.79-0.87; P = 6.67 × 10-16), downstream of IL1B and the secondary variant rs7575402 (A or T) (EAF, 45%-55%; adjusted OR, 0.89; 95% CI, 0.85-0.93; adjusted P = 9.6 × 10-8). The lead variant rs12992780 had a smaller odds ratio for recurrent pericarditis (0.76) than the acute form (0.86) (P for heterogeneity = .03) and rs7575402 was associated with CpG methylation overlapping binding sites of 4 transcription factors known to regulate interleukin 1 production: PU.1 (encoded by SPI1), STAT1, STAT3, and CCAAT/enhancer-binding protein ß (encoded by CEBPB). Conclusions and Relevance: This study found an association between pericarditis and 2 independent sequence variants at the interleukin 1 gene locus. This finding has the potential to contribute to development of more targeted and personalized therapy of pericarditis with interleukin 1-blocking drugs.

Subject(s)

Genome-Wide Association Study , Humans , Male , Adolescent , Female , Genotype , Phenotype , Gene Frequency , Finland

The sequences of 150,119 genomes in the UK Biobank.

Halldorsson, Bjarni V; Eggertsson, Hannes P; Moore, Kristjan H S; Hauswedell, Hannes; Eiriksson, Ogmundur; Ulfarsson, Magnus O; Palsson, Gunnar; Hardarson, Marteinn T; Oddsson, Asmundur; Jensson, Brynjar O; Kristmundsdottir, Snaedis; Sigurpalsdottir, Brynja D; Stefansson, Olafur A; Beyter, Doruk; Holley, Guillaume; Tragante, Vinicius; Gylfason, Arnaldur; Olason, Pall I; Zink, Florian; Asgeirsdottir, Margret; Sverrisson, Sverrir T; Sigurdsson, Brynjar; Gudjonsson, Sigurjon A; Sigurdsson, Gunnar T; Halldorsson, Gisli H; Sveinbjornsson, Gardar; Norland, Kristjan; Styrkarsdottir, Unnur; Magnusdottir, Droplaug N; Snorradottir, Steinunn; Kristinsson, Kari; Sobech, Emilia; Jonsson, Helgi; Geirsson, Arni J; Olafsson, Isleifur; Jonsson, Palmi; Pedersen, Ole Birger; Erikstrup, Christian; Brunak, Søren; Ostrowski, Sisse Rye; Thorleifsson, Gudmar; Jonsson, Frosti; Melsted, Pall; Jonsdottir, Ingileif; Rafnar, Thorunn; Holm, Hilma; Stefansson, Hreinn; Saemundsdottir, Jona; Gudbjartsson, Daniel F; Magnusson, Olafur T.

Nature ; 607(7920): 732-740, 2022 07.

Article in English | MEDLINE | ID: mdl-35859178

ABSTRACT

Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data1,2. Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank3. This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation.

Subject(s)

Biological Specimen Banks , Databases, Genetic , Genetic Variation , Genome, Human , Genomics , Whole Genome Sequencing , Africa/ethnology , Asia/ethnology , Cohort Studies , Conserved Sequence , Exons/genetics , Genome, Human/genetics , Haplotypes/genetics , Humans , INDEL Mutation , Ireland/ethnology , Microsatellite Repeats , Polymorphism, Single Nucleotide/genetics , United Kingdom

popSTR: population-scale detection of STR variants.

Kristmundsdóttir, Snædís; Sigurpálsdóttir, Brynja D; Kehr, Birte; Halldórsson, Bjarni V.

Bioinformatics ; 33(24): 4041-4048, 2017 Dec 15.

Article in English | MEDLINE | ID: mdl-27591079

ABSTRACT

MOTIVATION: Microsatellites, also known as short tandem repeats (STRs), are tracts of repetitive DNA sequences containing motifs ranging from two to six bases. Microsatellites are one of the most abundant type of variation in the human genome, after single nucleotide polymorphisms (SNPs) and Indels. Microsatellite analysis has a wide range of applications, including medical genetics, forensics and construction of genetic genealogy. However, microsatellite variations are rarely considered in whole-genome sequencing studies, in large due to a lack of tools capable of analyzing them. RESULTS: Here we present a microsatellite genotyper, optimized for Illumina WGS data, which is both faster and more accurate than other methods previously presented. There are two main ingredients to our improvements. First we reduce the amount of sequencing data necessary for creating microsatellite profiles by using previously aligned sequencing data. Second, we use population information to train microsatellite and individual specific error profiles. By comparing our genotyping results to genotypes generated by capillary electrophoresis we show that our error rates are 50% lower than those of lobSTR, another program specifically developed to determine microsatellite genotypes. AVAILABILITY AND IMPLEMENTATION: Source code is available on Github: https://github.com/DecodeGenetics/popSTR. CONTACT: snaedis.kristmundsdottir@decode.is or bjarni.halldorsson@decode.is.

Subject(s)

Microsatellite Repeats , Genotype , Humans , Software , Whole Genome Sequencing

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL