Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 43
Filter
1.
Sci Total Environ ; 937: 173535, 2024 Aug 10.
Article in English | MEDLINE | ID: mdl-38802021

ABSTRACT

Wastewater-based epidemiological surveillance at municipal wastewater treatment plants has proven to play an important role in COVID-19 surveillance. Considering international passenger hubs contribute extensively to global transmission of viruses, wastewater surveillance at this type of location may be of added value as well. The aim of this study is to explore the potential of long-term wastewater surveillance at a large passenger hub as an additional tool for public health surveillance during different stages of a pandemic. Here, we present an analysis of SARS-CoV-2 viral loads in airport wastewater by reverse-transcription quantitative polymerase chain reaction (RT-qPCR) from the beginning of the COVID-19 pandemic in Feb 2020, and an analysis of SARS-CoV-2 variants by whole-genome next-generation sequencing from Sep 2020, both until Sep 2022, in the Netherlands. Results are contextualized using (inter)national measures and data sources such as passenger numbers, clinical surveillance data and national wastewater surveillance data. Our findings show that wastewater surveillance was possible throughout the study period, irrespective of measures, as viral loads were detected and quantified in 98.6 % (273/277) of samples. Emergence of SARS-CoV-2 variants, identified in 91.0 % (161/177) of sequenced samples, coincided with increases in viral loads. Furthermore, trends in viral load and variant detection in airport wastewater closely followed, and in some cases preceded, trends in national daily average viral load in wastewater and variants detected in clinical surveillance. Wastewater-based epidemiology at a large international airport is a valuable addition to classical COVID-19 surveillance and the developed expertise can be applied in pandemic preparedness plans for other (emerging) pathogens in the future.


Subject(s)
Airports , COVID-19 , SARS-CoV-2 , Viral Load , Wastewater , COVID-19/epidemiology , Wastewater/virology , Netherlands/epidemiology , Humans , Wastewater-Based Epidemiological Monitoring , Environmental Monitoring/methods
2.
Sci Rep ; 13(1): 17870, 2023 10 19.
Article in English | MEDLINE | ID: mdl-37857658

ABSTRACT

The implementation and integration of wastewater-based epidemiology constitutes a valuable addition to existing pathogen surveillance systems, such as clinical surveillance for SARS-CoV-2. In the Netherlands, SARS-CoV-2 variant circulation is monitored by performing whole-genome sequencing on wastewater samples. In this manuscript, we describe the detection of an AY.43 lineage (Delta variant) amid a period of BA.5 (Omicron variant) dominance in wastewater samples from two wastewater treatment plants (WWTPs) during the months of August and September of 2022. Our results describe a temporary emergence, which was absent in samples from other WWTPs, and which coincided with peaks in viral load. We show how these lineage estimates can be traced back to lineage-specific substitution patterns. The absence of this variant from reported clinical data, but high associated viral loads suggest cryptic transmission. Our findings highlight the additional value of wastewater surveillance for generating insights into circulating pathogens.


Subject(s)
COVID-19 , Humans , COVID-19/epidemiology , SARS-CoV-2/genetics , Wastewater , Wastewater-Based Epidemiological Monitoring
3.
Microbiol Spectr ; 11(4): e0502222, 2023 08 17.
Article in English | MEDLINE | ID: mdl-37432120

ABSTRACT

Norovirus is the primary cause of viral gastroenteritis (GE). To investigate norovirus epidemiology, there is a need for whole-genome sequencing and reference sets consisting of complete genomes. To investigate the potential of shotgun metagenomic sequencing on the Illumina platform for whole-genome sequencing, 71 reverse transcriptase quantitative PCR (RT-qPCR) norovirus positive-feces (threshold cycle [CT], <30) samples from norovirus surveillance within The Netherlands were subjected to metagenomic sequencing. Data were analyzed through an in-house next-generation sequencing (NGS) analysis workflow. Additionally, we assessed the potential of metagenomic sequencing for the surveillance of off-target viruses that are of importance for public health, e.g., sapovirus, rotavirus A, enterovirus, parechovirus, aichivirus, adenovirus, and bocaparvovirus. A total of 60 complete and 10 partial norovirus genomes were generated, representing 7 genogroup I capsid genotypes and 12 genogroup II capsid genotypes. In addition to the norovirus genomes, the metagenomic approach yielded partial or complete genomes of other viruses for 39% of samples from children and 6.7% of samples from adults, including adenovirus 41 (N = 1); aichivirus 1 (N = 1); coxsackievirus A2 (N = 2), A4 (N = 2), A5 (N = 1), and A16 (N = 1); bocaparvovirus 1 (N = 1) and 3 (N = 1); human parechovirus 1 (N = 2) and 3 (N = 1); Rotavirus A (N = 1); and a sapovirus GI.7 (N = 1). The sapovirus GI.7 was initially not detected through RT-qPCR and warranted an update of the primer and probe set. Metagenomic sequencing on the Illumina platform robustly determines complete norovirus genomes and may be used to broaden gastroenteritis surveillance by capturing off-target enteric viruses. IMPORTANCE Viral gastroenteritis results in significant morbidity and mortality in vulnerable individuals and is primarily caused by norovirus. To investigate norovirus epidemiology, there is a need for whole-genome sequencing and reference sets consisting of full genomes. Using surveillance samples sent to the Dutch National Institute for Public Health and the Environment (RIVM), we compared metagenomics against conventional techniques, such as RT-qPCR and Sanger-sequencing, with norovirus as the target pathogen. We determined that metagenomics is a robust method to generate complete norovirus genomes, in parallel to many off-target pathogenic enteric virus genomes, thereby broadening our surveillance efforts. Moreover, we detected a sapovirus that was not detected by our validated gastroenteritis RT-qPCR panel, which exemplifies the strength of metagenomics. Our study shows that metagenomics can be used for public health gastroenteritis surveillance, the generation of reference-sets for molecular epidemiology, and how it compares to current surveillance strategies.


Subject(s)
Adenoviridae Infections , Adenovirus Infections, Human , Enteritis , Enterovirus Infections , Enterovirus , Gastroenteritis , Norovirus , Rotavirus , Sapovirus , Viruses , Child , Adult , Humans , Infant , Public Health , Metagenomics , RNA, Viral/genetics , Gastroenteritis/epidemiology , Rotavirus/genetics , Viruses/genetics , Norovirus/genetics , Adenoviridae/genetics , Sapovirus/genetics , Enterovirus/genetics , Feces
4.
Bioinformatics ; 39(1)2023 01 01.
Article in English | MEDLINE | ID: mdl-36594541

ABSTRACT

MOTIVATION: Beyond identifying genetic variants, we introduce a set of Boolean relations, which allows for a comprehensive classification of the relations of every pair of variants by taking all minimal alignments into account. We present an efficient algorithm to compute these relations, including a novel way of efficiently computing all minimal alignments within the best theoretical complexity bounds. RESULTS: We show that these relations are common, and many non-trivial, for variants of the CFTR gene in dbSNP. Ultimately, we present an approach for the storing and indexing of variants in the context of a database that enables efficient querying for all these relations. AVAILABILITY AND IMPLEMENTATION: A Python implementation is available at https://github.com/mutalyzer/algebra/tree/v0.2.0 as well as an interface at https://mutalyzer.nl/algebra.


Subject(s)
Algorithms , Data Management , Databases, Factual , Software
5.
J Proteome Res ; 22(2): 514-519, 2023 02 03.
Article in English | MEDLINE | ID: mdl-36173614

ABSTRACT

It has long been known that biological species can be identified from mass spectrometry data alone. Ten years ago, we described a method and software tool, compareMS2, for calculating a distance between sets of tandem mass spectra, as routinely collected in proteomics. This method has seen use in species identification and mixture characterization in food and feed products, as well as other applications. Here, we present the first major update of this software, including a new metric, a graphical user interface and additional functionality. The data have been deposited to ProteomeXchange with dataset identifier PXD034932.


Subject(s)
Software , Tandem Mass Spectrometry , Tandem Mass Spectrometry/methods , Proteomics/methods , Algorithms
6.
Sci Data ; 9(1): 169, 2022 04 13.
Article in English | MEDLINE | ID: mdl-35418585

ABSTRACT

The genomes of thousands of individuals are profiled within Dutch healthcare and research each year. However, this valuable genomic data, associated clinical data and consent are captured in different ways and stored across many systems and organizations. This makes it difficult to discover rare disease patients, reuse data for personalized medicine and establish research cohorts based on specific parameters. FAIR Genomes aims to enable NGS data reuse by developing metadata standards for the data descriptions needed to FAIRify genomic data while also addressing ELSI issues. We developed a semantic schema of essential data elements harmonized with international FAIR initiatives. The FAIR Genomes schema v1.1 contains 110 elements in 9 modules. It reuses common ontologies such as NCIT, DUO and EDAM, only introducing new terms when necessary. The schema is represented by a YAML file that can be transformed into templates for data entry software (EDC) and programmatic interfaces (JSON, RDF) to ease genomic data sharing in research and healthcare. The schema, documentation and MOLGENIS reference implementation are available at https://fairgenomes.org .


Subject(s)
High-Throughput Nucleotide Sequencing , Metadata , Delivery of Health Care , Genomics , Humans , Software
7.
Mol Ther Nucleic Acids ; 25: 342-354, 2021 Sep 03.
Article in English | MEDLINE | ID: mdl-34484861

ABSTRACT

Facioscapulohumeral muscular dystrophy (FSHD) is caused by chromatin relaxation of the D4Z4 repeat resulting in misexpression of the D4Z4-encoded DUX4 gene in skeletal muscle. One of the key genetic requirements for the stable production of full-length DUX4 mRNA in skeletal muscle is a functional polyadenylation signal (ATTAAA) in exon three of DUX4 that is used in somatic cells. Base editors hold great promise to treat DNA lesions underlying genetic diseases through their ability to carry out specific and rapid nucleotide mutagenesis even in postmitotic cells such as skeletal muscle. In this study, we present a simple and straightforward strategy for mutagenesis of the somatic DUX4 polyadenylation signal by adenine base editing in immortalized myoblasts derived from independent FSHD-affected individuals. We show that mutating this critical cis-regulatory element results in downregulation of DUX4 mRNA and its direct transcriptional target genes. Our findings identify the somatic DUX4 polyadenylation signal as a therapeutic target and represent the first step toward clinical application of the CRISPR-Cas9 base editing platform for FSHD gene therapy.

8.
Gastroenterology ; 161(4): 1218-1228.e5, 2021 10.
Article in English | MEDLINE | ID: mdl-34126062

ABSTRACT

BACKGROUND & AIMS: Patients with multiple recurrent Clostridioides difficile infection (rCDI) have a disturbed gut microbiota that can be restored by fecal microbiota transplantation (FMT). Despite extensive screening, healthy feces donors may carry bacteria in their intestinal tract that could have long-term health effects, such as potentially procarcinogenic polyketide synthase-positive (pks+) Escherichia coli. Here, we aim to determine whether the pks abundance and persistence of pks+E coli is influenced by pks status of the donor feces. METHODS: In a cohort of 49 patients with rCDI treated with FMT and matching donor samples-the largest cohort of its kind, to our knowledge-we retrospectively screened fecal metagenomes for pks+E coli and compared the presence of pks in patients before and after treatment and to their respective donors. RESULTS: The pks island was more prevalent (P = .026) and abundant (P < .001) in patients with rCDI (pre-FMT, 27 of 49 [55%]; median, 0.46 reads per kilobase per million [RPKM] pks) than in healthy donors (3 of 8 donors [37.5%], 11 of 38 samples [29%]; median, 0.01 RPKM pks). The pks status of patients post-FMT depended on the pks status of the donor suspension with which the patient was treated (P = .046). Particularly, persistence (8 of 9 cases) or clearance (13 of 18) of pks+E coli in pks+ patients was correlated to pks in the donor (P = .004). CONCLUSIONS: We conclude that FMT contributes to pks+E coli persistence or eradication in patients with rCDI but that donor-to-patient transmission of pks+E coli is unlikely.


Subject(s)
Clostridioides difficile/pathogenicity , Clostridium Infections/therapy , Escherichia coli/growth & development , Fecal Microbiota Transplantation , Gastrointestinal Microbiome , Adult , Aged , Aged, 80 and over , Clostridium Infections/diagnosis , Clostridium Infections/microbiology , Dysbiosis , Escherichia coli/enzymology , Escherichia coli/genetics , Escherichia coli Proteins/genetics , Escherichia coli Proteins/metabolism , Fecal Microbiota Transplantation/adverse effects , Female , Humans , Male , Metagenome , Metagenomics , Middle Aged , Polyketide Synthases/genetics , Polyketide Synthases/metabolism , Reinfection , Retrospective Studies , Time Factors , Treatment Outcome
9.
Bioinformatics ; 37(18): 2811-2817, 2021 09 29.
Article in English | MEDLINE | ID: mdl-33538839

ABSTRACT

MOTIVATION: Unambiguous variant descriptions are of utmost importance in clinical genetic diagnostics, scientific literature and genetic databases. The Human Genome Variation Society (HGVS) publishes a comprehensive set of guidelines on how variants should be correctly and unambiguously described. We present the implementation of the Mutalyzer 2 tool suite, designed to automatically apply the HGVS guidelines so users do not have to deal with the HGVS intricacies explicitly to check and correct their variant descriptions. RESULTS: Mutalyzer is profusely used by the community, having processed over 133 million descriptions since its launch. Over a five year period, Mutalyzer reported a correct input in ∼50% of cases. In 41% of the cases either a syntactic or semantic error was identified and for ∼7% of cases, Mutalyzer was able to automatically correct the description. AVAILABILITY AND IMPLEMENTATION: Mutalyzer is an Open Source project under the GNU Affero General Public License. The source code is available on GitHub (https://github.com/mutalyzer/mutalyzer) and a running instance is available at: https://mutalyzer.nl.


Subject(s)
Genetic Variation , Software , Humans , Genome, Human
10.
Sci Data ; 8(1): 10, 2021 01 15.
Article in English | MEDLINE | ID: mdl-33452270

ABSTRACT

Rett syndrome (RTT) is a rare neurological disorder mostly caused by a genetic variation in MECP2. Making new MECP2 variants and the related phenotypes available provides data for better understanding of disease mechanisms and faster identification of variants for diagnosis. This is, however, currently hampered by the lack of interoperability between genotype-phenotype databases. Here, we demonstrate on the example of MECP2 in RTT that by making the genotype-phenotype data more Findable, Accessible, Interoperable, and Reusable (FAIR), we can facilitate prioritization and analysis of variants. In total, 10,968 MECP2 variants were successfully integrated. Among these variants 863 unique confirmed RTT causing and 209 unique confirmed benign variants were found. This dataset was used for comparison of pathogenicity predicting tools, protein consequences, and identification of ambiguous variants. Prediction tools generally recognised the RTT causing and benign variants, however, there was a broad range of overlap Nineteen variants were identified that were annotated as both disease-causing and benign, suggesting that there are additional factors in these cases contributing to disease development.


Subject(s)
Methyl-CpG-Binding Protein 2/genetics , Mutation , Rett Syndrome/etiology , DNA Mutational Analysis , Data Analysis , Humans , Rett Syndrome/genetics
11.
Leukemia ; 35(1): 47-61, 2021 01.
Article in English | MEDLINE | ID: mdl-32127641

ABSTRACT

Acute myeloid leukemia (AML) is caused by genetic aberrations that also govern the prognosis of patients and guide risk-adapted and targeted therapy. Genetic aberrations in AML are structurally diverse and currently detected by different diagnostic assays. This study sought to establish whole transcriptome RNA sequencing as single, comprehensive, and flexible platform for AML diagnostics. We developed HAMLET (Human AML Expedited Transcriptomics) as bioinformatics pipeline for simultaneous detection of fusion genes, small variants, tandem duplications, and gene expression with all information assembled in an annotated, user-friendly output file. Whole transcriptome RNA sequencing was performed on 100 AML cases and HAMLET results were validated by reference assays and targeted resequencing. The data showed that HAMLET accurately detected all fusion genes and overexpression of EVI1 irrespective of 3q26 aberrations. In addition, small variants in 13 genes that are often mutated in AML were called with 99.2% sensitivity and 100% specificity, and tandem duplications in FLT3 and KMT2A were detected by a novel algorithm based on soft-clipped reads with 100% sensitivity and 97.1% specificity. In conclusion, HAMLET has the potential to provide accurate comprehensive diagnostic information relevant for AML classification, risk assessment and targeted therapy on a single technology platform.


Subject(s)
Exome Sequencing , Gene Expression Profiling , Leukemia, Myeloid, Acute/diagnosis , Leukemia, Myeloid, Acute/genetics , Transcriptome , Biomarkers, Tumor , Computational Biology/methods , Female , Gene Expression Profiling/methods , Gene Expression Regulation, Leukemic , Genetic Variation , Genomics/methods , Humans , Male , Molecular Diagnostic Techniques , Mutation , Oncogene Proteins, Fusion , Prognosis , Reproducibility of Results , Exome Sequencing/methods
12.
J Clin Virol ; 131: 104594, 2020 Oct.
Article in English | MEDLINE | ID: mdl-32866812

ABSTRACT

INTRODUCTION: The SARS-CoV-2 pandemic of 2020 is a prime example of the omnipresent threat of emerging viruses that can infect humans. A protocol for the identification of novel coronaviruses by viral metagenomic sequencing in diagnostic laboratories may contribute to pandemic preparedness. AIM: The aim of this study is to validate a metagenomic virus discovery protocol as a tool for coronavirus pandemic preparedness. METHODS: The performance of a viral metagenomic protocol in a clinical setting for the identification of novel coronaviruses was tested using clinical samples containing SARS-CoV-2, SARS-CoV, and MERS-CoV, in combination with databases generated to contain only viruses of before the discovery dates of these coronaviruses, to mimic virus discovery. RESULTS: Classification of NGS reads using Centrifuge and Genome Detective resulted in assignment of the reads to the closest relatives of the emerging coronaviruses. Low nucleotide and amino acid identity (81% and 84%, respectively, for SARS-CoV-2) in combination with up to 98% genome coverage were indicative for a related, novel coronavirus. Capture probes targeting vertebrate viruses, designed in 2015, enhanced both sequencing depth and coverage of the SARS-CoV-2 genome, the latter increasing from 71% to 98%. CONCLUSION: The model used for simulation of virus discovery enabled validation of the metagenomic sequencing protocol. The metagenomic protocol with virus probes designed before the pandemic, can assist the detection and identification of novel coronaviruses directly in clinical samples.


Subject(s)
Coronavirus Infections/virology , Genome, Viral , High-Throughput Nucleotide Sequencing , Metagenomics , Pneumonia, Viral/virology , Betacoronavirus/isolation & purification , COVID-19 , COVID-19 Testing , Clinical Laboratory Techniques/methods , Computational Biology , Coronavirus Infections/diagnosis , Humans , Middle East Respiratory Syndrome Coronavirus/isolation & purification , Nasopharynx/virology , Pandemics , Severe acute respiratory syndrome-related coronavirus/isolation & purification , SARS-CoV-2
13.
Forensic Sci Int Genet ; 46: 102257, 2020 05.
Article in English | MEDLINE | ID: mdl-32058299

ABSTRACT

The assessment of microbiome biodiversity is the most common application of metagenomics. While 16S sequencing remains standard procedure for taxonomic profiling of metagenomic data, a growing number of studies have clearly demonstrated biases associated with this method. By using Whole Genome Shotgun sequencing (WGS) metagenomics, most of the known restrictions associated with 16S data are alleviated. However, due to the computationally intensive data analyses and higher sequencing costs, WGS based metagenomics remains a less popular option. Selecting the experiment type that provides a comprehensive, yet manageable amount of information is a challenge encountered in many metagenomics studies. In this work, we created a series of artificial bacterial mixes, each with a different distribution of skin-associated microbial species. These mixes were used to estimate the resolution of two different metagenomic experiments - 16S and WGS - and to evaluate several different bioinformatics approaches for taxonomic read classification. In all test cases, WGS approaches provide much more accurate results, in terms of taxa prediction and abundance estimation, in comparison to those of 16S. Furthermore, we demonstrate that a 16S dataset, analysed using different state of the art techniques and reference databases, can produce widely different results. In light of the fact that most forensic metagenomic analysis are still performed using 16S data, our results are especially important.


Subject(s)
Bacteria/classification , Bacteria/genetics , RNA, Ribosomal, 16S/genetics , Whole Genome Sequencing , Datasets as Topic , High-Throughput Nucleotide Sequencing , Metagenomics , Real-Time Polymerase Chain Reaction
14.
Hum Mutat ; 40(12): 2230-2238, 2019 12.
Article in English | MEDLINE | ID: mdl-31433103

ABSTRACT

Each year diagnostic laboratories in the Netherlands profile thousands of individuals for heritable disease using next-generation sequencing (NGS). This requires pathogenicity classification of millions of DNA variants on the standard 5-tier scale. To reduce time spent on data interpretation and increase data quality and reliability, the nine Dutch labs decided to publicly share their classifications. Variant classifications of nearly 100,000 unique variants were catalogued and compared in a centralized MOLGENIS database. Variants classified by more than one center were labeled as "consensus" when classifications agreed, and shared internationally with LOVD and ClinVar. When classifications opposed (LB/B vs. LP/P), they were labeled "conflicting", while other nonconsensus observations were labeled "no consensus". We assessed our classifications using the InterVar software to compare to ACMG 2015 guidelines, showing 99.7% overall consistency with only 0.3% discrepancies. Differences in classifications between Dutch labs or between Dutch labs and ACMG were mainly present in genes with low penetrance or for late onset disorders and highlight limitations of the current 5-tier classification system. The data sharing boosted the quality of DNA diagnostics in Dutch labs, an initiative we hope will be followed internationally. Recently, a positive match with a case from outside our consortium resulted in a more definite disease diagnosis.


Subject(s)
Genetic Diseases, Inborn/diagnosis , Genetic Variation , High-Throughput Nucleotide Sequencing/methods , Information Dissemination/methods , Data Accuracy , Databases, Genetic , Genetic Diseases, Inborn/genetics , Guidelines as Topic , Humans , Laboratories , Netherlands , Sequence Analysis, DNA
15.
Int J Paleopathol ; 27: 1-8, 2019 12.
Article in English | MEDLINE | ID: mdl-31430635

ABSTRACT

OBJECTIVE: We assessed whether Petrus Donders (died 1887), a Dutch priest who for 27 years cared for people with leprosy in the leprosarium Batavia, Suriname, had evidence of Mycobacterium (M.) leprae infection. A positive finding of M. leprae ancient (a)DNA would contribute to the origin of leprosy in Suriname. MATERIALS: Skeletal remains of Father Petrus Donders; two additional skeletons excavated from the Batavia cemetery were used as controls. METHODS: Archival research, paleopathological evaluation and aDNA-based testing of skeletal remains. RESULTS: Neither archives nor inspection of Donders skeletal remains revealed evidence of leprosy, and aDNA-based testing for M. leprae was negative. We detected M. leprae aDNA by RLEP PCR in one control skeleton, which also displayed pathological lesions compatible with leprosy. The M. leprae aDNA was genotyped by Sanger sequencing as SNP type 4; the skeleton displayed mitochondrial haplogroup L3. CONCLUSION: We found no evidence that Donders contracted leprosy despite years of intense leprosy contact, but we successfully isolated an archaeological M. leprae aDNA sample from a control skeleton from South America. SIGNIFICANCE: We successfully genotyped recovered aDNA to a M. leprae strain that likely originated in West Africa. The detected human mitochondrial haplogroup L3 is also associated with this geographical region. This suggests that slave trade contributed to leprosy in Suriname. LIMITATIONS: A limited number of skeletons was examined. SUGGESTIONS FOR FURTHER RESEARCH: Broader review of skeletal collections is advised to expand on diversity of the M. leprae aDNA database.


Subject(s)
Cemeteries/history , DNA, Bacterial/genetics , Genome, Bacterial/genetics , Mycobacterium leprae/pathogenicity , Skeleton/microbiology , DNA, Bacterial/history , Genotype , History, 19th Century , Humans , Paleopathology/methods , Suriname
16.
BMC Genomics ; 20(1): 338, 2019 May 06.
Article in English | MEDLINE | ID: mdl-31060512

ABSTRACT

BACKGROUND: Bacteria carry a wide array of genes, some of which have multiple alleles. These different alleles are often responsible for distinct types of virulence and can determine the classification at the subspecies levels (e.g., housekeeping genes for Multi Locus Sequence Typing, MLST). Therefore, it is important to rapidly detect not only the gene of interest, but also the relevant allele. Current sequencing-based methods are limited to mapping reads to each of the known allele reference, which is a time-consuming procedure. RESULTS: To address this limitation, we developed BacTag - a pipeline that rapidly and accurately detects which genes are present in a sequencing dataset and reports the allele of each of the identified genes. We exploit the fact that different alleles of the same gene have a high similarity. Instead of mapping the reads to each of the allele reference sequences, we preprocess the database prior to the analysis, which makes the subsequent gene and allele identification efficient. During the preprocessing, we determine a representative reference sequence for each gene and store the differences between all alleles and this chosen reference. Throughout the analysis we estimate whether the gene is present in the sequencing data by mapping the reads to this reference sequence; if the gene is found, we compare the variants to those in the preprocessed database. This allows to detect which specific allele is present in the sequencing data. Our pipeline was successfully tested on artificial WGS E. coli, S. pseudintermedius, P. gingivalis, M. bovis, Borrelia spp. and Streptomyces spp. data and real WGS E. coli and K. pneumoniae data in order to report alleles of MLST house-keeping genes. CONCLUSIONS: We developed a new pipeline for fast and accurate gene and allele recognition based on database preprocessing and parallel computing and performed better or comparable to the current popular tools. We believe that our approach can be useful for a wide range of projects, including bacterial subspecies classification, clinical diagnostics of bacterial infections, and epidemiological studies.


Subject(s)
Bacteria/classification , Bacteria/genetics , High-Throughput Nucleotide Sequencing/methods , Molecular Typing/methods , Sequence Analysis, DNA/methods , Alleles , Databases, Genetic , Genes, Bacterial , Genome, Bacterial
17.
Arthritis Rheumatol ; 71(4): 561-570, 2019 04.
Article in English | MEDLINE | ID: mdl-30298554

ABSTRACT

OBJECTIVE: Multiple single-nucleotide polymorphisms (SNPs) conferring susceptibility to osteoarthritis (OA) mark imbalanced expression of positional genes in articular cartilage, reflected by unequally expressed alleles among heterozygotes (allelic imbalance [AI]). We undertook this study to explore the articular cartilage transcriptome from OA patients for AI events to identify putative disease-driving genetic variation. METHODS: AI was assessed in 42 preserved and 5 lesioned OA cartilage samples (from the Research Arthritis and Articular Cartilage study) for which RNA sequencing data were available. The count fraction of the alternative alleles among the alternative and reference alleles together (φ) was determined for heterozygous individuals. A meta-analysis was performed to generate a meta-φ and P value for each SNP with a false discovery rate (FDR) correction for multiple comparisons. To further validate AI events, we explored them as a function of multiple additional OA features. RESULTS: We observed a total of 2,070 SNPs that consistently marked AI of 1,031 unique genes in articular cartilage. Of these genes, 49 were found to be significantly differentially expressed (fold change <0.5 or >2, FDR <0.05) between preserved and paired lesioned cartilage, and 18 had previously been reported to confer susceptibility to OA and/or related phenotypes. Moreover, we identified notable highly significant AI SNPs in the CRLF1, WWP2, and RPS3 genes that were related to multiple OA features. CONCLUSION: We present a framework and resulting data set for researchers in the OA research field to probe for disease-relevant genetic variation that affects gene expression in pivotal disease-affected tissue. This likely includes putative novel compelling OA risk genes such as CRLF1, WWP2, and RPS3.


Subject(s)
Allelic Imbalance/genetics , Cartilage, Articular/metabolism , Osteoarthritis/genetics , Polymorphism, Single Nucleotide , Transcriptome/genetics , Adult , Female , Humans , Male , Middle Aged , Receptors, Cytokine/genetics , Ribosomal Proteins/genetics , Risk Factors , Sequence Analysis, RNA , Ubiquitin-Protein Ligases/genetics
18.
Forensic Sci Int Genet ; 35: 169-175, 2018 07.
Article in English | MEDLINE | ID: mdl-29852469

ABSTRACT

Since two decades, short tandem repeats (STRs) are the preferred markers for human identification, routinely analysed by fragment length analysis. Here we present a novel set of short hypervariable autosomal microhaplotypes (MH) that have four or more SNPs in a span of less than 70 nucleotides (nt). These MHs display a discriminating power approaching that of STRs and provide a powerful alternative for the analysis;1;is of forensic samples that are problematic when the STR fragment size range exceeds the integrity range of severely degraded DNA or when multiple donors contribute to an evidentiary stain and STR stutter artefacts complicate profile interpretation. MH typing was developed using the power of massively parallel sequencing (MPS) enabling new powerful, fast and efficient SNP-based approaches. MH candidates were obtained from queries in data of the 1000 Genomes, and Genome of the Netherlands (GoNL) projects. Wet-lab analysis of 276 globally dispersed samples and 97 samples of nine large CEPH families assisted locus selection and corroboration of informative value. We infer that MHs represent an alternative marker type with good discriminating power per locus (allowing the use of a limited number of loci), small amplicon sizes and absence of stutter artefacts that can be especially helpful when unbalanced mixed samples are submitted for human identification.


Subject(s)
DNA Fingerprinting/methods , Haplotypes , Polymorphism, Single Nucleotide , Alleles , Artifacts , High-Throughput Nucleotide Sequencing , Humans , Multiplex Polymerase Chain Reaction , Sequence Analysis, DNA
19.
Front Aging Neurosci ; 10: 102, 2018.
Article in English | MEDLINE | ID: mdl-29706885

ABSTRACT

Hereditary cerebral hemorrhage with amyloidosis-Dutch type (HCHWA-D) is an early onset hereditary form of cerebral amyloid angiopathy (CAA) caused by a point mutation resulting in an amino acid change (NP_000475.1:p.Glu693Gln) in the amyloid precursor protein (APP). Post-mortem frontal and occipital cortical brain tissue from nine patients and nine age-related controls was used for RNA sequencing to identify biological pathways affected in HCHWA-D. Although previous studies indicated that pathology is more severe in the occipital lobe in HCHWA-D compared to the frontal lobe, the current study showed similar changes in gene expression in frontal and occipital cortex and the two brain regions were pooled for further analysis. Significantly altered pathways were analyzed using gene set enrichment analysis (GSEA) on 2036 significantly differentially expressed genes. Main pathways over-represented by down-regulated genes were related to cellular aerobic respiration (including ATP synthesis and carbon metabolism) indicating a mitochondrial dysfunction. Principal up-regulated pathways were extracellular matrix (ECM)-receptor interaction and ECM proteoglycans in relation with an increase in the transforming growth factor beta (TGFß) signaling pathway. Comparison with the publicly available dataset from pre-symptomatic APP-E693Q transgenic mice identified overlap for the ECM-receptor interaction pathway, indicating that ECM modification is an early disease specific pathomechanism.

20.
Hum Mutat ; 38(8): 912-921, 2017 08.
Article in English | MEDLINE | ID: mdl-28471515

ABSTRACT

Next-generation sequencing is radically changing how DNA diagnostic laboratories operate. What started as a single-gene profession is now developing into gene panel sequencing and whole-exome and whole-genome sequencing (WES/WGS) analyses. With further advances in sequencing technology and concomitant price reductions, WGS will soon become the standard and be routinely offered. Here, we focus on the critical steps involved in performing WGS, with a particular emphasis on points where WGS differs from WES, the important variables that should be taken into account, and the quality control measures that can be taken to monitor the process. The points discussed here, combined with recent publications on guidelines for reporting variants, will facilitate the routine implementation of WGS into a diagnostic setting.


Subject(s)
Genome, Human/genetics , Exome/genetics , High-Throughput Nucleotide Sequencing , Humans , Methyl-CpG-Binding Protein 2/genetics , Polymorphism, Single Nucleotide/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...