Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 84
Filter
1.
BMC Microbiol ; 23(1): 299, 2023 10 20.
Article in English | MEDLINE | ID: mdl-37864136

ABSTRACT

The microbiota that colonize the human gut and other tissues are dynamic, varying both in composition and functional state between individuals and over time. Gene expression measurements can provide insights into microbiome composition and function. However, efficient and unbiased removal of microbial ribosomal RNA (rRNA) presents a barrier to acquiring metatranscriptomic data. Here we describe a probe set that achieves efficient enzymatic rRNA removal of complex human-associated microbial communities. We demonstrate that the custom probe set can be further refined through an iterative design process to efficiently deplete rRNA from a range of human microbiome samples. Using synthetic nucleic acid spike-ins, we show that the rRNA depletion process does not introduce substantial quantitative error in gene expression profiles. Successful rRNA depletion allows for efficient characterization of taxonomic and functional profiles, including during the development of the human gut microbiome. The pan-human microbiome enzymatic rRNA depletion probes described here provide a powerful tool for studying the transcriptional dynamics and function of the human microbiome.


Subject(s)
Gastrointestinal Microbiome , Microbiota , Humans , RNA, Ribosomal/genetics , Bacteria/genetics , RNA, Ribosomal, 16S/genetics , Microbiota/genetics , Gastrointestinal Microbiome/genetics
2.
Int J Mol Sci ; 23(19)2022 Sep 20.
Article in English | MEDLINE | ID: mdl-36232302

ABSTRACT

We assess the performance of mRNA capture sequencing to identify fusion transcripts in FFPE tissue of different sarcoma types, followed by RT-qPCR confirmation. To validate our workflow, six positive control tumors with a specific chromosomal rearrangement were analyzed using the TruSight RNA Pan-Cancer Panel. Fusion transcript calling by FusionCatcher confirmed these aberrations and enabled the identification of both fusion gene partners and breakpoints. Next, whole-transcriptome TruSeq RNA Exome sequencing was applied to 17 fusion gene-negative alveolar rhabdomyosarcoma (ARMS) or undifferentiated round cell sarcoma (URCS) tumors, for whom fluorescence in situ hybridization (FISH) did not identify the classical pathognomonic rearrangements. For six patients, a pathognomonic fusion transcript was readily detected, i.e., PAX3-FOXO1 in two ARMS patients, and EWSR1-FLI1, EWSR1-ERG, or EWSR1-NFATC2 in four URCS patients. For the 11 remaining patients, 11 newly identified fusion transcripts were confirmed by RT-qPCR, including COPS3-TOM1L2, NCOA1-DTNB, WWTR1-LINC01986, PLAA-MOB3B, AP1B1-CHEK2, and BRD4-LEUTX fusion transcripts in ARMS patients. Additionally, recurrently detected secondary fusion transcripts in patients diagnosed with EWSR1-NFATC2-positive sarcoma were confirmed (COPS4-TBC1D9, PICALM-SYTL2, SMG6-VPS53, and UBE2F-ALS2). In conclusion, this study shows that mRNA capture sequencing enhances the detection rate of pathognomonic fusions and enables the identification of novel and secondary fusion transcripts in sarcomas.


Subject(s)
Sarcoma , Soft Tissue Neoplasms , Adaptor Protein Complex 1/genetics , Adaptor Protein Complex beta Subunits , Cell Cycle Proteins/genetics , Dithionitrobenzoic Acid , Humans , In Situ Hybridization, Fluorescence , Nuclear Proteins/genetics , Oncogene Proteins, Fusion/genetics , RNA , RNA, Messenger/genetics , Reverse Transcriptase Polymerase Chain Reaction , Sarcoma/diagnosis , Sarcoma/genetics , Sarcoma/pathology , Soft Tissue Neoplasms/pathology , Transcription Factors/genetics
3.
Alzheimers Dement ; 18(11): 2117-2130, 2022 11.
Article in English | MEDLINE | ID: mdl-35084109

ABSTRACT

While amyloid-ß (Aß) plaques are considered a hallmark of Alzheimer's disease, clinical trials focused on targeting gamma secretase, an enzyme involved in aberrant Aß peptide production, have not led to amelioration of AD symptoms or synaptic dysregulation. Screening strategies based on mechanistic, multi-omics approaches that go beyond pathological readouts can aid in the evaluation of therapeutics. Using early-onset Alzheimer's (EOFAD) disease patient lineage PSEN1A246E iPSC-derived neurons, we performed RNA-seq to characterize AD-associated endotypes, which are in turn used as a screening evaluation metric for two gamma secretase drugs, the inhibitor Semagacestat and the modulator BPN-15606. We demonstrate that drug treatment partially restores the neuronal state while concomitantly inhibiting cell cycle re-entry and dedifferentiation endotypes to different degrees depending on the mechanism of gamma secretase engagement. Our endotype-centric screening approach offers a new paradigm by which candidate AD therapeutics can be evaluated for their overall ability to reverse disease endotypes.


Subject(s)
Alzheimer Disease , Induced Pluripotent Stem Cells , Humans , Alzheimer Disease/drug therapy , Alzheimer Disease/genetics , Alzheimer Disease/metabolism , Amyloid Precursor Protein Secretases/metabolism , Amyloid beta-Protein Precursor/metabolism , Amyloid beta-Peptides/metabolism , Plaque, Amyloid/pathology , Induced Pluripotent Stem Cells/metabolism
5.
Viruses ; 13(10)2021 10 14.
Article in English | MEDLINE | ID: mdl-34696495

ABSTRACT

Two serious public health challenges have emerged in the current COVID-19 pandemic namely, deficits in SARS-CoV-2 variant monitoring and neglect of other co-circulating respiratory viruses. Additionally, accurate assessment of the evolution, extent, and dynamics of the outbreak is required to understand the transmission of the virus. To address these challenges, we evaluated 533 samples using a high-throughput next-generation sequencing (NGS) respiratory viral panel (RVP) that includes 40 viral pathogens. The performance metrics revealed a PPA, NPA, and accuracy of 95.98%, 85.96%, and 94.4%, respectively. The clade for pangolin lineage B that contains certain distant variants, including P4715L in ORF1ab, Q57H in ORF3a, and S84L in ORF8 covarying with the D614G spike protein mutation, were the most prevalent early in the pandemic in Georgia, USA. The isolates from the same county formed paraphyletic groups, indicating virus transmission between counties. The study demonstrates the clinical and public health utility of the NGS-RVP to identify novel variants that can provide actionable information to prevent or mitigate emerging viral threats and models that provide insights into viral transmission patterns and predict transmission/resurgence of regional outbreaks as well as providing critical information on co-circulating respiratory viruses that might be independent factors contributing to the global disease burden.


Subject(s)
COVID-19/epidemiology , Genome, Viral/genetics , Respiratory Tract Infections/diagnosis , Respiratory Tract Infections/virology , SARS-CoV-2/genetics , COVID-19/diagnosis , COVID-19/transmission , High-Throughput Nucleotide Sequencing , Humans , Limit of Detection , Phylogeny , SARS-CoV-2/isolation & purification , Spike Glycoprotein, Coronavirus/genetics
6.
Nat Biotechnol ; 39(9): 1141-1150, 2021 09.
Article in English | MEDLINE | ID: mdl-34504346

ABSTRACT

Clinical applications of precision oncology require accurate tests that can distinguish true cancer-specific mutations from errors introduced at each step of next-generation sequencing (NGS). To date, no bulk sequencing study has addressed the effects of cross-site reproducibility, nor the biological, technical and computational factors that influence variant identification. Here we report a systematic interrogation of somatic mutations in paired tumor-normal cell lines to identify factors affecting detection reproducibility and accuracy at six different centers. Using whole-genome sequencing (WGS) and whole-exome sequencing (WES), we evaluated the reproducibility of different sample types with varying input amount and tumor purity, and multiple library construction protocols, followed by processing with nine bioinformatics pipelines. We found that read coverage and callers affected both WGS and WES reproducibility, but WES performance was influenced by insert fragment size, genomic copy content and the global imbalance score (GIV; G > T/C > A). Finally, taking into account library preparation protocol, tumor content, read coverage and bioinformatics processes concomitantly, we recommend actionable practices to improve the reproducibility and accuracy of NGS experiments for cancer mutation detection.


Subject(s)
Benchmarking , Exome Sequencing/standards , Neoplasms/genetics , Sequence Analysis, DNA/standards , Whole Genome Sequencing/standards , Cell Line , Cell Line, Tumor , High-Throughput Nucleotide Sequencing/methods , Humans , Mutation , Neoplasms/pathology , Reproducibility of Results
7.
Nat Biotechnol ; 39(9): 1129-1140, 2021 09.
Article in English | MEDLINE | ID: mdl-34504351

ABSTRACT

Assessing the reproducibility, accuracy and utility of massively parallel DNA sequencing platforms remains an ongoing challenge. Here the Association of Biomolecular Resource Facilities (ABRF) Next-Generation Sequencing Study benchmarks the performance of a set of sequencing instruments (HiSeq/NovaSeq/paired-end 2 × 250-bp chemistry, Ion S5/Proton, PacBio circular consensus sequencing (CCS), Oxford Nanopore Technologies PromethION/MinION, BGISEQ-500/MGISEQ-2000 and GS111) on human and bacterial reference DNA samples. Among short-read instruments, HiSeq 4000 and X10 provided the most consistent, highest genome coverage, while BGI/MGISEQ provided the lowest sequencing error rates. The long-read instrument PacBio CCS had the highest reference-based mapping rate and lowest non-mapping rate. The two long-read platforms PacBio CCS and PromethION/MinION showed the best sequence mapping in repeat-rich areas and across homopolymers. NovaSeq 6000 using 2 × 250-bp read chemistry was the most robust instrument for capturing known insertion/deletion events. This study serves as a benchmark for current genomics technologies, as well as a resource to inform experimental design and next-generation sequencing variant calling.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , High-Throughput Nucleotide Sequencing/standards , Sequence Analysis, DNA/methods , Sequence Analysis, DNA/standards , Base Pair Mismatch , Benchmarking , DNA/genetics , DNA, Bacterial/genetics , Genome, Bacterial , Genome, Human , Humans
8.
Nat Biotechnol ; 39(9): 1151-1160, 2021 09.
Article in English | MEDLINE | ID: mdl-34504347

ABSTRACT

The lack of samples for generating standardized DNA datasets for setting up a sequencing pipeline or benchmarking the performance of different algorithms limits the implementation and uptake of cancer genomics. Here, we describe reference call sets obtained from paired tumor-normal genomic DNA (gDNA) samples derived from a breast cancer cell line-which is highly heterogeneous, with an aneuploid genome, and enriched in somatic alterations-and a matched lymphoblastoid cell line. We partially validated both somatic mutations and germline variants in these call sets via whole-exome sequencing (WES) with different sequencing platforms and targeted sequencing with >2,000-fold coverage, spanning 82% of genomic regions with high confidence. Although the gDNA reference samples are not representative of primary cancer cells from a clinical sample, when setting up a sequencing pipeline, they not only minimize potential biases from technologies, assays and informatics but also provide a unique resource for benchmarking 'tumor-only' or 'matched tumor-normal' analyses.


Subject(s)
Benchmarking , Breast Neoplasms/genetics , DNA Mutational Analysis/standards , High-Throughput Nucleotide Sequencing/standards , Whole Genome Sequencing/standards , Cell Line, Tumor , Datasets as Topic , Germ Cells , Humans , Mutation , Reference Standards , Reproducibility of Results
10.
Nat Biotechnol ; 39(11): 1453-1465, 2021 11.
Article in English | MEDLINE | ID: mdl-34140680

ABSTRACT

Existing compendia of non-coding RNA (ncRNA) are incomplete, in part because they are derived almost exclusively from small and polyadenylated RNAs. Here we present a more comprehensive atlas of the human transcriptome, which includes small and polyA RNA as well as total RNA from 300 human tissues and cell lines. We report thousands of previously uncharacterized RNAs, increasing the number of documented ncRNAs by approximately 8%. To infer functional regulation by known and newly characterized ncRNAs, we exploited pre-mRNA abundance estimates from total RNA sequencing, revealing 316 microRNAs and 3,310 long non-coding RNAs with multiple lines of evidence for roles in regulating protein-coding genes and pathways. Our study both refines and expands the current catalog of human ncRNAs and their regulatory interactions. All data, analyses and results are available for download and interrogation in the R2 web portal, serving as a basis for future exploration of RNA biology and function.


Subject(s)
MicroRNAs , RNA, Long Noncoding , Humans , MicroRNAs/genetics , RNA, Long Noncoding/genetics , RNA, Messenger , RNA, Untranslated/genetics , Transcriptome/genetics
11.
STAR Protoc ; 2(2): 100475, 2021 06 18.
Article in English | MEDLINE | ID: mdl-33937877

ABSTRACT

Comprehensive transcriptome analysis of extracellular RNA (exRNA) purified from human biofluids is challenging because of the low RNA concentration and compromised RNA integrity. Here, we describe an optimized workflow to (1) isolate exRNA from different types of biofluids and (2) to prepare messenger RNA (mRNA)-enriched sequencing libraries using complementary hybridization probes. Importantly, the workflow includes 2 sets of synthetic spike-in RNA molecules as processing controls for RNA purification and sequencing library preparation and as an alternative data normalization strategy. For complete details on the use and execution of this protocol, please refer to Hulstaert et al. (2020).


Subject(s)
Gene Expression Profiling/methods , RNA, Messenger/blood , Sequence Analysis, RNA , Transcriptome/genetics , Extracellular Space/chemistry , Extracellular Space/genetics , Gene Expression Profiling/standards , Humans , Polymerase Chain Reaction , RNA, Messenger/isolation & purification , Reference Standards , Sequence Analysis, RNA/methods , Sequence Analysis, RNA/standards
12.
Cell Rep ; 33(13): 108552, 2020 12 29.
Article in English | MEDLINE | ID: mdl-33378673

ABSTRACT

Extracellular RNAs present in biofluids have emerged as potential biomarkers for disease. Where most studies focus on blood-derived fluids, other biofluids may be more informative. We present an atlas of messenger, circular, and small RNA transcriptomes of a comprehensive collection of 20 human biofluids. By means of synthetic spike-in controls, we compare RNA content across biofluids, revealing a 10,000-fold difference in concentration. The circular RNA fraction is increased in most biofluids compared to tissues. Each biofluid transcriptome is enriched for RNA molecules derived from specific tissues and cell types. Our atlas enables an informed selection of the most relevant biofluid to monitor particular diseases. To verify the biomarker potential in these biofluids, four validation cohorts representing a broad spectrum of diseases were profiled, revealing numerous differential RNAs between case and control subjects. Spike-normalized data are publicly available in the R2 web portal for further exploration.


Subject(s)
Biomarkers , Body Fluids/metabolism , RNA/metabolism , Transcriptome , Cohort Studies , Gene Expression Profiling/methods , Humans , RNA/genetics , Sequence Analysis, RNA/methods
13.
Sci Adv ; 6(46)2020 11.
Article in English | MEDLINE | ID: mdl-33188013

ABSTRACT

Identifying the systems-level mechanisms that lead to Alzheimer's disease, an unmet need, is an essential step toward the development of therapeutics. In this work, we report that the key disease-causative mechanisms, including dedifferentiation and repression of neuronal identity, are triggered by changes in chromatin topology. Here, we generated human induced pluripotent stem cell (hiPSC)-derived neurons from donor patients with early-onset familial Alzheimer's disease (EOFAD) and used a multiomics approach to mechanistically characterize the modulation of disease-associated gene regulatory programs. We demonstrate that EOFAD neurons dedifferentiate to a precursor-like state with signatures of ectoderm and nonectoderm lineages. RNA-seq, ATAC-seq, and ChIP-seq analysis reveals that transcriptional alterations in the cellular state are orchestrated by changes in histone methylation and chromatin topology. Furthermore, we demonstrate that these mechanisms are observed in EOFAD-patient brains, validating our hiPSC-derived neuron models. The mechanistic endotypes of Alzheimer's disease uncovered here offer key insights for therapeutic interventions.


Subject(s)
Alzheimer Disease , Induced Pluripotent Stem Cells , Alzheimer Disease/genetics , Chromatin/genetics , Humans , Mutation , Neurons
14.
Sci Rep ; 10(1): 3716, 2020 02 28.
Article in English | MEDLINE | ID: mdl-32111915

ABSTRACT

Sensitive and specific diagnostic and prognostic biomarkers for prostate cancer (PCa) are urgently needed. Urine samples are a non-invasive means to obtain abundant and readily accessible "liquid biopsies". Herein we used urine liquid biopsies to identify and characterize a novel group of urine-enriched RNAs and metabolites in patients with PCa and normal individuals with or without benign prostatic disease. Differentially expressed RNAs were identified in urine samples by deep sequencing and metabolites in urine were measured by mass spectrometry. mRNA and metabolite profiles were distinct in patients with benign and malignant disease. Integrated analysis of urinary gene expression and metabolite signatures unveiled an aberrant glutamate metabolism and tricarboxylic acid (TCA) cycle node in prostate cancer-derived cells. Functional validation supported a role for glutamate metabolism and glutamate oxaloacetate transaminase 1 (GOT1)-dependent redox balance in PCa, which could be exploited for novel biomarkers and therapies. In this study, we discovered cancer-specific changes in urinary RNAs and metabolites, paving the way for the development of sensitive and specific urinary PCa diagnostic biomarkers either alone or in combination. Our methodology was based on single void urine samples (i.e., without prostatic massage). The integrated analysis of metabolomic and transcriptomic data from these liquid biopsies revealed a glutamate metabolism and tricarboxylic acid cycle node that was specific to prostate-derived cancer cells and cancer-specific metabolic changes in urine.


Subject(s)
Biomarkers, Tumor/urine , Prostatic Neoplasms/urine , RNA, Messenger/urine , Citric Acid Cycle , Glutamic Acid/metabolism , Humans , Liquid Biopsy , Male , Prostate/metabolism , Prostatic Neoplasms/diagnosis , Prostatic Neoplasms/genetics , Prostatic Neoplasms/pathology , RNA, Messenger/genetics
15.
16.
Lancet Infect Dis ; 19(6): 648-657, 2019 06.
Article in English | MEDLINE | ID: mdl-31000464

ABSTRACT

BACKGROUND: The real-time generation of information about pathogen genomes has become a vital goal for transmission analysis and characterisation in rapid outbreak responses. In response to the recently established genomic capacity in the Democratic Republic of the Congo, we explored the real-time generation of genomic information at the start of the 2018 Ebola virus disease (EVD) outbreak in North Kivu Province. METHODS: We used targeted-enrichment sequencing to produce two coding-complete Ebola virus genomes 5 days after declaration of the EVD outbreak in North Kivu. Subsequent sequencing efforts yielded an additional 46 genomes. Genomic information was used to assess early transmission, medical countermeasures, and evolution of Ebola virus. FINDINGS: The genomic information demonstrated that the EVD outbreak in the North Kivu and Ituri Provinces was distinct from the 2018 EVD outbreak in Équateur Province of the Democratic Republic of the Congo. Primer and probe mismatches to Ebola virus were identified in silico for all deployed diagnostic PCR assays, with the exception of the Cepheid GeneXpert GP assay. INTERPRETATION: The first two coding-complete genomes provided actionable information in real-time for the deployment of the rVSVΔG-ZEBOV-GP Ebola virus envelope glycoprotein vaccine, available therapeutics, and sequence-based diagnostic assays. Based on the mutations identified in the Ebola virus surface glycoprotein (GP12) observed in all 48 genomes, deployed monoclonal antibody therapeutics (mAb114 and ZMapp) should be efficacious against the circulating Ebola virus variant. Rapid Ebola virus genomic characterisation should be included in routine EVD outbreak response procedures to ascertain efficacy of medical countermeasures. FUNDING: Defense Biological Product Assurance Office.


Subject(s)
Antibodies, Monoclonal/genetics , Antiviral Agents/therapeutic use , Ebola Vaccines/therapeutic use , Ebolavirus/genetics , Genomics , Hemorrhagic Fever, Ebola/drug therapy , Hemorrhagic Fever, Ebola/epidemiology , Democratic Republic of the Congo/epidemiology , Disease Outbreaks , Humans , Medical Countermeasures , Retrospective Studies
17.
Lancet Infect Dis ; 19(6): 641-647, 2019 06.
Article in English | MEDLINE | ID: mdl-31000465

ABSTRACT

BACKGROUND: The 2018 Ebola virus disease (EVD) outbreak in Équateur Province, Democratic Republic of the Congo, began on May 8, and was declared over on July 24; it resulted in 54 documented cases and 33 deaths. We did a retrospective genomic characterisation of the outbreak and assessed potential therapeutic agents and vaccine (medical countermeasures). METHODS: We used target-enrichment sequencing to produce Ebola virus genomes from samples obtained in the 2018 Équateur Province outbreak. Combining these genomes with genomes associated with known outbreaks from GenBank, we constructed a maximum-likelihood phylogenetic tree. In-silico analyses were used to assess potential mismatches between the outbreak strain and the probes and primers of diagnostic assays and the antigenic sites of the experimental rVSVΔG-ZEBOV-GP vaccine and therapeutics. An in-vitro flow cytometry assay was used to assess the binding capability of the individual components of the monoclonal antibody cocktail ZMapp. FINDINGS: A targeted sequencing approach produced 16 near-complete genomes. Phylogenetic analysis of these genomes and 1011 genomes from GenBank revealed a distinct cluster, confirming a new Ebola virus variant, for which we propose the name "Tumba". This new variant appears to have evolved at a slower rate than other Ebola virus variants (0·69 × 10-3 substitutions per site per year with "Tumba" vs 1·06 × 10-3 substitutions per site per year without "Tumba"). We found few sequence mismatches in the assessed assay target regions and antigenic sites. We identified nine amino acid changes in the Ebola virus surface glycoprotein, of which one resulted in reduced binding of the 13C6 antibody within the ZMapp cocktail. INTERPRETATION: Retrospectively, we show the feasibility of using genomics to rapidly characterise a new Ebola virus variant within the timeframe of an outbreak. Phylogenetic analysis provides further indications that these variants are evolving at differing rates. Rapid in-silico analyses can direct in-vitro experiments to quickly assess medical countermeasures. FUNDING: Defense Biological Product Assurance Office.


Subject(s)
Antiviral Agents/therapeutic use , Disease Outbreaks , Ebola Vaccines/therapeutic use , Ebolavirus/genetics , Genomics , Hemorrhagic Fever, Ebola/drug therapy , Hemorrhagic Fever, Ebola/epidemiology , Democratic Republic of the Congo/epidemiology , Humans , Retrospective Studies
18.
BMC Genomics ; 19(1): 722, 2018 Oct 01.
Article in English | MEDLINE | ID: mdl-30285621

ABSTRACT

BACKGROUND: Transposome-based technologies have enabled the streamlined production of sequencer-ready DNA libraries; however, current methods are highly sensitive to the amount and quality of input nucleic acid. RESULTS: We describe a new library preparation technology (Nextera DNA Flex) that utilizes a known concentration of transposomes conjugated directly to beads to bind a fixed amount of DNA, and enables direct input of blood and saliva using an integrated extraction protocol. We further report results from libraries generated outside the standard parameters of the workflow, highlighting novel applications for Nextera DNA Flex, including human genome builds and variant calling from below 1 ng DNA input, customization of insert size, and preparation of libraries from short fragments and severely degraded FFPE samples. Using this bead-linked library preparation method, library yield saturation was observed at an input amount of 100 ng. Preparation of libraries from a range of species with varying GC levels demonstrated uniform coverage of small genomes. For large and complex genomes, coverage across the genome, including difficult regions, was improved compared with other library preparation methods. Libraries were successfully generated from amplicons of varying sizes (from 50 bp to 11 kb), however, a decrease in efficiency was observed for amplicons smaller than 250 bp. This library preparation method was also compatible with poor-quality DNA samples, with sequenceable libraries prepared from formalin-fixed paraffin-embedded samples with varying levels of degradation. CONCLUSIONS: In contrast to solution-based library preparation, this bead-based technology produces a normalized, sequencing-ready library for a wide range of DNA input types and amounts, largely obviating the need for DNA quantitation. The robustness of this bead-based library preparation kit and flexibility of input DNA facilitates application across a wide range of fields.


Subject(s)
DNA Transposable Elements/genetics , Gene Library , High-Throughput Nucleotide Sequencing/methods , Microspheres , Workflow , Genome, Human/genetics , Humans , Magnets/chemistry , Plasmids/genetics
19.
Methods Mol Biol ; 1838: 125-140, 2018.
Article in English | MEDLINE | ID: mdl-30128994

ABSTRACT

A large number of viruses can individually and concurrently cause various respiratory illnesses. Metagenomic sequencing using next-generation sequencing (NGS) technology is capable of identifying a variety of pathogens. Here, we describe a method using a large panel of oligo probes to enrich sequence targets of 34 respiratory DNA and RNA viruses that reduces non-viral reads in NGS data and achieves high performance of sequencing-based pathogen identification. The approach can be applied to total nucleic acids purified from respiratory swabs stored in viral transport medium. Illumina TruSeq RNA Access Library procedure is used in targeted sequencing of respiratory viruses. The samples are subjected to RNA fragmentation, random reverse transcription, random PCR amplification, and ligation with barcoded library adaptors. The libraries are pooled and subjected to two rounds of enrichments by using a large panel of oligos designed to capture whole genomes of 34 respiratory viruses. The enriched libraries are amplified and sequenced using Illumina MiSeq sequencing system and reagents. This method can achieve viral detection sensitivity comparable with molecular assay and obtain partial to complete genome sequences for each virus to allow accurate genotyping and variant analysis.


Subject(s)
Genome, Viral , Metagenome , Metagenomics , Respiratory Tract Infections/virology , Viruses/genetics , Gene Library , High-Throughput Nucleotide Sequencing , Humans , Metagenomics/methods , Respiratory Tract Infections/diagnosis , Sequence Analysis, DNA
20.
Genome Res ; 28(6): 869-877, 2018 06.
Article in English | MEDLINE | ID: mdl-29703817

ABSTRACT

Next generation sequencing (NGS) technologies have revolutionized the genomics field and are becoming more commonplace for identification of human infectious diseases. However, due to the low abundance of viral nucleic acids (NAs) in relation to host, viral identification using direct NGS technologies often lacks sufficient sensitivity. Here, we describe an approach based on two complementary enrichment strategies that significantly improves the sensitivity of NGS-based virus identification. To start, we developed two sets of DNA probes to enrich virus NAs associated with respiratory diseases. The first set of probes spans the genomes, allowing for identification of known viruses and full genome sequencing, while the second set targets regions conserved among viral families or genera, providing the ability to detect both known and potentially novel members of those virus groups. Efficiency of enrichment was assessed by NGS testing reference virus and clinical samples with known infection. We show significant improvement in viral identification using enriched NGS compared to unenriched NGS. Without enrichment, we observed an average of 0.3% targeted viral reads per sample. However, after enrichment, 50%-99% of the reads per sample were the targeted viral reads for both the reference isolates and clinical specimens using both probe sets. Importantly, dramatic improvements on genome coverage were also observed following virus-specific probe enrichment. The methods described here provide improved sensitivity for virus identification by NGS, allowing for a more comprehensive analysis of disease etiology.


Subject(s)
Communicable Diseases/diagnosis , Communicable Diseases/virology , Nucleic Acids/genetics , Viruses/isolation & purification , Communicable Diseases/etiology , Communicable Diseases/genetics , DNA Probes/genetics , Genome, Viral/genetics , Genomics , High-Throughput Nucleotide Sequencing , Humans , Nucleic Acids/isolation & purification , Viruses/genetics , Viruses/pathogenicity
SELECTION OF CITATIONS
SEARCH DETAIL
...