Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 571
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Nat Commun ; 15(1): 3972, 2024 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-38730241

RESUMO

The advancement of Long-Read Sequencing (LRS) techniques has significantly increased the length of sequencing to several kilobases, thereby facilitating the identification of alternative splicing events and isoform expressions. Recently, numerous computational tools for isoform detection using long-read sequencing data have been developed. Nevertheless, there remains a deficiency in comparative studies that systemically evaluate the performance of these tools, which are implemented with different algorithms, under various simulations that encompass potential influencing factors. In this study, we conducted a benchmark analysis of thirteen methods implemented in nine tools capable of identifying isoform structures from long-read RNA-seq data. We evaluated their performances using simulated data, which represented diverse sequencing platforms generated by an in-house simulator, RNA sequins (sequencing spike-ins) data, as well as experimental data. Our findings demonstrate IsoQuant as a highly effective tool for isoform detection with LRS, with Bambu and StringTie2 also exhibiting strong performance. These results offer valuable guidance for future research on alternative splicing analysis and the ongoing improvement of tools for isoform detection using LRS data.


Assuntos
Algoritmos , Processamento Alternativo , RNA Mensageiro , Análise de Sequência de RNA , Humanos , RNA Mensageiro/genética , RNA Mensageiro/análise , Análise de Sequência de RNA/métodos , Isoformas de RNA/genética , Software , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Isoformas de Proteínas/genética
2.
Nat Commun ; 15(1): 3126, 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38605047

RESUMO

Long reads that cover more variants per read raise opportunities for accurate haplotype construction, whereas the genotype errors of single nucleotide polymorphisms pose great computational challenges for haplotyping tools. Here we introduce KSNP, an efficient haplotype construction tool based on the de Bruijn graph (DBG). KSNP leverages the ability of DBG in handling high-throughput erroneous reads to tackle the challenges. Compared to other notable tools in this field, KSNP achieves at least 5-fold speedup while producing comparable haplotype results. The time required for assembling human haplotypes is reduced to nearly the data-in time.


Assuntos
Algoritmos , Polimorfismo de Nucleotídeo Único , Humanos , Haplótipos/genética , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software
3.
Methods Mol Biol ; 2787: 107-122, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38656485

RESUMO

Genetic diversity refers to the variety of genetic traits within a population or a species. It is an essential aspect of both plant ecology and plant breeding because it contributes to the adaptability, survival, and resilience of populations in changing environments. This chapter outlines a pipeline for estimating genetic diversity statistics from reduced representation or whole genome sequencing data. The pipeline involves obtaining DNA sequence reads, mapping the corresponding reads to a reference genome, calling variants from the alignments, and generating an unbiased estimation of nucleotide diversity and divergence between populations. The pipeline is suitable for single-end Illumina reads and can be adjusted for paired-end reads. The resulting pipeline provides a comprehensive approach for aligning and analyzing sequencing data to estimate genetic diversity.


Assuntos
Variação Genética , Genoma de Planta , Plantas , Plantas/genética , Software , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Biologia Computacional/métodos , Genômica/métodos
4.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38565260

RESUMO

MOTIVATION: Automated chromatin segmentation based on ChIP-seq (chromatin immunoprecipitation followed by sequencing) data reveals insights into the epigenetic regulation of chromatin accessibility. Existing segmentation methods are constrained by simplifying modeling assumptions, which may have a negative impact on the segmentation quality. RESULTS: We introduce EpiSegMix, a novel segmentation method based on a hidden Markov model with flexible read count distribution types and state duration modeling, allowing for a more flexible modeling of both histone signals and segment lengths. In a comparison with existing tools, ChromHMM, Segway, and EpiCSeg, we show that EpiSegMix is more predictive of cell biology, such as gene expression. Its flexible framework enables it to fit an accurate probabilistic model, which has the potential to increase the biological interpretability of chromatin states. AVAILABILITY AND IMPLEMENTATION: Source code: https://gitlab.com/rahmannlab/episegmix.


Assuntos
Cromatina , Epigênese Genética , Análise de Sequência de DNA/métodos , Histonas/metabolismo , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos
5.
Methods Mol Biol ; 2744: 223-238, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38683322

RESUMO

DNA barcodes are useful in biodiversity research, but sequencing barcodes with dye termination methods ("Sanger sequencing") has been so time-consuming and expensive that DNA barcodes are not as widely used as they should be. Fortunately, MinION sequencers from Oxford Nanopore Technologies have recently emerged as a cost-effective and efficient alternative for barcoding. MinION barcodes are now suitable for large-scale species discovery and enable specimen identification when the target species are represented in barcode databases. With a MinION, it is possible to obtain 10,000 barcodes from a single flow cell at a cost of less than 0.10 USD per specimen. Additionally, a Flongle flow cell can be used for small projects requiring up to 300 barcodes (0.50 USD per specimen). We here describe a cost-effective laboratory workflow for obtaining tagged amplicons, preparing ONT libraries, sequencing amplicon pools, and analyzing the MinION reads with the software ONTbarcoder. This workflow has been shown to yield highly accurate barcodes that are 99.99% identical to Sanger barcodes. Overall, we propose that the use of MinION for DNA barcoding is an attractive option for all researchers in need of a cost-effective and efficient solution for large-scale species discovery and specimen identification.


Assuntos
Código de Barras de DNA Taxonômico , Sequenciamento por Nanoporos , Código de Barras de DNA Taxonômico/métodos , Código de Barras de DNA Taxonômico/economia , Sequenciamento por Nanoporos/métodos , Análise Custo-Benefício , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/economia , Software , Biblioteca Gênica , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/economia , Fluxo de Trabalho , DNA/genética
6.
PLoS One ; 19(4): e0301446, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38573983

RESUMO

Reductions in sequencing costs have enabled widespread use of shotgun metagenomics and amplicon sequencing, which have drastically improved our understanding of the microbial world. However, large sequencing projects are now hampered by the cost of library preparation and low sample throughput, comparatively to the actual sequencing costs. Here, we benchmarked three high-throughput DNA extraction methods: ZymoBIOMICS™ 96 MagBead DNA Kit, MP BiomedicalsTM FastDNATM-96 Soil Microbe DNA Kit, and DNeasy® 96 PowerSoil® Pro QIAcube® HT Kit. The DNA extractions were evaluated based on length, quality, quantity, and the observed microbial community across five diverse soil types. DNA extraction of all soil types was successful for all kits, however DNeasy® 96 PowerSoil® Pro QIAcube® HT Kit excelled across all performance parameters. We further used the nanoliter dispensing system I.DOT One to miniaturize Illumina amplicon and metagenomic library preparation volumes by a factor of 5 and 10, respectively, with no significant impact on the observed microbial communities. With these protocols, DNA extraction, metagenomic, or amplicon library preparation for one 96-well plate are approx. 3, 5, and 6 hours, respectively. Furthermore, the miniaturization of amplicon and metagenome library preparation reduces the chemical and plastic costs from 5.0 to 3.6 and 59 to 7.3 USD pr. sample. This enhanced efficiency and cost-effectiveness will enable researchers to undertake studies with greater sample sizes and diversity, thereby providing a richer, more detailed view of microbial communities and their dynamics.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Metagenoma , Análise Custo-Benefício , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , DNA , Solo , Metagenômica/métodos
7.
Microbiol Spectr ; 12(4): e0388523, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38451098

RESUMO

This manuscript describes the development of a streamlined, cost-effective laboratory workflow to meet the demands of increased whole genome sequence (WGS) capacity while achieving mandated quality metrics. From 2020 to 2021, the Wadsworth Center Bacteriology Laboratory (WCBL) used a streamlined workflow to sequence 5,743 genomes that contributed sequence data to nine different projects. The combined use of the QIAcube HT, Illumina DNA Prep using quarter volume reactions, and the NextSeq allowed the WCBL to process all samples that required WGS while also achieving a median turn-around time of 7 days (range 4 to 10 days) and meeting minimum sequence quality requirements. Public Health Laboratories should consider implementing these methods to aid in meeting testing requirements within budgetary restrictions. IMPORTANCE: Public Health Laboratories that implement whole genome sequencing (WGS) technologies may struggle to find the balance between sample volume and cost effectiveness. We present a method that allows for sequencing of a variety of bacterial isolates in a cost-effective manner. This report provides specific strategies to implement high-volume WGS, including an innovative, low-cost solution utilizing a novel quarter volume sequencing library preparation. The methods described support the use of high-throughput DNA extraction and WGS within budgetary constraints, strengthening public health responses to outbreaks and disease surveillance.


Assuntos
Análise de Custo-Efetividade , Saúde Pública , Objetivos , Sequenciamento Completo do Genoma/métodos , DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Genoma Bacteriano
8.
Genome Res ; 34(2): 326-340, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38428994

RESUMO

Pacific Biosciences (PacBio) HiFi sequencing technology generates long reads (>10 kbp) with very high accuracy (<0.01% sequencing error). Although several de novo assembly tools are available for HiFi reads, there are no comprehensive studies on the evaluation of these assemblers. We evaluated the performance of 11 de novo HiFi assemblers on (1) real data for three eukaryotic genomes; (2) 34 synthetic data sets with different ploidy, sequencing coverage levels, heterozygosity rates, and sequencing error rates; (3) one real metagenomic data set; and (4) five synthetic metagenomic data sets with different composition abundance and heterozygosity rates. The 11 assemblers were evaluated using quality assessment tool (QUAST) and benchmarking universal single-copy ortholog (BUSCO). We also used several additional criteria, namely, completion rate, single-copy completion rate, duplicated completion rate, average proportion of largest category, average distance difference, quality value, run-time, and memory utilization. Results show that hifiasm and hifiasm-meta should be the first choice for assembling eukaryotic genomes and metagenomes with HiFi data. We performed a comprehensive benchmarking study of commonly used assemblers on complex eukaryotic genomes and metagenomes. Our study will help the research community to choose the most appropriate assembler for their data and identify possible improvements in assembly algorithms.


Assuntos
Metagenoma , Software , Análise de Sequência de DNA/métodos , Algoritmos , Metagenômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
9.
PLoS One ; 19(3): e0300381, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38489283

RESUMO

Water-borne plant pathogenic fungi and oomycetes are a major threat in greenhouse production systems. Early detection and quantification of these pathogens would enable us to ascertain both economic and biological thresholds required for a timely treatment, thus improving effective disease management. Here, we used Oxford nanopore MinION amplicon sequencing to analyze microbial communities in irrigation water collected from greenhouses used for growing tomato, cucumber and Aeschynanthus sp. Fungal and oomycete communities were characterized using primers that amplify the full internal transcribed spacer (ITS) region. To assess the sensitivity of the MinION sequencing, we spiked serially diluted mock DNA into the DNA isolated from greenhouse water samples prior to library preparation. Relative abundances of fungal and oomycete reads were distinct in the greenhouse irrigation water samples and in water samples from setups with tomato that was inoculated with Fusarium oxysporum. Sequence reads derived from fungal and oomycete mock communities were proportionate in the respective serial dilution samples, thus confirming the suitability of MinION amplicon sequencing for environmental monitoring. By using spike-ins as standards to test the reliability of quantification using the MinION, we found that the detection of spike-ins was highly affected by the background quantities of fungal or oomycete DNA in the sample. We observed that spike-ins having shorter length (538bp) produced reads across most of our dilutions compared to the longer spikes (>790bp). Moreover, the sequence reads were uneven with respect to dilution series and were least retrievable in the background samples having the highest DNA concentration, suggesting a narrow dynamic range of performance. We suggest continuous benchmarking of the MinION sequencing to improve quantitative metabarcoding efforts for rapid plant disease diagnostic and monitoring in the future.


Assuntos
Nanoporos , Oomicetos , Reprodutibilidade dos Testes , Oomicetos/genética , Fungos/genética , Análise de Sequência de DNA , DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos
10.
Sci Rep ; 14(1): 6756, 2024 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-38514891

RESUMO

Transposon directed insertion-site sequencing (TraDIS), a variant of transposon insertion sequencing commonly known as Tn-Seq, is a high-throughput assay that defines essential bacterial genes across diverse growth conditions. However, the variability between laboratory environments often requires laborious, time-consuming modifications to its protocol. In this technical study, we aimed to refine the protocol by identifying key parameters that can impact the complexity of mutant libraries. Firstly, we discovered that adjusting electroporation parameters including transposome concentration, transposome assembly conditions, and cell densities can significantly improve the recovery of viable mutants for different Escherichia coli strains. Secondly, we found that post-electroporation conditions, such as recovery time and the use of different mediums for selecting mutants may also impact the complexity of viable mutants in the library. Finally, we developed a simplified sequencing library preparation workflow based on a Nextera-TruSeq hybrid design where ~ 80% of sequenced reads correspond to transposon-DNA junctions. The technical improvements presented in our study aim to streamline TraDIS protocols, making this powerful technique more accessible for a wider scientific audience.


Assuntos
Elementos de DNA Transponíveis , Genes Bacterianos , Mutagênese Insercional , Elementos de DNA Transponíveis/genética , Análise Custo-Benefício , Sequência de Bases , Escherichia coli/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Biblioteca Gênica
11.
BMC Biol ; 22(1): 43, 2024 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-38378561

RESUMO

BACKGROUND: High tumor mutational burden (TMB) was reported to predict the efficacy of immune checkpoint inhibitors (ICIs). Pembrolizumab, an anti-PD-1, received FDA-approval for the treatment of unresectable/metastatic tumors with high TMB as determined by the FoundationOne®CDx test. It remains to be determined how TMB can also be calculated using other tests. RESULTS: FFPE/frozen tumor samples from various origins were sequenced in the frame of the Institut Curie (IC) Molecular Tumor Board using an in-house next-generation sequencing (NGS) panel. A TMB calculation method was developed at IC (IC algorithm) and compared to the FoundationOne® (FO) algorithm. Using IC algorithm, an optimal 10% variant allele frequency (VAF) cut-off was established for TMB evaluation on FFPE samples, compared to 5% on frozen samples. The median TMB score for MSS/POLE WT tumors was 8.8 mut/Mb versus 45 mut/Mb for MSI/POLE-mutated tumors. When focusing on MSS/POLE WT tumor samples, the highest median TMB scores were observed in lymphoma, lung, endometrial, and cervical cancers. After biological manual curation of these cases, 21% of them could be reclassified as MSI/POLE tumors and considered as "true TMB high." Higher TMB values were obtained using FO algorithm on FFPE samples compared to IC algorithm (40 mut/Mb [10-3927] versus 8.2 mut/Mb [2.5-897], p < 0.001). CONCLUSIONS: We herein propose a TMB calculation method and a bioinformatics tool that is customizable to different NGS panels and sample types. We were not able to retrieve TMB values from FO algorithm using our own algorithm and NGS panel.


Assuntos
Neoplasias , Humanos , Mutação , Neoplasias/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos
12.
ACS Synth Biol ; 13(2): 683-686, 2024 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-38329009

RESUMO

Biofoundries are automated high-throughput facilities specializing in the design, construction, and testing of engineered/synthetic DNA constructs (plasmids), often from genetic parts. A critical step of this process is assessing the fidelity of the assembled DNA construct to the desired design. Current methods utilized for this purpose are restriction digest or PCR followed by fragment analysis and sequencing. The Edinburgh Genome Foundry (EGF) has recently established a single-molecule sequencing quality control step using the Oxford Nanopore sequencing technology, along with a companion Nextflow pipeline and a Python package, to perform in-depth analysis and generate a detailed report. Our software enables researchers working with plasmids, including biofoundry scientists, to rapidly analyze and interpret sequencing data. In conclusion, we have created a laboratory and software protocol that validates assembled, cloned, or edited plasmids, using Nanopore long-reads, which can serve as a useful resource for the genetics, synthetic biology, and sequencing communities.


Assuntos
DNA , Nanoporos , Análise de Sequência de DNA/métodos , Análise Custo-Benefício , DNA/genética , Plasmídeos/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos
13.
J Clin Microbiol ; 62(3): e0010322, 2024 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-38315007

RESUMO

The ongoing COVID-19 pandemic necessitates cost-effective, high-throughput, and timely whole-genome sequencing (WGS) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) viruses for outbreak investigations, identifying variants of concern (VoC), characterizing vaccine breakthrough infections, and public health surveillance. In addition, the enormous demand for WGS on supply chains and the resulting shortages of laboratory supplies necessitated the use of low-reagent and low-consumable methods. Here, we report an optimized library preparation method (the BCCDC cutdown method) that can be used in a high-throughput scenario, where one technologist can perform 576 library preparations (6 plates of 96 samples) over the course of one 8-hour shift. The same protocol can also be used in a rapid turnaround time scenario, from primary samples (up to 96 samples) to loading on a sequencer in an 8-hour shift. This new method uses Freed et al.'s 1,200 bp primer sets (Biol Methods Protoc 5:bpaa014, 2020, https://doi.org/10.1093/biomethods/bpaa014) and a modified and condensed Illumina DNA Prep workflow (Illumina, CA, USA). Compared to the original protocol, the application of this new method using hundreds of clinical specimens demonstrated equivalent results to the full-length DNA Prep workflow at 45% of the cost, 15% of consumables required (such as pipet tips), 25% of manual hands-on time, and 15% of on-instrument time if performing on a liquid handler, with no compromise in sequence quality. Results demonstrate that this new method is a rapid, simple, cost-effective, and high-quality SARS-CoV-2 WGS protocol. IMPORTANCE: Sequencing has played an invaluable role in the response to the COVID-19 pandemic. Ongoing work in this area, however, demands optimization of laboratory workflow to increase sequencing capacity, improve turnaround time, and reduce cost without compromising sequence quality. This report describes an optimized DNA library preparation method for improved whole-genome sequencing of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pathogen. The workflow advantages summarized here include significant time, cost, and consumable savings, which suggest that this new method is an efficient, scalable, and pragmatic alternative for SARS-CoV-2 whole-genome sequencing.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , Análise Custo-Benefício , Pandemias , Biblioteca Gênica , DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos
14.
Genes (Basel) ; 15(2)2024 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-38397184

RESUMO

Mitochondrial (mt) DNA plays an important role in the fields of forensic and clinical genetics, molecular anthropology, and population genetics, with mixture interpretation being of particular interest in medical and forensic genetics. The high copy number, haploid state (only a single haplotype contributed per individual), high mutation rate, and well-known phylogeny of mtDNA, makes it an attractive marker for mixture deconvolution in damaged and low quantity samples of all types. Given the desire to deconvolute mtDNA mixtures, the goals of this study were to (1) create a new software, MixtureAceMT™, to deconvolute mtDNA mixtures by assessing and combining two existing software tools, MixtureAce™ and Mixemt, (2) create a dataset of in-silico MPS mixtures from whole mitogenome haplotypes representing a diverse set of population groups, and consisting of two and three contributors at different dilution ratios, and (3) since amplicon targeted sequencing is desirable, and is a commonly used approach in forensic laboratories, create biological mixture data associated with two amplification kits: PowerSeq™ Whole Genome Mito (Promega™, Madison, WI, USA) and Precision ID mtDNA Whole Genome Panel (Thermo Fisher Scientific by AB™, Waltham, MA, USA) to further validate the software for use in forensic laboratories. MixtureAceMT™ provides a user-friendly interface while reducing confounding features such as NUMTs and noise, reducing traditionally prohibitive processing times. The new software was able to detect the correct contributing haplogroups and closely estimate contributor proportions in sequencing data generated from small amplicons for mixtures with minor contributions of ≥5%. A challenge of mixture deconvolution using small amplicon sequencing is the potential generation of spurious haplogroups resulting from private mutations that differ from Phylotree. MixtureAceMT™ was able to resolve these additional haplogroups by including known haplotype/s in the evaluation. In addition, for some samples, the inclusion of known haplotypes was also able to resolve trace contributors (minor contribution 1-2%), which remain challenging to resolve even with deep sequencing.


Assuntos
DNA Mitocondrial , Sequenciamento de Nucleotídeos em Larga Escala , DNA Mitocondrial/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA , Mitocôndrias/genética , Haplótipos
15.
ACS Synth Biol ; 13(2): 457-465, 2024 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-38295293

RESUMO

Modern biological science, especially synthetic biology, relies heavily on the construction of DNA elements, often in the form of plasmids. Plasmids are used for a variety of applications, including the expression of proteins for subsequent purification, the expression of heterologous pathways for the production of valuable compounds, and the study of biological functions and mechanisms. For all applications, a critical step after the construction of a plasmid is its sequence validation. The traditional method for sequence determination is Sanger sequencing, which is limited to approximately 1000 bp per reaction. Here, we present a highly scalable in-house method for rapid validation of amplified DNA sequences using long-read Nanopore sequencing. We developed two-step amplicon and transposase strategies to provide maximum flexibility for dual barcode sequencing. We also provide an automated analysis pipeline to quickly and reliably analyze sequencing results and provide easy-to-interpret results for each sample. The user-friendly DuBA.flow start-to-finish pipeline is widely applicable. Furthermore, we show that construct validation using DuBA.flow can be performed by barcoded colony PCR amplicon sequencing, thus accelerating research.


Assuntos
DNA , Sequenciamento de Nucleotídeos em Larga Escala , Fluxo de Trabalho , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Plasmídeos/genética , DNA/genética
16.
Biochem Biophys Res Commun ; 696: 149488, 2024 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-38219485

RESUMO

Enzymatic methyl-seq (EM-seq), an enzyme-based method, identifies genome-wide DNA methylation, which enables us to obtain reliable methylome data from purified genomic DNA by avoiding bisulfite-induced DNA damage. However, the loss of DNA during purification hinders the methylome analysis of limited samples. The crude DNA extraction method is the quickest and minimal sample loss approach for obtaining useable DNA without requiring additional dissolution and purification. However, it remains unclear whether crude DNA can be used directly for EM-seq library construction. In this study, we aimed to assess the quality of EM-seq libraries prepared directly using crude DNA. The crude DNA-derived libraries provided appropriate fragment sizes and concentrations for sequencing similar to those of the purified DNA-derived libraries. However, the sequencing results of crude samples exhibited lower reference sequence mapping efficiencies than those of the purified samples. Additionally, the lower-input crude DNA-derived sample exhibited a marginally lower cytosine-to-thymine conversion efficiency and hypermethylated pattern around gene regulatory elements than the higher-input crude DNA- or purified DNA-derived samples. In contrast, the methylation profiles of the crude and purified samples exhibited a significant correlation. Our findings indicate that crude DNA can be used as a raw material for EM-seq library construction.


Assuntos
Metilação de DNA , DNA , Biblioteca Gênica , Sequência de Bases , DNA/genética , DNA/análise , Clonagem Molecular , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sulfitos
17.
BMC Bioinformatics ; 24(1): 461, 2023 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-38062356

RESUMO

BACKGROUND: Basecalling long DNA sequences is a crucial step in nanopore-based DNA sequencing protocols. In recent years, the CTC-RNN model has become the leading basecalling model, supplanting preceding hidden Markov models (HMMs) that relied on pre-segmenting ion current measurements. However, the CTC-RNN model operates independently of prior biological and physical insights. RESULTS: We present a novel basecaller named Lokatt: explicit duration Markov model and residual-LSTM network. It leverages an explicit duration HMM (EDHMM) designed to model the nanopore sequencing processes. Trained on a newly generated library with methylation-free Ecoli samples and MinION R9.4.1 chemistry, the Lokatt basecaller achieves basecalling performances with a median single read identity score of 0.930, a genome coverage ratio of 99.750%, on par with existing state-of-the-art structure when trained on the same datasets. CONCLUSION: Our research underlines the potential of incorporating prior knowledge into the basecalling processes, particularly through integrating HMMs and recurrent neural networks. The Lokatt basecaller showcases the efficacy of a hybrid approach, emphasizing its capacity to achieve high-quality basecalling performance while accommodating the nuances of nanopore sequencing. These outcomes pave the way for advanced basecalling methodologies, with potential implications for enhancing the accuracy and efficiency of nanopore-based DNA sequencing protocols.


Assuntos
Nanoporos , DNA/genética , Análise de Sequência de DNA/métodos , Redes Neurais de Computação , Sequência de Bases , Sequenciamento de Nucleotídeos em Larga Escala/métodos
18.
Parasitol Res ; 122(12): 3243-3256, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37940706

RESUMO

We recently described a targeted amplicon deep sequencing (TADS) strategy that utilizes a nested PCR targeting the 18S rDNA gene of blood-borne parasites. The assay facilitates selective digestion of host DNA by targeting enzyme restriction sites present in vertebrates but absent in parasites. This enriching of parasite-derived amplicon drastically reduces the proportion of host-derived reads during sequencing and results in the sensitive detection of several clinically important blood parasites including Plasmodium spp., Babesia spp., kinetoplastids, and filarial nematodes. Despite these promising results, high costs and the laborious nature of metagenomics sequencing are prohibitive to the routine use of this assay in most laboratories. We describe and evaluate a new metagenomic approach that utilizes a set of primers modified from our original assay that incorporates Illumina barcodes and adapters during the PCR steps. This modification makes amplicons immediately compatible with sequencing on the Illumina MiSeq platform, removing the need for a separate library preparation, which is expensive and time-consuming. We compared this modified assay to our previous nested TADS assay in terms of preparation speed, limit of detection (LOD), and cost. Our modifications reduced assay turnaround times from 7 to 5 days. The cost decreased from approximately $40 per sample to $11 per sample. The modified assay displayed comparable performance in the detection and differentiation of human-infecting Plasmodium spp., Babesia spp., kinetoplastids, and filarial nematodes in clinical samples. The LOD of this modified approach was determined for malaria parasites and remained similar to that previously reported for our earlier assay (0.58 Plasmodium falciparum parasites/µL of blood). These modifications markedly reduced costs and turnaround times, making the assay more amenable to routine diagnostic applications.


Assuntos
Babesia , Parasitos , Plasmodium , Animais , Humanos , Parasitos/genética , Análise Custo-Benefício , Plasmodium/genética , Plasmodium falciparum/genética , DNA Ribossômico/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Babesia/genética
19.
Nat Commun ; 14(1): 4760, 2023 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-37553321

RESUMO

Long-read RNA sequencing (RNA-seq) is a powerful technology for transcriptome analysis, but the relatively low throughput of current long-read sequencing platforms limits transcript coverage. One strategy for overcoming this bottleneck is targeted long-read RNA-seq for preselected gene panels. We present TEQUILA-seq, a versatile, easy-to-implement, and low-cost method for targeted long-read RNA-seq utilizing isothermally linear-amplified capture probes. When performed on the Oxford nanopore platform with multiple gene panels of varying sizes, TEQUILA-seq consistently and substantially enriches transcript coverage while preserving transcript quantification. We profile full-length transcript isoforms of 468 actionable cancer genes across 40 representative breast cancer cell lines. We identify transcript isoforms enriched in specific subtypes and discover novel transcript isoforms in extensively studied cancer genes such as TP53. Among cancer genes, tumor suppressor genes (TSGs) are significantly enriched for aberrant transcript isoforms targeted for degradation via mRNA nonsense-mediated decay, revealing a common RNA-associated mechanism for TSG inactivation. TEQUILA-seq reduces the per-reaction cost of targeted capture by 2-3 orders of magnitude, as compared to a standard commercial solution. TEQUILA-seq can be broadly used for targeted sequencing of full-length transcripts in diverse biomedical research settings.


Assuntos
Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , RNA/genética , Isoformas de Proteínas/genética , Transcriptoma/genética
20.
Mutat Res Rev Mutat Res ; 792: 108466, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37643677

RESUMO

Error-corrected Next Generation Sequencing (ecNGS) is rapidly emerging as a valuable, highly sensitive and accurate method for detecting and characterizing mutations in any cell type, tissue or organism from which DNA can be isolated. Recent mutagenicity and carcinogenicity studies have used ecNGS to quantify drug-/chemical-induced mutations and mutational spectra associated with cancer risk. ecNGS has potential applications in genotoxicity assessment as a new readout for traditional models, for mutagenesis studies in 3D organotypic cultures, and for detecting off-target effects of gene editing tools. Additionally, early data suggest that ecNGS can measure clonal expansion of mutations as a mechanism-agnostic early marker of carcinogenic potential and can evaluate mutational load directly in human biomonitoring studies. In this review, we discuss promising applications, challenges, limitations, and key data initiatives needed to enable regulatory testing and adoption of ecNGS - including for advancing safety assessment, augmenting weight-of-evidence for mutagenicity and carcinogenicity mechanisms, identifying early biomarkers of cancer risk, and managing human health risk from chemical exposures.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Mutagênicos , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Testes de Mutagenicidade , Mutação , Mutagênicos/toxicidade , Carcinógenos/toxicidade , Carcinogênese , Medição de Risco
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA