Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 2.695
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
PLoS One ; 19(6): e0303938, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38843147

RESUMO

Oxford Nanopore Technologies (ONT) sequencing is a promising technology. We assessed the performance of the new ONT R10 flowcells and V14 rapid sequencing chemistry for Mtb whole genome sequencing of Mycobacterium tuberculosis (Mtb) DNA extracted from clinical primary liquid cultures (CPLCs). Using the recommended protocols for MinION Mk1C, R10.4.1 MinION flowcells, and the ONT Rapid Sequencing Kit V14 on six CPLC samples, we obtained a pooled library yield of 10.9 ng/µl, generated 1.94 Gb of sequenced bases and 214k reads after 48h in a first sequencing run. Only half (49%) of all generated reads met the Phred Quality score threshold (>8). To assess if the low data output and sequence quality were due to impurities present in DNA extracted directly from CPLCs, we added a pre-library preparation bead-clean-up step and included purified DNA obtained from an Mtb subculture as a control sample in a second sequencing run. The library yield for DNA extracted from four CPLCs and one Mtb subculture (control) was similar (10.0 ng/µl), 2.38 Gb of bases and 822k reads were produced. The quality was slightly better with 66% of the produced reads having a Phred Quality >8. A third run of DNA from six CPLCs with bead clean-up pre-processing produced a low library yield (±1 Gb of bases, 166k reads) of low quality (51% of reads with a Phred Quality score >8). A median depth of coverage above 10× was only achieved for five of 17 (29%) sequenced libraries. Compared to Illumina WGS of the same samples, accurate lineage predictions and full drug resistance profiles from the generated ONT data could not be determined by TBProfiler. Further optimization of the V14 ONT rapid sequencing chemistry and library preparation protocol is needed for clinical Mtb WGS applications.


Assuntos
DNA Bacteriano , Mycobacterium tuberculosis , Mycobacterium tuberculosis/genética , Humanos , DNA Bacteriano/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Nanoporos , Sequenciamento por Nanoporos/métodos , Genoma Bacteriano , Sequenciamento Completo do Genoma/métodos , Tuberculose/microbiologia , Tuberculose/diagnóstico , Biblioteca Gênica
2.
Methods Mol Biol ; 2744: 223-238, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38683322

RESUMO

DNA barcodes are useful in biodiversity research, but sequencing barcodes with dye termination methods ("Sanger sequencing") has been so time-consuming and expensive that DNA barcodes are not as widely used as they should be. Fortunately, MinION sequencers from Oxford Nanopore Technologies have recently emerged as a cost-effective and efficient alternative for barcoding. MinION barcodes are now suitable for large-scale species discovery and enable specimen identification when the target species are represented in barcode databases. With a MinION, it is possible to obtain 10,000 barcodes from a single flow cell at a cost of less than 0.10 USD per specimen. Additionally, a Flongle flow cell can be used for small projects requiring up to 300 barcodes (0.50 USD per specimen). We here describe a cost-effective laboratory workflow for obtaining tagged amplicons, preparing ONT libraries, sequencing amplicon pools, and analyzing the MinION reads with the software ONTbarcoder. This workflow has been shown to yield highly accurate barcodes that are 99.99% identical to Sanger barcodes. Overall, we propose that the use of MinION for DNA barcoding is an attractive option for all researchers in need of a cost-effective and efficient solution for large-scale species discovery and specimen identification.


Assuntos
Código de Barras de DNA Taxonômico , Sequenciamento por Nanoporos , Código de Barras de DNA Taxonômico/métodos , Código de Barras de DNA Taxonômico/economia , Sequenciamento por Nanoporos/métodos , Análise Custo-Benefício , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/economia , Software , Biblioteca Gênica , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/economia , Fluxo de Trabalho , DNA/genética
3.
Nat Commun ; 15(1): 3126, 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38605047

RESUMO

Long reads that cover more variants per read raise opportunities for accurate haplotype construction, whereas the genotype errors of single nucleotide polymorphisms pose great computational challenges for haplotyping tools. Here we introduce KSNP, an efficient haplotype construction tool based on the de Bruijn graph (DBG). KSNP leverages the ability of DBG in handling high-throughput erroneous reads to tackle the challenges. Compared to other notable tools in this field, KSNP achieves at least 5-fold speedup while producing comparable haplotype results. The time required for assembling human haplotypes is reduced to nearly the data-in time.


Assuntos
Algoritmos , Polimorfismo de Nucleotídeo Único , Humanos , Haplótipos/genética , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software
4.
Methods Mol Biol ; 2787: 107-122, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38656485

RESUMO

Genetic diversity refers to the variety of genetic traits within a population or a species. It is an essential aspect of both plant ecology and plant breeding because it contributes to the adaptability, survival, and resilience of populations in changing environments. This chapter outlines a pipeline for estimating genetic diversity statistics from reduced representation or whole genome sequencing data. The pipeline involves obtaining DNA sequence reads, mapping the corresponding reads to a reference genome, calling variants from the alignments, and generating an unbiased estimation of nucleotide diversity and divergence between populations. The pipeline is suitable for single-end Illumina reads and can be adjusted for paired-end reads. The resulting pipeline provides a comprehensive approach for aligning and analyzing sequencing data to estimate genetic diversity.


Assuntos
Variação Genética , Genoma de Planta , Plantas , Plantas/genética , Software , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Biologia Computacional/métodos , Genômica/métodos
5.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38565260

RESUMO

MOTIVATION: Automated chromatin segmentation based on ChIP-seq (chromatin immunoprecipitation followed by sequencing) data reveals insights into the epigenetic regulation of chromatin accessibility. Existing segmentation methods are constrained by simplifying modeling assumptions, which may have a negative impact on the segmentation quality. RESULTS: We introduce EpiSegMix, a novel segmentation method based on a hidden Markov model with flexible read count distribution types and state duration modeling, allowing for a more flexible modeling of both histone signals and segment lengths. In a comparison with existing tools, ChromHMM, Segway, and EpiCSeg, we show that EpiSegMix is more predictive of cell biology, such as gene expression. Its flexible framework enables it to fit an accurate probabilistic model, which has the potential to increase the biological interpretability of chromatin states. AVAILABILITY AND IMPLEMENTATION: Source code: https://gitlab.com/rahmannlab/episegmix.


Assuntos
Cromatina , Epigênese Genética , Análise de Sequência de DNA/métodos , Histonas/metabolismo , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos
6.
Genome Biol Evol ; 16(4)2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38669452

RESUMO

A pangenome captures the genomic diversity for a species, derived from a collection of genetic sequences of diverse populations. Advances in sequencing technologies have given rise to three primary methods for pangenome construction and analysis: de novo assembly and comparison, reference genome-based iterative assembly, and graph-based pangenome construction. Each method presents advantages and challenges in processing varying amounts and structures of DNA sequencing data. With the emergence of high-quality genome assemblies and advanced bioinformatic tools, the graph-based pangenome is emerging as an advanced reference for exploring the biological and functional implications of genetic variations.


Assuntos
Genoma de Planta , Genômica/métodos , Plantas/genética , Análise de Sequência de DNA/métodos , Variação Genética , Biologia Computacional/métodos
7.
PLoS One ; 19(4): e0301446, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38573983

RESUMO

Reductions in sequencing costs have enabled widespread use of shotgun metagenomics and amplicon sequencing, which have drastically improved our understanding of the microbial world. However, large sequencing projects are now hampered by the cost of library preparation and low sample throughput, comparatively to the actual sequencing costs. Here, we benchmarked three high-throughput DNA extraction methods: ZymoBIOMICS™ 96 MagBead DNA Kit, MP BiomedicalsTM FastDNATM-96 Soil Microbe DNA Kit, and DNeasy® 96 PowerSoil® Pro QIAcube® HT Kit. The DNA extractions were evaluated based on length, quality, quantity, and the observed microbial community across five diverse soil types. DNA extraction of all soil types was successful for all kits, however DNeasy® 96 PowerSoil® Pro QIAcube® HT Kit excelled across all performance parameters. We further used the nanoliter dispensing system I.DOT One to miniaturize Illumina amplicon and metagenomic library preparation volumes by a factor of 5 and 10, respectively, with no significant impact on the observed microbial communities. With these protocols, DNA extraction, metagenomic, or amplicon library preparation for one 96-well plate are approx. 3, 5, and 6 hours, respectively. Furthermore, the miniaturization of amplicon and metagenome library preparation reduces the chemical and plastic costs from 5.0 to 3.6 and 59 to 7.3 USD pr. sample. This enhanced efficiency and cost-effectiveness will enable researchers to undertake studies with greater sample sizes and diversity, thereby providing a richer, more detailed view of microbial communities and their dynamics.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Metagenoma , Análise Custo-Benefício , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , DNA , Solo , Metagenômica/métodos
8.
PLoS One ; 19(3): e0300381, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38489283

RESUMO

Water-borne plant pathogenic fungi and oomycetes are a major threat in greenhouse production systems. Early detection and quantification of these pathogens would enable us to ascertain both economic and biological thresholds required for a timely treatment, thus improving effective disease management. Here, we used Oxford nanopore MinION amplicon sequencing to analyze microbial communities in irrigation water collected from greenhouses used for growing tomato, cucumber and Aeschynanthus sp. Fungal and oomycete communities were characterized using primers that amplify the full internal transcribed spacer (ITS) region. To assess the sensitivity of the MinION sequencing, we spiked serially diluted mock DNA into the DNA isolated from greenhouse water samples prior to library preparation. Relative abundances of fungal and oomycete reads were distinct in the greenhouse irrigation water samples and in water samples from setups with tomato that was inoculated with Fusarium oxysporum. Sequence reads derived from fungal and oomycete mock communities were proportionate in the respective serial dilution samples, thus confirming the suitability of MinION amplicon sequencing for environmental monitoring. By using spike-ins as standards to test the reliability of quantification using the MinION, we found that the detection of spike-ins was highly affected by the background quantities of fungal or oomycete DNA in the sample. We observed that spike-ins having shorter length (538bp) produced reads across most of our dilutions compared to the longer spikes (>790bp). Moreover, the sequence reads were uneven with respect to dilution series and were least retrievable in the background samples having the highest DNA concentration, suggesting a narrow dynamic range of performance. We suggest continuous benchmarking of the MinION sequencing to improve quantitative metabarcoding efforts for rapid plant disease diagnostic and monitoring in the future.


Assuntos
Nanoporos , Oomicetos , Reprodutibilidade dos Testes , Oomicetos/genética , Fungos/genética , Análise de Sequência de DNA , DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos
9.
Genome Res ; 34(2): 326-340, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38428994

RESUMO

Pacific Biosciences (PacBio) HiFi sequencing technology generates long reads (>10 kbp) with very high accuracy (<0.01% sequencing error). Although several de novo assembly tools are available for HiFi reads, there are no comprehensive studies on the evaluation of these assemblers. We evaluated the performance of 11 de novo HiFi assemblers on (1) real data for three eukaryotic genomes; (2) 34 synthetic data sets with different ploidy, sequencing coverage levels, heterozygosity rates, and sequencing error rates; (3) one real metagenomic data set; and (4) five synthetic metagenomic data sets with different composition abundance and heterozygosity rates. The 11 assemblers were evaluated using quality assessment tool (QUAST) and benchmarking universal single-copy ortholog (BUSCO). We also used several additional criteria, namely, completion rate, single-copy completion rate, duplicated completion rate, average proportion of largest category, average distance difference, quality value, run-time, and memory utilization. Results show that hifiasm and hifiasm-meta should be the first choice for assembling eukaryotic genomes and metagenomes with HiFi data. We performed a comprehensive benchmarking study of commonly used assemblers on complex eukaryotic genomes and metagenomes. Our study will help the research community to choose the most appropriate assembler for their data and identify possible improvements in assembly algorithms.


Assuntos
Metagenoma , Software , Análise de Sequência de DNA/métodos , Algoritmos , Metagenômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
10.
Genes (Basel) ; 15(2)2024 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-38397184

RESUMO

Mitochondrial (mt) DNA plays an important role in the fields of forensic and clinical genetics, molecular anthropology, and population genetics, with mixture interpretation being of particular interest in medical and forensic genetics. The high copy number, haploid state (only a single haplotype contributed per individual), high mutation rate, and well-known phylogeny of mtDNA, makes it an attractive marker for mixture deconvolution in damaged and low quantity samples of all types. Given the desire to deconvolute mtDNA mixtures, the goals of this study were to (1) create a new software, MixtureAceMT™, to deconvolute mtDNA mixtures by assessing and combining two existing software tools, MixtureAce™ and Mixemt, (2) create a dataset of in-silico MPS mixtures from whole mitogenome haplotypes representing a diverse set of population groups, and consisting of two and three contributors at different dilution ratios, and (3) since amplicon targeted sequencing is desirable, and is a commonly used approach in forensic laboratories, create biological mixture data associated with two amplification kits: PowerSeq™ Whole Genome Mito (Promega™, Madison, WI, USA) and Precision ID mtDNA Whole Genome Panel (Thermo Fisher Scientific by AB™, Waltham, MA, USA) to further validate the software for use in forensic laboratories. MixtureAceMT™ provides a user-friendly interface while reducing confounding features such as NUMTs and noise, reducing traditionally prohibitive processing times. The new software was able to detect the correct contributing haplogroups and closely estimate contributor proportions in sequencing data generated from small amplicons for mixtures with minor contributions of ≥5%. A challenge of mixture deconvolution using small amplicon sequencing is the potential generation of spurious haplogroups resulting from private mutations that differ from Phylotree. MixtureAceMT™ was able to resolve these additional haplogroups by including known haplotype/s in the evaluation. In addition, for some samples, the inclusion of known haplotypes was also able to resolve trace contributors (minor contribution 1-2%), which remain challenging to resolve even with deep sequencing.


Assuntos
DNA Mitocondrial , Sequenciamento de Nucleotídeos em Larga Escala , DNA Mitocondrial/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA , Mitocôndrias/genética , Haplótipos
11.
ACS Synth Biol ; 13(2): 683-686, 2024 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-38329009

RESUMO

Biofoundries are automated high-throughput facilities specializing in the design, construction, and testing of engineered/synthetic DNA constructs (plasmids), often from genetic parts. A critical step of this process is assessing the fidelity of the assembled DNA construct to the desired design. Current methods utilized for this purpose are restriction digest or PCR followed by fragment analysis and sequencing. The Edinburgh Genome Foundry (EGF) has recently established a single-molecule sequencing quality control step using the Oxford Nanopore sequencing technology, along with a companion Nextflow pipeline and a Python package, to perform in-depth analysis and generate a detailed report. Our software enables researchers working with plasmids, including biofoundry scientists, to rapidly analyze and interpret sequencing data. In conclusion, we have created a laboratory and software protocol that validates assembled, cloned, or edited plasmids, using Nanopore long-reads, which can serve as a useful resource for the genetics, synthetic biology, and sequencing communities.


Assuntos
DNA , Nanoporos , Análise de Sequência de DNA/métodos , Análise Custo-Benefício , DNA/genética , Plasmídeos/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos
12.
Artigo em Inglês | MEDLINE | ID: mdl-38190241

RESUMO

Five strains of two novel species were isolated from the wastewater treatment systems of a pharmaceutical factory located in Zhejiang province, PR China. Strains ZM22T and Y6 were identified as belonging to a potential novel species of the genus Comamonas, whereas strains ZM23T, ZM24 and ZM25 were identified as belonging to a novel species of the genus Pseudomonas. These strains were characterized by polyphasic approaches including 16S rRNA gene analysis, multi-locus sequence analysis, average nucleotide identity (ANI), in silico DNA-DNA hybridization (isDDH), physiological and biochemical tests, as well as chemotaxonomic analysis. Genome-based phylogenetic analysis further confirmed that strains ZM22T and Y6 form a distinct clade closely related to Comamonas testosteroni ATCC 11996T and Comamonas thiooxydans DSM 17888T. Strains ZM23T, ZM24 and ZM25 were grouped as a separate clade closely related to Pseudomonas nitroreducens DSM 14399T and Pseudomonas nicosulfuronedens LAM1902T. The orthoANI and isDDH results indicated that strains ZM22T and Y6 belong to the same species. In addition, genomic DNA fingerprinting demonstrated that these strains do not originate from a single clone. The same results were observed for strains ZM23T, ZM24 and ZM25. Strains ZM22T and Y6 were resistant to multiple antibiotics, whereas strains ZM23T, ZM24 and ZM25 were able to degrade an emerging pollutant, triclosan. The phylogenetic, physiological and biochemical characteristics, as well as chemotaxonomy, allowed these strains to be distinguished from their genus, and we therefore propose the names Comamonas resistens sp. nov. (type strain ZM22=MCCC 1K08496T=KCTC 82561T) and Pseudomonas triclosanedens sp. nov. (type strain ZM23T=MCCC 1K08497T=JCM 36056T), respectively.


Assuntos
Comamonas , Ácidos Graxos , Purificação da Água , Técnicas de Tipagem Bacteriana , Composição de Bases , Comamonas/genética , DNA Bacteriano/genética , Ácidos Graxos/química , Filogenia , Pseudomonas/genética , RNA Ribossômico 16S/genética , Análise de Sequência de DNA , Indústria Farmacêutica
13.
Biochem Biophys Res Commun ; 696: 149488, 2024 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-38219485

RESUMO

Enzymatic methyl-seq (EM-seq), an enzyme-based method, identifies genome-wide DNA methylation, which enables us to obtain reliable methylome data from purified genomic DNA by avoiding bisulfite-induced DNA damage. However, the loss of DNA during purification hinders the methylome analysis of limited samples. The crude DNA extraction method is the quickest and minimal sample loss approach for obtaining useable DNA without requiring additional dissolution and purification. However, it remains unclear whether crude DNA can be used directly for EM-seq library construction. In this study, we aimed to assess the quality of EM-seq libraries prepared directly using crude DNA. The crude DNA-derived libraries provided appropriate fragment sizes and concentrations for sequencing similar to those of the purified DNA-derived libraries. However, the sequencing results of crude samples exhibited lower reference sequence mapping efficiencies than those of the purified samples. Additionally, the lower-input crude DNA-derived sample exhibited a marginally lower cytosine-to-thymine conversion efficiency and hypermethylated pattern around gene regulatory elements than the higher-input crude DNA- or purified DNA-derived samples. In contrast, the methylation profiles of the crude and purified samples exhibited a significant correlation. Our findings indicate that crude DNA can be used as a raw material for EM-seq library construction.


Assuntos
Metilação de DNA , DNA , Biblioteca Gênica , Sequência de Bases , DNA/genética , DNA/análise , Clonagem Molecular , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sulfitos
14.
ACS Synth Biol ; 13(2): 457-465, 2024 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-38295293

RESUMO

Modern biological science, especially synthetic biology, relies heavily on the construction of DNA elements, often in the form of plasmids. Plasmids are used for a variety of applications, including the expression of proteins for subsequent purification, the expression of heterologous pathways for the production of valuable compounds, and the study of biological functions and mechanisms. For all applications, a critical step after the construction of a plasmid is its sequence validation. The traditional method for sequence determination is Sanger sequencing, which is limited to approximately 1000 bp per reaction. Here, we present a highly scalable in-house method for rapid validation of amplified DNA sequences using long-read Nanopore sequencing. We developed two-step amplicon and transposase strategies to provide maximum flexibility for dual barcode sequencing. We also provide an automated analysis pipeline to quickly and reliably analyze sequencing results and provide easy-to-interpret results for each sample. The user-friendly DuBA.flow start-to-finish pipeline is widely applicable. Furthermore, we show that construct validation using DuBA.flow can be performed by barcoded colony PCR amplicon sequencing, thus accelerating research.


Assuntos
DNA , Sequenciamento de Nucleotídeos em Larga Escala , Fluxo de Trabalho , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Plasmídeos/genética , DNA/genética
15.
Mol Phylogenet Evol ; 192: 107988, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38072140

RESUMO

Phylogenetic inference has become a standard technique in integrative taxonomy and systematics, as well as in biogeography and ecology. DNA barcodes are often used for phylogenetic inference, despite being strongly limited due to their low number of informative sites. Also, because current DNA barcodes are based on a fraction of a single, fast-evolving gene, they are highly unsuitable for resolving deeper phylogenetic relationships due to saturation. In recent years, methods that analyse hundreds and thousands of loci at once have improved the resolution of the Tree of Life, but these methods require resources, experience and molecular laboratories that most taxonomists do not have. This paper introduces a PCR-based protocol that produces long amplicons of both slow- and fast-evolving unlinked mitochondrial and nuclear gene regions, which can be sequenced by the affordable and portable ONT MinION platform with low infrastructure or funding requirements. As a proof of concept, we inferred a phylogeny of a sample of 63 spider species from 20 families using our proposed protocol. The results were overall consistent with the results from approaches based on hundreds and thousands of loci, while requiring just a fraction of the cost and labour of such approaches, making our protocol accessible to taxonomists worldwide.


Assuntos
Código de Barras de DNA Taxonômico , DNA , Humanos , Filogenia , Análise Custo-Benefício , DNA/química , Análise de Sequência de DNA/métodos , Código de Barras de DNA Taxonômico/métodos
16.
Vet Res Commun ; 48(2): 827-837, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37955753

RESUMO

This study investigates suspected African swine fever (ASF) outbreaks in two villages of Kannur district in Kerala, India, with the aim of identifying the causative agent and its genotype, the source of infection, and estimating the economic losses due to the outbreaks. Clinically, the disease was acute with high mortality, while gross pathology was characterized by widespread haemorrhages in various organs, especially the spleen, which was dark, enlarged and had friable cut surfaces with diffuse haemorrhages. Notably, histopathological examination revealed multifocal, diffuse haemorrhages in the splenic parenchyma and lymphoid depletion accompanied by lymphoid cell necrosis. The clinico-pathological observations were suggestive of ASF, which was confirmed by PCR. The source of outbreak was identified as swill and it was a likely point source infection as revealed by epidemic curve analysis. The phylogenetic analysis of p72 gene identified the ASFV in the current outbreak as genotype-II and IGR II variant consistent with ASFVs detected in India thus far. However, the sequence analysis of the Central Variable Region (CVR) of the B602L gene showed that the ASFVs circulating in Kerala (South India) formed a separate clade along with those found in Mizoram (North East India), while ASFVs circulating in Arunachal Pradesh and Assam states of India grouped in to different clade. This study represents the first investigation of ASF outbreak in South India, establishing the genetic relatedness of the ASFV circulating in this region with that in other parts of the country. The study also underscores the utility of the CVR of the B602L gene in genetically characterizing highly similar Genotype II ASFVs to understand the spread of ASF within the country.


Assuntos
Vírus da Febre Suína Africana , Febre Suína Africana , Doenças dos Suínos , Suínos , Animais , Febre Suína Africana/epidemiologia , Sus scrofa , Vírus da Febre Suína Africana/genética , Filogenia , Análise de Sequência de DNA/veterinária , Surtos de Doenças/veterinária , Genótipo , Hemorragia/epidemiologia , Hemorragia/veterinária , Doenças dos Suínos/epidemiologia
17.
Artigo em Inglês | MEDLINE | ID: mdl-38015671

RESUMO

The field of tumor phylogenetics focuses on studying the differences within cancer cell populations. Many efforts are done within the scientific community to build cancer progression models trying to understand the heterogeneity of such diseases. These models are highly dependent on the kind of data used for their construction, therefore, as the experimental technologies evolve, it is of major importance to exploit their peculiarities. In this work we describe a cancer progression model based on Single Cell DNA Sequencing data. When constructing the model, we focus on tailoring the formalism on the specificity of the data. We operate by defining a minimal set of assumptions needed to reconstruct a flexible DAG structured model, capable of identifying progression beyond the limitation of the infinite site assumption. Our proposal is conservative in the sense that we aim to neither discard nor infer knowledge which is not represented in the data. We provide simulations and analytical results to show the features of our model, test it on real data, show how it can be integrated with other approaches to cope with input noise. Moreover, our framework can be exploited to produce simulated data that follows our theoretical assumptions. Finally, we provide an open source R implementation of our approach, called CIMICE, that is publicly available on BioConductor.


Assuntos
Neoplasias , Humanos , Cadeias de Markov , Neoplasias/genética , Neoplasias/patologia , Filogenia , Análise de Sequência de DNA
18.
BMC Bioinformatics ; 24(1): 461, 2023 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-38062356

RESUMO

BACKGROUND: Basecalling long DNA sequences is a crucial step in nanopore-based DNA sequencing protocols. In recent years, the CTC-RNN model has become the leading basecalling model, supplanting preceding hidden Markov models (HMMs) that relied on pre-segmenting ion current measurements. However, the CTC-RNN model operates independently of prior biological and physical insights. RESULTS: We present a novel basecaller named Lokatt: explicit duration Markov model and residual-LSTM network. It leverages an explicit duration HMM (EDHMM) designed to model the nanopore sequencing processes. Trained on a newly generated library with methylation-free Ecoli samples and MinION R9.4.1 chemistry, the Lokatt basecaller achieves basecalling performances with a median single read identity score of 0.930, a genome coverage ratio of 99.750%, on par with existing state-of-the-art structure when trained on the same datasets. CONCLUSION: Our research underlines the potential of incorporating prior knowledge into the basecalling processes, particularly through integrating HMMs and recurrent neural networks. The Lokatt basecaller showcases the efficacy of a hybrid approach, emphasizing its capacity to achieve high-quality basecalling performance while accommodating the nuances of nanopore sequencing. These outcomes pave the way for advanced basecalling methodologies, with potential implications for enhancing the accuracy and efficiency of nanopore-based DNA sequencing protocols.


Assuntos
Nanoporos , DNA/genética , Análise de Sequência de DNA/métodos , Redes Neurais de Computação , Sequência de Bases , Sequenciamento de Nucleotídeos em Larga Escala/métodos
20.
PLoS One ; 18(11): e0294283, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38032990

RESUMO

Early detection of SARS-CoV-2 infection is key to managing the current global pandemic, as evidence shows the virus is most contagious on or before symptom onset. Here, we introduce a low-cost, high-throughput method for diagnosing and studying SARS-CoV-2 infection. Dubbed Pathogen-Oriented Low-Cost Assembly & Re-Sequencing (POLAR), this method amplifies the entirety of the SARS-CoV-2 genome. This contrasts with typical RT-PCR-based diagnostic tests, which amplify only a few loci. To achieve this goal, we combine a SARS-CoV-2 enrichment method developed by the ARTIC Network (https://artic.network/) with short-read DNA sequencing and de novo genome assembly. Using this method, we can reliably (>95% accuracy) detect SARS-CoV-2 at a concentration of 84 genome equivalents per milliliter (GE/mL). The vast majority of diagnostic methods meeting our analytical criteria that are currently authorized for use by the United States Food and Drug Administration with the Coronavirus Disease 2019 (COVID-19) Emergency Use Authorization require higher concentrations of the virus to achieve this degree of sensitivity and specificity. In addition, we can reliably assemble the SARS-CoV-2 genome in the sample, often with no gaps and perfect accuracy given sufficient viral load. The genotypic data in these genome assemblies enable the more effective analysis of disease spread than is possible with an ordinary binary diagnostic. These data can also help identify vaccine and drug targets. Finally, we show that the diagnoses obtained using POLAR of positive and negative clinical nasal mid-turbinate swab samples 100% match those obtained in a clinical diagnostic lab using the Center for Disease Control's 2019-Novel Coronavirus test. Using POLAR, a single person can manually process 192 samples over an 8-hour experiment at the cost of ~$36 per patient (as of December 7th, 2022), enabling a 24-hour turnaround with sequencing and data analysis time. We anticipate that further testing and refinement will allow greater sensitivity using this approach.


Assuntos
COVID-19 , SARS-CoV-2 , Estados Unidos , Humanos , SARS-CoV-2/genética , COVID-19/diagnóstico , Teste para COVID-19 , Sensibilidade e Especificidade , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA