Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 72
Filtrar
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38349061

RESUMO

Extrachromosomal circular DNA (eccDNA) is currently attracting considerable attention from researchers due to its significant impact on tumor biogenesis. High-throughput sequencing (HTS) methods for eccDNA identification are continually evolving. However, an efficient pipeline for the integrative and comprehensive analysis of eccDNA obtained from HTS data is still lacking. Here, we introduce eccDNA-pipe, an accessible software package that offers a user-friendly pipeline for conducting eccDNA analysis starting from raw sequencing data. This dataset includes data from various sequencing techniques such as whole-genome sequencing (WGS), Circle-seq and Circulome-seq, obtained through short-read sequencing or long-read sequencing. eccDNA-pipe presents a comprehensive solution for both upstream and downstream analysis, encompassing quality control and eccDNA identification in upstream analysis and downstream tasks such as eccDNA length distribution analysis, differential analysis of genes enriched with eccDNA and visualization of eccDNA structures. Notably, eccDNA-pipe automatically generates high-quality publication-ready plots. In summary, eccDNA-pipe provides a comprehensive and user-friendly pipeline for customized analysis of eccDNA research.


Assuntos
DNA Circular , Neoplasias , Humanos , DNA Circular/genética , DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala , Sequenciamento Completo do Genoma
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38605641

RESUMO

Simulation of RNA-seq reads is critical in the assessment, comparison, benchmarking and development of bioinformatics tools. Yet the field of RNA-seq simulators has progressed little in the last decade. To address this need we have developed BEERS2, which combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline. BEERS2 takes input transcripts (typically fully length messenger RNA transcripts with polyA tails) from either customizable input or from CAMPAREE simulated RNA samples. It produces realistic reads of these transcripts as FASTQ, SAM or BAM formats with the SAM or BAM formats containing the true alignment to the reference genome. It also produces true transcript-level quantification values. BEERS2 combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline and is designed to include the effects of polyA selection and RiboZero for ribosomal depletion, hexamer priming sequence biases, GC-content biases in polymerase chain reaction (PCR) amplification, barcode read errors and errors during PCR amplification. These characteristics combine to make BEERS2 the most complete simulation of RNA-seq to date. Finally, we demonstrate the use of BEERS2 by measuring the effect of several settings on the popular Salmon pseudoalignment algorithm.


Assuntos
Genoma , RNA , RNA-Seq , Análise de Sequência de RNA , Simulação por Computador , RNA/genética , Sequenciamento de Nucleotídeos em Larga Escala
3.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36917471

RESUMO

Metagenome assembly is an efficient approach to reconstruct microbial genomes from metagenomic sequencing data. Although short-read sequencing has been widely used for metagenome assembly, linked- and long-read sequencing have shown their advancements in assembly by providing long-range DNA connectedness. Many metagenome assembly tools were developed to simplify the assembly graphs and resolve the repeats in microbial genomes. However, there remains no comprehensive evaluation of metagenomic sequencing technologies, and there is a lack of practical guidance on selecting the appropriate metagenome assembly tools. This paper presents a comprehensive benchmark of 19 commonly used assembly tools applied to metagenomic sequencing datasets obtained from simulation, mock communities or human gut microbiomes. These datasets were generated using mainstream sequencing platforms, such as Illumina and BGISEQ short-read sequencing, 10x Genomics linked-read sequencing, and PacBio and Oxford Nanopore long-read sequencing. The assembly tools were extensively evaluated against many criteria, which revealed that long-read assemblers generated high contig contiguity but failed to reveal some medium- and high-quality metagenome-assembled genomes (MAGs). Linked-read assemblers obtained the highest number of overall near-complete MAGs from the human gut microbiomes. Hybrid assemblers using both short- and long-read sequencing were promising methods to improve both total assembly length and the number of near-complete MAGs. This paper also discussed the running time and peak memory consumption of these assembly tools and provided practical guidance on selecting them.


Assuntos
Metagenoma , Microbiota , Humanos , Benchmarking , Microbiota/genética , Metagenômica/métodos , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos
4.
Vet Res ; 54(1): 95, 2023 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-37853447

RESUMO

When resequencing animal genomes, some short reads cannot be mapped to the reference genome and are usually discarded. In this study, unmapped reads from 302 German Black Pied cattle were analyzed to identify potential pathogenic DNA. These unmapped reads were assembled and blasted against NCBI's database to identify bacterial and viral sequences. The results provided evidence for the presence of pathogens. We found sequences of Bovine parvovirus 3 and Mycoplasma species. These findings emphasize the information content of unmapped reads for gaining insight into bacterial and viral infections, which is important for veterinarians and epidemiologists.


Assuntos
Doenças dos Bovinos , Viroses , Bovinos , Animais , Análise de Sequência de DNA/veterinária , Sequenciamento Completo do Genoma/veterinária , Viroses/veterinária , Bactérias/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/veterinária
5.
Biol Res ; 56(1): 42, 2023 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-37468985

RESUMO

The human genome contains regions that cannot be adequately assembled or aligned using next generation short-read sequencing technologies. More than 2500 genes are known contain such 'dark' regions. In this study, we investigate the negative consequences of dark regions on gene discovery across a range of disease and study types, showing that dark regions are likely preventing researchers from identifying genetic variants relevant to human disease.


Assuntos
Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Genoma Humano/genética , Análise de Sequência de DNA
6.
Genomics ; 114(2): 110277, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35104609

RESUMO

Sexual reproduction is a diverse and widespread process. In gonochoristic species, the differentiation of sexes occurs through diverse mechanisms, influenced by environmental and genetic factors. In most vertebrates, a master-switch gene is responsible for triggering a sex determination network. However, only a few genes have acquired master-switch functions, and this process is associated with the evolution of sex-chromosomes, which have a significant influence in evolution. Additionally, their highly repetitive regions impose challenges for high-quality sequencing, even using high-throughput, state-of-the-art techniques. Here, we review the mechanisms involved in sex determination and their role in the evolution of species, particularly vertebrates, focusing on sex chromosomes and the challenges involved in sequencing these genomic elements. We also address the improvements provided by the growth of sequencing projects, by generating a massive number of near-gapless, telomere-to-telomere, chromosome-level, phased assemblies, increasing the number and quality of sex-chromosome sequences available for further studies.


Assuntos
Cromossomos Sexuais , Telômero , Animais , Sequências Repetitivas de Ácido Nucleico , Cromossomos Sexuais/genética , Telômero/genética , Vertebrados/genética
7.
Biochem Biophys Res Commun ; 621: 67-73, 2022 09 17.
Artigo em Inglês | MEDLINE | ID: mdl-35810593

RESUMO

Nonsense-mediated mRNA decay (NMD) and its regulation play an important role in eliminating faulty transcripts and controlling gene expression. However, measuring NMD activity and characterizing its targets remain challenging. In this study, we set out to establish Nanopore direct RNA sequencing in combination with quantitative real-time PCR (qPCR) as a method for analyzing NMD activity and its targets in cultured cell lines and clinical tissue samples. Nanopore RNA sequencing could detect more isoforms than short-read sequencing, especially in identifying novel isoforms and predicting isoforms annotated with premature termination codon (PTC). Changes in transcriptional isoforms of five genes (PRS, RPL12, SRSF2, PPIA, and TMEM208) could faithfully reflect NMD activity in the three cell lines and prostate cancer (PCA) samples. NMD activity in PCA samples varied, but some patients showed an increased trend. Together, Nanopore sequencing was superior in identifying NMD targets and evaluating NMD activity compared with short-read sequencing, and the NMD markers we screened may be used for measuring NMD activity in clinical patients.


Assuntos
Sequenciamento por Nanoporos , Nanoporos , Humanos , Masculino , Proteínas de Membrana/metabolismo , Degradação do RNAm Mediada por Códon sem Sentido , Isoformas de Proteínas/metabolismo , RNA/metabolismo , Estabilidade de RNA/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Análise de Sequência de RNA
8.
Int J Mol Sci ; 23(4)2022 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-35216262

RESUMO

Copy number variations (CNVs) are the predominant class of structural genomic variations involved in the processes of evolutionary adaptation, genomic disorders, and disease progression. Compared with single-nucleotide variants, there have been challenges associated with the detection of CNVs owing to their diverse sizes. However, the field has seen significant progress in the past 20-30 years. This has been made possible due to the rapid development of molecular diagnostic methods which ensure a more detailed view of the genome structure, further complemented by recent advances in computational methods. Here, we review the major approaches that have been used to routinely detect CNVs, ranging from cytogenetics to the latest sequencing technologies, and then cover their specific features.


Assuntos
Variações do Número de Cópias de DNA/genética , Genoma/genética , Genômica/métodos , Citogenética/métodos , Progressão da Doença , Humanos , Polimorfismo de Nucleotídeo Único/genética
9.
BMC Bioinformatics ; 21(1): 149, 2020 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-32306895

RESUMO

BACKGROUND: Typical experimental design advice for expression analyses using RNA-seq generally assumes that single-end reads provide robust gene-level expression estimates in a cost-effective manner, and that the additional benefits obtained from paired-end sequencing are not worth the additional cost. However, in many cases (e.g., with Illumina NextSeq and NovaSeq instruments), shorter paired-end reads and longer single-end reads can be generated for the same cost, and it is not obvious which strategy should be preferred. Using publicly available data, we test whether short-paired end reads can achieve more robust expression estimates and differential expression results than single-end reads of approximately the same total number of sequenced bases. RESULTS: At both the transcript and gene levels, 2 × 40 paired-end reads unequivocally provide expression estimates that are more highly correlated with 2 × 125 than 1 × 75 reads; in nearly all cases, those correlations are also greater than for 1 × 125, despite the greater total number of sequenced bases for the latter. Across an array of metrics, differential expression tests based upon 2 × 40 consistently outperform those using 1 × 75. CONCLUSION: Researchers seeking a cost-effective approach for gene-level expression analysis should prefer short paired-end reads over a longer single-end strategy. Short paired-end reads will also give reasonably robust expression estimates and differential expression results at the isoform level.


Assuntos
Perfilação da Expressão Gênica/métodos , Expressão Gênica/genética
10.
BMC Genomics ; 21(1): 840, 2020 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-33246410

RESUMO

BACKGROUND: Copy number variations (CNVs) are a major form of genetic variations and are involved in animal domestication and genetic adaptation to local environments. We investigated CNVs in the domestic goat (Capra hircus) using Illumina short-read sequencing data, by comparing our lab data for 38 goats from three Chinese breeds (Chengdu Brown, Jintang Black, and Tibetan Cashmere) to public data for 26 individuals from three other breeds (two Moroccan and one Chinese) and 21samples from Bezoar ibexes. RESULTS: We obtained a total of 2394 CNV regions (CNVRs) by merging 208,649 high-confidence CNVs, which spanned ~ 267 Mb of total length and accounted for 10.80% of the goat autosomal genome. Functional analyses showed that 2322 genes overlapping with the CNVRs were significantly enriched in 57 functional GO terms and KEGG pathways, most related to the nervous system, metabolic process, and reproduction system. Clustering patterns of all 85 samples generated separately from duplications and deletions were generally consistent with the results from SNPs, agreeing with the geographical origins of these goats. Based on genome-wide FST at each CNV locus, some genes overlapping with the highly divergent CNVs between domestic and wild goats were mainly enriched for several immunity-related pathways, whereas the genes overlapping with the highly differentiated CNVs between highland and lowland goats were mainly related to vitamin and lipid metabolism. Remarkably, a 507-bp deletion at ~ 14 kb downstream of FGF5 on chromosome 6 showed highly divergent (FST = 0.973) between the highland and lowland goats. Together with an enhancer activity of this sequence shown previously, the function of this duplication in regulating fiber growth deserved to be further investigated in detail. CONCLUSION: We generated a comprehensive map of CNVs in goats. Many genetically differentiated CNVs among various goat populations might be associated with the population characteristics of domestic goat breeds.


Assuntos
Bezoares , Variações do Número de Cópias de DNA , Animais , Genética Populacional , Cabras/genética , Sequenciamento de Nucleotídeos em Larga Escala
11.
Artigo em Inglês | MEDLINE | ID: mdl-32094139

RESUMO

Carbapenem resistance in Enterobacterales is a public health threat. Klebsiella pneumoniae carbapenemase (encoded by alleles of the blaKPC family) is one of the most common transmissible carbapenem resistance mechanisms worldwide. The dissemination of blaKPC historically has been associated with distinct K. pneumoniae lineages (clonal group 258 [CG258]), a particular plasmid family (pKpQIL), and a composite transposon (Tn4401). In the United Kingdom, blaKPC has represented a large-scale, persistent management challenge for some hospitals, particularly in North West England. The dissemination of blaKPC has evolved to be polyclonal and polyspecies, but the genetic mechanisms underpinning this evolution have not been elucidated in detail; this study used short-read whole-genome sequencing of 604 blaKPC-positive isolates (Illumina) and long-read assembly (PacBio)/polishing (Illumina) of 21 isolates for characterization. We observed the dissemination of blaKPC (predominantly blaKPC-2; 573/604 [95%] isolates) across eight species and more than 100 known sequence types. Although there was some variation at the transposon level (mostly Tn4401a, 584/604 [97%] isolates; predominantly with ATTGA-ATTGA target site duplications, 465/604 [77%] isolates), blaKPC spread appears to have been supported by highly fluid, modular exchange of larger genetic segments among plasmid populations dominated by IncFIB (580/604 isolates), IncFII (545/604 isolates), and IncR (252/604 isolates) replicons. The subset of reconstructed plasmid sequences (21 isolates, 77 plasmids) also highlighted modular exchange among non-blaKPC and blaKPC plasmids and the common presence of multiple replicons within blaKPC plasmid structures (>60%). The substantial genomic plasticity observed has important implications for our understanding of the epidemiology of transmissible carbapenem resistance in Enterobacterales for the implementation of adequate surveillance approaches and for control.


Assuntos
Proteínas de Bactérias/genética , Farmacorresistência Bacteriana/genética , Enterobacteriaceae/efeitos dos fármacos , Enterobacteriaceae/genética , Epidemiologia Molecular , Plasmídeos/genética , beta-Lactamases/genética , Antibacterianos/farmacologia , Carbapenêmicos/farmacologia , DNA Bacteriano/química , DNA Bacteriano/genética , Infecções por Enterobacteriaceae/epidemiologia , Infecções por Enterobacteriaceae/genética , Infecções por Enterobacteriaceae/microbiologia , Genoma Bacteriano , Humanos , Infecções por Klebsiella/epidemiologia , Estudos Retrospectivos , Reino Unido/epidemiologia , Sequenciamento Completo do Genoma
12.
J Clin Microbiol ; 58(10)2020 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-32669382

RESUMO

Viral genetic sequencing can be used to monitor the spread of HIV drug resistance, identify appropriate antiretroviral regimes, and characterize transmission dynamics. Despite decreasing costs, next-generation sequencing (NGS) is still prohibitively costly for routine use in generalized HIV epidemics in low- and middle-income countries. Here, we present veSEQ-HIV, a high-throughput, cost-effective NGS sequencing method and computational pipeline tailored specifically to HIV, which can be performed using leftover blood drawn for routine CD4 cell count testing. This method overcomes several major technical challenges that have prevented HIV sequencing from being used routinely in public health efforts; it is fast, robust, and cost-efficient, and generates full genomic sequences of diverse strains of HIV without bias. The complete veSEQ-HIV pipeline provides viral load estimates and quantitative summaries of drug resistance mutations; it also exploits information on within-host viral diversity to construct directed transmission networks. We evaluated the method's performance using 1,620 plasma samples collected from individuals attending 10 large urban clinics in Zambia as part of the HPTN 071-2 study (PopART Phylogenetics). Whole HIV genomes were recovered from 91% of samples with a viral load of >1,000 copies/ml. The cost of the assay (30 GBP per sample) compares favorably with existing VL and HIV genotyping tests, proving an affordable option for combining HIV clinical monitoring with molecular epidemiology and drug resistance surveillance in low-income settings.


Assuntos
Fármacos Anti-HIV , Infecções por HIV , HIV-1 , Fármacos Anti-HIV/uso terapêutico , Farmacorresistência Viral/genética , Genômica , Infecções por HIV/diagnóstico , Infecções por HIV/tratamento farmacológico , Infecções por HIV/epidemiologia , Humanos , Carga Viral , Zâmbia
13.
Int J Mol Sci ; 20(17)2019 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-31450745

RESUMO

Avocado (Persea americana Mill.) is an economically important crop because of its high nutritional value. However, the absence of a sequenced avocado reference genome has hindered investigations of secondary metabolism. For next-generation high-throughput transcriptome sequencing, we obtained 365,615,152 and 348,623,402 clean reads as well as 109.13 and 104.10 Gb of sequencing data for avocado mesocarp and seed, respectively, during five developmental stages. High-quality reads were assembled into 100,837 unigenes with an average length of 847.40 bp (N50 = 1725 bp). Additionally, 16,903 differentially expressed genes (DEGs) were detected, 17 of which were related to carotenoid biosynthesis. The expression levels of most of these 17 DEGs were higher in the mesocarp than in the seed during five developmental stages. In this study, the avocado mesocarp and seed transcriptome were also sequenced using single-molecule long-read sequencing to acquired 25.79 and 17.67 Gb clean data, respectively. We identified 233,014 and 238,219 consensus isoforms in avocado mesocarp and seed, respectively. Furthermore, 104 and 59 isoforms were found to correspond to the putative 11 carotenoid biosynthetic-related genes in the avocado mesocarp and seed, respectively. The isoform numbers of 10 out of the putative 11 genes involved in the carotenoid biosynthetic pathway were higher in the mesocarp than those in the seed. Besides, alpha- and beta-carotene contents in the avocado mesocarp and seed during five developmental stages were also measured, and they were higher in the mesocarp than in the seed, which validated the results of transcriptome profiling. Gene expression changes and the associated variations in gene dosage could influence carotenoid biosynthesis. These results will help to further elucidate carotenoid biosynthesis in avocado.


Assuntos
Carotenoides/metabolismo , Regulação da Expressão Gênica de Plantas , Persea/genética , Persea/metabolismo , Sementes/genética , Sementes/metabolismo , Transcriptoma , Vias Biossintéticas , Biologia Computacional/métodos , Dosagem de Genes , Perfilação da Expressão Gênica , Ontologia Genética , Metaboloma , Metabolômica/métodos , Anotação de Sequência Molecular , Desenvolvimento Vegetal/genética
14.
Clin Genet ; 93(3): 508-519, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29206278

RESUMO

High-throughput sequencing (HTS) has revolutionized genetics by enabling the detection of sequence variants at hitherto unprecedented large scale. Despite these advances, however, there are still remaining challenges in the complete coverage of targeted regions (genes, exome or genome) as well as in HTS data analysis and interpretation. Moreover, it is easy to get overwhelmed by the plethora of available methods and tools for HTS. Here, we review the step-by-step process from the generation of sequence data to molecular diagnosis of Mendelian diseases. Highlighting advantages and limitations, this review addresses the current state of (1) HTS technologies, considering targeted, whole-exome, and whole-genome sequencing on short- and long-read platforms; (2) read alignment, variant calling and interpretation; as well as (3) regulatory issues related to genetic counseling, reimbursement, and data storage.


Assuntos
Genoma Humano , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Estudos de Associação Genética/métodos , Aconselhamento Genético , Predisposição Genética para Doença , Variação Genética , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Reembolso de Seguro de Saúde , Análise de Sequência de DNA
15.
Evol Appl ; 17(3): e13653, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38495945

RESUMO

Genomic structural variants (SVs) are now recognized as an integral component of intraspecific polymorphism and are known to contribute to evolutionary processes in various organisms. However, they are inherently difficult to detect and genotype from readily available short-read sequencing data, and therefore remain poorly documented in wild populations. Salmonid species displaying strong interpopulation variability in both life history traits and habitat characteristics, such as Atlantic salmon (Salmo salar), offer a prime context for studying adaptive polymorphism, but the contribution of SVs to fine-scale local adaptation has yet to be explored. Here, we performed a comparative analysis of SVs, single nucleotide polymorphisms (SNPs) and small indels (<50 bp) segregating in the Romaine and Puyjalon salmon, two putatively locally adapted populations inhabiting neighboring rivers (Québec, Canada) and showing pronounced variation in life history traits, namely growth, fecundity, and age at maturity and smoltification. We first catalogued polymorphism using a hybrid SV characterization approach pairing both short- (16X) and long-read sequencing (20X) for variant discovery with graph-based genotyping of SVs across 60 salmon genomes, along with characterization of SNPs and small indels from short reads. We thus identified 115,907 SVs, 8,777,832 SNPs and 1,089,321 short indels, with SVs covering 4.8 times more base pairs than SNPs. All three variant types revealed a highly congruent population structure and similar patterns of F ST and density variation along the genome. Finally, we performed outlier detection and redundancy analysis (RDA) to identify variants of interest in the putative local adaptation of Romaine and Puyjalon salmon. Genes located near these variants were enriched for biological processes related to nervous system function, suggesting that observed variation in traits such as age at smoltification could arise from differences in neural development. This study therefore demonstrates the feasibility of large-scale SV characterization and highlights its relevance for salmonid population genomics.

16.
Methods Mol Biol ; 2822: 245-262, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38907923

RESUMO

RNA sequencing (RNA-Seq) has emerged as a powerful and versatile tool for the comprehensive analysis of transcriptomes and has been widely used to investigate gene expression, copy number variation, alternative splicing, and novel transcript discovery. This chapter outlines the methodology for conducting short-read RNA-Seq, starting from RNA enrichment to library preparation and sequencing. Throughout the chapter, practical tips and best practices are provided to guide researchers in order to optimize each step of the RNA-Seq workflow. Multiple quality control steps throughout the workflow that are critical to obtain high-quality RNA-Seq data are also discussed.


Assuntos
RNA-Seq , Humanos , RNA-Seq/métodos , Perfilação da Expressão Gênica/métodos , Transcriptoma/genética , Análise de Sequência de RNA/métodos , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Controle de Qualidade , RNA/genética , Fluxo de Trabalho , Software , Processamento Alternativo/genética , Biologia Computacional/métodos
17.
Methods Mol Biol ; 2833: 161-183, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38949710

RESUMO

Outbreaks are a risk to public health particularly when pathogenic, hypervirulent, and/or multidrug-resistant organisms (MDROs) are involved. In a hospital setting, vulnerable populations such as the immunosuppressed, intensive care patients, and neonates are most at risk. Rapid and accurate outbreak detection is essential to implement effective interventions in clinical areas to control and stop further transmission. Advances in the field of whole genome sequencing (WGS) have resulted in lowered costs, increased capacity, and improved reproducibility of results. WGS now has the potential to revolutionize the investigation and management of outbreaks replacing conventional genotyping and other discrimination systems. Here, we outline specific procedures and protocols to implement WGS into investigation of outbreaks in healthcare settings.


Assuntos
Surtos de Doenças , Genômica , Sequenciamento Completo do Genoma , Humanos , Sequenciamento Completo do Genoma/métodos , Genômica/métodos , Genoma Bacteriano
18.
Artigo em Inglês | MEDLINE | ID: mdl-38862430

RESUMO

Tandem duplication (TD) is a major type of structural variations (SVs) that plays an important role in novel gene formation and human diseases. However, TDs are often missed or incorrectly classified as insertions by most modern SV detection methods due to the lack of specialized operation on TD-related mutational signals. Herein, we developed a TD detection module for the Pindel tool, referred to as Pindel-TD, based on a TD-specific pattern growth approach. Pindel-TD is capable of detecting TDs with a wide size range at single nucleotide resolution. Using simulated and real read data from HG002, we demonstrated that Pindel-TD outperforms other leading methods in terms of precision, recall, F1-score, and robustness. Furthermore, by applying Pindel-TD to data generated from the K562 cancer cell line, we identified a TD located at the seventh exon of SAGE1, providing an explanation for its high expression. Pindel-TD is available for non-commercial use at https://github.com/xjtu-omics/pindel.


Assuntos
Software , Humanos , Células K562 , Duplicação Gênica , Sequências de Repetição em Tandem/genética , Algoritmos
19.
Genome Biol ; 25(1): 274, 2024 Oct 17.
Artigo em Inglês | MEDLINE | ID: mdl-39420419

RESUMO

The extremely high levels of genetic polymorphism within the human major histocompatibility complex (MHC) limit the usefulness of reference-based alignment methods for sequence assembly. We incorporate a short-read, de novo assembly algorithm into a workflow for novel application to the MHC. MHConstructor is a containerized pipeline designed for high-throughput, haplotype-informed, reproducible assembly of both whole genome sequencing and target capture short-read data in large, population cohorts. To-date, no other self-contained tool exists for the generation of de novo MHC assemblies from short-read data. MHConstructor facilitates wide-spread access to high-quality, alignment-free MHC sequence analysis.


Assuntos
Haplótipos , Complexo Principal de Histocompatibilidade , Humanos , Complexo Principal de Histocompatibilidade/genética , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Algoritmos
20.
Front Vet Sci ; 11: 1443855, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39144078

RESUMO

Introduction: Spillover events of Mycoplasma ovipneumoniae have devastating effects on the wild sheep populations. Multilocus sequence typing (MLST) is used to monitor spillover events and the spread of M. ovipneumoniae between the sheep populations. Most studies involving the typing of M. ovipneumoniae have used Sanger sequencing. However, this technology is time-consuming, expensive, and is not well suited to efficient batch sample processing. Methods: Our study aimed to develop and validate an MLST workflow for typing of M. ovipneumoniae using Nanopore Rapid Barcoding sequencing and multiplex polymerase chain reaction (PCR). We compare the workflow with Nanopore Native Barcoding library preparation and Illumina MiSeq amplicon protocols to determine the most accurate and cost-effective method for sequencing multiplex amplicons. A multiplex PCR was optimized for four housekeeping genes of M. ovipneumoniae using archived DNA samples (N = 68) from nasal swabs. Results: Sequences recovered from Nanopore Rapid Barcoding correctly identified all MLST types with the shortest total workflow time and lowest cost per sample when compared with Nanopore Native Barcoding and Illumina MiSeq methods. Discussion: Our proposed workflow is a convenient and effective method for strain typing of M. ovipneumoniae and can be applied to other bacterial MLST schemes. The workflow is suitable for diagnostic settings, where reduced hands-on time, cost, and multiplexing capabilities are important.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa