Búsqueda | Portal Regional de la BVS

1.

Efficient inference of large prokaryotic pangenomes with PanTA.

Le, Duc Quang; Nguyen, Tien Anh; Nguyen, Son Hoang; Nguyen, Tam Thi; Nguyen, Canh Hao; Phung, Huong Thanh; Ho, Tho Huu; Vo, Nam S; Nguyen, Trang; Nguyen, Hoang Anh; Cao, Minh Duc.

Genome Biol ; 25(1): 209, 2024 Aug 06.

Artículo en Inglés | MEDLINE | ID: mdl-39107817

RESUMEN

Pangenome inference is an indispensable step in bacterial genomics, yet its scalability poses a challenge due to the rapid growth of genomic collections. This paper presents PanTA, a software package designed for constructing pangenomes of large bacterial datasets, showing unprecedented efficiency levels multiple times higher than existing tools. PanTA introduces a novel mechanism to construct the pangenome progressively without rebuilding the accumulated collection from scratch. The progressive mode is shown to consume orders of magnitude less computational resources than existing solutions in managing growing datasets. The software is open source and is publicly available at https://github.com/amromics/panta and at 10.6084/m9.figshare.23724705 .

Asunto(s)

Genoma Bacteriano , Programas Informáticos , Genómica/métodos , Bacterias/genética , Filogenia

2.

AMRomics: a scalable workflow to analyze large microbial genome collections.

Le, Duc Quang; Nguyen, Tam Thi; Nguyen, Canh Hao; Ho, Tho Huu; Vo, Nam S; Nguyen, Trang; Nguyen, Hoang Anh; Vinh, Le Sy; Dang, Thanh Hai; Cao, Minh Duc; Nguyen, Son Hoang.

BMC Genomics ; 25(1): 709, 2024 Jul 22.

Artículo en Inglés | MEDLINE | ID: mdl-39039439

RESUMEN

Whole genome analysis for microbial genomics is critical to studying and monitoring antimicrobial resistance strains. The exponential growth of microbial sequencing data necessitates a fast and scalable computational pipeline to generate the desired outputs in a timely and cost-effective manner. Recent methods have been implemented to integrate individual genomes into large collections of specific bacterial populations and are widely employed for systematic genomic surveillance. However, they do not scale well when the population expands and turnaround time remains the main issue for this type of analysis. Here, we introduce AMRomics, an optimized microbial genomics pipeline that can work efficiently with big datasets. We use different bacterial data collections to compare AMRomics against competitive tools and show that our pipeline can generate similar results of interest but with better performance. The software is open source and is publicly available at https://github.com/amromics/amromics under an MIT license.

Asunto(s)

Genoma Bacteriano , Genómica , Programas Informáticos , Flujo de Trabajo , Genómica/métodos , Biología Computacional/métodos , Bacterias/genética , Genoma Microbiano , Farmacorresistencia Bacteriana/genética

3.

AMRViz enables seamless genomics analysis and visualization of antimicrobial resistance.

Le, Duc Quang; Nguyen, Son Hoang; Nguyen, Tam Thi; Nguyen, Canh Hao; Ho, Tho Huu; Vo, Nam S; Nguyen, Trang; Nguyen, Hoang Anh; Cao, Minh Duc.

BMC Bioinformatics ; 25(1): 193, 2024 May 16.

Artículo en Inglés | MEDLINE | ID: mdl-38755527

RESUMEN

We have developed AMRViz, a toolkit for analyzing, visualizing, and managing bacterial genomics samples. The toolkit is bundled with the current best practice analysis pipeline allowing researchers to perform comprehensive analysis of a collection of samples directly from raw sequencing data with a single command line. The analysis results in a report showing the genome structure, genome annotations, antibiotic resistance and virulence profile for each sample. The pan-genome of all samples of the collection is analyzed to identify core- and accessory-genes. Phylogenies of the whole genome as well as all gene clusters are also generated. The toolkit provides a web-based visualization dashboard allowing researchers to interactively examine various aspects of the analysis results. Availability: AMRViz is implemented in Python and NodeJS, and is publicly available under open source MIT license at https://github.com/amromics/amrviz .

Asunto(s)

Genoma Bacteriano , Genómica , Programas Informáticos , Genómica/métodos , Farmacorresistencia Bacteriana/genética , Filogenia , Bacterias/genética , Bacterias/efectos de los fármacos , Antibacterianos/farmacología

4.

Pasa: leveraging population pangenome graph to scaffold prokaryote genome assemblies.

Do, Van Hoan; Nguyen, Son Hoang; Le, Duc Quang; Nguyen, Tam Thi; Nguyen, Canh Hao; Ho, Tho Huu; Vo, Nam S; Nguyen, Trang; Nguyen, Hoang Anh; Cao, Minh Duc.

Nucleic Acids Res ; 52(3): e15, 2024 Feb 09.

Artículo en Inglés | MEDLINE | ID: mdl-38084888

RESUMEN

Whole genome sequencing has increasingly become the essential method for studying the genetic mechanisms of antimicrobial resistance and for surveillance of drug-resistant bacterial pathogens. The majority of bacterial genomes sequenced to date have been sequenced with Illumina sequencing technology, owing to its high-throughput, excellent sequence accuracy, and low cost. However, because of the short-read nature of the technology, these assemblies are fragmented into large numbers of contigs, hindering the obtaining of full information of the genome. We develop Pasa, a graph-based algorithm that utilizes the pangenome graph and the assembly graph information to improve scaffolding quality. By leveraging the population information of the bacteria species, Pasa is able to utilize the linkage information of the gene families of the species to resolve the contig graph of the assembly. We show that our method outperforms the current state of the arts in terms of accuracy, and at the same time, is computationally efficient to be applied to a large number of existing draft assemblies.

Asunto(s)

Algoritmos , Bacterias , Genoma Bacteriano , Bacterias/clasificación , Bacterias/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos

5.

Individualized, heterologous chimpanzee adenovirus and self-amplifying mRNA neoantigen vaccine for advanced metastatic solid tumors: phase 1 trial interim results.

Palmer, Christine D; Rappaport, Amy R; Davis, Matthew J; Hart, Meghan G; Scallan, Ciaran D; Hong, Sue-Jean; Gitlin, Leonid; Kraemer, Lauren D; Kounlavouth, Sonia; Yang, Aaron; Smith, Lindsey; Schenk, Desiree; Skoberne, Mojca; Taquechel, Kiara; Marrali, Martina; Jaroslavsky, Jason R; Nganje, Charmaine N; Maloney, Elizabeth; Zhou, Rita; Navarro-Gomez, Daniel; Greene, Adrienne C; Grotenbreg, Gijsbert; Greer, Renee; Blair, Wade; Cao, Minh Duc; Chan, Shawn; Bae, Kyounghwa; Spira, Alexander I; Roychowdhury, Sameek; Carbone, David P; Henick, Brian S; Drake, Charles G; Solomon, Benjamin J; Ahn, Daniel H; Mahipal, Amit; Maron, Steve B; Johnson, Benny; Rousseau, Raphael; Yelensky, Roman; Liao, Chih-Yi; Catenacci, Daniel V T; Allen, Andrew; Ferguson, Andrew R; Jooss, Karin.

Nat Med ; 28(8): 1619-1629, 2022 08.

Artículo en Inglés | MEDLINE | ID: mdl-35970920

RESUMEN

Checkpoint inhibitor (CPI) therapies provide limited benefit to patients with tumors of low immune reactivity. T cell-inducing vaccines hold promise to exert long-lasting disease control in combination with CPI therapy. Safety, tolerability and recommended phase 2 dose (RP2D) of an individualized, heterologous chimpanzee adenovirus (ChAd68) and self-amplifying mRNA (samRNA)-based neoantigen vaccine in combination with nivolumab and ipilimumab were assessed as primary endpoints in an ongoing phase 1/2 study in patients with advanced metastatic solid tumors (NCT03639714). The individualized vaccine regimen was safe and well tolerated, with no dose-limiting toxicities. Treatment-related adverse events (TRAEs) >10% included pyrexia, fatigue, musculoskeletal and injection site pain and diarrhea. Serious TRAEs included one count each of pyrexia, duodenitis, increased transaminases and hyperthyroidism. The RP2D was 1012 viral particles (VP) ChAd68 and 30 µg samRNA. Secondary endpoints included immunogenicity, feasibility of manufacturing and overall survival (OS). Vaccine manufacturing was feasible, with vaccination inducing long-lasting neoantigen-specific CD8 T cell responses. Several patients with microsatellite-stable colorectal cancer (MSS-CRC) had improved OS. Exploratory biomarker analyses showed decreased circulating tumor DNA (ctDNA) in patients with prolonged OS. Although small study size limits statistical and translational analyses, the increased OS observed in MSS-CRC warrants further exploration in larger randomized studies.

Asunto(s)

Neoplasias Colorrectales , Pan troglodytes , Adenoviridae/genética , Animales , Neoplasias Colorrectales/tratamiento farmacológico , Fiebre , Humanos , ARN Mensajero/uso terapéutico

6.

Whole genome deep sequencing analysis of cell-free DNA in samples with low tumour content.

Ganesamoorthy, Devika; Robertson, Alan James; Chen, Wenhan; Hall, Michael B; Cao, Minh Duc; Ferguson, Kaltin; Lakhani, Sunil R; Nones, Katia; Simpson, Peter T; Coin, Lachlan J M.

BMC Cancer ; 22(1): 85, 2022 Jan 20.

Artículo en Inglés | MEDLINE | ID: mdl-35057759

RESUMEN

BACKGROUND: Circulating cell-free DNA (cfDNA) in the plasma of cancer patients contains cell-free tumour DNA (ctDNA) derived from tumour cells and it has been widely recognized as a non-invasive source of tumour DNA for diagnosis and prognosis of cancer. Molecular profiling of ctDNA is often performed using targeted sequencing or low-coverage whole genome sequencing (WGS) to identify tumour specific somatic mutations or somatic copy number aberrations (sCNAs). However, these approaches cannot efficiently detect all tumour-derived genomic changes in ctDNA. METHODS: We performed WGS analysis of cfDNA from 4 breast cancer patients and 2 patients with benign tumours. We sequenced matched germline DNA for all 6 patients and tumour samples from the breast cancer patients. All samples were sequenced on Illumina HiSeqXTen sequencing platform and achieved approximately 30x, 60x and 100x coverage on germline, tumour and plasma DNA samples, respectively. RESULTS: The mutational burden of the plasma samples (1.44 somatic mutations/Mb of genome) was higher than the matched tumour samples. However, 90% of high confidence somatic cfDNA variants were not detected in matched tumour samples and were found to comprise two background plasma mutational signatures. In contrast, cfDNA from the di-nucleosome fraction (300 bp-350 bp) had much higher proportion (30%) of variants shared with tumour. Despite high coverage sequencing we were unable to detect sCNAs in plasma samples. CONCLUSIONS: Deep sequencing analysis of plasma samples revealed higher fraction of unique somatic mutations in plasma samples, which were not detected in matched tumour samples. Sequencing of di-nucleosome bound cfDNA fragments may increase recovery of tumour mutations from plasma.

Asunto(s)

Neoplasias de la Mama/genética , ADN Tumoral Circulante/sangre , Análisis Mutacional de ADN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación Completa del Genoma/métodos , Adulto , Biomarcadores de Tumor/genética , Neoplasias de la Mama/sangre , Femenino , Humanos , Mutación , Pronóstico

7.

Real-time resolution of short-read assembly graph using ONT long reads.

Nguyen, Son Hoang; Cao, Minh Duc; Coin, Lachlan J M.

PLoS Comput Biol ; 17(1): e1008586, 2021 01.

Artículo en Inglés | MEDLINE | ID: mdl-33471816

RESUMEN

A streaming assembly pipeline utilising real-time Oxford Nanopore Technology (ONT) sequencing data is important for saving sequencing resources and reducing time-to-result. A previous approach implemented in npScarf provided an efficient streaming algorithm for hybrid assembly but was relatively prone to mis-assemblies compared to other graph-based methods. Here we present npGraph, a streaming hybrid assembly tool using the assembly graph instead of the separated pre-assembly contigs. It is able to produce more complete genome assembly by resolving the path finding problem on the assembly graph using long reads as the traversing guide. Application to synthetic and real data from bacterial isolate genomes show improved accuracy while still maintaining a low computational cost. npGraph also provides a graphical user interface (GUI) which provides a real-time visualisation of the progress of assembly. The tool and source code is available at https://github.com/hsnguyen/assembly.

Asunto(s)

Nanoporos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Biología Computacional , ADN Bacteriano/análisis , ADN Bacteriano/genética , Genoma Bacteriano/genética , Nanotecnología , Programas Informáticos , Interfaz Usuario-Computador

8.

Assembly of whole-chromosome pseudomolecules for polyploid plant genomes using outbred mapping populations.

Zhou, Chenxi; Olukolu, Bode; Gemenet, Dorcus C; Wu, Shan; Gruneberg, Wolfgang; Cao, Minh Duc; Fei, Zhangjun; Zeng, Zhao-Bang; George, Andrew W; Khan, Awais; Yencho, G Craig; Coin, Lachlan J M.

Nat Genet ; 52(11): 1256-1264, 2020 11.

Artículo en Inglés | MEDLINE | ID: mdl-33128049

RESUMEN

Despite advances in sequencing technologies, assembly of complex plant genomes remains elusive due to polyploidy and high repeat content. Here we report PolyGembler for grouping and ordering contigs into pseudomolecules by genetic linkage analysis. Our approach also provides an accurate method with which to detect and fix assembly errors. Using simulated data, we demonstrate that our approach is of high accuracy and outperforms three existing state-of-the-art genetic mapping tools. Particularly, our approach is more robust to the presence of missing genotype data and genotyping errors. We used our method to construct pseudomolecules for allotetraploid lawn grass utilizing PacBio long reads in combination with restriction site-associated DNA sequencing, and for diploid Ipomoea trifida and autotetraploid potato utilizing contigs assembled from Illumina reads in combination with genotype data generated by single-nucleotide polymorphism arrays and genotyping by sequencing, respectively. We resolved 13 assembly errors for a published I. trifida genome assembly and anchored eight unplaced scaffolds in the published potato genome.

Asunto(s)

Algoritmos , Cromosomas de las Plantas , Ligamiento Genético , Genoma de Planta , Poliploidía , Simulación por Computador , Genotipo , Ipomoea/genética , Fitomejoramiento , Poaceae/genética , Análisis por Matrices de Proteínas , Solanum tuberosum/genética

9.

Retooling phage display with electrohydrodynamic nanomixing and nanopore sequencing.

Raftery, Lyndon J; Howard, Christopher B; Grewal, Yadveer S; Vaidyanathan, Ramanathan; Jones, Martina L; Anderson, Will; Korbie, Darren; Duarte, Tania; Cao, Minh Duc; Nguyen, Son Hoang; Coin, Lachlan J M; Mahler, Stephen M; Trau, Matt.

Lab Chip ; 19(24): 4083-4092, 2019 12 21.

Artículo en Inglés | MEDLINE | ID: mdl-31712799

RESUMEN

Phage display methodologies offer a versatile platform for the isolation of single-chain Fv (scFv) molecules which may be rebuilt into monoclonal antibodies. Herein, we report on a complete workflow termed PhageXpress, for rapid selection of single-chain Fv sequences by leveraging electrohydrodynamic-manipulation of a solution containing phage library particles to enhance target binding whilst minimizing non-specific interactions. Our PhageXpress technique is combined with Oxford Nanopore Technologies' MinION sequencer and custom bioinformatics to achieve high-throughput screening of phage libraries. We performed 4 rounds of biopanning against Dengue virus (DENV) non-structural protein 1 (NS1) using traditional methods (4 week turnaround), which resulted in the isolation of 19 unique scFv clones. We validated the feasibility and efficiency of the PhageXpress method utilizing the same phage library and antigen target. Notably, we successfully mapped 14 of the 19 anti-NS1 scFv sequences (â¼74%) with our new method, despite using â¼30-fold less particles during screening and conducting only a single round of biopanning. We believe this approach supersedes traditional methods for the discovery of bio-recognition molecules such as antibodies by speeding up the process for the development of therapeutic and diagnostic biologics.

Asunto(s)

Anticuerpos Antivirales , Secuenciación de Nanoporos , Biblioteca de Péptidos , Anticuerpos de Cadena Única , Anticuerpos Antivirales/química , Anticuerpos Antivirales/genética , Virus del Dengue/química , Humanos , Anticuerpos de Cadena Única/química , Anticuerpos de Cadena Única/genética , Proteínas no Estructurales Virales/química

10.

Correction to: Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning.

Teng, Haotian; Cao, Minh Duc; Hall, Michael B; Duarte, Tania; Wang, Sheng; Coin, Lachlan J M.

Gigascience ; 8(5)2019 05 01.

Artículo en Inglés | MEDLINE | ID: mdl-31077312

11.

Octapeptin C4 and polymyxin resistance occur via distinct pathways in an epidemic XDR Klebsiella pneumoniae ST258 isolate.

Pitt, Miranda E; Cao, Minh Duc; Butler, Mark S; Ramu, Soumya; Ganesamoorthy, Devika; Blaskovich, Mark A T; Coin, Lachlan J M; Cooper, Matthew A.

J Antimicrob Chemother ; 74(3): 582-593, 2019 03 01.

Artículo en Inglés | MEDLINE | ID: mdl-30445429

RESUMEN

BACKGROUND: Polymyxin B and E (colistin) have been pivotal in the treatment of XDR Gram-negative bacterial infections; however, resistance has emerged. A structurally related lipopeptide, octapeptin C4, has shown significant potency against XDR bacteria, including polymyxin-resistant strains, but its mode of action remains undefined. OBJECTIVES: We sought to compare and contrast the acquisition of resistance in an XDR Klebsiella pneumoniae (ST258) clinical isolate in vitro with all three lipopeptides to potentially unveil variations in their mode of action. METHODS: The isolate was exposed to increasing concentrations of polymyxins and octapeptin C4 over 20 days. Day 20 strains underwent WGS, complementation assays, antimicrobial susceptibility testing and lipid A analysis. RESULTS: Twenty days of exposure to the polymyxins resulted in a 1000-fold increase in the MIC, whereas for octapeptin C4 a 4-fold increase was observed. There was no cross-resistance observed between the polymyxin- and octapeptin-resistant strains. Sequencing of polymyxin-resistant isolates revealed mutations in previously known resistance-associated genes, including crrB, mgrB, pmrB, phoPQ and yciM, along with novel mutations in qseC. Octapeptin C4-resistant isolates had mutations in mlaDF and pqiB, genes related to phospholipid transport. These genetic variations were reflected in distinct phenotypic changes to lipid A. Polymyxin-resistant isolates increased 4-amino-4-deoxyarabinose fortification of lipid A phosphate groups, whereas the lipid A of octapeptin C4-resistant strains harboured a higher abundance of hydroxymyristate and palmitoylate. CONCLUSIONS: Octapeptin C4 has a distinct mode of action compared with the polymyxins, highlighting its potential as a future therapeutic agent to combat the increasing threat of XDR bacteria.

Asunto(s)

Antibacterianos/farmacología , Colistina/farmacología , Farmacorresistencia Bacteriana Múltiple , Klebsiella pneumoniae/efectos de los fármacos , Lipopéptidos/farmacología , Péptidos Cíclicos/farmacología , Polimixina B/farmacología , Humanos , Infecciones por Klebsiella/microbiología , Klebsiella pneumoniae/aislamiento & purificación , Pruebas de Sensibilidad Microbiana , Mutación , Secuenciación Completa del Genoma

12.

Ongoing human chromosome end extension revealed by analysis of BioNano and nanopore data.

Shao, Haojing; Zhou, Chenxi; Cao, Minh Duc; Coin, Lachlan J M.

Sci Rep ; 8(1): 16616, 2018 11 09.

Artículo en Inglés | MEDLINE | ID: mdl-30413723

RESUMEN

The majority of human chromosome ends remain incompletely assembled due to their highly repetitive structure. In this study, we use BioNano data to anchor and extend chromosome ends from two European trios as well as two unrelated Asian genomes. At least 11 BioNano assembled chromosome ends are structurally divergent from the reference genome, including both missing sequence and extensions. These extensions are heritable and in some cases divergent between Asian and European samples. Six out of nine predicted extension sequences from NA12878 can be confirmed and filled by nanopore data. We identify two multi-kilobase sequence families both enriched more than 100-fold in extension sequence (p-values < 1e-5) whose origins can be traced to interstitial sequence on ancestral primate chromosome 7. Extensive sub-telomeric duplication of these families has occurred in the human lineage subsequent to divergence from chimpanzees.

Asunto(s)

Biotecnología/métodos , Cromosomas Humanos , Genómica/métodos , Nanoporos , Telómero/genética , Bases de Datos Factuales , Humanos , Estándares de Referencia

13.

GtTR: Bayesian estimation of absolute tandem repeat copy number using sequence capture and high throughput sequencing.

Ganesamoorthy, Devika; Cao, Minh Duc; Duarte, Tania; Chen, Wenhan; Coin, Lachlan.

BMC Bioinformatics ; 19(1): 267, 2018 07 16.

Artículo en Inglés | MEDLINE | ID: mdl-30012093

RESUMEN

BACKGROUND: Tandem repeats comprise significant proportion of the human genome including coding and regulatory regions. They are highly prone to repeat number variation and nucleotide mutation due to their repetitive and unstable nature, making them a major source of genomic variation between individuals. Despite recent advances in high throughput sequencing, analysis of tandem repeats in the context of complex diseases is still hindered by technical limitations. We report a novel targeted sequencing approach, which allows simultaneous analysis of hundreds of repeats. We developed a Bayesian algorithm, namely - GtTR - which combines information from a reference long-read dataset with a short read counting approach to genotype tandem repeats at population scale. PCR sizing analysis was used for validation. RESULTS: We used a PacBio long-read sequenced sample to generate a reference tandem repeat genotype dataset with on average 13% absolute deviation from PCR sizing results. Using this reference dataset GtTR generated estimates of VNTR copy number with accuracy within 95% high posterior density (HPD) intervals of 68 and 83% for capture sequence data and 200X WGS data respectively, improving to 87 and 94% with use of a PCR reference. We show that the genotype resolution increases as a function of depth, such that the median 95% HPD interval lies within 25, 14, 12 and 8% of the its midpoint copy number value for 30X, 200X WGS, 395X and 800X capture sequence data respectively. We validated nine targets by PCR sizing analysis and genotype estimates from sequencing results correlated well with PCR results. CONCLUSIONS: The novel genotyping approach described here presents a new cost-effective method to explore previously unrecognized class of repeat variation in GWAS studies of complex diseases at the population level. Further improvements in accuracy can be obtained by improving accuracy of the reference dataset.

Asunto(s)

Algoritmos , Dosificación de Gen , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuencias Repetidas en Tándem/genética , Alelos , Secuencia de Bases , Teorema de Bayes , Simulación por Computador , Genoma Humano , Genotipo , Humanos , Repeticiones de Minisatélite/genética , Secuenciación Completa del Genoma

14.

npInv: accurate detection and genotyping of inversions using long read sub-alignment.

Shao, Haojing; Ganesamoorthy, Devika; Duarte, Tania; Cao, Minh Duc; Hoggart, Clive J; Coin, Lachlan J M.

BMC Bioinformatics ; 19(1): 261, 2018 07 13.

Artículo en Inglés | MEDLINE | ID: mdl-30001702

RESUMEN

BACKGROUND: Detection of genomic inversions remains challenging. Many existing methods primarily target inzversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic homologous recombination (NAHR) inversions largely unexplored. RESULT: We present npInv, a novel tool specifically for detecting and genotyping NAHR inversion using long read sub-alignment of long read sequencing data. We benchmark npInv with other tools in both simulation and real data. We use npInv to generate a whole-genome inversion map for NA12878 consisting of 30 NAHR inversions (of which 15 are novel), including all previously known NAHR mediated inversions in NA12878 with flanking IR less than 7kb. Our genotyping accuracy on this dataset was 94%. We used PCR to confirm the presence of two of these novel inversions. We show that there is a near linear relationship between the length of flanking IR and the minimum inversion size, without inverted repeats. CONCLUSION: The application of npInv shows high accuracy in both simulation and real data. The results give deeper insight into understanding inversion.

Asunto(s)

Inversión Cromosómica/genética , Genotipo , Humanos

15.

Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning.

Teng, Haotian; Cao, Minh Duc; Hall, Michael B; Duarte, Tania; Wang, Sheng; Coin, Lachlan J M.

Gigascience ; 7(5)2018 05 01.

Artículo en Inglés | MEDLINE | ID: mdl-29648610

RESUMEN

Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.

Asunto(s)

Aprendizaje Automático , Nanoporos , Nucleótidos/genética , Análisis de Secuencia de ADN/métodos , Procesamiento de Señales Asistido por Computador , Programas Informáticos , Emparejamiento Base , Secuencia de Bases , Escherichia coli/genética , Mycobacterium tuberculosis/genética , Redes Neurales de la Computación , Probabilidad , Reproducibilidad de los Resultados

16.

Multifactorial chromosomal variants regulate polymyxin resistance in extensively drug-resistant Klebsiella pneumoniae.

Pitt, Miranda E; Elliott, Alysha G; Cao, Minh Duc; Ganesamoorthy, Devika; Karaiskos, Ilias; Giamarellou, Helen; Abboud, Cely S; Blaskovich, Mark A T; Cooper, Matthew A; Coin, Lachlan J M.

Microb Genom ; 4(3)2018 03.

Artículo en Inglés | MEDLINE | ID: mdl-29431605

RESUMEN

Extensively drug-resistant Klebsiella pneumoniae (XDR-KP) infections cause high mortality and are disseminating globally. Identifying the genetic basis underpinning resistance allows for rapid diagnosis and treatment. XDR isolates sourced from Greece and Brazil, including 19 polymyxin-resistant and five polymyxin-susceptible strains, were subjected to whole genome sequencing. Seventeen of the 19 polymyxin-resistant isolates harboured variations upstream or within mgrB. The most common mutation identified was an insertion at nucleotide position 75 in mgrB via an ISKpn26-like element in the ST258 lineage and ISKpn13 in one ST11 isolate. Three strains acquired an IS1 element upstream of mgrB and another strain had an ISKpn25 insertion at 133 bp. Other isolates had truncations (C28STOP, Q30STOP) or a missense mutation (D29E) affecting mgrB. Complementation assays revealed all mgrB perturbations contributed to resistance. Missense mutations in phoQ (T281M, G385C) were also found to facilitate resistance. Several variants in phoPQ co-segregating with the ISKpn26-like insertion were identified as potential partial suppressor mutations. Three ST258 samples were found to contain subpopulations with different resistance-conferring mutations, including the ISKpn26-like insertion colonizing with a novel mutation in pmrB (P158R), both confirmed via complementation assays. These findings highlight the broad spectrum of chromosomal modifications which can facilitate and regulate resistance against polymyxins in K. pneumoniae.

Asunto(s)

Cromosomas Bacterianos/genética , ADN Bacteriano/aislamiento & purificación , Farmacorresistencia Bacteriana/genética , Klebsiella pneumoniae/efectos de los fármacos , Polimixinas/farmacología , Antibacterianos/farmacología , Brasil , Colistina/farmacología , ADN Bacteriano/genética , Escherichia coli/efectos de los fármacos , Escherichia coli/genética , Regulación Bacteriana de la Expresión Génica , Biblioteca de Genes , Genes Bacterianos , Variación Genética , Grecia , Klebsiella pneumoniae/genética , Klebsiella pneumoniae/aislamiento & purificación , Pruebas de Sensibilidad Microbiana , Mutación , Análisis de Secuencia de ADN

17.

Simulating the dynamics of targeted capture sequencing with CapSim.

Cao, Minh Duc; Ganesamoorthy, Devika; Zhou, Chenxi; Coin, Lachlan J M.

Bioinformatics ; 34(5): 873-874, 2018 03 01.

Artículo en Inglés | MEDLINE | ID: mdl-29092025

RESUMEN

Motivation: Targeted sequencing using capture probes has become increasingly popular in clinical applications due to its scalability and cost-effectiveness. The approach also allows for higher sequencing coverage of the targeted regions resulting in better analysis statistical power. However, because of the dynamics of the hybridization process, it is difficult to evaluate the efficiency of the probe design prior to the experiments which are time consuming and costly. Results: We developed CapSim, a software package for simulation of targeted sequencing. Given a genome sequence and a set of probes, CapSim simulates the fragmentation, the dynamics of probe hybridization and the sequencing of the captured fragments on Illumina and PacBio sequencing platforms. The simulated data can be used for evaluating the performance of the analysis pipeline, as well as the efficiency of the probe design. Parameters of the various stages in the sequencing process can also be evaluated in order to optimize the experiments. Availability and implementation: CapSim is publicly available under BSD license at https://github.com/Devika1/capsim. Contact: l.coin@imb.uq.edu.au. Supplementary information: Supplementary data are available at Bioinformatics online.

Asunto(s)

Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Genómica/métodos , Programas Informáticos

18.

Multifactorial chromosomal variants regulate polymyxin resistance in extensively drug-resistant Klebsiella pneumoniae

Pitt, Miranda E; Elliott, Alysha G; Cao, Minh Duc; Ganesamoorthy, Devika; Karaiskos, Ilias; Giamarellou, Helen; Abboud, Cely S; Blaskovich, Mark A. T; Cooper, Matthew A; Coin, Lachlan J. M.

Microbial Genomics ; 4(3): 1-35, 2018. ilus

Artículo en Inglés | Sec. Est. Saúde SP, SESSP-IDPCPROD, Sec. Est. Saúde SP | ID: biblio-1064745

RESUMEN

Extensively drug-resistant Klebsiella pneumoniae (XDR-KP) infections cause high mortality and are disseminating globally. Identifying the genetic basis underpinning resistance allows for rapid diagnosis and treatment. XDR isolates sourced fromGreece and Brazil, including 19 polymyxin-resistant and five polymyxin-susceptible strains, were subjected to whole genomesequencing. Seventeen of the 19 polymyxin-resistant isolates harboured variations upstream or within mgrB. The mostcommon mutation identified was an insertion at nucleotide position 75 in mgrB via an ISKpn26-like element in the ST258lineage and ISKpn13 in one ST11 isolate. Three strains acquired an IS1 element upstream of mgrB and another strain had anISKpn25 insertion at 133 bp...

Asunto(s)

Klebsiella pneumoniae , Polimixinas

19.

Real-time demultiplexing Nanopore barcoded sequencing data with npBarcode.

Nguyen, Son Hoang; Duarte, Tania P S; Coin, Lachlan J M; Cao, Minh Duc.

Bioinformatics ; 33(24): 3988-3990, 2017 Dec 15.

Artículo en Inglés | MEDLINE | ID: mdl-28961965

RESUMEN

MOTIVATION: The recent introduction of a barcoding protocol for Oxford Nanopore sequencing has increased the versatility of the technology. Several bioinformatics tools have been developed to demultiplex barcoded reads, but none of them supports streaming analysis. This limits the use of multiplexed sequencing in real-time applications, which is one of the main advantages of the technology. RESULTS: We introduced npBarcode, an open source and cross-platform tool for barcode demultiplexing in streaming fashion that can be used to pipe data to further real-time analyses. The tool also provides a friendly graphical user interface by integrating the module into npReader, making possible to monitor the progress concurrently when the sequencing is still in progress. We show that our algorithm achieves accuracies at least as good as competing tools. AVAILABILITY AND IMPLEMENTATION: npBarcode is bundled in Japsa-a Java tools kit for genome analysis, and is freely available at https://github.com/mdcao/japsa. CONTACT: s.nguyen@uq.edu.au or l.coin@imb.uq.edu.au. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Procesamiento Automatizado de Datos , Nanoporos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Reproducibilidad de los Resultados

20.

Scaffolding and completing genome assemblies in real-time with nanopore sequencing.

Cao, Minh Duc; Nguyen, Son Hoang; Ganesamoorthy, Devika; Elliott, Alysha G; Cooper, Matthew A; Coin, Lachlan J M.

Nat Commun ; 8: 14515, 2017 02 20.

Artículo en Inglés | MEDLINE | ID: mdl-28218240

RESUMEN

Third generation sequencing technologies provide the opportunity to improve genome assemblies by generating long reads spanning most repeat sequences. However, current analysis methods require substantial amounts of sequence data and computational resources to overcome the high error rates. Furthermore, they can only perform analysis after sequencing has completed, resulting in either over-sequencing, or in a low quality assembly due to under-sequencing. Here we present npScarf, which can scaffold and complete short read assemblies while the long read sequencing run is in progress. It reports assembly metrics in real-time so the sequencing run can be terminated once an assembly of sufficient quality is obtained. In assembling four bacterial and one eukaryotic genomes, we show that npScarf can construct more complete and accurate assemblies while requiring less sequencing data and computational resources than existing methods. Our approach offers a time- and resource-effective strategy for completing short read assemblies.

Asunto(s)

Algoritmos , Biología Computacional/métodos , Genoma Bacteriano/genética , Klebsiella pneumoniae/genética , Nanoporos , Análisis de Secuencia de ADN/métodos , ADN Bacteriano/química , ADN Bacteriano/genética , Klebsiella pneumoniae/clasificación , Reproducibilidad de los Resultados , Especificidad de la Especie

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA