Búsqueda | Portal Regional de la BVS

1.

Parsnp 2.0: Scalable Core-Genome alignment for massive microbial datasets.

Kille, Bryce; Nute, Michael G; Huang, Victor; Kim, Eddie; Phillippy, Adam M; Treangen, Todd J.

Bioinformatics ; 2024 May 09.

Artículo en Inglés | MEDLINE | ID: mdl-38724243

RESUMEN

MOTIVATION: Since 2016, the number of microbial species with available reference genomes in NCBI has more than tripled. Multiple genome alignment, the process of identifying nucleotides across multiple genomes which share a common ancestor, is used as the input to numerous downstream comparative analysis methods. Parsnp is one of the few multiple genome alignment methods able to scale to the current era of genomic data; however, there has been no major release since its initial release in 2014. RESULTS: To address this gap, we developed Parsnp v2, which significantly improves on its original release. Parsnp v2 provides users with more control over executions of the program, allowing Parsnp to be better tailored for different use-cases. We introduce a partitioning option to Parsnp, which allows the input to be broken up into multiple parallel alignment processes which are then combined into a final alignment. The partitioning option can reduce memory usage by over 4x and reduce runtime by over 2x, all while maintaining a precise core-genome alignment. The partitioning workflow is also less susceptible to complications caused by assembly artifacts and minor variation, as alignment anchors only need to be conserved within their partition and not across the entire input set. We highlight the performance on datasets involving thousands of bacterial and viral genomes. AVAILABILITY: Parsnp v2 is available at https://github.com/marbl/parsnp.

2.

Microbial Community Profiling Protocol with Full-length 16S rRNA Sequences and Emu.

Curry, Kristen D; Soriano, Sirena; Nute, Michael G; Villapol, Sonia; Dilthey, Alexander; Treangen, Todd J.

Curr Protoc ; 4(3): e978, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-38511467

RESUMEN

16S rRNA targeted amplicon sequencing is an established standard for elucidating microbial community composition. While high-throughput short-read sequencing can elicit only a portion of the 16S rRNA gene due to their limited read length, third generation sequencing can read the 16S rRNA gene in its entirety and thus provide more precise taxonomic classification. Here, we present a protocol for generating full-length 16S rRNA sequences with Oxford Nanopore Technologies (ONT) and a microbial community profile with Emu. We select Emu for analyzing ONT sequences as it leverages information from the entire community to overcome errors due to incomplete reference databases and hardware limitations to ultimately obtain species-level resolution. This pipeline provides a low-cost solution for characterizing microbiome composition by exploiting real-time, long-read ONT sequencing and tailored software for accurate characterization of microbial communities. © 2024 Wiley Periodicals LLC. Basic Protocol: Microbial community profiling with Emu Support Protocol 1: Full-length 16S rRNA microbial sequences with Oxford Nanopore Technologies sequencing platform Support Protocol 2: Building a custom reference database for Emu.

Asunto(s)

Dromaiidae , Microbiota , Animales , ARN Ribosómico 16S/genética , Dromaiidae/genética , Bacterias/genética , Análisis de Secuencia de ADN/métodos , Microbiota/genética

3.

Parsnp 2.0: Scalable Core-Genome Alignment for Massive Microbial Datasets.

Kille, Bryce; Nute, Michael G; Huang, Victor; Kim, Eddie; Phillippy, Adam M; Treangen, Todd J.

bioRxiv ; 2024 Jan 31.

Artículo en Inglés | MEDLINE | ID: mdl-38352342

RESUMEN

Motivation: Since 2016, the number of microbial species with available reference genomes in NCBI has more than tripled. Multiple genome alignment, the process of identifying nucleotides across multiple genomes which share a common ancestor, is used as the input to numerous downstream comparative analysis methods. Parsnp is one of the few multiple genome alignment methods able to scale to the current era of genomic data; however, there has been no major release since its initial release in 2014. Results: To address this gap, we developed Parsnp v2, which significantly improves on its original release. Parsnp v2 provides users with more control over executions of the program, allowing Parsnp to be better tailored for different use-cases. We introduce a partitioning option to Parsnp, which allows the input to be broken up into multiple parallel alignment processes which are then combined into a final alignment. The partitioning option can reduce memory usage by over 4x and reduce runtime by over 2x, all while maintaining a precise core-genome alignment. The partitioning workflow is also less susceptible to complications caused by assembly artifacts and minor variation, as alignment anchors only need to be conserved within their partition and not across the entire input set. We highlight the performance on datasets involving thousands of bacterial and viral genomes. Availability: Parsnp is available at https://github.com/marbl/parsnp.

4.

KOMB: K-core based de novo characterization of copy number variation in microbiomes.

Balaji, Advait; Sapoval, Nicolae; Seto, Charlie; Leo Elworth, R A; Fu, Yilei; Nute, Michael G; Savidge, Tor; Segarra, Santiago; Treangen, Todd J.

Comput Struct Biotechnol J ; 20: 3208-3222, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-35832621

RESUMEN

Characterizing metagenomes via kmer-based, database-dependent taxonomic classification has yielded key insights into underlying microbiome dynamics. However, novel approaches are needed to track community dynamics and genomic flux within metagenomes, particularly in response to perturbations. We describe KOMB, a novel method for tracking genome level dynamics within microbiomes. KOMB utilizes K-core decomposition to identify Structural variations (SVs), specifically, population-level Copy Number Variation (CNV) within microbiomes. K-core decomposition partitions the graph into shells containing nodes of induced degree at least K, yielding reduced computational complexity compared to prior approaches. Through validation on a synthetic community, we show that KOMB recovers and profiles repetitive genomic regions in the sample. KOMB is shown to identify functionally-important regions in Human Microbiome Project datasets, and was used to analyze longitudinal data and identify keystone taxa in Fecal Microbiota Transplantation (FMT) samples. In summary, KOMB represents a novel graph-based, taxonomy-oblivious, and reference-free approach for tracking CNV within microbiomes. KOMB is open source and available for download at https://gitlab.com/treangenlab/komb.

5.

Emu: species-level microbial community profiling of full-length 16S rRNA Oxford Nanopore sequencing data.

Curry, Kristen D; Wang, Qi; Nute, Michael G; Tyshaieva, Alona; Reeves, Elizabeth; Soriano, Sirena; Wu, Qinglong; Graeber, Enid; Finzer, Patrick; Mendling, Werner; Savidge, Tor; Villapol, Sonia; Dilthey, Alexander; Treangen, Todd J.

Nat Methods ; 19(7): 845-853, 2022 07.

Artículo en Inglés | MEDLINE | ID: mdl-35773532

RESUMEN

16S ribosomal RNA-based analysis is the established standard for elucidating the composition of microbial communities. While short-read 16S rRNA analyses are largely confined to genus-level resolution at best, given that only a portion of the gene is sequenced, full-length 16S rRNA gene amplicon sequences have the potential to provide species-level accuracy. However, existing taxonomic identification algorithms are not optimized for the increased read length and error rate often observed in long-read data. Here we present Emu, an approach that uses an expectation-maximization algorithm to generate taxonomic abundance profiles from full-length 16S rRNA reads. Results produced from simulated datasets and mock communities show that Emu is capable of accurate microbial community profiling while obtaining fewer false positives and false negatives than alternative methods. Additionally, we illustrate a real-world application of Emu by comparing clinical sample composition estimates generated by an established whole-genome shotgun sequencing workflow with those returned by full-length 16S rRNA gene sequences processed with Emu.

Asunto(s)

Dromaiidae , Microbiota , Secuenciación de Nanoporos , Animales , Bacterias/genética , Dromaiidae/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Microbiota/genética , Filogenia , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADN/métodos

6.

Current progress and open challenges for applying deep learning across the biosciences.

Sapoval, Nicolae; Aghazadeh, Amirali; Nute, Michael G; Antunes, Dinler A; Balaji, Advait; Baraniuk, Richard; Barberan, C J; Dannenfelser, Ruth; Dun, Chen; Edrisi, Mohammadamin; Elworth, R A Leo; Kille, Bryce; Kyrillidis, Anastasios; Nakhleh, Luay; Wolfe, Cameron R; Yan, Zhi; Yao, Vicky; Treangen, Todd J.

Nat Commun ; 13(1): 1728, 2022 04 01.

Artículo en Inglés | MEDLINE | ID: mdl-35365602

RESUMEN

Deep Learning (DL) has recently enabled unprecedented advances in one of the grand challenges in computational biology: the half-century-old problem of protein structure prediction. In this paper we discuss recent advances, limitations, and future perspectives of DL on five broad areas: protein structure prediction, protein function prediction, genome engineering, systems biology and data integration, and phylogenetic inference. We discuss each application area and cover the main bottlenecks of DL approaches, such as training data, problem scope, and the ability to leverage existing DL architectures in new contexts. To conclude, we provide a summary of the subject-specific and general challenges for DL across the biosciences.

Asunto(s)

Aprendizaje Profundo , Biología Computacional , Filogenia , Proteínas , Biología de Sistemas

7.

It takes guts to learn: machine learning techniques for disease detection from the gut microbiome.

Curry, Kristen D; Nute, Michael G; Treangen, Todd J.

Emerg Top Life Sci ; 5(6): 815-827, 2021 12 21.

Artículo en Inglés | MEDLINE | ID: mdl-34779841

RESUMEN

Associations between the human gut microbiome and expression of host illness have been noted in a variety of conditions ranging from gastrointestinal dysfunctions to neurological deficits. Machine learning (ML) methods have generated promising results for disease prediction from gut metagenomic information for diseases including liver cirrhosis and irritable bowel disease, but have lacked efficacy when predicting other illnesses. Here, we review current ML methods designed for disease classification from microbiome data. We highlight the computational challenges these methods have effectively overcome and discuss the biological components that have been overlooked to offer perspectives on future work in this area.

Asunto(s)

Microbioma Gastrointestinal , Microbiota , Humanos , Aprendizaje Automático , Metagenoma , Metagenómica/métodos

8.

Misunderstood parameter of NCBI BLAST impacts the correctness of bioinformatics workflows.

Shah, Nidhi; Nute, Michael G; Warnow, Tandy; Pop, Mihai.

Bioinformatics ; 35(9): 1613-1614, 2019 05 01.

Artículo en Inglés | MEDLINE | ID: mdl-30247621

Asunto(s)

Biología Computacional , Flujo de Trabajo , Alineación de Secuencia , Programas Informáticos

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA