Pesquisa | BVS IEC

FragGeneScanRs: faster gene prediction for short reads.

Van der Jeugt, Felix; Dawyndt, Peter; Mesuere, Bart.

BMC Bioinformatics ; 23(1): 198, 2022 May 28.

Artigo em Inglês | MEDLINE | ID: mdl-35643462

RESUMO

BACKGROUND: FragGeneScan is currently the most accurate and popular tool for gene prediction in short and error-prone reads, but its execution speed is insufficient for use on larger data sets. The parallelization which should have addressed this is inefficient. Its alternative implementation FragGeneScan+ is faster, but introduced a number of bugs related to memory management, race conditions and even output accuracy. RESULTS: This paper introduces FragGeneScanRs, a faster Rust implementation of the FragGeneScan gene prediction model. Its command line interface is backward compatible and adds extra features for more flexible usage. Its output is equivalent to the original FragGeneScan implementation. CONCLUSIONS: Compared to the current C implementation, shotgun metagenomic reads are processed up to 22 times faster using a single thread, with better scaling for multithreaded execution. The Rust code of FragGeneScanRs is freely available from GitHub under the GPL-3.0 license with instructions for installation, usage and other documentation ( https://github.com/unipept/FragGeneScanRs ).

Assuntos

Algoritmos , Software , Metagenoma , Metagenômica

UMGAP: the Unipept MetaGenomics Analysis Pipeline.

Van der Jeugt, Felix; Maertens, Rien; Steyaert, Aranka; Verschaffelt, Pieter; De Tender, Caroline; Dawyndt, Peter; Mesuere, Bart.

BMC Genomics ; 23(1): 433, 2022 Jun 10.

Artigo em Inglês | MEDLINE | ID: mdl-35689184

RESUMO

BACKGROUND: Shotgun metagenomics yields ever richer and larger data volumes on the complex communities living in diverse environments. Extracting deep insights from the raw reads heavily depends on the availability of fast, accurate and user-friendly biodiversity analysis tools. RESULTS: Because environmental samples may contain strains and species that are not covered in reference databases and because protein sequences are more conserved than the genes encoding them, we explore the alternative route of taxonomic profiling based on protein coding regions translated from the shotgun metagenomics reads, instead of directly processing the DNA reads. We therefore developed the Unipept MetaGenomics Analysis Pipeline (UMGAP), a highly versatile suite of open source tools that are implemented in Rust and support parallelization to achieve optimal performance. Six preconfigured pipelines with different performance trade-offs were carefully selected, and benchmarked against a selection of state-of-the-art shotgun metagenomics taxonomic profiling tools. CONCLUSIONS: UMGAP's protein space detour for taxonomic profiling makes it competitive with state-of-the-art shotgun metagenomics tools. Despite our design choices of an extra protein translation step, a broad spectrum index that can identify both archaea, bacteria, eukaryotes and viruses, and a highly configurable non-monolithic design, UMGAP achieves low runtime, manageable memory footprint and high accuracy. Its interactive visualizations allow for easy exploration and comparison of complex communities.

Assuntos

Metagenômica , Vírus , Algoritmos , Bactérias/genética , Análise de Sequência de DNA , Software , Vírus/genética

Unipept CLI 2.0: adding support for visualizations and functional annotations.

Verschaffelt, Pieter; Van Thienen, Philippe; Van Den Bossche, Tim; Van der Jeugt, Felix; De Tender, Caroline; Martens, Lennart; Dawyndt, Peter; Mesuere, Bart.

Bioinformatics ; 36(14): 4220-4221, 2020 08 15.

Artigo em Inglês | MEDLINE | ID: mdl-32492134

RESUMO

SUMMARY: Unipept is an ecosystem of tools developed for fast metaproteomics data-analysis consisting of a web application, a set of web services (application programming interface, API) and a command-line interface (CLI). After the successful introduction of version 4 of the Unipept web application, we here introduce version 2.0 of the API and CLI. Next to the existing taxonomic analysis, version 2.0 of the API and CLI provides access to Unipept's powerful functional analysis for metaproteomics samples. The functional analysis pipeline supports retrieval of Enzyme Commission numbers, Gene Ontology terms and InterPro entries for the individual peptides in a metaproteomics sample. This paves the way for other applications and developers to integrate these new information sources into their data processing pipelines, which greatly increases insight into the functions performed by the organisms in a specific environment. Both the API and CLI have also been expanded with the ability to render interactive visualizations from a list of taxon ids. These visualizations are automatically made available on a dedicated website and can easily be shared by users. AVAILABILITY AND IMPLEMENTATION: The API is available at http://api.unipept.ugent.be. Information regarding the CLI can be found at https://unipept.ugent.be/clidocs. Both interfaces are freely available and open-source under the MIT license. CONTACT: pieter.verschaffelt@ugent.be. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Ecossistema , Software , Análise de Dados , Peptídeos

Unipept 4.0: Functional Analysis of Metaproteome Data.

Gurdeep Singh, Robbert; Tanca, Alessandro; Palomba, Antonio; Van der Jeugt, Felix; Verschaffelt, Pieter; Uzzau, Sergio; Martens, Lennart; Dawyndt, Peter; Mesuere, Bart.

J Proteome Res ; 18(2): 606-615, 2019 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-30465426

RESUMO

Unipept ( https://unipept.ugent.be ) is a web application for metaproteome data analysis, with an initial focus on tryptic-peptide-based biodiversity analysis of MS/MS samples. Because the true potential of metaproteomics lies in gaining insight into the expressed functions of complex environmental samples, the 4.0 release of Unipept introduces complementary functional analysis based on GO terms and EC numbers. Integration of this new functional analysis with the existing biodiversity analysis is an important asset of the extended pipeline. As a proof of concept, a human faecal metaproteome data set from 15 healthy subjects was reanalyzed with Unipept 4.0, yielding fast, detailed, and straightforward characterization of taxon-specific catalytic functions that is shown to be consistent with previous results from a BLAST-based functional analysis of the same data.

Assuntos

Análise de Dados , Proteômica/métodos , Software , Biodiversidade , Misturas Complexas/análise , Fezes/química , Voluntários Saudáveis , Humanos , Estudo de Prova de Conceito , Espectrometria de Massas em Tandem

Unipept web services for metaproteomics analysis.

Mesuere, Bart; Willems, Toon; Van der Jeugt, Felix; Devreese, Bart; Vandamme, Peter; Dawyndt, Peter.

Bioinformatics ; 32(11): 1746-8, 2016 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-26819472

RESUMO

UNLABELLED: Unipept is an open source web application that is designed for metaproteomics analysis with a focus on interactive datavisualization. It is underpinned by a fast index built from UniProtKB and the NCBI taxonomy that enables quick retrieval of all UniProt entries in which a given tryptic peptide occurs. Unipept version 2.4 introduced web services that provide programmatic access to the metaproteomics analysis features. This enables integration of Unipept functionality in custom applications and data processing pipelines. AVAILABILITY AND IMPLEMENTATION: The web services are freely available at http://api.unipept.ugent.be and are open sourced under the MIT license. CONTACT: Unipept@ugent.be SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Metabolômica , Biologia Computacional , Bases de Dados Genéticas , Armazenamento e Recuperação da Informação , Internet , Bases de Conhecimento , Peptídeos , Software , Interface Usuário-Computador , Vocabulário Controlado

The unique peptidome: Taxon-specific tryptic peptides as biomarkers for targeted metaproteomics.

Mesuere, Bart; Van der Jeugt, Felix; Devreese, Bart; Vandamme, Peter; Dawyndt, Peter.

Proteomics ; 16(17): 2313-8, 2016 09.

Artigo em Inglês | MEDLINE | ID: mdl-27380722

RESUMO

The Unique Peptide Finder (http://unipept.ugent.be/peptidefinder) is an interactive web application to quickly hunt for tryptic peptides that are unique to a particular species, genus, or any other taxon. Biodiversity within the target taxon is represented by a set of proteomes selected from a monthly updated list of complete and nonredundant UniProt proteomes, supplemented with proprietary proteomes loaded into persistent local browser storage. The software computes and visualizes pan and core peptidomes as unions and intersections of tryptic peptides occurring in the selected proteomes. In addition, it also computes and displays unique peptidomes as the set of all tryptic peptides that occur in all selected proteomes but not in any UniProt record not assigned to the target taxon. As a result, the unique peptides can serve as robust biomarkers for the target taxon, for example, in targeted metaproteomics studies. Computations are extremely fast since they are underpinned by the Unipept database, the lowest common ancestor algorithm implemented in Unipept and modern web technologies that facilitate in-browser data storage and parallel processing.

Assuntos

Peptídeos/análise , Proteoma/química , Proteômica/métodos , Animais , Bactérias/química , Proteínas de Bactérias/química , Bases de Dados de Proteínas , Humanos , Software

Introducing SPeDE: High-Throughput Dereplication and Accurate Determination of Microbial Diversity from Matrix-Assisted Laser Desorption-Ionization Time of Flight Mass Spectrometry Data.

Dumolin, Charles; Aerts, Maarten; Verheyde, Bart; Schellaert, Simon; Vandamme, Tim; Van der Jeugt, Felix; De Canck, Evelien; Cnockaert, Margo; Wieme, Anneleen D; Cleenwerck, Ilse; Peiren, Jindrich; Dawyndt, Peter; Vandamme, Peter; Carlier, Aurélien.

mSystems ; 4(5)2019 Sep 10.

Artigo em Inglês | MEDLINE | ID: mdl-31506264

RESUMO

The isolation of microorganisms from microbial community samples often yields a large number of conspecific isolates. Increasing the diversity covered by an isolate collection entails the implementation of methods and protocols to minimize the number of redundant isolates. Matrix-assisted laser desorption-ionization time-of-flight (MALDI-TOF) mass spectrometry methods are ideally suited to this dereplication problem because of their low cost and high throughput. However, the available software tools are cumbersome and rely either on the prior development of reference databases or on global similarity analyses, which are inconvenient and offer low taxonomic resolution. We introduce SPeDE, a user-friendly spectral data analysis tool for the dereplication of MALDI-TOF mass spectra. Rather than relying on global similarity approaches to classify spectra, SPeDE determines the number of unique spectral features by a mix of global and local peak comparisons. This approach allows the identification of a set of nonredundant spectra linked to operational isolation units. We evaluated SPeDE on a data set of 5,228 spectra representing 167 bacterial strains belonging to 132 genera across six phyla and on a data set of 312 spectra of 78 strains measured before and after lyophilization and subculturing. SPeDE was able to dereplicate with high efficiency by identifying redundant spectra while retrieving reference spectra for all strains in a sample. SPeDE can identify distinguishing features between spectra, and its performance exceeds that of established methods in speed and precision. SPeDE is open source under the MIT license and is available from https://github.com/LM-UGent/SPeDEIMPORTANCE Estimation of the operational isolation units present in a MALDI-TOF mass spectral data set involves an essential dereplication step to identify redundant spectra in a rapid manner and without sacrificing biological resolution. We describe SPeDE, a new algorithm which facilitates culture-dependent clinical or environmental studies. SPeDE enables the rapid analysis and dereplication of isolates, a critical feature when long-term storage of cultures is limited or not feasible. We show that SPeDE can efficiently identify sets of similar spectra at the level of the species or strain, exceeding the taxonomic resolution of other methods. The high-throughput capacity, speed, and low cost of MALDI-TOF mass spectrometry and SPeDE dereplication over traditional gene marker-based sequencing approaches should facilitate adoption of the culturomics approach to bacterial isolation campaigns.

High-throughput metaproteomics data analysis with Unipept: A tutorial.

Mesuere, Bart; Van der Jeugt, Felix; Willems, Toon; Naessens, Tom; Devreese, Bart; Martens, Lennart; Dawyndt, Peter.

J Proteomics ; 171: 11-22, 2018 01 16.

Artigo em Inglês | MEDLINE | ID: mdl-28552653

RESUMO

In recent years, shotgun metaproteomics has established itself as an important tool to study the composition of complex ecosystems and microbial communities. Two key steps in metaproteomics data analysis are the inference of proteins from the identified peptides, and the determination of the taxonomic origin and function of these proteins. This tutorial therefore introduces the Unipept command line interface (http://unipept.ugent.be/clidocs) as a platform-independent tool for such metaproteomics data analyses. First, a detailed overview is given of the available Unipept commands and their functions. Next, the power of the Unipept command line interface is illustrated using two case studies that analyze a single tryptic peptide, and a set of peptides retrieved from a shotgun metaproteomics experiment, respectively. Finally, the analysis results obtained using these command line tools are compared with the interactive taxonomic analysis that is available on the Unipept website.

Assuntos

Metagenoma , Proteoma/análise , Proteômica/métodos , Software , Bases de Dados de Proteínas , Fezes/microbiologia , Feminino , Humanos , Metadados , Microbiota , Peptídeos/análise , Proteoma/classificação

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA