Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Bases de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Bioinformatics ; 34(17): i766-i772, 2018 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-30423080

RESUMO

Motivation: Mapping-based approaches have become limited in their application to very large sets of references since computing an FM-index for very large databases (e.g. >10 GB) has become a bottleneck. This affects many analyses that need such index as an essential step for approximate matching of the NGS reads to reference databases. For instance, in typical metagenomics analysis, the size of the reference sequences has become prohibitive to compute a single full-text index on standard machines. Even on large memory machines, computing such index takes about 1 day of computing time. As a result, updates of indices are rarely performed. Hence, it is desirable to create an alternative way of indexing while preserving fast search times. Results: To solve the index construction and update problem we propose the DREAM (Dynamic seaRchablE pArallel coMpressed index) framework and provide an implementation. The main contributions are the introduction of an approximate search distributor via a novel use of Bloom filters. We combine several Bloom filters to form an interleaved Bloom filter and use this new data structure to quickly exclude reads for parts of the databases where they cannot match. This allows us to keep the databases in several indices which can be easily rebuilt if parts are updated while maintaining a fast search time. The second main contribution is an implementation of DREAM-Yara a distributed version of a fully sensitive read mapper under the DREAM framework. Availability and implementation: https://gitlab.com/pirovc/dream_yara/.


Assuntos
Bases de Dados Factuais , Software , Humanos , Fatores de Tempo
2.
Microorganisms ; 7(8)2019 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-31357520

RESUMO

Clostridium (syn. Clostridioides) difficile is considered a pioneer colonizer and may cause gut infection in neonatal piglets. The aim of this study was to explore the microbiota-C. difficile associations in pigs. We used the DNA from the faeces of four sows collected during the periparturient period and from two to three of their piglets (collected weekly until nine weeks of age) for the determination of bacterial community composition (sequencing) and C. difficile concentration (qPCR). Furthermore, C. difficile-negative faeces were enriched in a growth medium, followed by qPCR to verify the presence of this bacterium. Clostridium-sensu-stricto-1 and Lactobacillus spp. predominated the gut microbiota of the sows and their offspring. C. difficile was detected at least once in the faeces of all sows during the entire sampling period, albeit at low concentrations. Suckling piglets harboured C. difficile in high concentrations (up to log10 9.29 copy number/g faeces), which gradually decreased as the piglets aged. Enrichment revealed the presence of C. difficile in previously C. difficile-negative sow and offspring faeces. In suckling piglets, the C. difficile level was negatively correlated with carbohydrate-fermenting bacteria, and it was positively associated with potential pathogens. Shannon and richness diversity indices were negatively associated with the C. difficile counts in suckling piglets. This study showed that gut microbiota seems to set conditions for colonisation resistance against C. difficile in the offspring. However, this conclusion requires further research to include host-specific factors.

3.
PeerJ ; 5: e3138, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28367376

RESUMO

Identification and quantification of microorganisms is a significant step in studying the alpha and beta diversities within and between microbial communities respectively. Both identification and quantification of a given microbial community can be carried out using whole genome shotgun sequences with less bias than when using 16S-rDNA sequences. However, shared regions of DNA among reference genomes and taxonomic units pose a significant challenge in assigning reads correctly to their true origins. The existing microbial community profiling tools commonly deal with this problem by either preparing signature-based unique references or assigning an ambiguous read to its least common ancestor in a taxonomic tree. The former method is limited to making use of the reads which can be mapped to the curated regions, while the latter suffer from the lack of uniquely mapped reads at lower (more specific) taxonomic ranks. Moreover, even if the tools exhibited good performance in calling the organisms present in a sample, there is still room for improvement in determining the correct relative abundance of the organisms. We present a new method Species Level Identification of Microorganisms from Metagenomes (SLIMM) which addresses the above issues by using coverage information of reference genomes to remove unlikely genomes from the analysis and subsequently gain more uniquely mapped reads to assign at lower ranks of a taxonomic tree. SLIMM is based on a few, seemingly easy steps which when combined create a tool that outperforms state-of-the-art tools in run-time and memory usage while being on par or better in computing quantitative and qualitative information at species-level.

4.
J Biotechnol ; 261: 157-168, 2017 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-28888961

RESUMO

BACKGROUND: The use of novel algorithmic techniques is pivotal to many important problems in life science. For example the sequencing of the human genome (Venter et al., 2001) would not have been possible without advanced assembly algorithms and the development of practical BWT based read mappers have been instrumental for NGS analysis. However, owing to the high speed of technological progress and the urgent need for bioinformatics tools, there was a widening gap between state-of-the-art algorithmic techniques and the actual algorithmic components of tools that are in widespread use. We previously addressed this by introducing the SeqAn library of efficient data types and algorithms in 2008 (Döring et al., 2008). RESULTS: The SeqAn library has matured considerably since its first publication 9 years ago. In this article we review its status as an established resource for programmers in the field of sequence analysis and its contributions to many analysis tools. CONCLUSIONS: We anticipate that SeqAn will continue to be a valuable resource, especially since it started to actively support various hardware acceleration techniques in a systematic manner.


Assuntos
Bases de Dados Genéticas , Genômica/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Alinhamento de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA