Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Front Genet ; 13: 643592, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35295949

RESUMEN

We present a novel approach to the Metagenomic Geolocation Challenge based on random projection of the sample reads from each location. This approach explores the direct use of k-mer composition to characterise samples so that we can avoid the computationally demanding step of aligning reads to available microbial reference sequences. Each variable-length read is converted into a fixed-length, k-mer-based read signature. Read signatures are then clustered into location signatures which provide a more compact characterisation of the reads at each location. Classification is then treated as a problem in ranked retrieval of locations, where signature similarity is used as a measure of similarity in microbial composition. We evaluate our approach using the CAMDA 2020 Challenge dataset and obtain promising results based on nearest neighbour classification. The main findings of this study are that k-mer representations carry sufficient information to reveal the origin of many of the CAMDA 2020 Challenge metagenomic samples, and that this reference-free approach can be achieved with much less computation than methods that need reads to be assigned to operational taxonomic units-advantages which become clear through comparison to previously published work on the CAMDA 2019 Challenge data.

2.
BMC Bioinformatics ; 19(Suppl 20): 509, 2018 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-30577803

RESUMEN

BACKGROUND: Sequencing highly-variable 16S regions is a common and often effective approach to the study of microbial communities, and next-generation sequencing (NGS) technologies provide abundant quantities of data for analysis. However, the speed of existing analysis pipelines may limit our ability to work with these quantities of data. Furthermore, the limited coverage of existing 16S databases may hamper our ability to characterise these communities, particularly in the context of complex or poorly studied environments. RESULTS: In this article we present the SigClust algorithm, a novel clustering method involving the transformation of sequence reads into binary signatures. When compared to other published methods, SigClust yields superior cluster coherence and separation of metagenomic read data, while operating within substantially reduced timeframes. We demonstrate its utility on published Illumina datasets and on a large collection of labelled wound reads sourced from patients in a wound clinic. The temporal analysis is based on tracking the dominant clusters of wound samples over time. The analysis can identify markers of both healing and non-healing wounds in response to treatment. Prominent clusters are found, corresponding to bacterial species known to be associated with unfavourable healing outcomes, including a number of strains of Staphylococcus aureus. CONCLUSIONS: SigClust identifies clusters rapidly and supports an improved understanding of the wound microbiome without reliance on a reference database. The results indicate a promising use for a SigClust-based pipeline in wound analysis and prediction, and a possible novel method for wound management and treatment.


Asunto(s)
Análisis de Datos , Metagenómica/métodos , Algoritmos , Análisis por Conglomerados , Humanos , Microbiota/genética
3.
Neural Comput ; 3(4): 623-632, 1991.
Artículo en Inglés | MEDLINE | ID: mdl-31167339

RESUMEN

By using artificial neurons with exponential transfer functions one can design perfect autoassociative and heteroassociative memory networks, with virtually unlimited storage capacity, for real or binary valued input and output. The autoassociative network has two layers: input and memory, with feedback between the two. The exponential response neurons are in the memory layer. By adding an encoding layer of conventional neurons the network becomes a heteroassociator and classifier. Because for real valued input vectors the dot-product with the weight vector is no longer a measure for similarity, we also consider a euclidean distance based neuron excitation and present Lyapunov functions for both cases. The network has energy minima corresponding only to stored prototype vectors. The exponential neurons make it simpler to build fast adaptive learning directly into classification networks that map real valued input to any class structure at its output.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA