Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37369636

RESUMEN

Untargeted metabolomics is gaining widespread applications. The key aspects of the data analysis include modeling complex activities of the metabolic network, selecting metabolites associated with clinical outcome and finding critical metabolic pathways to reveal biological mechanisms. One of the key roadblocks in data analysis is not well-addressed, which is the problem of matching uncertainty between data features and known metabolites. Given the limitations of the experimental technology, the identities of data features cannot be directly revealed in the data. The predominant approach for mapping features to metabolites is to match the mass-to-charge ratio (m/z) of data features to those derived from theoretical values of known metabolites. The relationship between features and metabolites is not one-to-one since some metabolites share molecular composition, and various adduct ions can be derived from the same metabolite. This matching uncertainty causes unreliable metabolite selection and functional analysis results. Here we introduce an integrated deep learning framework for metabolomics data that take matching uncertainty into consideration. The model is devised with a gradual sparsification neural network based on the known metabolic network and the annotation relationship between features and metabolites. This architecture characterizes metabolomics data and reflects the modular structure of biological system. Three goals can be achieved simultaneously without requiring much complex inference and additional assumptions: (1) evaluate metabolite importance, (2) infer feature-metabolite matching likelihood and (3) select disease sub-networks. When applied to a COVID metabolomics dataset and an aging mouse brain dataset, our method found metabolic sub-networks that were easily interpretable.


Asunto(s)
COVID-19 , Aprendizaje Profundo , Animales , Ratones , Metabolómica/métodos , Metaboloma , Redes y Vías Metabólicas
2.
Bioinformatics ; 38(14): 3662-3664, 2022 07 11.
Artículo en Inglés | MEDLINE | ID: mdl-35639952

RESUMEN

MOTIVATION: Testing for pathway enrichment is an important aspect in the analysis of untargeted metabolomics data. Due to the unique characteristics of untargeted metabolomics data, some key issues have not been fully addressed in existing pathway testing algorithms: (i) matching uncertainty between data features and metabolites; (ii) lacking of method to analyze positive mode and negative mode liquid chromatography-mass spectrometry (LC/MS) data simultaneously on the same set of subjects; (iii) the incompleteness of pathways in individual software packages. RESULTS: We developed an innovative R/Bioconductor package: metabolic pathway testing with positive and negative mode data (metapone), which can perform two novel statistical tests that take matching uncertainty into consideration-(i) a weighted gene set enrichment analysis-type test and (ii) a permutation-based weighted hypergeometric test. The package is capable of combining positive- and negative-ion mode results in a single testing scheme. For comprehensiveness, the built-in pathways were manually curated from three sources: Kyoto Encyclopedia of Genes and Genomes, Mummichog and The Small Molecule Pathway Database. AVAILABILITY AND IMPLEMENTATION: The package is available at https://bioconductor.org/packages/devel/bioc/html/metapone.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Metabolómica , Programas Informáticos , Humanos , Genoma , Algoritmos , Redes y Vías Metabólicas
3.
Biomolecules ; 13(7)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37509188

RESUMEN

Random Forest (RF) is a widely used machine learning method with good performance on classification and regression tasks. It works well under low sample size situations, which benefits applications in the field of biology. For example, gene expression data often involve much larger numbers of features (p) compared to the size of samples (n). Though the predictive accuracy using RF is often high, there are some problems when selecting important genes using RF. The important genes selected by RF are usually scattered on the gene network, which conflicts with the biological assumption of functional consistency between effective features. To improve feature selection by incorporating external topological information between genes, we propose the Graph Random Forest (GRF) for identifying highly connected important features by involving the known biological network when constructing the forest. The algorithm can identify effective features that form highly connected sub-graphs and achieve equivalent classification accuracy to RF. To evaluate the capability of our proposed method, we conducted simulation experiments and applied the method to two real datasets-non-small cell lung cancer RNA-seq data from The Cancer Genome Atlas, and human embryonic stem cell RNA-seq dataset (GSE93593). The resulting high classification accuracy, connectivity of selected sub-graphs, and interpretable feature selection results suggest the method is a helpful addition to graph-based classification models and feature selection procedures.


Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Humanos , Bosques Aleatorios , Algoritmos , Aprendizaje Automático
4.
Huan Jing Ke Xue ; 37(3): 1070-4, 2016 Mar 15.
Artículo en Zh | MEDLINE | ID: mdl-27337902

RESUMEN

The mechanism of activated sludge bulking in Zhengzhou wastewater treatment plant was studied by measurement of water quality parameters and high-throughput sequencing technology. The change of SVI value was significantly negatively correlated with the seasonal temperature variation, and sludge bulking was easy to occur during December to the next April, but the water quality was not affected. The result verified by high-throughput sequencing technology analysis showed that the microbial community structure of bulking sludge was significantly different from that of the non-bulking one. The dominant filamentous bacteria in the bulking sludge in this plant were Saprospiraceae and Flavobacterium. Therefore, the activated sludge bulking in this wastewater treatment plant was caused by the propagation of filamentous bacteria at low temperature.


Asunto(s)
Bacterias/clasificación , Frío , Consorcios Microbianos , Aguas del Alcantarillado/microbiología , Estaciones del Año , Aguas Residuales
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA