Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros




Base de datos
Asunto de la revista
Intervalo de año de publicación
1.
Sci Rep ; 12(1): 21740, 2022 12 16.
Artículo en Inglés | MEDLINE | ID: mdl-36526702

RESUMEN

Due to the increasing prevalence of chronic kidney disease and its high mortality rate, study of risk factors affecting the progression of the disease is of great importance. Here in this work, we aim to develop a framework for using machine learning methods to identify factors affecting kidney function. To this end classification methods are trained to predict the serum creatinine level based on numerical values of other blood test parameters in one of the three classes representing different ranges of the variable values. Models are trained using the data from blood test results of healthy and patient subjects including 46 different blood test parameters. The best developed models are random forest and LightGBM. Interpretation of the resulting model reveals a direct relationship between vitamin D and blood creatinine level. The detected analogy between these two parameters is reliable, regarding the relatively high predictive accuracy of the random forest model reaching the AUC of 0.90 and the accuracy of 0.74. Moreover, in this paper we develop a Bayesian network to infer the direct relationships between blood test parameters which have consistent results with the classification models. The proposed framework uses an inclusive set of advanced imputation methods to deal with the main challenge of working with electronic health data, missing values. Hence it can be applied to similar clinical studies to investigate and discover the relationships between the factors under study.


Asunto(s)
Riñón , Aprendizaje Automático , Humanos , Teorema de Bayes , Factores de Riesgo
2.
Front Syst Neurosci ; 16: 904770, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36817947

RESUMEN

Introduction: Can we apply graph representation learning algorithms to identify autism spectrum disorder (ASD) patients within a large brain imaging dataset? ASD is mainly identified by brain functional connectivity patterns. Attempts to unveil the common neural patterns emerged in ASD are the essence of ASD classification. We claim that graph representation learning methods can appropriately extract the connectivity patterns of the brain, in such a way that the method can be generalized to every recording condition, and phenotypical information of subjects. These methods can capture the whole structure of the brain, both local and global properties. Methods: The investigation is done for the worldwide brain imaging multi-site database known as ABIDE I and II (Autism Brain Imaging Data Exchange). Among different graph representation techniques, we used AWE, Node2vec, Struct2vec, multi node2vec, and Graph2Img. The best approach was Graph2Img, in which after extracting the feature vectors representative of the brain nodes, the PCA algorithm is applied to the matrix of feature vectors. The classifier adapted to the features embedded in graphs is an LeNet deep neural network. Results and discussion: Although we could not outperform the previous accuracy of 10-fold cross-validation in the identification of ASD versus control patients in this dataset, for leave-one-site-out cross-validation, we could obtain better results (our accuracy: 80%). The result is that graph embedding methods can prepare the connectivity matrix more suitable for applying to a deep network.

3.
Data Brief ; 38: 107360, 2021 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-34522726

RESUMEN

This dataset provides information related to the outbreak of COVID-19 disease in the United States, including data from each of 3142 US counties from the beginning of the outbreak (January 2020) until June 2021. This data is collected from many public online databases and includes the daily number of COVID-19 confirmed cases and deaths, as well as 46 features that may be relevant to the pandemic dynamics: demographic, geographic, climatic, traffic, public-health, social-distancing-policy adherence, and political characteristics of each county. We anticipate many researchers will use this dataset to train models that can predict the spread of COVID-19 and to identify the key driving factors.

4.
Sci Rep ; 11(1): 13822, 2021 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-34226584

RESUMEN

The need for improved models that can accurately predict COVID-19 dynamics is vital to managing the pandemic and its consequences. We use machine learning techniques to design an adaptive learner that, based on epidemiological data available at any given time, produces a model that accurately forecasts the number of reported COVID-19 deaths and cases in the United States, up to 10 weeks into the future with a mean absolute percentage error of 9%. In addition to being the most accurate long-range COVID predictor so far developed, it captures the observed periodicity in daily reported numbers. Its effectiveness is based on three design features: (1) producing different model parameters to predict the number of COVID deaths (and cases) from each time and for a given number of weeks into the future, (2) systematically searching over the available covariates and their historical values to find an effective combination, and (3) training the model using "last-fold partitioning", where each proposed model is validated on only the last instance of the training dataset, rather than being cross-validated. Assessments against many other published COVID predictors show that this predictor is 19-48% more accurate.


Asunto(s)
COVID-19/mortalidad , Enfermedades Transmisibles/mortalidad , Predicción , SARS-CoV-2/patogenicidad , Humanos , Aprendizaje Automático , Modelos Estadísticos , Estados Unidos
5.
J Biomed Inform ; 115: 103688, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33545331

RESUMEN

One of the effective missions of biology and medical science is to find disease-related genes. Recent research uses gene/protein networks to find such genes. Due to false positive interactions in these networks, the results often are not accurate and reliable. Integrating multiple gene/protein networks could overcome this drawback, causing a network with fewer false positive interactions. The integration method plays a crucial role in the quality of the constructed network. In this paper, we integrate several sources to build a reliable heterogeneous network, i.e., a network that includes nodes of different types. Due to the different gene/protein sources, four gene-gene similarity networks are constructed first and integrated by applying the type-II fuzzy voter scheme. The resulting gene-gene network is linked to a disease-disease similarity network (as the outcome of integrating four sources) through a two-part disease-gene network. We propose a novel algorithm, namely random walk with restart on the heterogeneous network method with fuzzy fusion (RWRHN-FF). Through running RWRHN-FF over the heterogeneous network, disease-related genes are determined. Experimental results using the leave-one-out cross-validation indicate that RWRHN-FF outperforms existing methods. The proposed algorithm can be applied to find new genes for prostate, breast, gastric, and colon cancers. Since the RWRHN-FF algorithm converges slowly on large heterogeneous networks, we propose a parallel implementation of the RWRHN-FF algorithm on the Apache Spark platform for high-throughput and reliable network inference. Experiments run on heterogeneous networks of different sizes indicate faster convergence compared to other non-distributed modes of implementation.


Asunto(s)
Biología Computacional , Redes Reguladoras de Genes , Algoritmos , Humanos , Masculino
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA