Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Python data odyssey: Mining user feedback from google play store.

Yasin, Affan; Fatima, Rubia; Ghazi, Ahmad Nauman; Wei, Ziqi.

Data Brief ; 54: 110499, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38770040

RESUMO

Context: The Google Play Store is widely recognized as one of the largest platforms for downloading applications, both free and paid. On a daily basis, millions of users avail themselves of this marketplace, sharing their thoughts through various means such as star ratings, user comments, suggestions, and feedback. These insights, in the form of comments and feedback, constitute a valuable resource for organizations, competitors, and emerging companies seeking to expand their market presence. These comments provide insights into app deficiencies, suggestions for new features, identified issues, and potential enhancements. Unlocking the potential of this repository of suggestions holds significant value. Objective: This study sought to gather and analyze user reviews from the Google Play store for leading game apps. The primary aim was to construct a dataset for subsequent analysis utilizing requirements engineering, machine learning, and competitive assessment. Methodology: The authors employed a Python-based web scraping method to extract a comprehensive set of over 429,000+ reviews from the Google Play pages of selected apps. The scraped data encompassed reviewer names (removed due to privacy), ratings, and the textual content of the reviews. Results: The outcome was a dataset comprising the extracted user reviews, ratings, and associated metadata. A total of 429,000+ reviews were acquired through the scraping process for popular apps like Subway Surfers, Candy Crush Saga, PUBG Mobile, among others. This dataset not only serves as a valuable educational resource for instructors, aiding in the training of students in data analysis, but also offers practitioners the opportunity for in-depth examination and insights (in the past data of top apps).

Prediction of dementia based on older adults' sleep disturbances using machine learning.

Nyholm, Joel; Ghazi, Ahmad Nauman; Ghazi, Sarah Nauman; Sanmartin Berglund, Johan.

Comput Biol Med ; 171: 108126, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38342045

RESUMO

BACKGROUND: The most common degenerative condition in older adults is dementia, which can be predicted using a number of indicators and whose progression can be slowed down. One of the indicators of an increased risk of dementia is sleep disturbances. This study aims to examine if machine learning can predict dementia and which sleep disturbance factors impact dementia. METHODS: This study uses five machine learning algorithms (gradient boosting, logistic regression, gaussian naive Bayes, random forest and support vector machine) and data on the older population (60+) in Sweden from the Swedish National Study on Ageing and Care - Blekinge (n=4175). Each algorithm uses 10-fold stratified cross-validation to obtain the results, which consist of the Brier score for checking accuracy and the feature importance for examining the factors which impact dementia. The algorithms use 16 features which are on personal and sleep disturbance factors. RESULTS: Logistic regression found an association between dementia and sleep disturbances. However, it is slight for the features in the study. Gradient boosting was the most accurate algorithm with 92.9% accuracy, 0.926 f1-score, 0.974 ROC AUC and 0.056 Brier score. The significant factors were different in each machine learning algorithm. If the person sleeps more than two hours during the day, their sex, education level, age, waking up during the night and if the person snores are the variables that most consistently have the highest feature importance in all algorithms. CONCLUSION: There is an association between sleep disturbances and dementia, which machine learning algorithms can predict. Furthermore, the risk factors for dementia are different across the algorithms, but sleep disturbances can predict dementia.

Assuntos

Demência , Aprendizado de Máquina , Humanos , Idoso , Teorema de Bayes , Algoritmos , Máquina de Vetores de Suporte , Demência/epidemiologia

Breaking barriers: a statistical and machine learning-based hybrid system for predicting dementia.

Javeed, Ashir; Anderberg, Peter; Ghazi, Ahmad Nauman; Noor, Adeeb; Elmståhl, Sölve; Berglund, Johan Sanmartin.

Front Bioeng Biotechnol ; 11: 1336255, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-38260734

RESUMO

Introduction: Dementia is a condition (a collection of related signs and symptoms) that causes a continuing deterioration in cognitive function, and millions of people are impacted by dementia every year as the world population continues to rise. Conventional approaches for determining dementia rely primarily on clinical examinations, analyzing medical records, and administering cognitive and neuropsychological testing. However, these methods are time-consuming and costly in terms of treatment. Therefore, this study aims to present a noninvasive method for the early prediction of dementia so that preventive steps should be taken to avoid dementia. Methods: We developed a hybrid diagnostic system based on statistical and machine learning (ML) methods that used patient electronic health records to predict dementia. The dataset used for this study was obtained from the Swedish National Study on Aging and Care (SNAC), with a sample size of 43040 and 75 features. The newly constructed diagnostic extracts a subset of useful features from the dataset through a statistical method (F-score). For the classification, we developed an ensemble voting classifier based on five different ML models: decision tree (DT), naive Bayes (NB), logistic regression (LR), support vector machines (SVM), and random forest (RF). To address the problem of ML model overfitting, we used a cross-validation approach to evaluate the performance of the proposed diagnostic system. Various assessment measures, such as accuracy, sensitivity, specificity, receiver operating characteristic (ROC) curve, and Matthew's correlation coefficient (MCC), were used to thoroughly validate the devised diagnostic system's efficiency. Results: According to the experimental results, the proposed diagnostic method achieved the best accuracy of 98.25%, as well as sensitivity of 97.44%, specificity of 95.744%, and MCC of 0.7535. Discussion: The effectiveness of the proposed diagnostic approach is compared to various cutting-edge feature selection techniques and baseline ML models. From experimental results, it is evident that the proposed diagnostic system outperformed the prior feature selection strategies and baseline ML models regarding accuracy.

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA