Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 4 de 4
1.
Inform Health Soc Care ; 47(3): 243-257, 2022 Jul 03.
Article En | MEDLINE | ID: mdl-34672859

Type 2 diabetes is a chronic, costly disease and is a serious global population health problem. Yet, the disease is well manageable and preventable if there is an early warning. This study aims to apply supervised machine learning algorithms for developing predictive models for type 2 diabetes using administrative claim data. Following guidelines from the Elixhauser Comorbidity Index, 31 variables were considered. Five supervised machine learning algorithms were used for developing type 2 diabetes prediction models. Principal component analysis was applied to rank variables' importance in predictive models. Random forest (RF) showed the highest accuracy (85.06%) among the algorithms, closely followed by the k-nearest neighbor (84.48%). The analysis further revealed RF as a high performing algorithm irrespective of data imbalance. As revealed by the principal component analysis, patient age is the most important predictor for type 2 diabetes, followed by a comorbid condition (i.e., solid tumor without metastasis). This study's finding of RF as the best performing classifier is consistent with the promise of tree-based algorithms for public data in other works. Thus, the outcome can guide in designing automated surveillance of patients at risk of forming diabetes from administrative claim information and will be useful to health regulators and insurers.


Diabetes Mellitus, Type 2 , Machine Learning , Algorithms , Cluster Analysis , Diabetes Mellitus, Type 2/epidemiology , Humans
2.
Article En | MEDLINE | ID: mdl-31478869

Disease prediction has the potential to benefit stakeholders such as the government and health insurance companies. It can identify patients at risk of disease or health conditions. Clinicians can then take appropriate measures to avoid or minimize the risk and in turn, improve quality of care and avoid potential hospital admissions. Due to the recent advancement of tools and techniques for data analytics, disease risk prediction can leverage large amounts of semantic information, such as demographics, clinical diagnosis and measurements, health behaviours, laboratory results, prescriptions and care utilisation. In this regard, electronic health data can be a potential choice for developing disease prediction models. A significant number of such disease prediction models have been proposed in the literature over time utilizing large-scale electronic health databases, different methods, and healthcare variables. The goal of this comprehensive literature review was to discuss different risk prediction models that have been proposed based on electronic health data. Search terms were designed to find relevant research articles that utilized electronic health data to predict disease risks. Online scholarly databases were searched to retrieve results, which were then reviewed and compared in terms of the method used, disease type, and prediction accuracy. This paper provides a comprehensive review of the use of electronic health data for risk prediction models. A comparison of the results from different techniques for three frequently modelled diseases using electronic health data was also discussed in this study. In addition, the advantages and disadvantages of different risk prediction models, as well as their performance, were presented. Electronic health data have been widely used for disease prediction. A few modelling approaches show very high accuracy in predicting different diseases using such data. These modelling approaches have been used to inform the clinical decision process to achieve better outcomes.


Disease Susceptibility , Electronic Health Records , Medical Informatics/methods , Models, Statistical , Data Mining , Humans , Machine Learning , Risk
3.
Article En | MEDLINE | ID: mdl-31963383

The prevalence of chronic disease comorbidity has increased worldwide. Comorbidity-i.e., the presence of multiple chronic diseases-is associated with adverse health outcomes in terms of mobility and quality of life as well as financial burden. Understanding the progression of comorbidities can provide valuable insights towards the prevention and better management of chronic diseases. Administrative data can be used in this regard as they contain semantic information on patients' health conditions. Most studies in this field are focused on understanding the progression of one chronic disease rather than multiple diseases. This study aims to understand the progression of two chronic diseases in the Australian health context. It specifically focuses on the comorbidity progression of cardiovascular disease (CVD) in patients with type 2 diabetes mellitus (T2DM), as the prevalence of these chronic diseases in Australians is high. A research framework is proposed to understand and represent the progression of CVD in patients with T2DM using graph theory and social network analysis techniques. Two study cohorts (i.e., patients with both T2DM and CVD and patients with only T2DM) were selected from an administrative dataset obtained from an Australian health insurance company. Two baseline disease networks were constructed from these two selected cohorts. A final disease network from two baseline disease networks was then generated by weight adjustments in a normalized way. The prevalence of renal failure, fluid and electrolyte disorders, hypertension and obesity was significantly higher in patients with both CVD and T2DM than patients with only T2DM. This showed that these chronic diseases occurred frequently during the progression of CVD in patients with T2DM. The proposed network-based model may potentially help the healthcare provider to understand high-risk diseases and the progression patterns between the recurrence of T2DM and CVD. Also, the framework could be useful for stakeholders including governments and private health insurers to adopt appropriate preventive health management programs for patients at a high risk of developing multiple chronic diseases.


Cardiovascular Diseases/complications , Diabetes Mellitus, Type 2/pathology , Australia/epidemiology , Cardiovascular Diseases/epidemiology , Cardiovascular Diseases/pathology , Cohort Studies , Comorbidity , Diabetes Mellitus, Type 2/complications , Diabetes Mellitus, Type 2/epidemiology , Disease Progression , Female , Humans , Male , Middle Aged , Quality of Life
4.
BMC Med Inform Decis Mak ; 19(1): 281, 2019 12 21.
Article En | MEDLINE | ID: mdl-31864346

BACKGROUND: Supervised machine learning algorithms have been a dominant method in the data mining field. Disease prediction using health data has recently shown a potential application area for these methods. This study ai7ms to identify the key trends among different types of supervised machine learning algorithms, and their performance and usage for disease risk prediction. METHODS: In this study, extensive research efforts were made to identify those studies that applied more than one supervised machine learning algorithm on single disease prediction. Two databases (i.e., Scopus and PubMed) were searched for different types of search items. Thus, we selected 48 articles in total for the comparison among variants supervised machine learning algorithms for disease prediction. RESULTS: We found that the Support Vector Machine (SVM) algorithm is applied most frequently (in 29 studies) followed by the Naïve Bayes algorithm (in 23 studies). However, the Random Forest (RF) algorithm showed superior accuracy comparatively. Of the 17 studies where it was applied, RF showed the highest accuracy in 9 of them, i.e., 53%. This was followed by SVM which topped in 41% of the studies it was considered. CONCLUSION: This study provides a wide overview of the relative performance of different variants of supervised machine learning algorithms for disease prediction. This important information of relative performance can be used to aid researchers in the selection of an appropriate supervised machine learning algorithm for their studies.


Algorithms , Clinical Decision Rules , Machine Learning , Bayes Theorem , Data Mining , Humans , Risk Factors , Support Vector Machine
...