Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Geriatr Gerontol Int ; 2024 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-39004920

RESUMO

AIM: Chronic diseases are influential components of stroke, one of the dominant reasons for dementia and premature mortality. Environmental risks are risk factors for transitioning from stroke to dementia. This study addresses the transition behaviors in stroke and dementia development associated with chronic diseases and environmental risks. METHODS: This study is an integrated survey of medical and environmental informatics concerning stroke patients' quality of life. A total of 10 627 stroke patients diagnosed in Taiwan were surveyed in this study. A covariate model and subgroup analysis were used to evaluate the influence of chronic diseases and environmental risk factors (i.e., divorce rate, unemployment rate, solitariness rate, temperature, and air pollution rate) on stroke and the corresponding dementia transition behaviors. RESULTS: This study constructed a total of 98 covariate analysis models, consisting of 14 transition types [10 transitions from chronic diseases to stroke (5 metabolic risk states × 2 stroke states) and 4 transitions from stroke to dementia (2 stroke states × 2 dementia states)] by 7 covariates (i.e., sex, age, divorce rate, unemployment rate, temperature, air pollution, and solitariness rate). Among the 98 transitions, 26 were statistically significant. CONCLUSIONS: Sex, age, divorce rate, unemployment rate, temperature, and air pollution rate exerted a partially significant influence on the transition from chronic diseases to stroke. Sex, age, unemployment rate, and temperature partially influenced the transition from stroke to dementia. This study also considered high-risk sub-populations of stroke patients, particularly males aged 65 years and below. Geriatr Gerontol Int 2024; ••: ••-••.

2.
Artigo em Inglês | MEDLINE | ID: mdl-35886298

RESUMO

The lung cancer threat has become a critical issue for public health. Research has been devoted to its clinical study but only a few studies have addressed the issue from a holistic perspective that included social, economic, and environmental dimensions. Therefore, in this study, risk factors or features, such as air pollution, tobacco use, socioeconomic status, employment status, marital status, and environment, were comprehensively considered when constructing a predictive model. These risk factors were analyzed and selected using stepwise regression and the variance inflation factor to eliminate the possibility of multicollinearity. To build efficient and informative prediction models of lung cancer incidence rates, several machine learning algorithms with cross-validation were adopted, namely, linear regression, support vector regression, random forest, K-nearest neighbor, and cubist model tree. A case study in Taiwan showed that the cubist model tree with feature selection was the best model with an RMSE of 3.310 and an R-squared of 0.960. Through these predictive models, we also found that apart from smoking, the average NO2 concentration, employment percentage, and number of factories were also important factors that had significant impacts on the incidence of lung cancer. In addition, the random forest model without feature selection and with feature selection could support the interpretation of the most contributing variables. The predictive model proposed in the present study can help to precisely analyze and estimate lung cancer incidence rates so that effective preventative measures can be developed. Furthermore, the risk factors involved in the predictive model can help with the future analysis of lung cancer incidence rates from a holistic perspective.


Assuntos
Poluição do Ar , Neoplasias Pulmonares , Poluição do Ar/efeitos adversos , Poluição do Ar/análise , Algoritmos , Benchmarking , Humanos , Incidência , Neoplasias Pulmonares/epidemiologia , Aprendizado de Máquina
3.
Comput Biol Med ; 138: 104888, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34610552

RESUMO

BACKGROUND: There is an increasing number of patients with a first primary cancer who are diagnosed with a second primary cancer, but prognosis methods to predict the survivability of a patient with multiple primary cancers have not been fully benchmarked. METHODS: This study investigated the five-year survivability prognosis performances of six machine learning approaches. These approaches are: artificial neural network, decision tree (DT), logistic regression, support vector machine, naïve Bayes (NB), and Bayesian network (BN). A synthetic minority over-sampling technique (SMOTE) was used to solve the imbalanced problem, and a nationwide cancer patient database containing 7,845 subjects in Taiwan was used as a sample source. Ten primary and secondary cancers and their key variables affecting the survivability of the patients were identified. RESULTS: All the models using SMOTE improved sensitivity and specificity significantly. NB has the highest performance in terms of accuracy and specificity, whereas BN has the highest performance in terms of sensitivity. Further, the computational time and the power of knowledge representation of NB, BN, and DT outperformed the others. CONCLUSIONS: Selecting the appropriate prognosis models to predict survivability of patients with two contingent primary cancers can aid precise prediction and can support appropriate treatment advice.


Assuntos
Benchmarking , Neoplasias , Teorema de Bayes , Humanos , Modelos Logísticos , Redes Neurais de Computação , Máquina de Vetores de Suporte
4.
Comput Methods Programs Biomed ; 196: 105686, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-32777652

RESUMO

BACKGROUND AND OBJECTIVE: Multiple primary cancers significantly threat patient survivability. Predicting the survivability of patients with two cancers is challenging because its stochastic pattern relates with numerous variables. METHODS: In this study, a Bayesian network (BN) model was proposed to describe the occurrence of two primary cancers and predict the five-year survivability of patients using probabilistic evidence. Eleven types of major primary cancers and contingent occurrences of secondary cancers were investigated. A nationwide two-cancer database involving 7,845 patients in Taiwan was investigated. The BN topology is rigorously examined and imbalanced dataset is processed by the synthetic minority oversampling technique. The proposed BN survivability prognosis model was compared with benchmark approaches. RESULTS: The proposed model significantly outperformed the back-propagation neural network, logistic regression, support vector machine, and naïve Bayes in terms of sensitivity, which is a critical performance index for the non-survival group. CONCLUSIONS: Using the proposed BN model, one can estimate the posterior probabilities for every query provided appropriate prior evidences. The potential survivability information of patients, treatment effects, and socio-demographics factor effects predicted by the proposed model can help in cancer treatment assessment and cancer development monitoring.


Assuntos
Neoplasias , Redes Neurais de Computação , Teorema de Bayes , Humanos , Modelos Logísticos , Neoplasias/epidemiologia , Prognóstico , Taiwan/epidemiologia
5.
Artigo em Inglês | MEDLINE | ID: mdl-32188138

RESUMO

BACKGROUND: Most stroke cases lead to serious mental and physical disabilities, such as dementia and sensory impairment. Chronic diseases are contributory risk factors for stroke. However, few studies considered the transition behaviors of stroke to dementia associated with chronic diseases and environmental risks. OBJECTIVE: This study aims to develop a prognosis model to address the issue of stroke transitioning to dementia associated with environmental risks. DESIGN: This cohort study used the data from the National Health Insurance Research Database in Taiwan. SETTING: Healthcare data were obtained from more than 25 million enrollees and covered over 99% of Taiwan's entire population. PARTICIPANTS: In this study, 10,627 stroke patients diagnosed from 2000 to 2010 in Taiwan were surveyed. METHODS: A Cox regression model and corresponding semi-Markov process were constructed to evaluate the influence of risk factors on stroke, corresponding dementia, and their transition behaviors. MAIN OUTCOME MEASURE: Relative risk and sojourn time were the main outcome measure. RESULTS: Multivariate analysis showed that certain environmental risks, medication, and rehabilitation factors highly influenced the transition of stroke from a chronic disease to dementia. This study also highlighted the high-risk populations of stroke patients against the environmental risk factors; the males below 65 years old were the most sensitive population. CONCLUSION: Experiments showed that the proposed semi-Markovian model outperformed other benchmark diagnosis algorithms (i.e., linear regression, decision tree, random forest, and support vector machine), with a high R2 of 90%. The proposed model also facilitated an accurate prognosis on the transition time of stroke from chronic diseases to dementias against environmental risks and rehabilitation factors.


Assuntos
Demência , Poluentes Ambientais , Acidente Vascular Cerebral , Idoso , Estudos de Coortes , Demência/epidemiologia , Poluentes Ambientais/toxicidade , Feminino , Humanos , Masculino , Fatores de Risco , Acidente Vascular Cerebral/epidemiologia , Taiwan
6.
J Med Syst ; 44(3): 65, 2020 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-32040648

RESUMO

Lung cancer is a major reason of mortalities. Estimating the survivability for this disease has become a key issue to families, hospitals, and countries. A conditional Gaussian Bayesian network model was presented in this study. This model considered 15 risk factors to predict the survivability of a lung cancer patient at 4 severity stages. We surveyed 1075 patients. The presented model is constructed by using the demographic, diagnosed-based, and prior-utilization variables. The proposed model for the survivability prognosis at different four stages performed R2 of 93.57%, 86.83%, 67.22%, and 52.94%, respectively. The model predicted the lung cancer survivability with high accuracy compared with the reported models. Our model also shows that it reached the ceiling of an ideal Bayesian network.


Assuntos
Sobreviventes de Câncer/estatística & dados numéricos , Neoplasias Pulmonares/mortalidade , Índice de Gravidade de Doença , Teorema de Bayes , Bases de Dados Factuais/estatística & dados numéricos , Feminino , Humanos , Masculino , Modelos Biológicos , Prognóstico , Análise de Sobrevida
7.
Comput Biol Med ; 106: 97-105, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30708222

RESUMO

Lung cancer is one of the leading causes of mortality, and its medical expenditure has increased dramatically. Estimating the expenditure for this disease has become an urgent concern of the supporting families, medial institutes, and government. In this study, a conditional Gaussian Bayesian network (CGBN) model was developed to incorporate the comprehensive risk factors to estimate the medical expenditure of a lung cancer patient at different stages. A total of 961 patients were surveyed by the four severity stages of lung cancer. The proposed CGBN model identified the correlation and association of 15 risk factors to the medical expenditure of different severity stages of lung cancer patients. The relationships among the demographic, diagnosed-based, and prior-utilization variables are constructed. The model predicted the lung cancer-related medical expenditure with high accuracy of 32.63%, 50.30%, 50.36%, and 66.58%, respectively for stages 1-4, as compared with the reported models. A greedy search was also applied to find the upper threshold of R2, while our model also shows that it approached the upper threshold.


Assuntos
Gastos em Saúde , Neoplasias Pulmonares/economia , Modelos Econômicos , Idoso , Teorema de Bayes , Feminino , Humanos , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/terapia , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Estudos Retrospectivos
8.
Biomed Res Int ; 2018: 1252897, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30519567

RESUMO

The effect of comorbidity on lung cancer patients' survival has been widely reported. The aim of this study was to investigate the effects of comorbidity on the establishment of the diagnosis of lung cancer and survival in lung cancer patients in Taiwan by using a nationwide population-based study design. This study collected various comorbidity patients and analyzed data regarding the lung cancer diagnosis and survival during a 16-year follow-up period (1995-2010). In total, 101,776 lung cancer patients were included, comprising 44,770 with and 57,006 without comorbidity. The Kaplan-Meier analyses were used to compare overall survival between lung cancer patients with and without comorbidity. In our cohort, chronic bronchitis patients who developed lung cancer had the lowest overall survival in one (45%), five (28.6%), and ten years (26.2%) since lung cancer diagnosis. Among lung cancer patients with nonpulmonary comorbidities, patients with hypertension had the lowest overall survival in one (47.9%), five (30.5%), and ten (28.2%) years since lung cancer diagnosis. In 2010, patients with and without comorbidity had 14.86 and 9.31 clinical visits, respectively. Lung cancer patients with preexisting comorbidity had higher frequency of physician visits. The presence of comorbid conditions was associated with early diagnosis of lung cancer.


Assuntos
Pneumopatias/diagnóstico , Pneumopatias/mortalidade , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/mortalidade , Adulto , Idoso , Estudos de Coortes , Comorbidade , Intervalo Livre de Doença , Feminino , Humanos , Estimativa de Kaplan-Meier , Pneumopatias/complicações , Pneumopatias/patologia , Neoplasias Pulmonares/complicações , Neoplasias Pulmonares/patologia , Masculino , Pessoa de Meia-Idade , Medição de Risco , Fatores de Risco , Taiwan/epidemiologia
9.
J Thorac Dis ; 8(Suppl 3): S272-8, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-27014474

RESUMO

BACKGROUND: Comparison of the degree of postoperative pain associated with different thoracoscopic surgical techniques for spontaneous pneumothorax has never reported. In this study we compared perioperative outcomes and degrees of postoperative pain associated with single-incision subxiphoid thoracoscopic surgery, single-incision transthoracic thoracoscopic surgery and three-incision transthoracic thoracoscopic surgery for spontaneous pneumothorax. METHODS: During the period August 2013 to September 2015, fifty-seven consecutive patients with spontaneous pneumothorax were treated via single-incision subxiphoid thoracoscopic surgery, single-incision transthoracic thoracoscopic surgery or three-incision transthoracic thoracoscopic surgery. Demographic data, operative time, operative blood loss, length of hospital stay, duration of chest tube drainage, postoperative complications, and numeric pain rating scale scores were collected from the medical records for analysis. RESULTS: Among the 57 patients, 14 received single-incision subxiphoid thoracoscopic surgery, 26 underwent single-incision transthoracic surgery and 17 received three-incision thoracoscopic surgery. In all patients, surgeries were completed without the need for conversion to open surgery. Patients who underwent the single-incision subxiphoid procedure had significantly lower 1-, 8-, 24- and 32-hour postoperative pain scale scores than patients who underwent the other two procedures. The average and maximum pain scale scores during the first 24 hours were lowest in the single-incision subxiphoid group (P<0.0001). CONCLUSIONS: Single-incision subxiphoid thoracoscopic surgery is associated with significantly lower postoperative pain intensity than transthoracic approaches and therefore may provide an alternative surgical technique for patients with spontaneous pneumothorax.

10.
J Med Syst ; 40(1): 35, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26573656

RESUMO

Brain metastases are commonly found in patients that are diagnosed with primary malignancy on their lung. Lung cancer patients with brain metastasis tend to have a poor survivability, which is less than 6 months in median. Therefore, an early and effective detection system for such disease is needed to help prolong the patients' survivability and improved their quality of life. A modified electromagnetism-like mechanism (EM) algorithm, MEM-SVM, is proposed by combining EM algorithm with support vector machine (SVM) as the classifier and opposite sign test (OST) as the local search technique. The proposed method is applied to 44 UCI and IDA datasets, and 5 cancers microarray datasets as preliminary experiment. In addition, this method is tested on 4 lung cancer microarray public dataset. Further, we tested our method on a nationwide dataset of brain metastasis from lung cancer (BMLC) in Taiwan. Since the nature of real medical dataset to be highly imbalanced, the synthetic minority over-sampling technique (SMOTE) is utilized to handle this problem. The proposed method is compared against another 8 popular benchmark classifiers and feature selection methods. The performance evaluation is based on the accuracy and Kappa index. For the 44 UCI and IDA datasets and 5 cancer microarray datasets, a non-parametric statistical test confirmed that MEM-SVM outperformed the other methods. For the 4 lung cancer public microarray datasets, MEM-SVM still achieved the highest mean value for accuracy and Kappa index. Due to the imbalanced property on the real case of BMLC dataset, all methods achieve good accuracy without significance difference among the methods. However, on the balanced BMLC dataset, MEM-SVM appears to be the best method with higher accuracy and Kappa index. We successfully developed MEM-SVM to predict the occurrence of brain metastasis from lung cancer with the combination of SMOTE technique to handle the class imbalance properties. The results confirmed that MEM-SVM has good diagnosis power and can be applied as an alternative diagnosis tool in with other medical tests for the early detection of brain metastasis from lung cancer.


Assuntos
Algoritmos , Neoplasias Encefálicas/diagnóstico , Neoplasias Encefálicas/secundário , Neoplasias Pulmonares/patologia , Máquina de Vetores de Suporte , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Taiwan
11.
Comput Methods Programs Biomed ; 119(2): 63-76, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25823851

RESUMO

Classifying imbalanced data in medical informatics is challenging. Motivated by this issue, this study develops a classifier approach denoted as BSMAIRS. This approach combines borderline synthetic minority oversampling technique (BSM) and artificial immune recognition system (AIRS) as global optimization searcher with the nearest neighbor algorithm used as a local classifier. Eight electronic medical datasets collected from University of California, Irvine (UCI) machine learning repository were used to evaluate the effectiveness and to justify the performance of the proposed BSMAIRS. Comparisons with several well-known classifiers were conducted based on accuracy, sensitivity, specificity, and G-mean. Statistical results concluded that BSMAIRS can be used as an efficient method to handle imbalanced class problems. To further confirm its performance, BSMAIRS was applied to real imbalanced medical data of lung cancer metastasis to the brain that were collected from National Health Insurance Research Database, Taiwan. This application can function as a supplementary tool for doctors in the early diagnosis of brain metastasis from lung cancer.


Assuntos
Algoritmos , Neoplasias Encefálicas/secundário , Neoplasias Pulmonares/patologia , Humanos , Taiwan
12.
Comput Methods Programs Biomed ; 119(3): 142-62, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25804445

RESUMO

The prediction of substantially short survivability in patients is extremely risky. In this study, we proposed a probabilistic model using Bayesian network (BN) to predict the short survivability of patients with brain metastasis from lung cancer. A nationwide cancer patient database from 1996 to 2010 in Taiwan was used. The cohort consisted of 438 patients with brain metastasis from lung cancer. We utilized synthetic minority over-sampling technique (SMOTE) to solve the imbalanced property embedded in the problem. The proposed BN was compared with three competitive models, namely, naive Bayes (NB), logistic regression (LR), and support vector machine (SVM). Statistical analysis showed that performances of BN, LR, NB, and SVM were statistically the same in terms of all indices with low sensitivity when these models were applied on an imbalanced data set. Results also showed that SMOTE can improve the performance of the four models in terms of sensitivity, while keeping high accuracy and specificity. Further, the proposed BN is more effective as compared with NB, LR, and SVM from two perspectives: the transparency and ability to show the relation of factors affecting brain metastasis from lung cancer; it allows decision makers to find the probability despite incomplete evidence and information; and the sensitivity of the proposed BN is the highest among all standard machine learning methods.


Assuntos
Neoplasias Encefálicas/secundário , Neoplasias Pulmonares , Modelos Estatísticos , Idoso , Teorema de Bayes , Neoplasias Encefálicas/mortalidade , Bases de Dados Factuais/estatística & dados numéricos , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Modelos Biológicos , Prognóstico , Máquina de Vetores de Suporte , Análise de Sobrevida , Taiwan/epidemiologia
13.
J Biomed Inform ; 54: 220-9, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25677947

RESUMO

Recently, the use of artificial intelligence based data mining techniques for massive medical data classification and diagnosis has gained its popularity, whereas the effectiveness and efficiency by feature selection is worthy to further investigate. In this paper, we presents a novel method for feature selection with the use of opposite sign test (OST) as a local search for the electromagnetism-like mechanism (EM) algorithm, denoted as improved electromagnetism-like mechanism (IEM) algorithm. Nearest neighbor algorithm is served as a classifier for the wrapper method. The proposed IEM algorithm is compared with nine popular feature selection and classification methods. Forty-six datasets from the UCI repository and eight gene expression microarray datasets are collected for comprehensive evaluation. Non-parametric statistical tests are conducted to justify the performance of the methods in terms of classification accuracy and Kappa index. The results confirm that the proposed IEM method is superior to the common state-of-art methods. Furthermore, we apply IEM to predict the occurrence of Type 2 diabetes mellitus (DM) after a gestational DM. Our research helps identify the risk factors for this disease; accordingly accurate diagnosis and prognosis can be achieved to reduce the morbidity and mortality rate caused by DM.


Assuntos
Algoritmos , Mineração de Dados/métodos , Diabetes Mellitus Tipo 2/diagnóstico , Diagnóstico por Computador/métodos , Bases de Dados Factuais , Campos Eletromagnéticos , Humanos , Modelos Teóricos , Reconhecimento Automatizado de Padrão , Fatores de Risco
14.
Comput Biol Med ; 47: 147-60, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24607682

RESUMO

The Bayesian network (BN) is a promising method for modeling cancer metastasis under uncertainty. BN is graphically represented using bioinformatics variables and can be used to support an informative medical decision/observation by using probabilistic reasoning. In this study, we propose such a BN to describe and predict the occurrence of brain metastasis from lung cancer. A nationwide database containing more than 50,000 cases of cancer patients from 1996 to 2010 in Taiwan was used in this study. The BN topology for studying brain metastasis from lung cancer was rigorously examined by domain experts/doctors. We used three statistical measures, namely, the accuracy, sensitivity, and specificity, to evaluate the performances of the proposed BN model and to compare it with three competitive approaches, namely, naive Bayes (NB), logistic regression (LR) and support vector machine (SVM). Experimental results show that no significant differences are observed in accuracy or specificity among the four models, while the proposed BN outperforms the others in terms of sampled average sensitivity. Moreover the proposed BN has advantages compared with the other approaches in interpreting how brain metastasis develops from lung cancer. It is shown to be easily understood by physicians, to be efficient in modeling non-linear situations, capable of solving stochastic medical problems, and handling situations wherein information are missing in the context of the occurrence of brain metastasis from lung cancer.


Assuntos
Teorema de Bayes , Neoplasias Encefálicas/secundário , Biologia Computacional/métodos , Neoplasias Pulmonares/patologia , Idoso , Algoritmos , Neoplasias Encefálicas/epidemiologia , Feminino , Humanos , Neoplasias Pulmonares/epidemiologia , Masculino , Pessoa de Meia-Idade , Modelos Estatísticos , Sensibilidade e Especificidade , Taiwan/epidemiologia
15.
BMC Bioinformatics ; 15: 49, 2014 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-24555567

RESUMO

BACKGROUND: In the application of microarray data, how to select a small number of informative genes from thousands of genes that may contribute to the occurrence of cancers is an important issue. Many researchers use various computational intelligence methods to analyzed gene expression data. RESULTS: To achieve efficient gene selection from thousands of candidate genes that can contribute in identifying cancers, this study aims at developing a novel method utilizing particle swarm optimization combined with a decision tree as the classifier. This study also compares the performance of our proposed method with other well-known benchmark classification methods (support vector machine, self-organizing map, back propagation neural network, C4.5 decision tree, Naive Bayes, CART decision tree, and artificial immune recognition system) and conducts experiments on 11 gene expression cancer datasets. CONCLUSION: Based on statistical analysis, our proposed method outperforms other popular classifiers for all test datasets, and is compatible to SVM for certain specific datasets. Further, the housekeeping genes with various expression patterns and tissue-specific genes are identified. These genes provide a high discrimination power on cancer classification.


Assuntos
Algoritmos , Biologia Computacional/métodos , Árvores de Decisões , Perfilação da Expressão Gênica/métodos , Neoplasias/genética , Inteligência Artificial , Teorema de Bayes , Bases de Dados Factuais , Feminino , Humanos , Masculino , Neoplasias/classificação , Neoplasias/metabolismo , Reprodutibilidade dos Testes , Máquina de Vetores de Suporte
16.
BMC Med Inform Decis Mak ; 13: 124, 2013 Nov 09.
Artigo em Inglês | MEDLINE | ID: mdl-24207108

RESUMO

BACKGROUND: Breast cancer is one of the most critical cancers and is a major cause of cancer death among women. It is essential to know the survivability of the patients in order to ease the decision making process regarding medical treatment and financial preparation. Recently, the breast cancer data sets have been imbalanced (i.e., the number of survival patients outnumbers the number of non-survival patients) whereas the standard classifiers are not applicable for the imbalanced data sets. The methods to improve survivability prognosis of breast cancer need for study. METHODS: Two well-known five-year prognosis models/classifiers [i.e., logistic regression (LR) and decision tree (DT)] are constructed by combining synthetic minority over-sampling technique (SMOTE), cost-sensitive classifier technique (CSC), under-sampling, bagging, and boosting. The feature selection method is used to select relevant variables, while the pruning technique is applied to obtain low information-burden models. These methods are applied on data obtained from the Surveillance, Epidemiology, and End Results database. The improvements of survivability prognosis of breast cancer are investigated based on the experimental results. RESULTS: Experimental results confirm that the DT and LR models combined with SMOTE, CSC, and under-sampling generate higher predictive performance consecutively than the original ones. Most of the time, DT and LR models combined with SMOTE and CSC use less informative burden/features when a feature selection method and a pruning technique are applied. CONCLUSIONS: LR is found to have better statistical power than DT in predicting five-year survivability. CSC is superior to SMOTE, under-sampling, bagging, and boosting to improve the prognostic performance of DT and LR.


Assuntos
Neoplasias da Mama/mortalidade , Modelos Estatísticos , Prognóstico , Adulto , Classificação/métodos , Árvores de Decisões , Intervalo Livre de Doença , Feminino , Humanos , Modelos Logísticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA