Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Sci Rep ; 14(1): 25723, 2024 10 28.
Artigo em Inglês | MEDLINE | ID: mdl-39468113

RESUMO

Breast cancer (BC) is a major contributor to female mortality worldwide, particularly in young women with aggressive tumors. Despite the need for accurate prognosis in this demographic, existing studies primarily focus on broader age groups, often using the SEER database, which has limitations in variable selection. This study aimed to develop an ML-based model to predict survival outcomes in young BC patients using the BC public staging database. A total of 3,401 patients with BC were included in the study. Patients were categorized as younger (n = 1574) and older (n = 1827). We applied several survival models-Random Survival Forest, Gradient Boosting Survival, Extra Survival Trees (EST), and penalized Cox models (Lasso and ElasticNet)-to compare mortality characteristics. The EST model outperformed others in predicting mortality for both age groups. Older patients exhibited a higher prevalence of comorbidities compared to younger patients. Tumor stage was the primary variable used to train the model for mortality prediction in both groups. COPD was a significant variable only in younger patients with BC. Other variables exhibited varying degrees of consistency in each group. These findings can help identify high-risk young female patients with BC who require aggressive treatment by predicting the risk of mortality.


Assuntos
Neoplasias da Mama , Bases de Dados Factuais , Estadiamento de Neoplasias , Humanos , Neoplasias da Mama/mortalidade , Neoplasias da Mama/patologia , Feminino , Adulto , Pessoa de Meia-Idade , Prognóstico , Idoso , Fatores Etários , Programa de SEER , Modelos de Riscos Proporcionais , Análise de Sobrevida , Adulto Jovem
2.
JMIR Med Inform ; 11: e47859, 2023 Nov 24.
Artigo em Inglês | MEDLINE | ID: mdl-37999942

RESUMO

BACKGROUND: Synthetic data generation (SDG) based on generative adversarial networks (GANs) is used in health care, but research on preserving data with logical relationships with synthetic tabular data (STD) remains challenging. Filtering methods for SDG can lead to the loss of important information. OBJECTIVE: This study proposed a divide-and-conquer (DC) method to generate STD based on the GAN algorithm, while preserving data with logical relationships. METHODS: The proposed method was evaluated on data from the Korea Association for Lung Cancer Registry (KALC-R) and 2 benchmark data sets (breast cancer and diabetes). The DC-based SDG strategy comprises 3 steps: (1) We used 2 different partitioning methods (the class-specific criterion distinguished between survival and death groups, while the Cramer V criterion identified the highest correlation between columns in the original data); (2) the entire data set was divided into a number of subsets, which were then used as input for the conditional tabular generative adversarial network and the copula generative adversarial network to generate synthetic data; and (3) the generated synthetic data were consolidated into a single entity. For validation, we compared DC-based SDG and conditional sampling (CS)-based SDG through the performances of machine learning models. In addition, we generated imbalanced and balanced synthetic data for each of the 3 data sets and compared their performance using 4 classifiers: decision tree (DT), random forest (RF), Extreme Gradient Boosting (XGBoost), and light gradient-boosting machine (LGBM) models. RESULTS: The synthetic data of the 3 diseases (non-small cell lung cancer [NSCLC], breast cancer, and diabetes) generated by our proposed model outperformed the 4 classifiers (DT, RF, XGBoost, and LGBM). The CS- versus DC-based model performances were compared using the mean area under the curve (SD) values: 74.87 (SD 0.77) versus 63.87 (SD 2.02) for NSCLC, 73.31 (SD 1.11) versus 67.96 (SD 2.15) for breast cancer, and 61.57 (SD 0.09) versus 60.08 (SD 0.17) for diabetes (DT); 85.61 (SD 0.29) versus 79.01 (SD 1.20) for NSCLC, 78.05 (SD 1.59) versus 73.48 (SD 4.73) for breast cancer, and 59.98 (SD 0.24) versus 58.55 (SD 0.17) for diabetes (RF); 85.20 (SD 0.82) versus 76.42 (SD 0.93) for NSCLC, 77.86 (SD 2.27) versus 68.32 (SD 2.37) for breast cancer, and 60.18 (SD 0.20) versus 58.98 (SD 0.29) for diabetes (XGBoost); and 85.14 (SD 0.77) versus 77.62 (SD 1.85) for NSCLC, 78.16 (SD 1.52) versus 70.02 (SD 2.17) for breast cancer, and 61.75 (SD 0.13) versus 61.12 (SD 0.23) for diabetes (LGBM). In addition, we found that balanced synthetic data performed better. CONCLUSIONS: This study is the first attempt to generate and validate STD based on a DC approach and shows improved performance using STD. The necessity for balanced SDG was also demonstrated.

3.
Artigo em Inglês | MEDLINE | ID: mdl-33572855

RESUMO

In this cross-sectional study, we investigated the baseline risk factors of diabetes mellitus (DM) in patients with undiagnosed DM (UDM). We utilized the Korean National Health and Nutrition Examination Survey (KNHANES) 2010-2017 data. Data regarding the participants' demographic characteristics, health status, health determinants, healthcare accessibility, and laboratory tests were gathered to explore the differences between the DM, UDM, and without-DM groups. Among the 64,759 individuals who participated in the KNHANES 2010-2017, 32,611 individuals aged ≥20 years with fasting plasma glucose levels of <100 or ≥126 mg/dL were selected. The odds ratios (ORs) regarding family history of diabetes and the performance of national health and cancer screening tests were lower in the UDM group than in the DM group (adjusted OR: 0.54; 95% confidence interval (CI): 0.43, 0.66; adjusted OR: 0.74; 95% CI: 0.62, 0.89; adjusted OR: 0.71; 95% CI: 0.60, 0.85). The ORs of hypertension and obesity were higher in the UDM group than in the DM group (adjusted OR: 1.32; 95% CI: 1.06, 1.64; adjusted OR: 1.80; 95% CI: 1.37, 2.36, respectively). Patients with UDM were more likely to be exposed to DM-related risk factors than those with and without DM. Public health interventions to prevent UDM development are necessary.


Assuntos
Diabetes Mellitus , Adulto , Idoso , Estudos Transversais , Diabetes Mellitus/epidemiologia , Humanos , Inquéritos Nutricionais , Prevalência , República da Coreia/epidemiologia , Fatores de Risco
4.
Artigo em Inglês | MEDLINE | ID: mdl-33266117

RESUMO

A screening model for estimating undiagnosed diabetes mellitus (UDM) is important for early medical care. There is minimal research and a serious lack of screening models for people with a family history of diabetes (FHD), especially one which incorporates gender characteristics. Therefore, the primary objective of our study was to develop a screening model for estimating UDM among people with FHD and enable its validation. We used data from the Korean National Health and Nutrition Examination Survey (KNHANES). KNAHNES (2010-2016) was used as a developmental cohort (n = 5939) and was then evaluated in a validation cohort (n = 1047) KNHANES (2017). We developed the screening model for UDM in male (SMM), female (SMF), and male and female combined (SMP) with FHD using backward stepwise logistic regression analysis. The SMM and SMF showed an appropriate performance (area under curve (AUC) = 76.2% and 77.9%) compared with SMP (AUC = 72.9%) in the validation cohort. Consequently, simple screening models were developed and validated, for the estimation of UDM among patients in the FHD group, which is expected to reduce the burden on the national health care system.


Assuntos
Diabetes Mellitus Tipo 2 , Diabetes Mellitus , Área Sob a Curva , Diabetes Mellitus/diagnóstico , Diabetes Mellitus/epidemiologia , Feminino , Humanos , Masculino , Programas de Rastreamento , Inquéritos Nutricionais , Fatores de Risco
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA