RESUMEN
This paper provides a comprehensive overview of the Cancer Public Library Database (CPLD), established under the Korean Clinical Data Utilization for Research Excellence project (K-CURE). The CPLD links data from four major population-based public sources: the Korea National Cancer Incidence Database in the Korea Central Cancer Registry, cause-of-death data in Statistics Korea, the National Health Information Database in the National Health Insurance Service, and the National Health Insurance Research Database in the Health Insurance Review & Assessment Service. These databases are linked using an encrypted resident registration number. The CPLD, established in 2022 and updated annually, comprises 1,983,499 men and women newly diagnosed with cancer between 2012 and 2019. It contains data on cancer registration and death, demographics, medical claims, general health checkups, and national cancer screening. The most common cancers among men in the CPLD were stomach (16.1%), lung (14.0%), colorectal (13.3%), prostate (9.6%), and liver (9.3%) cancers. The most common cancers among women were thyroid (20.4%), breast (16.6%), colorectal (9.0%), stomach (7.8%), and lung (6.2%) cancers. Among them, 571,285 died between 2012 and 2020 owing to cancer (89.2%) or other causes (10.8%). Upon approval, the CPLD is accessible to researchers through the K-CURE portal. The CPLD is a unique resource for diverse cancer research to investigate medical use before a cancer diagnosis, during initial diagnosis and treatment, and long-term follow-up. This offers expanded insight into healthcare delivery across the cancer continuum, from screening to end-of-life care.
Asunto(s)
Bases de Datos Factuales , Neoplasias , Sistema de Registros , Humanos , República de Corea/epidemiología , Neoplasias/epidemiología , Masculino , Femenino , IncidenciaRESUMEN
BACKGROUND: Venous thromboembolism (VTE) is a hospital-associated severe complication that may adversely affect patient prognosis. In this study, we evaluated the incidence of VTE and its risk factors in patients with epithelial ovarian cancer (EOC). METHODS: We retrospectively analyzed the electronic health record data of 1268 patients with EOC who received primary treatment at the National Cancer Center, Korea between January 2007 and December 2017 to identify patients who developed VTE. Demographic, clinical, and surgical characteristics of these patients were ascertained. Competing risks analyses were performed to estimate the cumulative incidence of VTE according to the treatment type. The associations between putative risk factors and the incidence of VTE were evaluated using the Fine-Gray regression models accounting for competing risks of death. RESULTS: VTE was the most prevalent cardiovascular event, found in 9.6% (n = 122) of all patients. Of these VTE events, 115 (94.3%) occurred within 2 years of EOC diagnosis. Advanced cancer stage at diagnosis (distant vs. localized, hazards ratio [HR])= 14.49, p = 0.015) and extended hospital stay (≥15 days, HR =3.87, p = 0.004) were associated with the incidence of VTE. There was no significant difference in the cumulative incidence of VTE between primary cytoreductive surgery followed by adjuvant chemotherapy and neoadjuvant chemotherapy followed by interval cytoreductive surgery (HR =0.81, p = 0.390). CONCLUSIONS: Approximately 10% of patients with EOC were diagnosed with VTE, which was the most common cardiovascular disease found in this study. The assessment of VTE risks in patients with advanced-stage EOC with an extended hospital stay is needed to facilitate adequate prophylactic treatment.
Asunto(s)
Carcinoma Epitelial de Ovario/complicaciones , Neoplasias Ováricas/complicaciones , Tromboembolia Venosa/epidemiología , Anciano , Carcinoma Epitelial de Ovario/tratamiento farmacológico , Carcinoma Epitelial de Ovario/patología , Carcinoma Epitelial de Ovario/cirugía , Enfermedades Cardiovasculares/epidemiología , Enfermedades Cardiovasculares/etiología , Quimioterapia Adyuvante/efectos adversos , Procedimientos Quirúrgicos de Citorreducción/efectos adversos , Femenino , Humanos , Incidencia , Tiempo de Internación , Persona de Mediana Edad , Terapia Neoadyuvante/efectos adversos , Neoplasias Ováricas/tratamiento farmacológico , Neoplasias Ováricas/patología , Neoplasias Ováricas/cirugía , Complicaciones Posoperatorias/epidemiología , Complicaciones Posoperatorias/etiología , República de Corea/epidemiología , Estudios Retrospectivos , Factores de Riesgo , Tromboembolia Venosa/etiologíaRESUMEN
BACKGROUND: Postoperative length of stay is a key indicator in the management of medical resources and an indirect predictor of the incidence of surgical complications and the degree of recovery of the patient after cancer surgery. Recently, machine learning has been used to predict complex medical outcomes, such as prolonged length of hospital stay, using extensive medical information. OBJECTIVE: The objective of this study was to develop a prediction model for prolonged length of stay after cancer surgery using a machine learning approach. METHODS: In our retrospective study, electronic health records (EHRs) from 42,751 patients who underwent primary surgery for 17 types of cancer between January 1, 2000, and December 31, 2017, were sourced from a single cancer center. The EHRs included numerous variables such as surgical factors, cancer factors, underlying diseases, functional laboratory assessments, general assessments, medications, and social factors. To predict prolonged length of stay after cancer surgery, we employed extreme gradient boosting classifier, multilayer perceptron, and logistic regression models. Prolonged postoperative length of stay for cancer was defined as bed-days of the group of patients who accounted for the top 50% of the distribution of bed-days by cancer type. RESULTS: In the prediction of prolonged length of stay after cancer surgery, extreme gradient boosting classifier models demonstrated excellent performance for kidney and bladder cancer surgeries (area under the receiver operating characteristic curve [AUC] >0.85). A moderate performance (AUC 0.70-0.85) was observed for stomach, breast, colon, thyroid, prostate, cervix uteri, corpus uteri, and oral cancers. For stomach, breast, colon, thyroid, and lung cancers, with more than 4000 cases each, the extreme gradient boosting classifier model showed slightly better performance than the logistic regression model, although the logistic regression model also performed adequately. We identified risk variables for the prediction of prolonged postoperative length of stay for each type of cancer, and the importance of the variables differed depending on the cancer type. After we added operative time to the models trained on preoperative factors, the models generally outperformed the corresponding models using only preoperative variables. CONCLUSIONS: A machine learning approach using EHRs may improve the prediction of prolonged length of hospital stay after primary cancer surgery. This algorithm may help to provide a more effective allocation of medical resources in cancer surgery.
RESUMEN
BACKGROUND: The analytical capacity and speed of next-generation sequencing (NGS) technology have been improved. Many genetic variants associated with various diseases have been discovered using NGS. Therefore, applying NGS to clinical practice results in precision or personalized medicine. However, as clinical sequencing reports in electronic health records (EHRs) are not structured according to recommended standards, clinical decision support systems have not been fully utilized. In addition, integrating genomic data with clinical data for translational research remains a great challenge. OBJECTIVE: To apply international standards to clinical sequencing reports and to develop a clinical research information system to integrate standardized genomic data with clinical data. METHODS: We applied the recently published ISO/TS 20428 standard to 367 clinical sequencing reports generated by panel (91 genes) sequencing in EHRs and implemented a clinical NGS research system by extending the clinical data warehouse to integrate the necessary clinical data for each patient. We also developed a user interface with a clinical research portal and an NGS result viewer. RESULTS: A single clinical sequencing report with 28 items was restructured into four database tables and 49 entities. As a result, 367 patients' clinical sequencing data were connected with clinical data in EHRs, such as diagnosis, surgery, and death information. This system can support the development of cohort or case-control datasets as well. CONCLUSIONS: The standardized clinical sequencing data are not only for clinical practice and could be further applied to translational research.
RESUMEN
we applied the inverse problem approach to locate the known source in a uniformly distributed sensor network from a simultaneous RSSI measurement between sensors and sources. We also proposed a new sensing model to calculate RSSI between sensors and a specific source and carefully considered the orientation vector of the source. We detected the original source by means of a linear inverse problem using the calculated RSSI at the target source from the improved sensing model. Finally, we simulated the proposed sensing model to verify its ability to detect the original source. Changes in the initial source and calculated results remained quite in place. Moreover, the norm of the detected source was significantly larger than the norm of any other sources.