Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 64
Filtrar
1.
J Biomed Inform ; 149: 104576, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-38101690

RESUMO

INTRODUCTION: Machine learning algorithms are expected to work side-by-side with humans in decision-making pipelines. Thus, the ability of classifiers to make reliable decisions is of paramount importance. Deep neural networks (DNNs) represent the state-of-the-art models to address real-world classification. Although the strength of activation in DNNs is often correlated with the network's confidence, in-depth analyses are needed to establish whether they are well calibrated. METHOD: In this paper, we demonstrate the use of DNN-based classification tools to benefit cancer registries by automating information extraction of disease at diagnosis and at surgery from electronic text pathology reports from the US National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) population-based cancer registries. In particular, we introduce multiple methods for selective classification to achieve a target level of accuracy on multiple classification tasks while minimizing the rejection amount-that is, the number of electronic pathology reports for which the model's predictions are unreliable. We evaluate the proposed methods by comparing our approach with the current in-house deep learning-based abstaining classifier. RESULTS: Overall, all the proposed selective classification methods effectively allow for achieving the targeted level of accuracy or higher in a trade-off analysis aimed to minimize the rejection rate. On in-distribution validation and holdout test data, with all the proposed methods, we achieve on all tasks the required target level of accuracy with a lower rejection rate than the deep abstaining classifier (DAC). Interpreting the results for the out-of-distribution test data is more complex; nevertheless, in this case as well, the rejection rate from the best among the proposed methods achieving 97% accuracy or higher is lower than the rejection rate based on the DAC. CONCLUSIONS: We show that although both approaches can flag those samples that should be manually reviewed and labeled by human annotators, the newly proposed methods retain a larger fraction and do so without retraining-thus offering a reduced computational cost compared with the in-house deep learning-based abstaining classifier.


Assuntos
Aprendizado Profundo , Humanos , Incerteza , Redes Neurais de Computação , Algoritmos , Aprendizado de Máquina
2.
BMC Med Inform Decis Mak ; 24(Suppl 5): 262, 2024 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-39289714

RESUMO

BACKGROUND: Applying graph convolutional networks (GCN) to the classification of free-form natural language texts leveraged by graph-of-words features (TextGCN) was studied and confirmed to be an effective means of describing complex natural language texts. However, the text classification models based on the TextGCN possess weaknesses in terms of memory consumption and model dissemination and distribution. In this paper, we present a fast message passing network (FastMPN), implementing a GCN with message passing architecture that provides versatility and flexibility by allowing trainable node embedding and edge weights, helping the GCN model find the better solution. We applied the FastMPN model to the task of clinical information extraction from cancer pathology reports, extracting the following six properties: main site, subsite, laterality, histology, behavior, and grade. RESULTS: We evaluated the clinical task performance of the FastMPN models in terms of micro- and macro-averaged F1 scores. A comparison was performed with the multi-task convolutional neural network (MT-CNN) model. Results show that the FastMPN model is equivalent to or better than the MT-CNN. CONCLUSIONS: Our implementation revealed that our FastMPN model, which is based on the PyTorch platform, can train a large corpus (667,290 training samples) with 202,373 unique words in less than 3 minutes per epoch using one NVIDIA V100 hardware accelerator. Our experiments demonstrated that using this implementation, the clinical task performance scores of information extraction related to tumors from cancer pathology reports were highly competitive.


Assuntos
Processamento de Linguagem Natural , Neoplasias , Redes Neurais de Computação , Humanos , Neoplasias/classificação , Mineração de Dados
3.
Cancer ; 129(12): 1821-1835, 2023 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-37063057

RESUMO

BACKGROUND: Depression is common among breast cancer patients and can affect concordance with guideline-recommended treatment plans. Yet, the impact of depression on cancer treatment and survival is understudied, particularly in relation to the timing of the depression diagnosis. METHODS: The Kentucky Cancer Registry data was used to identify female patients diagnosed with primary invasive breast cancer who were 20 years of age or older in 2007-2011. Patients were classified as having no depression, depression pre-cancer diagnosis only, depression post- cancer diagnosis only, or persistent depression. The impact of depression on receiving guideline-recommended treatment and survival was examined using multivariable logistic regression and Cox regression, respectively. RESULTS: Of 6054 eligible patients, 4.1%, 3.7%, and 6.2% patients had persistent depression, depression pre-diagnosis only, and depression post-diagnosis only, respectively. A total of 1770 (29.2%) patients did not receive guideline-recommended cancer treatment. Compared to patients with no depression, the odds of receiving guideline-recommended treatment were decreased in patients with depression pre-diagnosis only (odds ratio [OR], 0.75; 95% confidence interval [CI], 0.54-1.04) but not in patients with post-diagnosis only or persistent depression. Depression post-diagnosis only (hazard ratio, 1.51; 95% CI, 1.24-1.83) and depression pre-diagnosis only (hazard ratio, 1.26; 95% CI, 0.99-1.59) were associated with worse survival. No significant difference in survival was found between patients with persistent depression and patients with no depression (p > .05). CONCLUSIONS: Neglecting depression management after a breast cancer diagnosis may result in poorer cancer treatment concordance and worse survival. Early detection and consistent management of depression is critical in improving patient survival.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/complicações , Neoplasias da Mama/terapia , Neoplasias da Mama/diagnóstico , Kentucky/epidemiologia , Modelos de Riscos Proporcionais , Sistema de Registros
4.
BMC Bioinformatics ; 23(Suppl 12): 386, 2022 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-36151511

RESUMO

BACKGROUND: Public Data Commons (PDC) have been highlighted in the scientific literature for their capacity to collect and harmonize big data. On the other hand, local data commons (LDC), located within an institution or organization, have been underrepresented in the scientific literature, even though they are a critical part of research infrastructure. Being closest to the sources of data, LDCs provide the ability to collect and maintain the most up-to-date, high-quality data within an organization, closest to the sources of the data. As a data provider, LDCs have many challenges in both collecting and standardizing data, moreover, as a consumer of PDC, they face problems of data harmonization stemming from the monolithic harmonization pipeline designs commonly adapted by many PDCs. Unfortunately, existing guidelines and resources for building and maintaining data commons exclusively focus on PDC and provide very little information on LDC. RESULTS: This article focuses on four important observations. First, there are three different types of LDC service models that are defined based on their roles and requirements. These can be used as guidelines for building new LDC or enhancing the services of existing LDC. Second, the seven core services of LDC are discussed, including cohort identification and facilitation of genomic sequencing, the management of molecular reports and associated infrastructure, quality control, data harmonization, data integration, data sharing, and data access control. Third, instead of commonly developed monolithic systems, we propose a new data sharing method for data harmonization that combines both divide-and-conquer and bottom-up approaches. Finally, an end-to-end LDC implementation is introduced with real-world examples. CONCLUSIONS: Although LDCs are an optimal place to identify and address data quality issues, they have traditionally been relegated to the role of passive data provider for much larger PDC. Indeed, many LDCs limit their functions to only conducting routine data storage and transmission tasks due to a lack of information on how to design, develop, and improve their services using limited resources. We hope that this work will be the first small step in raising awareness among the LDCs of their expanded utility and to publicize to a wider audience the importance of LDC.


Assuntos
Big Data , Disseminação de Informação , Países em Desenvolvimento , Humanos
5.
J Biomed Inform ; 125: 103957, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34823030

RESUMO

In the last decade, the widespread adoption of electronic health record documentation has created huge opportunities for information mining. Natural language processing (NLP) techniques using machine and deep learning are becoming increasingly widespread for information extraction tasks from unstructured clinical notes. Disparities in performance when deploying machine learning models in the real world have recently received considerable attention. In the clinical NLP domain, the robustness of convolutional neural networks (CNNs) for classifying cancer pathology reports under natural distribution shifts remains understudied. In this research, we aim to quantify and improve the performance of the CNN for text classification on out-of-distribution (OOD) datasets resulting from the natural evolution of clinical text in pathology reports. We identified class imbalance due to different prevalence of cancer types as one of the sources of performance drop and analyzed the impact of previous methods for addressing class imbalance when deploying models in real-world domains. Our results show that our novel class-specialized ensemble technique outperforms other methods for the classification of rare cancer types in terms of macro F1 scores. We also found that traditional ensemble methods perform better in top classes, leading to higher micro F1 scores. Based on our findings, we formulate a series of recommendations for other ML practitioners on how to build robust models with extremely imbalanced datasets in biomedical NLP applications.


Assuntos
Processamento de Linguagem Natural , Neoplasias , Registros Eletrônicos de Saúde , Humanos , Aprendizado de Máquina , Redes Neurais de Computação
6.
BMC Bioinformatics ; 22(1): 113, 2021 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-33750288

RESUMO

BACKGROUND: Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often difficult and expensive. Active learning techniques may mitigate this challenge by reducing the amount of labelled data required to effectively train a model. In this study, we analyze the effectiveness of 11 active learning algorithms on classifying subsite and histology from cancer pathology reports using a Convolutional Neural Network as the text classification model. RESULTS: We compare the performance of each active learning strategy using two differently sized datasets and two different classification tasks. Our results show that on all tasks and dataset sizes, all active learning strategies except diversity-sampling strategies outperformed random sampling, i.e., no active learning. On our large dataset (15K initial labelled samples, adding 15K additional labelled samples each iteration of active learning), there was no clear winner between the different active learning strategies. On our small dataset (1K initial labelled samples, adding 1K additional labelled samples each iteration of active learning), marginal and ratio uncertainty sampling performed better than all other active learning techniques. We found that compared to random sampling, active learning strongly helps performance on rare classes by focusing on underrepresented classes. CONCLUSIONS: Active learning can save annotation cost by helping human annotators efficiently and intelligently select which samples to label. Our results show that a dataset constructed using effective active learning techniques requires less than half the amount of labelled data to achieve the same performance as a dataset constructed using random sampling.


Assuntos
Aprendizado de Máquina , Neoplasias , Algoritmos , Humanos , Neoplasias/genética , Neoplasias/patologia , Redes Neurais de Computação
7.
Blood ; 131(26): 2943-2954, 2018 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-29695515

RESUMO

Prostate apoptosis response-4 (Par-4), a proapoptotic tumor suppressor protein, is downregulated in many cancers including renal cell carcinoma, glioblastoma, endometrial, and breast cancer. Par-4 induces apoptosis selectively in various types of cancer cells but not normal cells. We found that chronic lymphocytic leukemia (CLL) cells from human patients and from Eµ-Tcl1 mice constitutively express Par-4 in greater amounts than normal B-1 or B-2 cells. Interestingly, knockdown of Par-4 in human CLL-derived Mec-1 cells results in a robust increase in p21/WAF1 expression and decreased growth due to delayed G1-to-S cell-cycle transition. Lack of Par-4 also increased the expression of p21 and delayed CLL growth in Eµ-Tcl1 mice. Par-4 expression in CLL cells required constitutively active B-cell receptor (BCR) signaling, as inhibition of BCR signaling with US Food and Drug Administration (FDA)-approved drugs caused a decrease in Par-4 messenger RNA and protein, and an increase in apoptosis. In particular, activities of Lyn, a Src family kinase, spleen tyrosine kinase, and Bruton tyrosine kinase are required for Par-4 expression in CLL cells, suggesting a novel regulation of Par-4 through BCR signaling. Together, these results suggest that Par-4 may play a novel progrowth rather than proapoptotic role in CLL and could be targeted to enhance the therapeutic effects of BCR-signaling inhibitors.


Assuntos
Proteínas Reguladoras de Apoptose/metabolismo , Regulação Leucêmica da Expressão Gênica , Leucemia Linfocítica Crônica de Células B/metabolismo , Animais , Proteínas Reguladoras de Apoptose/genética , Ciclo Celular , Linhagem Celular Tumoral , Inibidor de Quinase Dependente de Ciclina p21/genética , Inibidor de Quinase Dependente de Ciclina p21/metabolismo , Deleção de Genes , Humanos , Leucemia Linfocítica Crônica de Células B/genética , Leucemia Linfocítica Crônica de Células B/patologia , Camundongos Endogâmicos C57BL , Camundongos Endogâmicos NOD , Receptores de Antígenos de Linfócitos B/metabolismo , Transdução de Sinais , Regulação para Cima
8.
J Biomed Inform ; 110: 103564, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32919043

RESUMO

OBJECTIVE: In machine learning, it is evident that the classification of the task performance increases if bootstrap aggregation (bagging) is applied. However, the bagging of deep neural networks takes tremendous amounts of computational resources and training time. The research question that we aimed to answer in this research is whether we could achieve higher task performance scores and accelerate the training by dividing a problem into sub-problems. MATERIALS AND METHODS: The data used in this study consist of free text from electronic cancer pathology reports. We applied bagging and partitioned data training using Multi-Task Convolutional Neural Network (MT-CNN) and Multi-Task Hierarchical Convolutional Attention Network (MT-HCAN) classifiers. We split a big problem into 20 sub-problems, resampled the training cases 2,000 times, and trained the deep learning model for each bootstrap sample and each sub-problem-thus, generating up to 40,000 models. We performed the training of many models concurrently in a high-performance computing environment at Oak Ridge National Laboratory (ORNL). RESULTS: We demonstrated that aggregation of the models improves task performance compared with the single-model approach, which is consistent with other research studies; and we demonstrated that the two proposed partitioned bagging methods achieved higher classification accuracy scores on four tasks. Notably, the improvements were significant for the extraction of cancer histology data, which had more than 500 class labels in the task; these results show that data partition may alleviate the complexity of the task. On the contrary, the methods did not achieve superior scores for the tasks of site and subsite classification. Intrinsically, since data partitioning was based on the primary cancer site, the accuracy depended on the determination of the partitions, which needs further investigation and improvement. CONCLUSION: Results in this research demonstrate that 1. The data partitioning and bagging strategy achieved higher performance scores. 2. We achieved faster training leveraged by the high-performance Summit supercomputer at ORNL.


Assuntos
Neoplasias , Redes Neurais de Computação , Metodologias Computacionais , Humanos , Armazenamento e Recuperação da Informação , Aprendizado de Máquina
9.
BMC Med Inform Decis Mak ; 20(Suppl 10): 271, 2020 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-33319710

RESUMO

BACKGROUND: The Kentucky Cancer Registry (KCR) is a central cancer registry for the state of Kentucky that receives data about incident cancer cases from all healthcare facilities in the state within 6 months of diagnosis. Similar to all other U.S. and Canadian cancer registries, KCR uses a data dictionary provided by the North American Association of Central Cancer Registries (NAACCR) for standardized data entry. The NAACCR data dictionary is not an ontological system. Mapping between the NAACCR data dictionary and the National Cancer Institute (NCI) Thesaurus (NCIt) will facilitate the enrichment, dissemination and utilization of cancer registry data. We introduce a web-based system, called Interactive Mapping Interface (IMI), for creating mappings from data dictionaries to ontologies, in particular from NAACCR to NCIt. METHOD: IMI has been designed as a general approach with three components: (1) ontology library; (2) mapping interface; and (3) recommendation engine. The ontology library provides a list of ontologies as targets for building mappings. The mapping interface consists of six modules: project management, mapping dashboard, access control, logs and comments, hierarchical visualization, and result review and export. The built-in recommendation engine automatically identifies a list of candidate concepts to facilitate the mapping process. RESULTS: We report the architecture design and interface features of IMI. To validate our approach, we implemented an IMI prototype and pilot-tested features using the IMI interface to map a sample set of NAACCR data elements to NCIt concepts. 47 out of 301 NAACCR data elements have been mapped to NCIt concepts. Five branches of hierarchical tree have been identified from these mapped concepts for visual inspection. CONCLUSIONS: IMI provides an interactive, web-based interface for building mappings from data dictionaries to ontologies. Although our pilot-testing scope is limited, our results demonstrate feasibility using IMI for semantic enrichment of cancer registry data by mapping NAACCR data elements to NCIt concepts.


Assuntos
Ontologias Biológicas , Neoplasias , Canadá/epidemiologia , Humanos , Internet , Neoplasias/diagnóstico , Neoplasias/epidemiologia , Sistema de Registros , Vocabulário Controlado
10.
Cancer ; 125(21): 3729-3737, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31381143

RESUMO

Population-based cancer registries have improved dramatically over the last 2 decades. These central cancer registries provide a critical framework that can elevate the science of cancer research. There have also been important technical and scientific advances that help to unlock the potential of population-based cancer registries. These advances include improvements in probabilistic record linkage, refinements in natural language processing, the ability to perform genomic sequencing on formalin-fixed, paraffin-embedded (FFPE) tissue, and improvements in the ability to identify activity levels of many different signaling molecules in FFPE tissue. This article describes how central cancer registries can provide a population-based sample frame that will lead to studies with strong external validity, how central cancer registries can link with public and private health insurance claims to obtain complete treatment information, how central cancer registries can use informatics techniques to provide population-based rapid case ascertainment, how central cancer registries can serve as a population-based virtual tissue repository, and how population-based cancer registries are essential for guiding the implementation of evidence-based interventions and measuring changes in the cancer burden after the implementation of these interventions.


Assuntos
Neoplasias/diagnóstico , Neoplasias/terapia , Vigilância da População/métodos , Sistema de Registros/estatística & dados numéricos , Pesquisa Biomédica/métodos , Pesquisa Biomédica/estatística & dados numéricos , Fixadores/química , Formaldeído/química , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Inclusão em Parafina/métodos , Fixação de Tecidos/métodos
11.
Cancer Control ; 26(1): 1073274819845873, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31014079

RESUMO

Recent metabolic and genetic research has demonstrated that risk for specific histological types of lung cancer varies in relation to cigarette smoking and obesity. This study investigated the spatial and temporal distribution of lung cancer histological types in Kentucky, a largely rural state with high rates of smoking and obesity, to discern population-level trends that might reflect variation in these and other risk factors. The Kentucky Cancer Registry provided residential geographic coordinates for lung cancer cases diagnosed from 1995 through 2014. We used multinomial and discrete Poisson spatiotemporal scan statistics, adjusted for age, gender, and race, to characterize risk for specific histological types-small cell, adenocarcinoma, squamous cell, and other types-throughout Kentucky and compared to maps of risk factors. Toward the end of the study period, adenocarcinoma was more common among all population subgroups in north-central Kentucky, where smoking and obesity are less prevalent. During the same time frame, squamous cell, small cell, and other types were more common in rural Appalachia, where smoking and obesity are more prevalent, and in some high poverty urban areas. Spatial and temporal patterns in the distribution of histological types of lung cancer are likely related to regional variation in multiple risk factors. High smoking and obesity rates in the Appalachian region, and likely in high poverty urban areas, appeared to coincide with high rates of squamous cell and small cell lung cancer. In north-central Kentucky, environmental exposures might have resulted in higher risk for adenocarcinoma specifically.


Assuntos
Adenocarcinoma de Pulmão/epidemiologia , Fumar Cigarros/epidemiologia , Neoplasias Pulmonares/epidemiologia , Obesidade/epidemiologia , Carcinoma de Pequenas Células do Pulmão/epidemiologia , Adenocarcinoma de Pulmão/patologia , Idoso , Análise por Conglomerados , Feminino , Humanos , Kentucky/epidemiologia , Neoplasias Pulmonares/patologia , Masculino , Pessoa de Meia-Idade , Fatores de Risco , Carcinoma de Pequenas Células do Pulmão/patologia , Análise Espaço-Temporal
12.
J Biomed Inform ; 97: 103267, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31401235

RESUMO

OBJECTIVE: We study the performance of machine learning (ML) methods, including neural networks (NNs), to extract mutational test results from pathology reports collected by cancer registries. Given the lack of hand-labeled datasets for mutational test result extraction, we focus on the particular use-case of extracting Epidermal Growth Factor Receptor mutation results in non-small cell lung cancers. We explore the generalization of NNs across different registries where our goals are twofold: (1) to assess how well models trained on a registry's data port to test data from a different registry and (2) to assess whether and to what extent such models can be improved using state-of-the-art neural domain adaptation techniques under different assumptions about what is available (labeled vs unlabeled data) at the target registry site. MATERIALS AND METHODS: We collected data from two registries: the Kentucky Cancer Registry (KCR) and the Fred Hutchinson Cancer Research Center (FH) Cancer Surveillance System. We combine NNs with adversarial domain adaptation to improve cross-registry performance. We compare to other classifiers in the standard supervised classification, unsupervised domain adaptation, and supervised domain adaptation scenarios. RESULTS: The performance of ML methods varied between registries. To extract positive results, the basic convolutional neural network (CNN) had an F1 of 71.5% on the KCR dataset and 95.7% on the FH dataset. For the KCR dataset, the CNN F1 results were low when trained on FH data (Positive F1: 23%). Using our proposed adversarial CNN, without any labeled data, we match the F1 of the models trained directly on each target registry's data. The adversarial CNN F1 improved when trained on FH and applied to KCR dataset (Positive F1: 70.8%). We found similar performance improvements when we trained on KCR and tested on FH reports (Positive F1: 45% to 96%). CONCLUSION: Adversarial domain adaptation improves the performance of NNs applied to pathology reports. In the unsupervised domain adaptation setting, we match the performance of models that are trained directly on target registry's data by using source registry's labeled data and unlabeled examples from the target registry.


Assuntos
Aprendizado de Máquina , Mutação , Neoplasias/genética , Neoplasias/patologia , Sistema de Registros/estatística & dados numéricos , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/patologia , Biologia Computacional , Mineração de Dados , Aprendizado Profundo , Receptores ErbB/genética , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Redes Neurais de Computação
13.
J Community Health ; 44(3): 552-560, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-30767102

RESUMO

PURPOSE: To examine smoking and use of smoking cessation aids among tobacco-associated cancer (TAC) or non-tobacco-associated cancer (nTAC) survivors. Understanding when and if specific types of cessation resources are used can help with planning interventions to more effectively decrease smoking among all cancer survivors, but there is a lack of research on smoking cessation modalities used among cancer survivors. METHODS: Kentucky Cancer Registry data on incident lung, colorectal, pancreatic, breast, ovarian, and prostate cancer cases diagnosed 2007-2011, were linked with health administrative claims data (Medicaid, Medicare, private insurers) to examine the prevalence of smoking and use of smoking cessation aids 1 year prior and 1 year following the cancer diagnosis. TACs included colorectal, pancreatic, and lung cancers; nTAC included breast, ovarian, and prostate cancers. RESULTS: There were 10,033 TAC and 13,670 nTAC survivors. Smoking before diagnosis was significantly higher among TAC survivors (p < 0.0001). Among TAC survivors, smoking before diagnosis was significantly higher among persons who: were males (83%), aged 45-64 (83%), of unknown marital status (84%), had very low education (78%), had public insurance (89%), Medicaid (85%) or were uninsured (84%). Smoking cessation counseling and pharmacotherapy were more common among TAC than nTAC survivors (p < 0.01 and p = 0.05, respectively). DISCUSSION: While smoking cessation counseling and pharmacotherapy were higher among TAC survivors, reducing smoking among all cancer survivors remains a priority, given cancer survivors are at increased risk for subsequent chronic diseases, including cancer. Tobacco cessation among all cancer survivors (not just those with TAC) can help improve prognosis, quality of life and reduce the risk of further disease. Health care providers can recommend for individual, group and telephone counseling and/or pharmacotherapy recommendations. These could also be included in survivorship care plans.


Assuntos
Sobreviventes de Câncer/estatística & dados numéricos , Abandono do Hábito de Fumar/métodos , Fumar/epidemiologia , Adulto , Fatores Etários , Idoso , Sobreviventes de Câncer/psicologia , Aconselhamento , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Prevalência , Qualidade de Vida , Abandono do Hábito de Fumar/psicologia , Fatores Socioeconômicos , Produtos do Tabaco , Estados Unidos , Adulto Jovem
14.
South Med J ; 111(11): 649-653, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30391998

RESUMO

OBJECTIVE: The purpose of this study was to assess for any associations between individual and social factors and late-stage melanoma in Kentucky from 1995 to 2013. METHODS: The study combines three datasets: individual-level data from the Kentucky Cancer Registry, census tract-level data from the US Census, and county-level physician licensure data from the Kentucky Department for Public Health. The study population is described by all cases, early stage, and late stage. Logistic regression was used to evaluate the unadjusted associations between each covariate and early-stage and late-stage disease groups. All of the significant variables were assessed for interaction effect, and the significant interaction terms were used in the final model. Multiple logistic regression provided the final model of late-stage disease. RESULTS: In this study population, a dramatic increase in melanoma incidence is seen from 1995 to 2013 with a threefold increase in the number of cases per year. Of the 10,109 cases reported, 13.6% have late-stage disease, with a mean age for all cases at 56.9 years and the majority being men. Late-stage cases are more commonly uninsured or insured with Medicaid or Medicare compared with cases with early-stage lesions. Having a spouse or partner is clearly protective from being diagnosed as having late-stage melanoma, whereas being uninsured or having Medicaid increases the odds of late-stage melanoma. CONCLUSIONS: The incidence of melanoma is increasing dramatically. With no screening recommendation for the general population from the US Preventive Task Force, clinicians should focus on those at increased risk of late-stage melanoma: unmarried men who are uninsured or receiving Medicaid.


Assuntos
Pessoas sem Cobertura de Seguro de Saúde , Melanoma/epidemiologia , Melanoma/patologia , Pessoa Solteira , Neoplasias Cutâneas/epidemiologia , Neoplasias Cutâneas/patologia , Adolescente , Adulto , Idoso , Humanos , Incidência , Kentucky/epidemiologia , Masculino , Medicaid , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Sistema de Registros , Estados Unidos
15.
J Neurooncol ; 132(3): 507-512, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28285334

RESUMO

Determine whether the risk of astrocytomas in Appalachian children is higher than the national average. We compared the incidence of pediatric brain tumors in Appalachia versus non-Appalachia regions, covering years 2000-2011. The North American Association of Central Cancer Registries (NAACCR) collects population-based data from 55 cancer registries throughout U.S. and Canada. All invasive primary (i.e. non-metastatic tumors), with age at diagnosis 0-19 years old, were included. Nearly 27,000 and 2200 central nervous system (CNS) tumors from non-Appalachia and Appalachia, respectively comprise the cohorts. Age-adjusted incidence rates of each main brain tumor subtype were compared. The incidence rate of pediatric CNS tumors was 8% higher in Appalachia, 3.31 [95% CI 3.17-3.45] versus non-Appalachia, 3.06, [95% CI 3.02-3.09] for the years 2001-2011, all rates are per 100,000 population. Astrocytomas accounted for the majority of this difference, with the rate being 16% higher in Appalachian children, 1.77, [95% CI 1.67-1.87] versus non-Appalachian children, 1.52, [95% CI 1.50-1.55]. Among astrocytomas, World Health Organization (WHO) grade I astrocytomas were 41% higher in Appalachia, 0.63 [95% CI 0.56-0.70] versus non-Appalachia 0.44 [95% CI 0.43-0.46] for the years 2004-2011. This is the first study to demonstrate that Appalachian children are at greater risk of CNS neoplasms, and that much of this difference is in WHO grade I astrocytomas, 41% more common. The cause of this increased incidence is unknown and we discuss the importance of this in relation to genetic and environmental findings in Appalachia.


Assuntos
Neoplasias Encefálicas/epidemiologia , Adolescente , Região dos Apalaches/epidemiologia , Criança , Pré-Escolar , Feminino , Humanos , Incidência , Lactente , Recém-Nascido , Masculino , Sistema de Registros , Adulto Jovem
16.
J Surg Res ; 214: 1-8, 2017 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-28624029

RESUMO

BACKGROUND: Although adjuvant therapy (AT) is a necessary component of multimodality therapy for pancreatic ductal adenocarcinoma (PDAC), its application can be hindered by post-pancreaticoduodenectomy (PD) complications. The primary aim of this study was to evaluate the impact of post-PD complications on AT utilization and overall survival (OS). METHODS: Patients undergoing PD without neoadjuvant therapy for stages I-III PDAC at a single institution (2007-2015) were evaluated. Ninety-day postoperative major complications (PMCs) were defined as grade ≥3. Records were linked to the Kentucky Cancer Registry for AT/OS data. Early AT was given <8 wk; late 8-16 wk. Initiation >16 wk was not considered to be AT. Complication effects on AT timing/utilization and OS were evaluated. RESULTS: Of 93 consecutive patients treated with surgery upfront with AT data, 64 (69%) received AT (41 [44%] early; 23 [25%] late). There were 32 patients (34%) with low-grade complications and 24 (26%) with PMC. With PMC, only six of 24 patients (25%) received early AT and 13 of 24 (54%) received any (early/late) AT versus 35 of 69 (51%) early AT and 51 of 69 (74%) any AT without PMC. PMCs were associated with worse median OS (7.1 versus 24.6 mo, without PMC, P < 0.001). Independent predictors of OS included AT (hazard ratio [HR]: 0.48), tumor >2 cm (HR: 3.39), node-positivity (HR: 2.16), and PMC (HR: 3.69, all P < 0.02). CONCLUSIONS: Independent of AT utilization and biologic factors, PMC negatively impacted OS in patients treated with surgery first. These data suggest that strategies to decrease PMC and treatment sequencing alternatives to increase multimodality therapy rates may improve oncologic outcomes for PDAC.


Assuntos
Carcinoma Ductal Pancreático/terapia , Neoplasias Pancreáticas/terapia , Pancreaticoduodenectomia , Complicações Pós-Operatórias , Adulto , Idoso , Idoso de 80 Anos ou mais , Carcinoma Ductal Pancreático/mortalidade , Carcinoma Ductal Pancreático/cirurgia , Quimioterapia Adjuvante , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Neoplasias Pancreáticas/mortalidade , Neoplasias Pancreáticas/cirurgia , Radioterapia Adjuvante , Estudos Retrospectivos , Análise de Sobrevida , Fatores de Tempo , Resultado do Tratamento
17.
J Surg Oncol ; 114(4): 451-5, 2016 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-27238300

RESUMO

BACKGROUND: Long-term results of the ESPAC-3 trial suggest that while completing adjuvant therapy (AT) is necessary after resection of pancreatic ductal adenocarcinoma (PDAC), early initiation (within 8 weeks) may not be associated with improved overall survival (OS). The primary aim of this study was to evaluate the OS impact of early versus late AT in a statewide analysis. METHODS: Patients with stages I-III PDAC in the Kentucky Cancer Registry (KCR) from 2004 to 2013, were evaluated. Those undergoing pancreatectomy were stratified into two groups ("early," <8 weeks, vs. "late," 8-16 weeks). RESULTS: Of 2,221 diagnosed patients with stages I-III, 831 (37.4%) underwent pancreatectomy upfront. Of these, only 420 (50.5%) received AT. Initiation date of AT was not associated with OS (median OS: early, 20.2 vs. late, 19.0 months, P = 0.97). On multivariate analysis, factors that affected OS included stage (II, HR-1.82, P = 0.017; III, HR-3.77, P < 0.001), node positivity (HR-1.51, P = 0.004), poorly/undifferentiated grade (HR-1.34; P = 0.011), but not AT initiation date. CONCLUSIONS: In this statewide analysis, there was no difference in OS between early and late AT initiation for resected PDAC. The ideal window for AT initiation remains unknown as tumor biology continues to trump regimens from the past decade. J. Surg. Oncol. 2016;114:451-455. © 2016 Wiley Periodicals, Inc.


Assuntos
Adenocarcinoma/terapia , Carcinoma Ductal Pancreático/terapia , Pancreatectomia , Neoplasias Pancreáticas/terapia , Adenocarcinoma/mortalidade , Adenocarcinoma/patologia , Adulto , Idoso , Idoso de 80 Anos ou mais , Carcinoma Ductal Pancreático/mortalidade , Carcinoma Ductal Pancreático/patologia , Terapia Combinada , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Neoplasias Pancreáticas/mortalidade , Neoplasias Pancreáticas/patologia , Sistema de Registros , Fatores de Tempo
18.
Cancer Inform ; 23: 11769351231223806, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38322427

RESUMO

Large-scale, multi-site collaboration is becoming indispensable for a wide range of research and clinical activities in oncology. To facilitate the next generation of advances in cancer biology, precision oncology and the population sciences it will be necessary to develop and implement data management and analytic tools that empower investigators to reliably and objectively detect, characterize and chronicle the phenotypic and genomic changes that occur during the transformation from the benign to cancerous state and throughout the course of disease progression. To facilitate these efforts it is incumbent upon the informatics community to establish the workflows and architectures that automate the aggregation and organization of a growing range and number of clinical data types and modalities ranging from new molecular and laboratory tests to sophisticated diagnostic imaging studies. In an attempt to meet those challenges, leading health care centers across the country are making steep investments to establish enterprise-wide, data warehouses. A significant limitation of many data warehouses, however, is that they are designed to support only alphanumeric information. In contrast to those traditional designs, the system that we have developed supports automated collection and mining of multimodal data including genomics, digital pathology and radiology images. In this paper, our team describes the design, development and implementation of a multi-modal, Clinical & Research Data Warehouse (CRDW) that is tightly integrated with a suite of computational and machine-learning tools to provide actionable insight into the underlying characteristics of the tumor environment that would not be revealed using standard methods and tools. The System features a flexible Extract, Transform and Load (ETL) interface that enables it to adapt to aggregate data originating from different clinical and research sources depending on the specific EHR and other data sources utilized at a given deployment site.

19.
JCO Oncol Pract ; 20(5): 631-642, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38194612

RESUMO

PURPOSE: Database linkage between cancer registries and clinical trial consortia has the potential to elucidate referral patterns of children and adolescents with newly diagnosed cancer, including enrollment into cancer clinical trials. This study's primary objective was to assess the feasibility of this linkage approach. METHODS: Patients younger than 20 years diagnosed with incident cancer during 2012-2017 in the Kentucky Cancer Registry (KCR) were linked with patients enrolled in a Children's Oncology Group (COG) study. Matched patients between databases were described by sex, age, race and ethnicity, geographical location when diagnosed, and cancer type. Logistic regression modeling identified factors associated with COG study enrollment. Timeliness of patient identification by KCR was reported through the Centers for Disease Control and Prevention's Early Case Capture (ECC) program. RESULTS: Of 1,357 patients reported to KCR, 47% were determined by matching to be enrolled in a COG study. Patients had greater odds of enrollment if they were age 0-4 years (v 15-19 years), reported from a COG-affiliated institution, and had renal cancer, neuroblastoma, or leukemia. Patients had lower odds of enrollment if Hispanic (v non-Hispanic White) or had epithelial (eg, thyroid, melanoma) cancer. Most (59%) patients were reported to KCR within 10 days of pathologic diagnosis. CONCLUSION: Linkage of clinical trial data with cancer registries is a feasible approach for tracking patient referral and clinical trial enrollment patterns. Adolescents had lower enrollment compared with younger age groups, independent of cancer type. Population-based early case capture could guide interventions designed to increase cancer clinical trial enrollment.


Assuntos
Ensaios Clínicos como Assunto , Neoplasias , Humanos , Adolescente , Criança , Feminino , Masculino , Neoplasias/terapia , Neoplasias/epidemiologia , Pré-Escolar , Lactente , Recém-Nascido , Sistema de Registros , Adulto Jovem , Seleção de Pacientes , Armazenamento e Recuperação da Informação
20.
J Natl Cancer Inst Monogr ; 2024(65): 168-179, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39102888

RESUMO

BACKGROUND: Precision medicine has become a mainstay of cancer care in recent years. The National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) Program has been an authoritative source of cancer statistics and data since 1973. However, tumor genomic information has not been adequately captured in the cancer surveillance data, which impedes population-based research on molecular subtypes. To address this, the SEER Program has developed and implemented a centralized process to link SEER registries' tumor cases with genomic test results that are provided by molecular laboratories to the registries. METHODS: Data linkages were carried out following operating procedures for centralized linkages established by the SEER Program. The linkages used Match*Pro, a probabilistic linkage software, and were facilitated by the registries' trusted third party (an honest broker). The SEER registries provide to NCI limited datasets that undergo preliminary evaluation prior to their release to the research community. RESULTS: Recently conducted genomic linkages included OncotypeDX Breast Recurrence Score, OncotypeDX Breast Ductal Carcinoma in Situ, OncotypeDX Genomic Prostate Score, Decipher Prostate Genomic Classifier, DecisionDX Uveal Melanoma, DecisionDX Preferentially Expressed Antigen in Melanoma, DecisionDX Melanoma, and germline tests results in Georgia and California SEER registries. CONCLUSIONS: The linkages of cancer cases from SEER registries with genomic test results obtained from molecular laboratories offer an effective approach for data collection in cancer surveillance. By providing de-identified data to the research community, the NCI's SEER Program enables scientists to investigate numerous research inquiries.


Assuntos
Genômica , Neoplasias , Sistema de Registros , Programa de SEER , Humanos , Programa de SEER/estatística & dados numéricos , Estados Unidos/epidemiologia , Neoplasias/genética , Neoplasias/epidemiologia , Neoplasias/diagnóstico , Genômica/métodos , Sistema de Registros/estatística & dados numéricos , Feminino , Masculino , Testes Genéticos/métodos , Testes Genéticos/estatística & dados numéricos , Registro Médico Coordenado/métodos , National Cancer Institute (U.S.)
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA