Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Elife ; 122023 Nov 21.
Artículo en Inglés | MEDLINE | ID: mdl-37988407

RESUMEN

Pancreatic cancer is one of the deadliest cancer types with poor treatment options. Better detection of early symptoms and relevant disease correlations could improve pancreatic cancer prognosis. In this retrospective study, we used symptom and disease codes (ICD-10) from the Danish National Patient Registry (NPR) encompassing 6.9 million patients from 1994 to 2018,, of whom 23,592 were diagnosed with pancreatic cancer. The Danish cancer registry included 18,523 of these patients. To complement and compare the registry diagnosis codes with deeper clinical data, we used a text mining approach to extract symptoms from free text clinical notes in electronic health records (3078 pancreatic cancer patients and 30,780 controls). We used both data sources to generate and compare symptom disease trajectories to uncover temporal patterns of symptoms prior to pancreatic cancer diagnosis for the same patients. We show that the text mining of the clinical notes was able to complement the registry-based symptoms by capturing more symptoms prior to pancreatic cancer diagnosis. For example, 'Blood pressure reading without diagnosis', 'Abnormalities of heartbeat', and 'Intestinal obstruction' were not found for the registry-based analysis. Chaining symptoms together in trajectories identified two groups of patients with lower median survival (<90 days) following the trajectories 'Cough→Jaundice→Intestinal obstruction' and 'Pain→Jaundice→Abnormal results of function studies'. These results provide a comprehensive comparison of the two types of pancreatic cancer symptom trajectories, which in combination can leverage the full potential of the health data and ultimately provide a fuller picture for detection of early risk factors for pancreatic cancer.


Pancreatic cancer is one of the deadliest cancer types. Scientists predict it will become the second largest cause of cancer-related deaths in 2030. It has few or no symptoms at early stages and often goes undetected for an extended period. As a result, patients are often diagnosed at an advanced stage when they have few treatment options and lower survival rates. Only 11 percent of patients with pancreatic cancer survive five years past their diagnosis. Earlier detection and surgery to remove the tumor increase patient survival to 42% at five years. Those who undergo surgery at the earliest stage have an 84% survival rate at five years. Developing ways to screen for and detect pancreatic cancer early could improve patient survival. Identifying early symptoms is critical. So far, studies show links between weight loss, abdominal pain, lower back pain, and new-onset diabetes and pancreatic cancer. But clinicians often overlook these symptoms or do not associate them with cancer. National health registries may be data sources that scientists can use to zoom in on early pancreatic symptoms and create alerts for clinicians. Hjaltelin, Novitski et al. identified potential pancreatic cancer symptoms using patient registry data and electronic health records. Hjaltelin, Novitski et al. extracted potential pancreatic cancer-related disease or symptom trajectories from 7 million patients listed in the Danish National Patient Registry. They also scoured clinical notes in 34,000 patients' electronic health records for symptoms. The electronic health records yielded more promising symptoms than the registry. But both data sources produced complementary information. The analysis showed that some symptoms, like jaundice, were associated with higher survival rates because they may lead to earlier diagnosis. The data so far suggest that symptoms leading up to a pancreatic cancer diagnosis may be nonspecific and not occur in a particular order. As the cancer progresses, symptoms may become more specific and severe. Further assessment of the study's results is necessary. Tools like artificial intelligence or advanced text mining may allow scientists identify more definitive early symptom trajectories and help clinicians identify patients earlier.


Asunto(s)
Ictericia , Neoplasias Pancreáticas , Humanos , Registros Electrónicos de Salud , Estudios Retrospectivos , Datos de Salud Recolectados Rutinariamente , Neoplasias Pancreáticas/diagnóstico , Neoplasias Pancreáticas/epidemiología , Dinamarca/epidemiología , Neoplasias Pancreáticas
2.
PLoS Comput Biol ; 16(9): e1008244, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32960884

RESUMEN

Alcoholic-related liver disease (ALD) is the cause of more than half of all liver-related deaths. Sustained excess drinking causes fatty liver and alcohol-related steatohepatitis, which may progress to alcoholic liver fibrosis (ALF) and eventually to alcohol-related liver cirrhosis (ALC). Unfortunately, it is difficult to identify patients with early-stage ALD, as these are largely asymptomatic. Consequently, the majority of ALD patients are only diagnosed by the time ALD has reached decompensated cirrhosis, a symptomatic phase marked by the development of complications as bleeding and ascites. The main goal of this study is to discover relevant upstream diagnoses helping to understand the development of ALD, and to highlight meaningful downstream diagnoses that represent its progression to liver failure. Here, we use data from the Danish health registries covering the entire population of Denmark during nineteen years (1996-2014), to examine if it is possible to identify patients likely to develop ALF or ALC based on their past medical history. To this end, we explore a knowledge discovery approach by using high-dimensional statistical and machine learning techniques to extract and analyze data from the Danish National Patient Registry. Consistent with the late diagnoses of ALD, we find that ALC is the most common form of ALD in the registry data and that ALC patients have a strong over-representation of diagnoses associated with liver dysfunction. By contrast, we identify a small number of patients diagnosed with ALF who appear to be much less sick than those with ALC. We perform a matched case-control study using the group of patients with ALC as cases and their matched patients with non-ALD as controls. Machine learning models (SVM, RF, LightGBM and NaiveBayes) trained and tested on the set of ALC patients achieve a high performance for data classification (AUC = 0.89). When testing the same trained models on the small set of ALF patients, their performance unsurprisingly drops a lot (AUC = 0.67 for NaiveBayes). The statistical and machine learning results underscore small groups of upstream and downstream comorbidities that accurately detect ALC patients and show promise in prediction of ALF. Some of these groups are conditions either caused by alcohol or caused by malnutrition associated with alcohol-overuse. Others are comorbidities either related to trauma and life-style or to complications to cirrhosis, such as oesophageal varices. Our findings highlight the potential of this approach to uncover knowledge in registry data related to ALD.


Asunto(s)
Hepatopatías Alcohólicas/epidemiología , Hepatopatías Alcohólicas/patología , Aprendizaje Automático , Modelos Estadísticos , Anciano , Anciano de 80 o más Años , Comorbilidad , Dinamarca , Femenino , Humanos , Fallo Hepático/prevención & control , Masculino , Persona de Mediana Edad , Sistema de Registros , Factores de Riesgo
3.
Elife ; 82019 12 10.
Artículo en Inglés | MEDLINE | ID: mdl-31818369

RESUMEN

Diabetes is a diverse and complex disease, with considerable variation in phenotypic manifestation and severity. This variation hampers the study of etiological differences and reduces the statistical power of analyses of associations to genetics, treatment outcomes, and complications. We address these issues through deep, fine-grained phenotypic stratification of a diabetes cohort. Text mining the electronic health records of 14,017 patients, we matched two controlled vocabularies (ICD-10 and a custom vocabulary developed at the clinical center Steno Diabetes Center Copenhagen) to clinical narratives spanning a 19 year period. The two matched vocabularies comprise over 20,000 medical terms describing symptoms, other diagnoses, and lifestyle factors. The cohort is genetically homogeneous (Caucasian diabetes patients from Denmark) so the resulting stratification is not driven by ethnic differences, but rather by inherently dissimilar progression patterns and lifestyle related risk factors. Using unsupervised Markov clustering, we defined 71 clusters of at least 50 individuals within the diabetes spectrum. The clusters display both distinct and shared longitudinal glycemic dysregulation patterns, temporal co-occurrences of comorbidities, and associations to single nucleotide polymorphisms in or near genes relevant for diabetes comorbidities.


Asunto(s)
Minería de Datos , Complicaciones de la Diabetes/epidemiología , Diabetes Mellitus/epidemiología , Terminología como Asunto , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Algoritmos , Niño , Estudios de Cohortes , Dinamarca/epidemiología , Complicaciones de la Diabetes/diagnóstico , Complicaciones de la Diabetes/genética , Complicaciones de la Diabetes/terapia , Diabetes Mellitus/diagnóstico , Diabetes Mellitus/genética , Diabetes Mellitus/terapia , Registros Electrónicos de Salud , Femenino , Humanos , Masculino , Persona de Mediana Edad , Factores de Riesgo , Resultado del Tratamiento , Vocabulario , Adulto Joven
4.
J Proteomics ; 75(13): 3886-97, 2012 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-22634085

RESUMEN

Deubiquitylating enzymes (DUBs) are a large group of proteases that regulate ubiquitin-dependent metabolic pathways by cleaving ubiquitin-protein bonds. Here we present a global study aimed at elucidating the effects DUBs have on protein abundance changes in eukaryotic cells. To this end we compare wild-type Saccharomyces cerevisiae to 20 DUB knock-out strains using quantitative proteomics to measure proteome-wide expression of isotope labeled proteins, and analyze the data in the context of known transcription-factor regulatory networks. Overall we find that protein abundances differ widely between individual deletion strains, demonstrating that removing just a single component from the complex ubiquitin system causes major changes in cellular protein expression. The outcome of our analysis confirms many of the known biological roles for characterized DUBs such as Ubp3p and Ubp8p, and we demonstrate that Sec28p is a novel Ubp3p substrate. In addition we find strong associations for several uncharacterized DUBs providing clues for their possible cellular roles. Hierarchical clustering of all deletion strains reveals pronounced similarities between various DUBs, which corroborate current DUB knowledge and uncover novel functional aspects for uncharacterized DUBs. Observations in our analysis support that DUBs induce both direct and indirect effects on protein abundances.


Asunto(s)
Proteoma/genética , Saccharomyces cerevisiae/genética , Ubiquitina/metabolismo , Endopeptidasas/fisiología , Técnicas de Inactivación de Genes , Proteómica , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/fisiología
5.
Br J Clin Pharmacol ; 73(5): 674-84, 2012 May.
Artículo en Inglés | MEDLINE | ID: mdl-22122057

RESUMEN

This literature review included studies that use text-mining techniques in narrative documents stored in electronic patient records (EPRs) to investigate ADRs. We searched PubMed, Embase, Web of Science and International Pharmaceutical Abstracts without restrictions from origin until July 2011. We included empirically based studies on text mining of electronic patient records (EPRs) that focused on detecting ADRs, excluding those that investigated adverse events not related to medicine use. We extracted information on study populations, EPR data sources, frequencies and types of the identified ADRs, medicines associated with ADRs, text-mining algorithms used and their performance. Seven studies, all from the United States, were eligible for inclusion in the review. Studies were published from 2001, the majority between 2009 and 2010. Text-mining techniques varied over time from simple free text searching of outpatient visit notes and inpatient discharge summaries to more advanced techniques involving natural language processing (NLP) of inpatient discharge summaries. Performance appeared to increase with the use of NLP, although many ADRs were still missed. Due to differences in study design and populations, various types of ADRs were identified and thus we could not make comparisons across studies. The review underscores the feasibility and potential of text mining to investigate narrative documents in EPRs for ADRs. However, more empirical studies are needed to evaluate whether text mining of EPRs can be used systematically to collect new information about ADRs.


Asunto(s)
Sistemas de Registro de Reacción Adversa a Medicamentos/estadística & datos numéricos , Minería de Datos/métodos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Sistemas de Registros Médicos Computarizados/estadística & datos numéricos , Procesamiento de Lenguaje Natural , Farmacovigilancia , Algoritmos , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA