Búsqueda | Biblioteca Virtual en Salud Odontología. Uruguay

1.

Geometric Features Associated with Middle Cerebral Artery Bifurcation Aneurysm Formation: A Matched Case-Control Study.

Zhang, Jian; Can, Anil; Lai, Pui Man Rosalind; Mukundan, Srinivasan; Castro, Victor M; Dligach, Dmitriy; Finan, Sean; Gainer, Vivian S; Shadick, Nancy A; Savova, Guergana; Murphy, Shawn N; Cai, Tianxi; Weiss, Scott T; Du, Rose.

J Stroke Cerebrovasc Dis ; 31(3): 106268, 2022 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-34974241

RESUMEN

OBJECTIVES: The pathogenesis of intracranial aneurysms is multifactorial and includes genetic, environmental, and anatomic influences. We aimed to identify image-based morphological parameters that were associated with middle cerebral artery (MCA) bifurcation aneurysms. MATERIALS AND METHODS: We evaluated three-dimensional morphological parameters obtained from CT angiography (CTA) or digital subtraction angiography (DSA) from 317 patients with unilateral MCA bifurcation aneurysms diagnosed at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016. We chose the contralateral unaffected MCA bifurcation as the control group, in order to control for genetic and environmental risk factors. Diameters and angles of surrounding parent and daughter vessels of 634 MCAs were examined. RESULTS: Univariable and multivariable statistical analyses were performed to determine statistical significance. Sensitivity analyses with smaller (≤ 3 mm) aneurysms only and with angles excluded, were also performed. In a multivariable conditional logistic regression model we showed that smaller diameter size ratio (OR 0.0004, 95% CI 0.0001-0.15), larger daughter-daughter angles (OR 1.08, 95% CI 1.06-1.11) and larger parent-daughter angle ratios (OR 4.24, 95% CI 1.77-10.16) were significantly associated with MCA aneurysm presence after correcting for other variables. In order to account for possible changes to the vasculature by the aneurysm, a subgroup analysis of small aneurysms (≤ 3 mm) was performed and showed that the results were similar. CONCLUSIONS: Easily measurable morphological parameters of the surrounding vasculature of the MCA may provide objective metrics to assess MCA aneurysm formation risk in high-risk patients.

Asunto(s)

Aneurisma Intracraneal , Arteria Cerebral Media , Estudios de Casos y Controles , Angiografía por Tomografía Computarizada , Femenino , Humanos , Aneurisma Intracraneal/diagnóstico por imagen , Arteria Cerebral Media/diagnóstico por imagen

2.

Open-source Software Sustainability Models: Initial White Paper From the Informatics Technology for Cancer Research Sustainability and Industry Partnership Working Group.

Ye, Ye; Barapatre, Seemran; Davis, Michael K; Elliston, Keith O; Davatzikos, Christos; Fedorov, Andrey; Fillion-Robin, Jean-Christophe; Foster, Ian; Gilbertson, John R; Lasso, Andras; Miller, James V; Morgan, Martin; Pieper, Steve; Raumann, Brigitte E; Sarachan, Brion D; Savova, Guergana; Silverstein, Jonathan C; Taylor, Donald P; Zelnis, Joyce B; Zhang, Guo-Qiang; Cuticchia, Jamie; Becich, Michael J.

J Med Internet Res ; 23(12): e20028, 2021 12 02.

Artículo en Inglés | MEDLINE | ID: mdl-34860667

RESUMEN

BACKGROUND: The National Cancer Institute Informatics Technology for Cancer Research (ITCR) program provides a series of funding mechanisms to create an ecosystem of open-source software (OSS) that serves the needs of cancer research. As the ITCR ecosystem substantially grows, it faces the challenge of the long-term sustainability of the software being developed by ITCR grantees. To address this challenge, the ITCR sustainability and industry partnership working group (SIP-WG) was convened in 2019. OBJECTIVE: The charter of the SIP-WG is to investigate options to enhance the long-term sustainability of the OSS being developed by ITCR, in part by developing a collection of business model archetypes that can serve as sustainability plans for ITCR OSS development initiatives. The working group assembled models from the ITCR program, from other studies, and from the engagement of its extensive network of relationships with other organizations (eg, Chan Zuckerberg Initiative, Open Source Initiative, and Software Sustainability Institute) in support of this objective. METHODS: This paper reviews the existing sustainability models and describes 10 OSS use cases disseminated by the SIP-WG and others, including 3D Slicer, Bioconductor, Cytoscape, Globus, i2b2 (Informatics for Integrating Biology and the Bedside) and tranSMART, Insight Toolkit, Linux, Observational Health Data Sciences and Informatics tools, R, and REDCap (Research Electronic Data Capture), in 10 sustainability aspects: governance, documentation, code quality, support, ecosystem collaboration, security, legal, finance, marketing, and dependency hygiene. RESULTS: Information available to the public reveals that all 10 OSS have effective governance, comprehensive documentation, high code quality, reliable dependency hygiene, strong user and developer support, and active marketing. These OSS include a variety of licensing models (eg, general public license version 2, general public license version 3, Berkeley Software Distribution, and Apache 3) and financial models (eg, federal research funding, industry and membership support, and commercial support). However, detailed information on ecosystem collaboration and security is not publicly provided by most OSS. CONCLUSIONS: We recommend 6 essential attributes for research software: alignment with unmet scientific needs, a dedicated development team, a vibrant user community, a feasible licensing model, a sustainable financial model, and effective product management. We also stress important actions to be considered in future ITCR activities that involve the discussion of the sustainability and licensing models for ITCR OSS, the establishment of a central library, the allocation of consulting resources to code quality control, ecosystem collaboration, security, and dependency hygiene.

Asunto(s)

Ecosistema , Neoplasias , Humanos , Informática , Neoplasias/terapia , Investigación , Programas Informáticos , Tecnología

3.

Use of Narrative Concepts in Electronic Health Records to Validate Associations Between Genetic Factors and Response to Treatment of Inflammatory Bowel Diseases.

Ananthakrishnan, Ashwin N; Cagan, Andrew; Cai, Tianxi; Gainer, Vivian S; Savova, Guergana; Shaw, Stanley Y; Churchill, Susanne; Burke, Kristin E; Karlson, Elizabeth W; Murphy, Shawn N; Kohane, Isaac; Liao, Katherine P; Xavier, Ramnik J.

Clin Gastroenterol Hepatol ; 18(8): 1890-1892, 2020 07.

Artículo en Inglés | MEDLINE | ID: mdl-31404664

RESUMEN

Crohn's disease (CD) and ulcerative colitis (UC) are heterogeneous. With availability of therapeutic classes with distinct immunologic mechanisms of action, it has become imperative to identify markers that predict likelihood of response to each drug class. However, robust development of such tools has been challenging because of need for large prospective cohorts with systematic and careful assessment of treatment response using validated indices. Most hospitals in the United States use electronic health records (EHRs) that warehouse a large amount of narrative (free-text) and codified (administrative) data generated during routine clinical care. These data have been used to construct virtual disease cohorts for epidemiologic research as well as for defining genetic basis of disease states or discrete laboratory values.1-3 Whether EHR-based data can be used to validate genetic associations for more nuanced outcomes such as treatment response has not been examined previously.

Asunto(s)

Colitis Ulcerosa , Enfermedad de Crohn , Enfermedades Inflamatorias del Intestino , Registros Electrónicos de Salud , Humanos , Enfermedades Inflamatorias del Intestino/tratamiento farmacológico , Estudios Prospectivos , Estados Unidos

4.

Supervised methods to extract clinical events from cardiology reports in Italian.

Viani, Natalia; Miller, Timothy A; Napolitano, Carlo; Priori, Silvia G; Savova, Guergana K; Bellazzi, Riccardo; Sacchi, Lucia.

J Biomed Inform ; 95: 103219, 2019 07.

Artículo en Inglés | MEDLINE | ID: mdl-31150777

RESUMEN

Clinical narratives are a valuable source of information for both patient care and biomedical research. Given the unstructured nature of medical reports, specific automatic techniques are required to extract relevant entities from such texts. In the natural language processing (NLP) community, this task is often addressed by using supervised methods. To develop such methods, both reliably-annotated corpora and elaborately designed features are needed. Despite the recent advances on corpora collection and annotation, research on multiple domains and languages is still limited. In addition, to compute the features required for supervised classification, suitable language- and domain-specific tools are needed. In this work, we propose a novel application of recurrent neural networks (RNNs) for event extraction from medical reports written in Italian. To train and evaluate the proposed approach, we annotated a corpus of 75 cardiology reports for a total of 4365 mentions of relevant events and their attributes (e.g., the polarity). For the annotation task, we developed specific annotation guidelines, which are provided together with this paper. The RNN-based classifier was trained on a training set including 3335 events (60 documents). The resulting model was integrated into an NLP pipeline that uses a dictionary lookup approach to search for relevant concepts inside the text. A test set of 1030 events (15 documents) was used to evaluate and compare different pipeline configurations. As a main result, using the RNN-based classifier instead of the dictionary lookup approach allowed increasing recall from 52.4% to 88.9%, and precision from 81.1% to 88.2%. Further, using the two methods in combination, we obtained final recall, precision, and F1 score of 91.7%, 88.6%, and 90.1%, respectively. These experiments indicate that integrating a well-performing RNN-based classifier with a standard knowledge-based approach can be a good strategy to extract information from clinical text in non-English languages.

Asunto(s)

Minería de Datos/métodos , Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural , Cardiopatías , Humanos , Italia , Redes Neurales de la Computación , Semántica

5.

Elevated International Normalized Ratio Is Associated With Ruptured Aneurysms.

Can, Anil; Castro, Victor M; Dligach, Dmitriy; Finan, Sean; Yu, Sheng; Gainer, Vivian; Shadick, Nancy A; Savova, Guergana; Murphy, Shawn; Cai, Tianxi; Weiss, Scott T; Du, Rose.

Stroke ; 49(9): 2046-2052, 2018 09.

Artículo en Inglés | MEDLINE | ID: mdl-30354989

RESUMEN

Background and Purpose- The effects of anticoagulation therapy and elevated international normalized ratio (INR) values on the risk of aneurysmal subarachnoid hemorrhage are unknown. We aimed to investigate the association between anticoagulation therapy, elevated INR values, and rupture of intracranial aneurysms. Methods- We conducted a case-control study of 4696 patients with 6403 intracranial aneurysms, including 1198 prospective patients, diagnosed at the Massachusetts General Hospital and the Brigham and Women's Hospital between 1990 and 2016 who were on no anticoagulant therapy or on warfarin for anticoagulation. Patients were divided into ruptured and nonruptured groups. Univariable and multivariable logistic regression analyses were performed to evaluate the association of anticoagulation therapy, INR values, and presentation with a ruptured intracranial aneurysm, taking into account the interaction between anticoagulant use and INR. Inverse probability weighting using propensity scores was used to minimize differences in baseline demographics characteristics. The marginal effects of anticoagulant use on rupture risk stratified by INR values were calculated. Results- In unweighted and weighted multivariable analyses, elevated INR values were significantly associated with rupture status among patients who were not anticoagulated (unweighted odds ratio, 22.78; 95% CI, 10.85-47.81 and weighted odds ratio, 28.16; 95% CI, 12.44-63.77). In anticoagulated patients, warfarin use interacts significantly with INR when INR ≥1.2 by decreasing the effects of INR on rupture risk. Conclusions- INR elevation is associated with intracranial aneurysm rupture, but the effects may be moderated by warfarin. INR values should, therefore, be taken into consideration when counseling patients with intracranial aneurysms.

Asunto(s)

Aneurisma Roto/epidemiología , Anticoagulantes/uso terapéutico , Relación Normalizada Internacional , Aneurisma Intracraneal , Hemorragia Subaracnoidea/epidemiología , Warfarina/uso terapéutico , Adulto , Anciano , Aneurisma Roto/sangre , Estudios de Casos y Controles , Femenino , Humanos , Modelos Logísticos , Masculino , Persona de Mediana Edad , Análisis Multivariante , Oportunidad Relativa , Puntaje de Propensión , Factores de Riesgo , Rotura Espontánea , Hemorragia Subaracnoidea/sangre

6.

Antihyperglycemic Agents Are Inversely Associated With Intracranial Aneurysm Rupture.

Can, Anil; Castro, Victor M; Yu, Sheng; Dligach, Dmitriy; Finan, Sean; Gainer, Vivian S; Shadick, Nancy A; Savova, Guergana; Murphy, Shawn; Cai, Tianxi; Weiss, Scott T; Du, Rose.

Stroke ; 49(1): 34-39, 2018 01.

Artículo en Inglés | MEDLINE | ID: mdl-29203688

RESUMEN

BACKGROUND AND PURPOSE: Previous studies have suggested a protective effect of diabetes mellitus on aneurysmal subarachnoid hemorrhage risk. However, reports are inconsistent, and objective measures of hyperglycemia in these studies are lacking. Our aim was to investigate the association between aneurysmal subarachnoid hemorrhage and antihyperglycemic agent use and glycated hemoglobin levels. METHODS: The medical records of 4701 patients with 6411 intracranial aneurysms, including 1201 prospective patients, diagnosed at the Massachusetts General Hospital and Brigham and Women's Hospital between 1990 and 2016 were reviewed and analyzed. Patients were separated into ruptured and nonruptured groups. Univariate and multivariate logistic regression analyses were performed to determine the association between aneurysmal subarachnoid hemorrhage and antihyperglycemic agents and glycated hemoglobin levels. Propensity score weighting was used to account for selection bias. RESULTS: In both unweighted and weighted multivariate analysis, antihyperglycemic agent use was inversely and significantly associated with ruptured aneurysms (unweighted odds ratio, 0.58; 95% confidence interval, 0.39-0.87; weighted odds ratio, 0.57; 95% confidence interval, 0.34-0.96). In contrast, glycated hemoglobin levels were not significantly associated with rupture status. CONCLUSIONS: Antihyperglycemic agent use rather than hyperglycemia is associated with decreased risk of aneurysmal subarachnoid hemorrhage, suggesting a possible protective effect of glucose-lowering agents in the pathogenesis of aneurysm rupture.

Asunto(s)

Aneurisma Roto , Hemoglobina Glucada/metabolismo , Hipoglucemiantes/administración & dosificación , Aneurisma Intracraneal , Hemorragia Subaracnoidea , Adulto , Anciano , Aneurisma Roto/sangre , Aneurisma Roto/epidemiología , Aneurisma Roto/etiología , Aneurisma Roto/fisiopatología , Femenino , Humanos , Hipoglucemiantes/efectos adversos , Aneurisma Intracraneal/sangre , Aneurisma Intracraneal/epidemiología , Aneurisma Intracraneal/etiología , Aneurisma Intracraneal/fisiopatología , Masculino , Persona de Mediana Edad , Factores de Riesgo , Hemorragia Subaracnoidea/sangre , Hemorragia Subaracnoidea/epidemiología , Hemorragia Subaracnoidea/etiología , Hemorragia Subaracnoidea/fisiopatología

7.

Low Serum Calcium and Magnesium Levels and Rupture of Intracranial Aneurysms.

Can, Anil; Rudy, Robert F; Castro, Victor M; Dligach, Dmitriy; Finan, Sean; Yu, Sheng; Gainer, Vivian; Shadick, Nancy A; Savova, Guergana; Murphy, Shawn; Cai, Tianxi; Weiss, Scott T; Du, Rose.

Stroke ; 49(7): 1747-1750, 2018 07.

Artículo en Inglés | MEDLINE | ID: mdl-29844027

RESUMEN

BACKGROUND AND PURPOSE: Both low serum calcium and magnesium levels have been associated with the extent of bleeding in patients with intracerebral hemorrhage, suggesting hypocalcemia- and hypomagnesemia-induced coagulopathy as a possible underlying mechanism. We hypothesized that serum albumin-corrected total calcium and magnesium levels are associated with ruptured intracranial aneurysms. METHODS: The medical records of 4701 patients, including 1201 prospective patients, diagnosed at the Brigham and Women's Hospital and Massachusetts General Hospital between 1990 and 2016 were reviewed and analyzed. One thousand two hundred seventy-five patients had available serum calcium, magnesium, and albumin values within 1 day of diagnosis. Individuals were divided into cases with ruptured aneurysms and controls with unruptured aneurysms. Univariable and multivariable logistic regression analyses were performed to determine the association between serum albumin-corrected total calcium and magnesium levels and ruptured aneurysms. RESULTS: In multivariable analysis, both albumin-corrected calcium (odds ratio, 0.33; 95% confidence interval, 0.27-0.40) and magnesium (odds ratio, 0.40; 95% confidence interval, 0.28-0.55) were significantly and inversely associated with ruptured intracranial aneurysms. CONCLUSIONS: In this large case-control study, hypocalcemia and hypomagnesemia at diagnosis were significantly associated with ruptured aneurysms. Impaired hemostasis caused by hypocalcemia and hypomagnesemia may explain this association.

Asunto(s)

Aneurisma Roto/sangre , Calcio/sangre , Aneurisma Intracraneal/sangre , Magnesio/sangre , Estudios de Casos y Controles , Femenino , Humanos , Masculino , Estudios Prospectivos

8.

Lipid-Lowering Agents and High HDL (High-Density Lipoprotein) Are Inversely Associated With Intracranial Aneurysm Rupture.

Can, Anil; Castro, Victor M; Dligach, Dmitriy; Finan, Sean; Yu, Sheng; Gainer, Vivian; Shadick, Nancy A; Savova, Guergana; Murphy, Shawn; Cai, Tianxi; Weiss, Scott T; Du, Rose.

Stroke ; 49(5): 1148-1154, 2018 05.

Artículo en Inglés | MEDLINE | ID: mdl-29622625

RESUMEN

BACKGROUND AND PURPOSE: Growing evidence from experimental animal models and clinical studies suggests the protective effect of statin use against rupture of intracranial aneurysms; however, results from large studies detailing the relationship between intracranial aneurysm rupture and total cholesterol, HDL (high-density lipoprotein), LDL (low-density lipoprotein), and lipid-lowering agent use are lacking. METHODS: The medical records of 4701 patients with 6411 intracranial aneurysms diagnosed at the Massachusetts General Hospital and the Brigham and Women's Hospital between 1990 and 2016 were reviewed and analyzed. Patients were separated into ruptured and nonruptured groups. Univariable and multivariable logistic regression analyses were performed to determine the effects of lipids (total cholesterol, LDL, and HDL) and lipid-lowering medications on intracranial aneurysm rupture risk. Propensity score weighting was used to account for differences in baseline characteristics of the cohorts. RESULTS: Lipid-lowering agent use was significantly inversely associated with rupture status (odds ratio, 0.58; 95% confidence interval, 0.47-0.71). In a subgroup analysis of complete cases that includes both lipid-lowering agent use and lipid values, higher HDL levels (odds ratio, 0.95; 95% confidence interval, 0.93-0.98) and lipid-lowering agent use (odds ratio, 0.41; 95% confidence interval, 0.23-0.73) were both significantly and inversely associated with rupture status, whereas total cholesterol and LDL levels were not significant. A monotonic exposure-response curve between HDL levels and risk of aneurysmal rupture was obtained. CONCLUSIONS: Higher HDL values and the use of lipid-lowering agents are significantly inversely associated with ruptured intracranial aneurysms.

Asunto(s)

Aneurisma Roto/epidemiología , HDL-Colesterol/sangre , Hipolipemiantes/uso terapéutico , Aneurisma Intracraneal/epidemiología , Adulto , Anciano , Aneurisma Roto/sangre , Bencimidazoles/uso terapéutico , LDL-Colesterol/sangre , Resina de Colestiramina/uso terapéutico , Colestipol/uso terapéutico , Ezetimiba/uso terapéutico , Femenino , Ácidos Fíbricos/uso terapéutico , Humanos , Inhibidores de Hidroximetilglutaril-CoA Reductasas/uso terapéutico , Aneurisma Intracraneal/sangre , Modelos Logísticos , Masculino , Persona de Mediana Edad , Análisis Multivariante , Oportunidad Relativa , Oligonucleótidos/uso terapéutico , Inhibidores de PCSK9 , Puntaje de Propensión , Factores Protectores

9.

Phelan-McDermid syndrome data network: Integrating patient reported outcomes with clinical notes and curated genetic reports.

Kothari, Cartik; Wack, Maxime; Hassen-Khodja, Claire; Finan, Sean; Savova, Guergana; O'Boyle, Megan; Bliss, Geraldine; Cornell, Andria; Horn, Elizabeth J; Davis, Rebecca; Jacobs, Jacquelyn; Kohane, Isaac; Avillach, Paul.

Am J Med Genet B Neuropsychiatr Genet ; 177(7): 613-624, 2018 10.

Artículo en Inglés | MEDLINE | ID: mdl-28862395

RESUMEN

The heterogeneity of patient phenotype data are an impediment to the research into the origins and progression of neuropsychiatric disorders. This difficulty is compounded in the case of rare disorders such as Phelan-McDermid Syndrome (PMS) by the paucity of patient clinical data. PMS is a rare syndromic genetic cause of autism and intellectual deficiency. In this paper, we describe the Phelan-McDermid Syndrome Data Network (PMS_DN), a platform that facilitates research into phenotype-genotype correlation and progression of PMS by: a) integrating knowledge of patient phenotypes extracted from Patient Reported Outcomes (PRO) data and clinical notes-two heterogeneous, underutilized sources of knowledge about patient phenotypes-with curated genetic information from the same patient cohort and b) making this integrated knowledge, along with a suite of statistical tools, available free of charge to authorized investigators on a Web portal https://pmsdn.hms.harvard.edu. PMS_DN is a Patient Centric Outcomes Research Initiative (PCORI) where patients and their families are involved in all aspects of the management of patient data in driving research into PMS. To foster collaborative research, PMS_DN also makes patient aggregates from this knowledge available to authorized investigators using distributed research networks such as the PCORnet PopMedNet. PMS_DN is hosted on a scalable cloud based environment and complies with all patient data privacy regulations. As of October 31, 2016, PMS_DN integrates high-quality knowledge extracted from the clinical notes of 112 patients and curated genetic reports of 176 patients with preprocessed PRO data from 415 patients.

Asunto(s)

Minería de Datos/métodos , Estudios de Asociación Genética/métodos , Almacenamiento y Recuperación de la Información/métodos , Trastorno del Espectro Autista/genética , Deleción Cromosómica , Trastornos de los Cromosomas/genética , Trastornos de los Cromosomas/fisiopatología , Cromosomas Humanos Par 22/genética , Estudios de Cohortes , Bases de Datos Genéticas , Femenino , Humanos , Discapacidad Intelectual/genética , Masculino , Registros Médicos , Proteínas del Tejido Nervioso/genética , Medición de Resultados Informados por el Paciente , Fenotipo

10.

Towards generalizable entity-centric clinical coreference resolution.

Miller, Timothy; Dligach, Dmitriy; Bethard, Steven; Lin, Chen; Savova, Guergana.

J Biomed Inform ; 69: 251-258, 2017 05.

Artículo en Inglés | MEDLINE | ID: mdl-28438706

RESUMEN

OBJECTIVE: This work investigates the problem of clinical coreference resolution in a model that explicitly tracks entities, and aims to measure the performance of that model in both traditional in-domain train/test splits and cross-domain experiments that measure the generalizability of learned models. METHODS: The two methods we compare are a baseline mention-pair coreference system that operates over pairs of mentions with best-first conflict resolution and a mention-synchronous system that incrementally builds coreference chains. We develop new features that incorporate distributional semantics, discourse features, and entity attributes. We use two new coreference datasets with similar annotation guidelines - the THYME colon cancer dataset and the DeepPhe breast cancer dataset. RESULTS: The mention-synchronous system performs similarly on in-domain data but performs much better on new data. Part of speech tag features prove superior in feature generalizability experiments over other word representations. Our methods show generalization improvement but there is still a performance gap when testing in new domains. DISCUSSION: Generalizability of clinical NLP systems is important and under-studied, so future work should attempt to perform cross-domain and cross-institution evaluations and explicitly develop features and training regimens that favor generalizability. A performance-optimized version of the mention-synchronous system will be included in the open source Apache cTAKES software.

Asunto(s)

APACHE , Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural , Semántica , Programas Informáticos , Humanos

11.

An information model for computable cancer phenotypes.

Hochheiser, Harry; Castine, Melissa; Harris, David; Savova, Guergana; Jacobson, Rebecca S.

BMC Med Inform Decis Mak ; 16(1): 121, 2016 09 15.

Artículo en Inglés | MEDLINE | ID: mdl-27629872

RESUMEN

BACKGROUND: Standards, methods, and tools supporting the integration of clinical data and genomic information are an area of significant need and rapid growth in biomedical informatics. Integration of cancer clinical data and cancer genomic information poses unique challenges, because of the high volume and complexity of clinical data, as well as the heterogeneity and instability of cancer genome data when compared with germline data. Current information models of clinical and genomic data are not sufficiently expressive to represent individual observations and to aggregate those observations into longitudinal summaries over the course of cancer care. These models are acutely needed to support the development of systems and tools for generating the so called clinical "deep phenotype" of individual cancer patients, a process which remains almost entirely manual in cancer research and precision medicine. METHODS: Reviews of existing ontologies and interviews with cancer researchers were used to inform iterative development of a cancer phenotype information model. We translated a subset of the Fast Healthcare Interoperability Resources (FHIR) models into the OWL 2 Description Logic (DL) representation, and added extensions as needed for modeling cancer phenotypes with terms derived from the NCI Thesaurus. Models were validated with domain experts and evaluated against competency questions. RESULTS: The DeepPhe Information model represents cancer phenotype data at increasing levels of abstraction from mention level in clinical documents to summaries of key events and findings. We describe the model using breast cancer as an example, depicting methods to represent phenotypic features of cancers, tumors, treatment regimens, and specific biologic behaviors that span the entire course of a patient's disease. CONCLUSIONS: We present a multi-scale information model for representing individual document mentions, document level classifications, episodes along a disease course, and phenotype summarization, linking individual observations to high-level summaries in support of subsequent integration and analysis.

Asunto(s)

Biología Computacional/métodos , Modelos Teóricos , Neoplasias/clasificación , Fenotipo , Humanos

12.

Identification of subjects with polycystic ovary syndrome using electronic health records.

Castro, Victor; Shen, Yuanyuan; Yu, Sheng; Finan, Sean; Pau, Cindy Ta; Gainer, Vivian; Keefe, Candace C; Savova, Guergana; Murphy, Shawn N; Cai, Tianxi; Welt, Corrine K.

Reprod Biol Endocrinol ; 13: 116, 2015 Oct 29.

Artículo en Inglés | MEDLINE | ID: mdl-26510685

RESUMEN

BACKGROUND: Polycystic ovary syndrome (PCOS) is a heterogeneous disorder because of the variable criteria used for diagnosis. Therefore, International Classification of Diseases 9 (ICD-9) codes may not accurately capture the diagnostic criteria necessary for large scale PCOS identification. We hypothesized that use of electronic medical records text and data would more specifically capture PCOS subjects. METHODS: Subjects with PCOS were identified in the Partners Healthcare Research Patients Data Registry by searching for the term "polycystic ovary syndrome" using natural language processing (n = 24,930). A training subset of 199 identified charts was reviewed and categorized based on likelihood of a true Rotterdam PCOS diagnosis, i.e. two out of three of the following: irregular menstrual cycles, hyperandrogenism and/or polycystic ovary morphology. Data from the history, physical exam, laboratory and radiology results were codified and extracted from notes of definite PCOS subjects. Thirty-two terms were used to build an algorithm for identifying definite PCOS cases and applied to the rest of the dataset. The positive predictive value cutoff was set at 76.8 % to maximize the number of subjects available for study. A true positive predictive value for the algorithm was calculated after review of 100 charts from subjects identified as definite PCOS cases with at least two documented Rotterdam criteria. The positive predictive value was compared to that calculated using 200 charts identified using the ICD-9 code for PCOS (256.4; n = 13,670). In addition, a cohort of previously recruited PCOS subjects was submitted for algorithm validation. RESULTS: Chart review demonstrated that 64 % were confirmed as definitely PCOS using the algorithm, with a 9 % false positive rate. 66 % of subjects identified by ICD-9 code for PCOS could be confirmed as definitely PCOS, with an 8.5 % false positive rate. There was no significant difference in the positive predictive values using the two methods (p = 0.2). However, the number of charts that had insufficient confirmatory data was lower using the algorithm (5 % vs 11 %; p < 0.04). Of 477 subjects with PCOS recruited and examined individually and present in the database as patients, 451 were found within the algorithm dataset. CONCLUSIONS: Extraction of text parameters along with codified data improves the confidence in PCOS patient cohorts identified using the electronic medical record. However, the positive predictive value was not significantly different when using ICD-9 codes or the specific algorithm. Further studies are needed to determine the positive predictive value of the two methods in additional electronic medical record datasets.

Asunto(s)

Registros Electrónicos de Salud , Adulto , Algoritmos , Bases de Datos Factuales , Femenino , Humanos , Persona de Mediana Edad , Síndrome del Ovario Poliquístico/diagnóstico

13.

An Introduction to Natural Language Processing: How You Can Get More From Those Electronic Notes You Are Generating.

Kimia, Amir A; Savova, Guergana; Landschaft, Assaf; Harper, Marvin B.

Pediatr Emerg Care ; 31(7): 536-41, 2015 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-26148107

RESUMEN

Electronically stored clinical documents may contain both structured data and unstructured data. The use of structured clinical data varies by facility, but clinicians are familiar with coded data such as International Classification of Diseases, Ninth Revision, Systematized Nomenclature of Medicine-Clinical Terms codes, and commonly other data including patient chief complaints or laboratory results. Most electronic health records have much more clinical information stored as unstructured data, for example, clinical narrative such as history of present illness, procedure notes, and clinical decision making are stored as unstructured data. Despite the importance of this information, electronic capture or retrieval of unstructured clinical data has been challenging. The field of natural language processing (NLP) is undergoing rapid development, and existing tools can be successfully used for quality improvement, research, healthcare coding, and even billing compliance. In this brief review, we provide examples of successful uses of NLP using emergency medicine physician visit notes for various projects and the challenges of retrieving specific data and finally present practical methods that can run on a standard personal computer as well as high-end state-of-the-art funded processes run by leading NLP informatics researchers.

Asunto(s)

Codificación Clínica , Registros Electrónicos de Salud , Almacenamiento y Recuperación de la Información , Procesamiento de Lenguaje Natural , Humanos

14.

Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence.

Carrell, David S; Halgrim, Scott; Tran, Diem-Thy; Buist, Diana S M; Chubak, Jessica; Chapman, Wendy W; Savova, Guergana.

Am J Epidemiol ; 179(6): 749-58, 2014 Mar 15.

Artículo en Inglés | MEDLINE | ID: mdl-24488511

RESUMEN

The increasing availability of electronic health records (EHRs) creates opportunities for automated extraction of information from clinical text. We hypothesized that natural language processing (NLP) could substantially reduce the burden of manual abstraction in studies examining outcomes, like cancer recurrence, that are documented in unstructured clinical text, such as progress notes, radiology reports, and pathology reports. We developed an NLP-based system using open-source software to process electronic clinical notes from 1995 to 2012 for women with early-stage incident breast cancers to identify whether and when recurrences were diagnosed. We developed and evaluated the system using clinical notes from 1,472 patients receiving EHR-documented care in an integrated health care system in the Pacific Northwest. A separate study provided the patient-level reference standard for recurrence status and date. The NLP-based system correctly identified 92% of recurrences and estimated diagnosis dates within 30 days for 88% of these. Specificity was 96%. The NLP-based system overlooked 5 of 65 recurrences, 4 because electronic documents were unavailable. The NLP-based system identified 5 other recurrences incorrectly classified as nonrecurrent in the reference standard. If used in similar cohorts, NLP could reduce by 90% the number of EHR charts abstracted to identify confirmed breast cancer recurrence cases at a rate comparable to traditional abstraction.

Asunto(s)

Neoplasias de la Mama/diagnóstico , Registros Electrónicos de Salud/estadística & datos numéricos , Procesamiento de Lenguaje Natural , Recurrencia Local de Neoplasia/diagnóstico , Factores de Edad , Anciano , Neoplasias de la Mama/fisiopatología , Neoplasias de la Mama/terapia , Femenino , Humanos , Persona de Mediana Edad , Clasificación del Tumor , Recurrencia Local de Neoplasia/fisiopatología , Recurrencia Local de Neoplasia/terapia , Estándares de Referencia , Reproducibilidad de los Resultados

15.

Evaluating the ChatGPT family of models for biomedical reasoning and classification.

Chen, Shan; Li, Yingya; Lu, Sheng; Van, Hoang; Aerts, Hugo J W L; Savova, Guergana K; Bitterman, Danielle S.

J Am Med Inform Assoc ; 31(4): 940-948, 2024 Apr 03.

Artículo en Inglés | MEDLINE | ID: mdl-38261400

RESUMEN

OBJECTIVE: Large language models (LLMs) have shown impressive ability in biomedical question-answering, but have not been adequately investigated for more specific biomedical applications. This study investigates ChatGPT family of models (GPT-3.5, GPT-4) in biomedical tasks beyond question-answering. MATERIALS AND METHODS: We evaluated model performance with 11 122 samples for two fundamental tasks in the biomedical domain-classification (n = 8676) and reasoning (n = 2446). The first task involves classifying health advice in scientific literature, while the second task is detecting causal relations in biomedical literature. We used 20% of the dataset for prompt development, including zero- and few-shot settings with and without chain-of-thought (CoT). We then evaluated the best prompts from each setting on the remaining dataset, comparing them to models using simple features (BoW with logistic regression) and fine-tuned BioBERT models. RESULTS: Fine-tuning BioBERT produced the best classification (F1: 0.800-0.902) and reasoning (F1: 0.851) results. Among LLM approaches, few-shot CoT achieved the best classification (F1: 0.671-0.770) and reasoning (F1: 0.682) results, comparable to the BoW model (F1: 0.602-0.753 and 0.675 for classification and reasoning, respectively). It took 78 h to obtain the best LLM results, compared to 0.078 and 0.008 h for the top-performing BioBERT and BoW models, respectively. DISCUSSION: The simple BoW model performed similarly to the most complex LLM prompting. Prompt engineering required significant investment. CONCLUSION: Despite the excitement around viral ChatGPT, fine-tuning for two fundamental biomedical natural language processing tasks remained the best strategy.

Asunto(s)

Lenguaje , Procesamiento de Lenguaje Natural

16.

Family history as the strongest predictor of aortic and peripheral aneurysms in patients with intracranial aneurysms.

Lai, Pui Man Rosalind; Akama-Garren, Elliot; Can, Anil; Tirado, Selena-Rae; Castro, Victor M; Dligach, Dmitriy; Finan, Sean; Gainer, Vivian S; Shadick, Nancy A; Savova, Guergana; Murphy, Shawn N; Cai, Tianxi; Weiss, Scott T; Du, Rose.

J Clin Neurosci ; 126: 128-134, 2024 Jun 12.

Artículo en Inglés | MEDLINE | ID: mdl-38870642

RESUMEN

OBJECTIVE: Intracranial aneurysms (IA) and aortic aneurysms (AA) are both abnormal dilations of arteries with familial predisposition and have been proposed to share co-prevalence and pathophysiology. Associations of IA and non-aortic peripheral aneurysms are less well-studied. The goal of the study was to understand the patterns of aortic and peripheral (extracranial) aneurysms in patients with IA, and risk factors associated with the development of these aneurysms. METHODS: 4701 patients were included in our retrospective analysis of all patients with intracranial aneurysms at our institution over the past 26 years. Patient demographics, comorbidities, and aneurysmal locations were analyzed. Univariate and multivariate analyses were performed to study associations with and without extracranial aneurysms. RESULTS: A total of 3.4% of patients (161 of 4701) with IA had at least one extracranial aneurysm. 2.8% had thoracic or abdominal aortic aneurysms. Age, male sex, hypertension, coronary artery disease, history of ischemic cerebral infarction, connective tissues disease, and family history of extracranial aneurysms in a 1st degree relative were associated with the presence of extracranial aneurysms and a higher number of extracranial aneurysms. In addition, family history of extracranial aneurysms in a second degree relative is associated with the presence of extracranial aneurysms and atrial fibrillation is associated with a higher number of extracranial aneurysms. CONCLUSION: Significant comorbidities are associated with extracranial aneurysms in patients with IA. Family history of extracranial aneurysms has the strongest association and suggests that IA patients with a family history of extracranial aneurysms may benefit from screening.

17.

Large language models to identify social determinants of health in electronic health records.

Guevara, Marco; Chen, Shan; Thomas, Spencer; Chaunzwa, Tafadzwa L; Franco, Idalid; Kann, Benjamin H; Moningi, Shalini; Qian, Jack M; Goldstein, Madeleine; Harper, Susan; Aerts, Hugo J W L; Catalano, Paul J; Savova, Guergana K; Mak, Raymond H; Bitterman, Danielle S.

NPJ Digit Med ; 7(1): 6, 2024 Jan 11.

Artículo en Inglés | MEDLINE | ID: mdl-38200151

RESUMEN

Social determinants of health (SDoH) play a critical role in patient outcomes, yet their documentation is often missing or incomplete in the structured data of electronic health records (EHRs). Large language models (LLMs) could enable high-throughput extraction of SDoH from the EHR to support research and clinical care. However, class imbalance and data limitations present challenges for this sparsely documented yet critical information. Here, we investigated the optimal methods for using LLMs to extract six SDoH categories from narrative text in the EHR: employment, housing, transportation, parental status, relationship, and social support. The best-performing models were fine-tuned Flan-T5 XL for any SDoH mentions (macro-F1 0.71), and Flan-T5 XXL for adverse SDoH mentions (macro-F1 0.70). Adding LLM-generated synthetic data to training varied across models and architecture, but improved the performance of smaller Flan-T5 models (delta F1 + 0.12 to +0.23). Our best-fine-tuned models outperformed zero- and few-shot performance of ChatGPT-family models in the zero- and few-shot setting, except GPT4 with 10-shot prompting for adverse SDoH. Fine-tuned models were less likely than ChatGPT to change their prediction when race/ethnicity and gender descriptors were added to the text, suggesting less algorithmic bias (p < 0.05). Our models identified 93.8% of patients with adverse SDoH, while ICD-10 codes captured 2.0%. These results demonstrate the potential of LLMs in improving real-world evidence on SDoH and assisting in identifying patients who could benefit from resource support.

18.

Similar risk of depression and anxiety following surgery or hospitalization for Crohn's disease and ulcerative colitis.

Ananthakrishnan, Ashwin N; Gainer, Vivian S; Cai, Tianxi; Perez, Raul Guzman; Cheng, Su-Chun; Savova, Guergana; Chen, Pei; Szolovits, Peter; Xia, Zongqi; De Jager, Philip L; Shaw, Stanley; Churchill, Susanne; Karlson, Elizabeth W; Kohane, Isaac; Perlis, Roy H; Plenge, Robert M; Murphy, Shawn N; Liao, Katherine P.

Am J Gastroenterol ; 108(4): 594-601, 2013 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-23337479

RESUMEN

OBJECTIVES: Psychiatric comorbidity is common in Crohn's disease (CD) and ulcerative colitis (UC). Inflammatory bowel disease (IBD)-related surgery or hospitalizations represent major events in the natural history of the disease. The objective of this study is to examine whether there is a difference in the risk of psychiatric comorbidity following surgery in CD and UC. METHODS: We used a multi-institution cohort of IBD patients without a diagnosis code for anxiety or depression preceding their IBD-related surgery or hospitalization. Demographic-, disease-, and treatment-related variables were retrieved. Multivariate logistic regression analysis was performed to individually identify risk factors for depression and anxiety. RESULTS: Our study included a total of 707 CD and 530 UC patients who underwent bowel resection surgery and did not have depression before surgery. The risk of depression 5 years after surgery was 16% and 11% in CD and UC patients, respectively. We found no difference in the risk of depression following surgery in the CD and UC patients (adjusted odds ratio, 1.11; 95% confidence interval, 0.84-1.47). Female gender, comorbidity, immunosuppressant use, perianal disease, stoma surgery, and early surgery within 3 years of care predicted depression after CD surgery; only the female gender and comorbidity predicted depression in UC patients. Only 12% of the CD cohort had ≥4 risk factors for depression, but among them nearly 44% subsequently received a diagnosis code for depression. CONCLUSIONS: IBD-related surgery or hospitalization is associated with a significant risk for depression and anxiety, with a similar magnitude of risk in both diseases.

Asunto(s)

Trastornos de Ansiedad/etiología , Colitis Ulcerosa/cirugía , Enfermedad de Crohn/cirugía , Trastorno Depresivo/etiología , Hospitalización , Complicaciones Posoperatorias , Adulto , Anciano , Trastornos de Ansiedad/epidemiología , Estudios de Cohortes , Colitis Ulcerosa/complicaciones , Colitis Ulcerosa/psicología , Enfermedad de Crohn/complicaciones , Enfermedad de Crohn/psicología , Trastorno Depresivo/epidemiología , Femenino , Humanos , Modelos Logísticos , Masculino , Persona de Mediana Edad , Factores de Riesgo , Factores Sexuales

19.

Improved de-identification of physician notes through integrative modeling of both public and private medical text.

McMurry, Andrew J; Fitch, Britt; Savova, Guergana; Kohane, Isaac S; Reis, Ben Y.

BMC Med Inform Decis Mak ; 13: 112, 2013 Oct 02.

Artículo en Inglés | MEDLINE | ID: mdl-24083569

RESUMEN

BACKGROUND: Physician notes routinely recorded during patient care represent a vast and underutilized resource for human disease studies on a population scale. Their use in research is primarily limited by the need to separate confidential patient information from clinical annotations, a process that is resource-intensive when performed manually. This study seeks to create an automated method for de-identifying physician notes that does not require large amounts of private information: in addition to training a model to recognize Protected Health Information (PHI) within private physician notes, we reverse the problem and train a model to recognize non-PHI words and phrases that appear in public medical texts. METHODS: Public and private medical text sources were analyzed to distinguish common medical words and phrases from Protected Health Information. Patient identifiers are generally nouns and numbers that appear infrequently in medical literature. To quantify this relationship, term frequencies and part of speech tags were compared between journal publications and physician notes. Standard medical concepts and phrases were then examined across ten medical dictionaries. Lists and rules were included from the US census database and previously published studies. In total, 28 features were used to train decision tree classifiers. RESULTS: The model successfully recalled 98% of PHI tokens from 220 discharge summaries. Cost sensitive classification was used to weight recall over precision (98% F10 score, 76% F1 score). More than half of the false negatives were the word "of" appearing in a hospital name. All patient names, phone numbers, and home addresses were at least partially redacted. Medical concepts such as "elevated white blood cell count" were informative for de-identification. The results exceed the previously approved criteria established by four Institutional Review Boards. CONCLUSIONS: The results indicate that distributional differences between private and public medical text can be used to accurately classify PHI. The data and algorithms reported here are made freely available for evaluation and improvement.

Asunto(s)

Simulación por Computador/normas , Registros Electrónicos de Salud/normas , Procesamiento de Lenguaje Natural , Médicos , Algoritmos , Confidencialidad/normas , Humanos , Reproducibilidad de los Resultados

20.

End-to-end clinical temporal information extraction with multi-head attention.

Miller, Timothy; Bethard, Steven; Dligach, Dmitriy; Savova, Guergana.

Proc Conf Assoc Comput Linguist Meet ; 2023: 313-319, 2023 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-37780680

RESUMEN

Understanding temporal relationships in text from electronic health records can be valuable for many important downstream clinical applications. Since Clinical TempEval 2017, there has been little work on end-to-end systems for temporal relation extraction, with most work focused on the setting where gold standard events and time expressions are given. In this work, we make use of a novel multi-headed attention mechanism on top of a pre-trained transformer encoder to allow the learning process to attend to multiple aspects of the contextualized embeddings. Our system achieves state of the art results on the THYME corpus by a wide margin, in both the in-domain and cross-domain settings.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA