Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 89
Filter
Add more filters

Country/Region as subject
Publication year range
1.
J Biomed Inform ; 137: 104265, 2023 01.
Article in English | MEDLINE | ID: mdl-36464227

ABSTRACT

The detection of adverse drug reactions (ADRs) is critical to our understanding of the safety and risk-benefit profile of medications. With an incidence that has not changed over the last 30 years, ADRs are a significant source of patient morbidity, responsible for 5%-10% of acute care hospital admissions worldwide. Spontaneous reporting of ADRs has long been the standard method of reporting, however this approach is known to have high rates of under-reporting, a problem that limits pharmacovigilance efforts. Automated ADR reporting presents an alternative pathway to increase reporting rates, although this may be limited by over-reporting of other drug-related adverse events. We developed a deep learning natural language processing algorithm to identify ADRs in discharge summaries at a single academic hospital centre. Our model was developed in two stages: first, a pre-trained model (DeBERTa) was further pre-trained on 1.1 million unlabelled clinical documents; secondly, this model was fine-tuned to detect ADR mentions in a corpus of 861 annotated discharge summaries. This model was compared to a version without the pre-training step, and a previously published RoBERTa model pretrained on MIMIC III, which has demonstrated strong performance on other pharmacovigilance tasks. To ensure that our algorithm could differentiate ADRs from other drug-related adverse events, the annotated corpus was enriched for both validated ADR reports and confounding drug-related adverse events using. The final model demonstrated good performance with a ROC-AUC of 0.955 (95% CI 0.933 - 0.978) for the task of identifying discharge summaries containing ADR mentions, significantly outperforming the two comparator models.


Subject(s)
Deep Learning , Drug-Related Side Effects and Adverse Reactions , Humans , Natural Language Processing , Adverse Drug Reaction Reporting Systems , Algorithms , Drug-Related Side Effects and Adverse Reactions/epidemiology , Pharmacovigilance
2.
J Sch Nurs ; 38(1): 74-83, 2022 Feb.
Article in English | MEDLINE | ID: mdl-33944636

ABSTRACT

School nurses are the most accessible health care providers for many young people including adolescents and young adults. Early identification of depression results in improved outcomes, but little information is available comprehensively describing depressive symptoms specific to this population. The aim of this study was to develop a taxonomy of depressive symptoms that were manifested and described by young people based on a scoping review and content analysis. Twenty-five journal articles that included narrative descriptions of depressive symptoms in young people were included. A total of 60 depressive symptoms were identified and categorized into five dimensions: behavioral (n = 8), cognitive (n = 14), emotional (n = 15), interpersonal (n = 13), and somatic (n = 10). This comprehensive depression symptom taxonomy can help school nurses to identify young people who may experience depression and will support future research to better screen for depression.


Subject(s)
Depression , Adolescent , Humans , Young Adult
3.
Curr Opin Pulm Med ; 27(6): 544-553, 2021 11 01.
Article in English | MEDLINE | ID: mdl-34431789

ABSTRACT

PURPOSE OF REVIEW: At many institutions, the Covid-19 pandemic made it necessary to rapidly change the way services are provided to patients, including those with cystic fibrosis (CF). The purpose of this review is to explore the past, present and future of telehealth and virtual monitoring in CF and to highlight certain challenges/considerations in developing such services. RECENT FINDINGS: The Covid-19 pandemic has proven that telehealth and virtual monitoring are a feasible means for safely providing services to CF patients when traditional care is not possible. However, both telehealth and virtual monitoring can also provide further support in the future in a post-covid era through a hybrid-model incorporating traditional care, remote data collection and sophisticated platforms to manage and share data with CF teams. SUMMARY: We provide a detailed overview of telehealth and virtual monitoring including examples of how paediatric and adult CF services adapted to the need for rapid change. Such services have proven popular with people with CF meaning that co-design with stakeholders will likely improve systems further. In the future, telehealth and virtual monitoring will become more sophisticated by harnessing increasingly powerful technologies such as artificial intelligence, connected monitoring devices and wearables. In this review, we harmonise definitions and terminologies before highlighting considerations and limitations for the future of telehealth and virtual monitoring in CF.


Subject(s)
COVID-19 , Cystic Fibrosis , Telemedicine , Adult , Artificial Intelligence , Child , Cystic Fibrosis/therapy , Humans , Pandemics , SARS-CoV-2
4.
Ann Surg ; 272(4): 629-636, 2020 10.
Article in English | MEDLINE | ID: mdl-32773639

ABSTRACT

OBJECTIVES: We present the development and validation of a portable NLP approach for automated surveillance of SSIs. SUMMARY OF BACKGROUND DATA: The surveillance of SSIs is labor-intensive limiting the generalizability and scalability of surgical quality surveillance programs. METHODS: We abstracted patient clinical text notes after surgical procedures from 2 independent healthcare systems using different electronic healthcare records. An SSI detected as part of the American College of Surgeons' National Surgical Quality Improvement Program was used as the reference standard. We developed a rules-based NLP system (Easy Clinical Information Extractor [CIE]-SSI) for operative event-level detection of SSIs using an training cohort (4574 operative events) from 1 healthcare system and then conducted internal validation on a blind cohort from the same healthcare system (1850 operative events) and external validation on a blind cohort from the second healthcare system (15,360 operative events). EasyCIE-SSI performance was measured using sensitivity, specificity, and area under the receiver-operating-curve (AUC). RESULTS: The prevalence of SSI was 4% and 5% in the internal and external validation corpora. In internal validation, EasyCIE-SSI had a sensitivity, specificity, AUC of 94%, 88%, 0.912 for the detection of SSI, respectively. In external validation, EasyCIE-SSI had sensitivity, specificity, AUC of 79%, 92%, 0.852 for the detection of SSI, respectively. The sensitivity of EasyCIE-SSI decreased in clean, skin/subcutaneous, and outpatient procedures in the external validation compared to internal validation. CONCLUSION: Automated surveillance of SSIs can be achieved using NLP of clinical notes with high sensitivity and specificity.


Subject(s)
Mobile Applications , Natural Language Processing , Surgical Wound Infection/diagnosis , Adult , Aged , Cohort Studies , Female , Humans , Male , Middle Aged , Population Surveillance/methods , Quality Improvement , Surgical Procedures, Operative/standards
5.
Am J Epidemiol ; 179(6): 749-58, 2014 Mar 15.
Article in English | MEDLINE | ID: mdl-24488511

ABSTRACT

The increasing availability of electronic health records (EHRs) creates opportunities for automated extraction of information from clinical text. We hypothesized that natural language processing (NLP) could substantially reduce the burden of manual abstraction in studies examining outcomes, like cancer recurrence, that are documented in unstructured clinical text, such as progress notes, radiology reports, and pathology reports. We developed an NLP-based system using open-source software to process electronic clinical notes from 1995 to 2012 for women with early-stage incident breast cancers to identify whether and when recurrences were diagnosed. We developed and evaluated the system using clinical notes from 1,472 patients receiving EHR-documented care in an integrated health care system in the Pacific Northwest. A separate study provided the patient-level reference standard for recurrence status and date. The NLP-based system correctly identified 92% of recurrences and estimated diagnosis dates within 30 days for 88% of these. Specificity was 96%. The NLP-based system overlooked 5 of 65 recurrences, 4 because electronic documents were unavailable. The NLP-based system identified 5 other recurrences incorrectly classified as nonrecurrent in the reference standard. If used in similar cohorts, NLP could reduce by 90% the number of EHR charts abstracted to identify confirmed breast cancer recurrence cases at a rate comparable to traditional abstraction.


Subject(s)
Breast Neoplasms/diagnosis , Electronic Health Records/statistics & numerical data , Natural Language Processing , Neoplasm Recurrence, Local/diagnosis , Age Factors , Aged , Breast Neoplasms/physiopathology , Breast Neoplasms/therapy , Female , Humans , Middle Aged , Neoplasm Grading , Neoplasm Recurrence, Local/physiopathology , Neoplasm Recurrence, Local/therapy , Reference Standards , Reproducibility of Results
6.
J Biomed Inform ; 50: 162-72, 2014 Aug.
Article in English | MEDLINE | ID: mdl-24859155

ABSTRACT

The Health Insurance Portability and Accountability Act (HIPAA) Safe Harbor method requires removal of 18 types of protected health information (PHI) from clinical documents to be considered "de-identified" prior to use for research purposes. Human review of PHI elements from a large corpus of clinical documents can be tedious and error-prone. Indeed, multiple annotators may be required to consistently redact information that represents each PHI class. Automated de-identification has the potential to improve annotation quality and reduce annotation time. For instance, using machine-assisted annotation by combining de-identification system outputs used as pre-annotations and an interactive annotation interface to provide annotators with PHI annotations for "curation" rather than manual annotation from "scratch" on raw clinical documents. In order to assess whether machine-assisted annotation improves the reliability and accuracy of the reference standard quality and reduces annotation effort, we conducted an annotation experiment. In this annotation study, we assessed the generalizability of the VA Consortium for Healthcare Informatics Research (CHIR) annotation schema and guidelines applied to a corpus of publicly available clinical documents called MTSamples. Specifically, our goals were to (1) characterize a heterogeneous corpus of clinical documents manually annotated for risk-ranked PHI and other annotation types (clinical eponyms and person relations), (2) evaluate how well annotators apply the CHIR schema to the heterogeneous corpus, (3) compare whether machine-assisted annotation (experiment) improves annotation quality and reduces annotation time compared to manual annotation (control), and (4) assess the change in quality of reference standard coverage with each added annotator's annotations.


Subject(s)
Electronic Health Records , User-Computer Interface , Health Insurance Portability and Accountability Act , United States
7.
Stud Health Technol Inform ; 310: 289-293, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38269811

ABSTRACT

We analyzed PubMed citations since 1988 to explore the dissemination of medical/health informatics concepts between countries and across medical domains. We extracted countries from the PubMed author affiliation field to identify and analyze the top 10 informatics publishing countries. We found that the informatics publications are becoming more similar over time and that the rate of exchange across countries has increased with the introduction of e-publishing. Nonetheless, with the exception of machine learning, the impact of core informatics concepts on mainstream medicine and radiology publications remains small.


Subject(s)
Medical Informatics , Radiology , Machine Learning , Mainstreaming, Education , PubMed
8.
Stud Health Technol Inform ; 310: 579-583, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38269875

ABSTRACT

The reliable identification of skin and soft tissue infections (SSTIs) from electronic health records is important for a number of applications, including quality improvement, clinical guideline construction, and epidemiological analysis. However, in the United States, types of SSTIs (e.g. is the infection purulent or non-purulent?) are not captured reliably in structured clinical data. With this work, we trained and evaluated a rule-based clinical natural language processing system using 6,576 manually annotated clinical notes derived from the United States Veterans Health Administration (VA) with the goal of automatically extracting and classifying SSTI subtypes from clinical notes. The trained system achieved mention- and document-level performance metrics of the range 0.39 to 0.80 for mention level classification and 0.49 to 0.98 for document level classification.


Subject(s)
Soft Tissue Infections , United States , Humans , Soft Tissue Infections/diagnosis , Skin , Benchmarking , Electronic Health Records , Natural Language Processing
9.
Stud Health Technol Inform ; 310: 1241-1245, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38270013

ABSTRACT

The Learning Health Systems (LHS) framework demonstrates the potential for iterative interrogation of health data in real time and implementation of insights into practice. Yet, the lack of appropriately skilled workforce results in an inability to leverage existing data to design innovative solutions. We developed a tailored professional development program to foster a skilled workforce. The short course is wholly online, for interdisciplinary professionals working in the digital health arena. To transform healthcare systems, the workforce needs an understanding of LHS principles, data driven approaches, and the need for diversly skilled learning communities that can tackle these complex problems together.


Subject(s)
Learning Health System , Digital Health , Interdisciplinary Studies , Learning , Workforce
10.
J Biomed Inform ; 46(4): 734-43, 2013 Aug.
Article in English | MEDLINE | ID: mdl-23602781

ABSTRACT

A major goal of Natural Language Processing in the public health informatics domain is the automatic extraction and encoding of data stored in free text patient records. This extracted data can then be utilized by computerized systems to perform syndromic surveillance. In particular, the chief complaint--a short string that describes a patient's symptoms--has come to be a vital resource for syndromic surveillance in the North American context due to its near ubiquity. This paper reviews fifteen systems in North America--at the city, county, state and federal level--that use chief complaints for syndromic surveillance.


Subject(s)
Population Surveillance , Humans , North America , Syndrome
11.
Pharmacoepidemiol Drug Saf ; 22(8): 834-41, 2013 Aug.
Article in English | MEDLINE | ID: mdl-23554109

ABSTRACT

PURPOSE: This study aimed to develop Natural Language Processing (NLP) approaches to supplement manual outcome validation, specifically to validate pneumonia cases from chest radiograph reports. METHODS: We trained one NLP system, ONYX, using radiograph reports from children and adults that were previously manually reviewed. We then assessed its validity on a test set of 5000 reports. We aimed to substantially decrease manual review, not replace it entirely, and so, we classified reports as follows: (1) consistent with pneumonia; (2) inconsistent with pneumonia; or (3) requiring manual review because of complex features. We developed processes tailored either to optimize accuracy or to minimize manual review. Using logistic regression, we jointly modeled sensitivity and specificity of ONYX in relation to patient age, comorbidity, and care setting. We estimated positive and negative predictive value (PPV and NPV) assuming pneumonia prevalence in the source data. RESULTS: Tailored for accuracy, ONYX identified 25% of reports as requiring manual review (34% of true pneumonias and 18% of non-pneumonias). For the remainder, ONYX's sensitivity was 92% (95% CI 90-93%), specificity 87% (86-88%), PPV 74% (72-76%), and NPV 96% (96-97%). Tailored to minimize manual review, ONYX classified 12% as needing manual review. For the remainder, ONYX had sensitivity 75% (72-77%), specificity 95% (94-96%), PPV 86% (83-88%), and NPV 91% (90-91%). CONCLUSIONS: For pneumonia validation, ONYX can replace almost 90% of manual review while maintaining low to moderate misclassification rates. It can be tailored for different outcomes and study needs and thus warrants exploration in other settings.


Subject(s)
Natural Language Processing , Pharmacoepidemiology , Pneumonia/diagnosis , Adolescent , Adult , Age Factors , Aged , Aged, 80 and over , Child , Child, Preschool , Humans , Infant , Logistic Models , Middle Aged , Pneumonia/diagnostic imaging , Pneumonia/epidemiology , Predictive Value of Tests , Prevalence , Radiography , Young Adult
12.
Front Digit Health ; 5: 1196442, 2023.
Article in English | MEDLINE | ID: mdl-37214343

ABSTRACT

Cystic Fibrosis (CF) is a chronic life-limiting condition that affects multiple organs within the body. Patients must adhere to strict medication regimens, physiotherapy, diet, and attend regular clinic appointments to manage their condition effectively. This necessary but burdensome requirement has prompted investigations into how different digital health technologies can enhance current care by providing the opportunity to virtually monitor patients. This review explores how virtual monitoring has been harnessed for assessment or performance of physiotherapy/exercise, diet/nutrition, symptom monitoring, medication adherence, and wellbeing/mental-health in people with CF. This review will also briefly discuss the potential future of CF virtual monitoring and some common barriers to its current adoption and implementation within CF. Due to the multifaceted nature of CF, it is anticipated that this review will be relevant to not only the CF community, but also those investigating and developing digital health solutions for the management of other chronic diseases.

13.
J Cyst Fibros ; 22(4): 598-606, 2023 07.
Article in English | MEDLINE | ID: mdl-37230808

ABSTRACT

The ongoing development and integration of telehealth within CF care has been accelerated in response to the Covid-19 pandemic, with many centres publishing their experiences. Now, as the restrictions of the pandemic ease, the use of telehealth appears to be waning, with many centres returning to routine traditional face-to-face services. For most, telehealth is not integrated into clinical care models, and there is a lack of guidance on how to integrate such a service into clinical care. The aims of this systematic review were to first identify manuscripts which may inform best CF telehealth practices, and second, to analyse these finding to determine how the CF community may use telehealth to improve care for patients, families, and Multidisciplinary Teams into the future. To achieve this, the PRISMA review methodology was utilised, in combination with a modified novel scoring system that consolidates expert weighting from key CF stakeholders, allowing for the manuscripts to be placed in a hierarchy in accordance with their scientific robustness. From the 39 found manuscripts, the top ten are presented and further analysed. The top ten manuscripts are exemplars of where telehealth is used effectively within CF care at this time, and demonstrate specific use cases of its potential best practices. However, there is a lack of guidance for implementation and clinical decision making, which remains an area for improvement. Thus, it is suggested that further work explores and provides guidance for standardised implementation into CF clinical practice.


Subject(s)
COVID-19 , Cystic Fibrosis , Telemedicine , Humans , Cystic Fibrosis/diagnosis , Cystic Fibrosis/epidemiology , Cystic Fibrosis/therapy , Pandemics , COVID-19/epidemiology
14.
J Biomed Inform ; 45(1): 71-81, 2012 Feb.
Article in English | MEDLINE | ID: mdl-21925286

ABSTRACT

Information extraction applications that extract structured event and entity information from unstructured text can leverage knowledge of clinical report structure to improve performance. The Subjective, Objective, Assessment, Plan (SOAP) framework, used to structure progress notes to facilitate problem-specific, clinical decision making by physicians, is one example of a well-known, canonical structure in the medical domain. Although its applicability to structuring data is understood, its contribution to information extraction tasks has not yet been determined. The first step to evaluating the SOAP framework's usefulness for clinical information extraction is to apply the model to clinical narratives and develop an automated SOAP classifier that classifies sentences from clinical reports. In this quantitative study, we applied the SOAP framework to sentences from emergency department reports, and trained and evaluated SOAP classifiers built with various linguistic features. We found the SOAP framework can be applied manually to emergency department reports with high agreement (Cohen's kappa coefficients over 0.70). Using a variety of features, we found classifiers for each SOAP class can be created with moderate to outstanding performance with F(1) scores of 93.9 (subjective), 94.5 (objective), 75.7 (assessment), and 77.0 (plan). We look forward to expanding the framework and applying the SOAP classification to clinical information extraction tasks.


Subject(s)
Data Mining/methods , Emergency Service, Hospital , Automation , Databases, Factual , Decision Making , Diagnosis , Humans , Research Report
15.
J Biomed Inform ; 45(3): 507-21, 2012 Jun.
Article in English | MEDLINE | ID: mdl-22343015

ABSTRACT

MOTIVATION: Expressions that refer to a real-world entity already mentioned in a narrative are often considered anaphoric. For example, in the sentence "The pain comes and goes," the expression "the pain" is probably referring to a previous mention of pain. Interpretation of meaning involves resolving the anaphoric reference: deciding which expression in the text is the correct antecedent of the referring expression, also called an anaphor. We annotated a set of 180 clinical reports (surgical pathology, radiology, discharge summaries, and emergency department) from two institutions to indicate all anaphor-antecedent pairs. OBJECTIVE: The objective of this study is to describe the characteristics of the corpus in terms of the frequency of anaphoric relations, the syntactic and semantic nature of the members of the pairs, and the types of anaphoric relations that occur. Understanding how anaphoric reference is exhibited in clinical reports is critical to developing reference resolution algorithms and to identifying peculiarities of clinical text that may alter the features and methodologies that will be successful for automated anaphora resolution. RESULTS: We found that anaphoric reference is prevalent in all types of clinical reports, that annotations of noun phrases, semantic type, and section headings may be especially important for automated resolution of anaphoric reference, and that separate modules for reference resolution may be required for different report types, different institutions, and different types of anaphors. Accurate resolution will probably require extensive domain knowledge-especially for pathology and radiology reports with more part/whole and set/subset relations. CONCLUSION: We hope researchers will leverage the annotations in this corpus to develop automated algorithms and will add to the annotations to generate a more extensive corpus.


Subject(s)
Electronic Health Records/standards , Semantics , Algorithms , Data Mining/methods , Humans
16.
J Biomed Inform ; 45(4): 651-7, 2012 Aug.
Article in English | MEDLINE | ID: mdl-22210167

ABSTRACT

Mapping medical test names into a standardized vocabulary is a prerequisite to sharing test-related data between health care entities. One major barrier in this process is the inability to describe tests in sufficient detail to assign the appropriate name in Logical Observation Identifiers, Names, and Codes (LOINC®). Approaches to address mapping of test names with incomplete information have not been well described. We developed a process of "enhancing" local test names by incorporating information required for LOINC mapping into the test names themselves. When using the Regenstrief LOINC Mapping Assistant (RELMA) we found that 73/198 (37%) of "enhanced" test names were successfully mapped to LOINC, compared to 41/191 (21%) of original names (p=0.001). Our approach led to a significantly higher proportion of test names with successful mapping to LOINC, but further efforts are required to achieve more satisfactory results.


Subject(s)
Diagnostic Techniques and Procedures , Electronic Health Records , Logical Observation Identifiers Names and Codes , Humans , User-Computer Interface
17.
Arthritis Rheumatol ; 74(12): 1893-1905, 2022 12.
Article in English | MEDLINE | ID: mdl-35857865

ABSTRACT

Deep learning has emerged as the leading method in machine learning, spawning a rapidly growing field of academic research and commercial applications across medicine. Deep learning could have particular relevance to rheumatology if correctly utilized. The greatest benefits of deep learning methods are seen with unstructured data frequently found in rheumatology, such as images and text, where traditional machine learning methods have struggled to unlock the trove of information held within these data formats. The basis for this success comes from the ability of deep learning to learn the structure of the underlying data. It is no surprise that the first areas of medicine that have started to experience impact from deep learning heavily rely on interpreting visual data, such as triaging radiology workflows and computer-assisted colonoscopy. Applications in rheumatology are beginning to emerge, with recent successes in areas as diverse as detecting joint erosions on plain radiography, predicting future rheumatoid arthritis disease activity, and identifying halo sign on temporal artery ultrasound. Given the important role deep learning methods are likely to play in the future of rheumatology, it is imperative that rheumatologists understand the methods and assumptions that underlie the deep learning algorithms in widespread use today, their limitations and the landscape of deep learning research that will inform algorithm development, and clinical decision support tools of the future. The best applications of deep learning in rheumatology must be informed by the clinical experience of rheumatologists, so that algorithms can be developed to tackle the most relevant clinical problems.


Subject(s)
Artificial Intelligence , Deep Learning , Humans , Rheumatologists , Machine Learning , Algorithms
18.
J Am Heart Assoc ; 11(7): e024198, 2022 04 05.
Article in English | MEDLINE | ID: mdl-35322668

ABSTRACT

Background Social risk factors influence rehospitalization rates yet are challenging to incorporate into prediction models. Integration of social risk factors using natural language processing (NLP) and machine learning could improve risk prediction of 30-day readmission following an acute myocardial infarction. Methods and Results Patients were enrolled into derivation and validation cohorts. The derivation cohort included inpatient discharges from Vanderbilt University Medical Center between January 1, 2007, and December 31, 2016, with a primary diagnosis of acute myocardial infarction, who were discharged alive, and not transferred from another facility. The validation cohort included patients from Dartmouth-Hitchcock Health Center between April 2, 2011, and December 31, 2016, meeting the same eligibility criteria described above. Data from both sites were linked to Centers for Medicare & Medicaid Services administrative data to supplement 30-day hospital readmissions. Clinical notes from each cohort were extracted, and an NLP model was deployed, counting mentions of 7 social risk factors. Five machine learning models were run using clinical and NLP-derived variables. Model discrimination and calibration were assessed, and receiver operating characteristic comparison analyses were performed. The 30-day rehospitalization rates among the derivation (n=6165) and validation (n=4024) cohorts were 15.1% (n=934) and 10.2% (n=412), respectively. The derivation models demonstrated no statistical improvement in model performance with the addition of the selected NLP-derived social risk factors. Conclusions Social risk factors extracted using NLP did not significantly improve 30-day readmission prediction among hospitalized patients with acute myocardial infarction. Alternative methods are needed to capture social risk factors.


Subject(s)
Myocardial Infarction , Natural Language Processing , Aged , Electronic Health Records , Humans , Information Storage and Retrieval , Medicare , Myocardial Infarction/diagnosis , Myocardial Infarction/therapy , Patient Readmission , Retrospective Studies , United States/epidemiology
19.
J Biomed Inform ; 44(5): 728-37, 2011 Oct.
Article in English | MEDLINE | ID: mdl-21459155

ABSTRACT

In this paper we describe an application called peFinder for document-level classification of CT pulmonary angiography reports. peFinder is based on a generalized version of the ConText algorithm, a simple text processing algorithm for identifying features in clinical report documents. peFinder was used to answer questions about the disease state (pulmonary emboli present or absent), the certainty state of the diagnosis (uncertainty present or absent), the temporal state of an identified pulmonary embolus (acute or chronic), and the technical quality state of the exam (diagnostic or not diagnostic). Gold standard answers for each question were determined from the consensus classifications of three human annotators. peFinder results were compared to naive Bayes' classifiers using unigrams and bigrams. The sensitivities (and positive predictive values) for peFinder were 0.98(0.83), 0.86(0.96), 0.94(0.93), and 0.60(0.90) for disease state, quality state, certainty state, and temporal state respectively, compared to 0.68(0.77), 0.67(0.87), 0.62(0.82), and 0.04(0.25) for the naive Bayes' classifier using unigrams, and 0.75(0.79), 0.52(0.69), 0.59(0.84), and 0.04(0.25) for the naive Bayes' classifier using bigrams.


Subject(s)
Algorithms , Lung/diagnostic imaging , Angiography/classification , Bayes Theorem , Humans , Pulmonary Embolism/diagnostic imaging , Research Report , Semantics
20.
J Biomed Inform ; 44(6): 1113-22, 2011 Dec.
Article in English | MEDLINE | ID: mdl-21856441

ABSTRACT

Coreference resolution is the task of determining linguistic expressions that refer to the same real-world entity in natural language. Research on coreference resolution in the general English domain dates back to 1960s and 1970s. However, research on coreference resolution in the clinical free text has not seen major development. The recent US government initiatives that promote the use of electronic health records (EHRs) provide opportunities to mine patient notes as more and more health care institutions adopt EHR. Our goal was to review recent advances in general purpose coreference resolution to lay the foundation for methodologies in the clinical domain, facilitated by the availability of a shared lexical resource of gold standard coreference annotations, the Ontology Development and Information Extraction (ODIE) corpus.


Subject(s)
Medical Informatics/methods , Natural Language Processing , Electronic Health Records , Humans , Information Storage and Retrieval , Linguistics
SELECTION OF CITATIONS
SEARCH DETAIL