Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
1.
NPJ Digit Med ; 6(1): 26, 2023 Feb 21.
Article in English | MEDLINE | ID: mdl-36810915

ABSTRACT

In supervised learning model development, domain experts are often used to provide the class labels (annotations). Annotation inconsistencies commonly occur when even highly experienced clinical experts annotate the same phenomenon (e.g., medical image, diagnostics, or prognostic status), due to inherent expert bias, judgments, and slips, among other factors. While their existence is relatively well-known, the implications of such inconsistencies are largely understudied in real-world settings, when supervised learning is applied on such 'noisy' labelled data. To shed light on these issues, we conducted extensive experiments and analyses on three real-world Intensive Care Unit (ICU) datasets. Specifically, individual models were built from a common dataset, annotated independently by 11 Glasgow Queen Elizabeth University Hospital ICU consultants, and model performance estimates were compared through internal validation (Fleiss' κ = 0.383 i.e., fair agreement). Further, broad external validation (on both static and time series datasets) of these 11 classifiers was carried out on a HiRID external dataset, where the models' classifications were found to have low pairwise agreements (average Cohen's κ = 0.255 i.e., minimal agreement). Moreover, they tend to disagree more on making discharge decisions (Fleiss' κ = 0.174) than predicting mortality (Fleiss' κ = 0.267). Given these inconsistencies, further analyses were conducted to evaluate the current best practices in obtaining gold-standard models and determining consensus. The results suggest that: (a) there may not always be a "super expert" in acute clinical settings (using internal and external validation model performances as a proxy); and (b) standard consensus seeking (such as majority vote) consistently leads to suboptimal models. Further analysis, however, suggests that assessing annotation learnability and using only 'learnable' annotated datasets for determining consensus achieves optimal models in most cases.

2.
Br J Gen Pract ; 67(665): e816-e823, 2017 Dec.
Article in English | MEDLINE | ID: mdl-29109114

ABSTRACT

BACKGROUND: Endometriosis is a condition with relatively non-specific symptoms, and in some cases a long time elapses from first-symptom presentation to diagnosis. AIM: To develop and test new composite pointers to a diagnosis of endometriosis in primary care electronic records. DESIGN AND SETTING: This is a nested case-control study of 366 cases using the Practice Team Information database of anonymised primary care electronic health records from Scotland. Data were analysed from 366 cases of endometriosis between 1994 and 2010, and two sets of age and GP practice matched controls: (a) 1453 randomly selected females and (b) 610 females whose records contained codes indicating consultation for gynaecological symptoms. METHOD: Composite pointers comprised patterns of symptoms, prescribing, or investigations, in combination or over time. Conditional logistic regression was used to examine the presence of both new and established pointers during the 3 years before diagnosis of endometriosis and to identify time of appearance. RESULTS: A number of composite pointers that were strongly predictive of endometriosis were observed. These included pain and menstrual symptoms occurring within the same year (odds ratio [OR] 6.5, 95% confidence interval [CI] = 3.9 to 10.6), and lower gastrointestinal symptoms occurring within 90 days of gynaecological pain (OR 6.1, 95% CI = 3.6 to 10.6). Although the association of infertility with endometriosis was only detectable in the year before diagnosis, several pain-related features were associated with endometriosis several years earlier. CONCLUSION: Useful composite pointers to a diagnosis of endometriosis in GP records were identified. Some of these were present several years before the diagnosis and may be valuable targets for diagnostic support systems.


Subject(s)
Dysmenorrhea/diagnosis , Electronic Health Records , Endometriosis/diagnosis , Gastroenteritis/diagnosis , Pelvic Pain/diagnosis , Primary Health Care , Adolescent , Adult , Age Distribution , Analgesics/therapeutic use , Anti-Inflammatory Agents, Non-Steroidal/therapeutic use , Case-Control Studies , Dysmenorrhea/etiology , Endometriosis/physiopathology , Female , Gastroenteritis/etiology , Humans , Odds Ratio , Pelvic Pain/etiology , Practice Guidelines as Topic , Referral and Consultation , Risk Assessment , Scotland/epidemiology , Young Adult
3.
Artif Intell Med ; 58(1): 1-13, 2013 May.
Article in English | MEDLINE | ID: mdl-23522940

ABSTRACT

OBJECTIVE: While EIRA has proved to be successful in the detection of anomalous patient responses to treatments in the Intensive Care Unit, it could not describe to clinicians the rationales behind the anomalous detections. The aim of this paper is to address this problem. METHODS: Few attempts have been made in the past to build knowledge-based medical systems that possess both argumentation and explanation capabilities. Here we propose an approach based on Dung's seminal calculus of opposition. RESULTS: We have developed a new tool, arguEIRA, which is an extension of the existing EIRA system. In this paper we extend EIRA by providing it with an argumentation-based justification system that formalizes and communicates to the clinicians the reasons why a patient response is anomalous. CONCLUSION: Our comparative evaluation of the EIRA system against the newly developed tool highlights the multiple benefits that the use of argumentation-logic can bring to the field of medical decision support and explanation.


Subject(s)
Decision Support Systems, Clinical/organization & administration , Intensive Care Units/organization & administration , Knowledge Bases , Treatment Outcome , Algorithms
4.
IEEE J Biomed Health Inform ; 17(4): 843-52, 2013 Jul.
Article in English | MEDLINE | ID: mdl-25055313

ABSTRACT

We present a Bayesian analysis of ordinal annotations made by clinicians of patients in intensive care. In particular, we investigate the different ways in which clinicians can disagree and how their disagreement is reduced once they take part in a recently proposed procedure (INSIGHT) that aims at improving consistency. The model combines a nonparametric function (loosely interpretable as the health of the patient) with clinician-specific generative procedures for producing the observed ordinal values. Our analysis provides valuable details of the rating behavior of the individual clinicians and shows that the INSIGHT procedure is particularly effective at removing (some) clinician-specific inconsistencies and biases.


Subject(s)
Intensive Care Units/statistics & numerical data , Medical Records/standards , Physicians/statistics & numerical data , Artificial Intelligence , Computer Simulation , Humans , Models, Statistical
5.
Artif Intell Med ; 55(2): 71-86, 2012 Jun.
Article in English | MEDLINE | ID: mdl-22483422

ABSTRACT

OBJECTIVES: The work reported here focuses on developing novel techniques which enable an expert to detect inconsistencies in 2 (or more) perspectives that the expert might have on the same (classification) task. The high level task which the experts (physicians) had set themselves was to classify, on a 5-point severity scale (A-E), the hourly reports produced by an intensive care unit's patient management system. METHOD: The INSIGHT system has been developed to support domain experts exploring, and removing inconsistencies in their conceptualization of a task. We report here a study of intensive care physicians reconciling 2 perspectives on their patients. The 2 perspectives provided to INSIGHT were an annotated set of patient records where the expert had selected the appropriate category to describe that snapshot of the patient, and a set of rules which are able to classify the various time points on the same 5-point scale. Inconsistencies between these 2 perspectives are displayed as a confusion matrix; moreover INSIGHT then allows the expert to revise both the annotated datasets (correcting data errors, or changing the assigned categories) and the actual rule-set. RESULTS: Each of the 3 experts achieved a very high degree of consensus (~97%) between his refined knowledge sources (i.e., annotated hourly patient records and the rule-set). We then had the experts produce a common rule-set and then refine their several sets of annotations against it; this again resulted in inter-expert agreements of ~97%. The resulting rule-set can then be used in applications with considerable confidence. CONCLUSION: This study has shown that under some circumstances, it is possible for domain experts to achieve a high degree of correlation between 2 perspectives of the same task. The experts agreed that the immediate feedback provided by INSIGHT was a significant contribution to this successful outcome.


Subject(s)
Artificial Intelligence , Database Management Systems/instrumentation , Electronic Health Records/instrumentation , Expert Testimony , Classification/methods , Diagnosis, Computer-Assisted/methods , Information Storage and Retrieval/methods , Intensive Care Units
6.
Health Informatics J ; 16(4): 260-73, 2010 Dec.
Article in English | MEDLINE | ID: mdl-21216806

ABSTRACT

Treatment and survival for patients with lung cancer vary between and within countries. We have undertaken a multifaceted study of a clinical dataset of 635 patients, to see if clinician treatment decisions were being made consistently and in accordance with the appropriate Scottish Intercollegiate Guidelines Network (SIGN) document. Subsequently, we created a dataset of 117 patients who should have undergone surgery according to the SIGN guideline. As analyses of this dataset did not provide clear distinctions between the main treatment groups, a clinician reviewed the case notes and dataset, checking for inconsistencies. The revised dataset was processed by a decision tree algorithm which suggests clinically plausible decisions. Further, statistical analyses compared the 54 patients offered surgery with the 52 who were not. These analyses suggest that there are significant differences: the most discriminating feature is significant co-morbidity (p < 0.001). The article concludes with suggestions for how future guidelines might be enhanced.


Subject(s)
Decision Making , Guideline Adherence , Lung Neoplasms/therapy , Practice Guidelines as Topic , Algorithms , Decision Trees , Humans , Randomized Controlled Trials as Topic , Scotland
7.
J Neurosurg ; 97(2): 326-36, 2002 Aug.
Article in English | MEDLINE | ID: mdl-12186460

ABSTRACT

OBJECT: Decision tree analysis highlights patient subgroups and critical values in variables assessed. Importantly, the results are visually informative and often present clear clinical interpretation about risk factors faced by patients in these subgroups. The aim of this prospective study was to compare results of logistic regression with those of decision tree analysis of an observational, head-injury data set, including a wide range of secondary insults and 12-month outcomes. METHODS: One hundred twenty-four adult head-injured patients were studied during their stay in an intensive care unit by using a computerized data collection system. Verified values falling outside threshold limits were analyzed according to insult grade and duration with the aid of logistic regression. A decision tree was automatically produced from root node to target classes (Glasgow Outcome Scale [GOS] score). Among 69 patients, in whom eight insult categories could be assessed, outcome at 12 months was analyzed using logistic regression to determine the relative influence of patient age, admission Glasgow Coma Scale score, Injury Severity Score (ISS), pupillary response on admission, and insult duration. The most significant predictors of mortality in this patient set were duration of hypotensive, pyrexic, and hypoxemic insults. When good and poor outcomes were compared, hypotensive insults and pupillary response on admission were significant. Using decision tree analysis, the authors found that hypotension and low cerebral perfusion pressure (CPP) are the best predictors of death, with a 9.2% improvement in predictive accuracy (PA) over that obtained by simply predicting the largest outcome category as the outcome for each patient. Hypotension was a significant predictor of poor outcome (GOS Score 1-3). Low CPP, patient age, hypocarbia, and pupillary response were also good predictors of outcome (good/poor), with a 5.1% improvement in PA. In certain subgroups of patients pyrexia was a predictor of good outcome. CONCLUSIONS: Decision tree analysis confirmed some of the results of logistic regression and challenged others. This investigation shows that there is knowledge to be gained from analyzing observational data with the aid of decision tree analysis.


Subject(s)
Brain Injuries/mortality , Brain Injuries/physiopathology , Decision Trees , Logistic Models , Outcome Assessment, Health Care , Patient Admission/statistics & numerical data , Recovery of Function/physiology , Adult , Brain Injuries/therapy , Female , Glasgow Coma Scale , Humans , Injury Severity Score , Male , Predictive Value of Tests , Prospective Studies , Survival Rate , Time Factors
SELECTION OF CITATIONS
SEARCH DETAIL
...