Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
1.
Nat Commun ; 15(1): 1887, 2024 Feb 29.
Article in English | MEDLINE | ID: mdl-38424096

ABSTRACT

While it is common to monitor deployed clinical artificial intelligence (AI) models for performance degradation, it is less common for the input data to be monitored for data drift - systemic changes to input distributions. However, when real-time evaluation may not be practical (eg., labeling costs) or when gold-labels are automatically generated, we argue that tracking data drift becomes a vital addition for AI deployments. In this work, we perform empirical experiments on real-world medical imaging to evaluate three data drift detection methods' ability to detect data drift caused (a) naturally (emergence of COVID-19 in X-rays) and (b) synthetically. We find that monitoring performance alone is not a good proxy for detecting data drift and that drift-detection heavily depends on sample size and patient features. Our work discusses the need and utility of data drift detection in various scenarios and highlights gaps in knowledge for the practical application of existing methods.


Subject(s)
Artificial Intelligence , COVID-19 , Humans , Diagnostic Imaging , COVID-19/diagnostic imaging , Radiography
2.
Radiol Artif Intell ; 5(5): e220270, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37795140

ABSTRACT

Purpose: To externally test four chest radiograph classifiers on a large, diverse, real-world dataset with robust subgroup analysis. Materials and Methods: In this retrospective study, adult posteroanterior chest radiographs (January 2016-December 2020) and associated radiology reports from Trillium Health Partners in Ontario, Canada, were extracted and de-identified. An open-source natural language processing tool was locally validated and used to generate ground truth labels for the 197 540-image dataset based on the associated radiology report. Four classifiers generated predictions on each chest radiograph. Performance was evaluated using accuracy, positive predictive value, negative predictive value, sensitivity, specificity, F1 score, and Matthews correlation coefficient for the overall dataset and for patient, setting, and pathology subgroups. Results: Classifiers demonstrated 68%-77% accuracy, 64%-75% sensitivity, and 82%-94% specificity on the external testing dataset. Algorithms showed decreased sensitivity for solitary findings (43%-65%), patients younger than 40 years (27%-39%), and patients in the emergency department (38%-60%) and decreased specificity on normal chest radiographs with support devices (59%-85%). Differences in sex and ancestry represented movements along an algorithm's receiver operating characteristic curve. Conclusion: Performance of deep learning chest radiograph classifiers was subject to patient, setting, and pathology factors, demonstrating that subgroup analysis is necessary to inform implementation and monitor ongoing performance to ensure optimal quality, safety, and equity.Keywords: Conventional Radiography, Thorax, Ethics, Supervised Learning, Convolutional Neural Network (CNN), Machine Learning Algorithms Supplemental material is available for this article. © RSNA, 2023See also the commentary by Huisman and Hannink in this issue.

3.
Radiol Artif Intell ; 5(2): e220056, 2023 Mar.
Article in English | MEDLINE | ID: mdl-37035427

ABSTRACT

Despite frequent reports of imaging artificial intelligence (AI) that parallels human performance, clinicians often question the safety and robustness of AI products in practice. This work explores two underreported sources of noise that negatively affect imaging AI: (a) variation in labeling schema definitions and (b) noise in the labeling process. First, the overlap between the schemas of two publicly available datasets and a third-party vendor are compared, showing there is low agreement (<50%) between them. The authors also highlight the problem of label inconsistency, where different annotation schemas are selected for the same clinical prediction task; this results in inconsistent use of medical ontologies through intermingling or duplicate observations and diseases. Second, the individual radiologist annotations for the CheXpert test set are used to quantify noise in the labeling process. The analysis demonstrated that label noise varies by class, as agreement was high for pneumothorax and medical devices (percent agreement > 90%). Among low agreement classes (pneumonia, consolidation), the labels assigned as "ground truth" were unreliable, suggesting that the result of majority voting is highly dependent on which group of radiologists is assigned to annotation. Noise in labeling schemas and gold label annotations are pervasive in medical imaging classification and affect downstream clinical deployment. Possible solutions (eg, changes to task design, annotation methods, and model training) and their potential to improve trust in clinical AI are discussed. Keywords: Radiology AI, Dataset Creation, Noise in Datasets Supplemental material is available for this article. © RSNA, 2023 See also the commentary by Ursprung and Woitek in this issue.

4.
Can Assoc Radiol J ; 74(2): 314-325, 2023 May.
Article in English | MEDLINE | ID: mdl-36189838

ABSTRACT

Purpose: To observe interactions of practicing radiologists with a chest x-ray AI tool and evaluate its usability and impact on workflow efficiency. Methods: Using a simulated clinical workflow and remote multi-monitor screensharing, we prospectively assessed the interactions of 10 staff radiologists (5-33 years of experience) with a PACS-embedded, regulatory-approved chest x-ray AI tool. Qualitatively, we collected feedback using a think-aloud method and post-testing semi-structured interview; transcript themes were categorized by: (1) AI tool features, (2) deployment considerations, and (3) broad human-AI interactions. Quantitatively, we used time-stamped video recordings to compare reporting and decision-making efficiency with and without AI assistance. Results: For AI tool features, radiologists appreciated the simple binary classification (normal vs abnormal) and found the heatmap essential to understand what the AI considered abnormal; users were uncertain of how to interpret confidence values. Regarding deployment considerations, radiologists thought the tool would be especially helpful for identifying subtle diagnoses; opinions were mixed on whether the tool impacted perceived efficiency, accuracy, and confidence. Considering general human-AI interactions, radiologists shared concerns about automation bias especially when relying on an automated triage function. Regarding decision-making and workflow efficiency, participants began dictating 5 seconds later (42% increase, P = .02) and took 14 seconds longer to complete cases (33% increase, P = .09) with AI assistance. Conclusions: Radiologist usability testing provided insights into effective AI tool features, deployment considerations, and human-AI interactions that can guide successful AI deployment. Early AI adoption may increase radiologists' decision-making and total reporting time but improves with experience.


Subject(s)
User-Centered Design , User-Computer Interface , Humans , Workflow , X-Rays , Radiologists
5.
Can Assoc Radiol J ; 74(2): 326-333, 2023 May.
Article in English | MEDLINE | ID: mdl-36341574

ABSTRACT

Artificial intelligence (AI) software in radiology is becoming increasingly prevalent and performance is improving rapidly with new applications for given use cases being developed continuously, oftentimes with development and validation occurring in parallel. Several guidelines have provided reporting standards for publications of AI-based research in medicine and radiology. Yet, there is an unmet need for recommendations on the assessment of AI software before adoption and after commercialization. As the radiology AI ecosystem continues to grow and mature, a formalization of system assessment and evaluation is paramount to ensure patient safety, relevance and support to clinical workflows, and optimal allocation of limited AI development and validation resources before broader implementation into clinical practice. To fulfil these needs, we provide a glossary for AI software types, use cases and roles within the clinical workflow; list healthcare needs, key performance indicators and required information about software prior to assessment; and lay out examples of software performance metrics per software category. This conceptual framework is intended to streamline communication with the AI software industry and provide healthcare decision makers and radiologists with tools to assess the potential use of these software. The proposed software evaluation framework lays the foundation for a radiologist-led prospective validation network of radiology AI software. Learning Points: The rapid expansion of AI applications in radiology requires standardization of AI software specification, classification, and evaluation. The Canadian Association of Radiologists' AI Tech & Apps Working Group Proposes an AI Specification document format and supports the implementation of a clinical expert evaluation process for Radiology AI software.


Subject(s)
Artificial Intelligence , Radiology , Humans , Ecosystem , Canada , Radiologists , Software
6.
Can Urol Assoc J ; 16(6): 213-221, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35099382

ABSTRACT

INTRODUCTION: We aimed to develop an explainable machine learning (ML) model to predict side-specific extraprostatic extension (ssEPE) to identify patients who can safely undergo nerve-sparing radical prostatectomy using preoperative clinicopathological variables. METHODS: A retrospective sample of clinicopathological data from 900 prostatic lobes at our institution was used as the training cohort. Primary outcome was the presence of ssEPE. The baseline model for comparison had the highest performance out of current biopsy-derived predictive models for ssEPE. A separate logistic regression (LR) model was built using the same variables as the ML model. All models were externally validated using a testing cohort of 122 lobes from another institution. Models were assessed by area under receiver-operating-characteristic curve (AUROC), precision-recall curve (AUPRC), calibration, and decision curve analysis. Model predictions were explained using SHapley Additive exPlanations. This tool was deployed as a publicly available web application. RESULTS: Incidence of ssEPE in the training and testing cohorts were 30.7 and 41.8%, respectively. The ML model achieved AUROC 0.81 (LR 0.78, baseline 0.74) and AUPRC 0.69 (LR 0.64, baseline 0.59) on the training cohort. On the testing cohort, the ML model achieved AUROC 0.81 (LR 0.76, baseline 0.75) and AUPRC 0.78 (LR 0.75, baseline 0.70). The ML model was explainable, well-calibrated, and achieved the highest net benefit for clinically relevant cutoffs of 10-30%. CONCLUSIONS: We developed a user-friendly application that enables physicians without prior ML experience to assess ssEPE risk and understand factors driving these predictions to aid surgical planning and patient counselling (https://share.streamlit.io/jcckwong/ssepe/main/ssEPE_V2.py).

7.
Bioelectron Med ; 7(1): 5, 2021 Apr 21.
Article in English | MEDLINE | ID: mdl-33879255

ABSTRACT

The overuse of low value medical tests and treatments drives costs and patient harm. Efforts to address overuse, such as Choosing Wisely campaigns, typically rely on passive implementation strategies- a form of low reliability system change. Embedding guidelines into clinical decision support (CDS) software is a higher leverage approach to provide ordering suggestions through an interface embedded within the clinical workflow. Growth in computing power is increasingly enabling artificial intelligence (AI) to augment such decision making tools. This article offers a roadmap of opportunities for AI-enabled CDS to reduce overuse, which are presented according to a patient's journey of care.

8.
CMAJ Open ; 8(3): E545-E553, 2020.
Article in English | MEDLINE | ID: mdl-32873583

ABSTRACT

BACKGROUND: Nonpharmaceutical interventions (NPIs) are the primary tools to mitigate early spread of the coronavirus disease 2019 (COVID-19) pandemic; however, such policies are implemented variably at the federal, provincial or territorial, and municipal levels without centralized documentation. We describe the development of the comprehensive open Canadian Non-Pharmaceutical Intervention (CAN-NPI) data set, which identifies and classifies all NPIs implemented in regions across Canada in response to COVID-19, and provides an accompanying description of geographic and temporal heterogeneity. METHODS: We performed an environmental scan of government websites, news media and verified government social media accounts to identify NPIs implemented in Canada between Jan. 1 and Apr. 19, 2020. The CAN-NPI data set contains information about each intervention's timing, location, type, target population and alignment with a response stringency measure. We conducted descriptive analyses to characterize the temporal and geographic variation in early NPI implementation. RESULTS: We recorded 2517 NPIs grouped in 63 distinct categories during this period. The median date of NPI implementation in Canada was Mar. 24, 2020. Most jurisdictions heightened the stringency of their response following the World Health Organization's global pandemic declaration on Mar. 11, 2020. However, there was variation among provinces or territories in the timing and stringency of NPI implementation, with 8 out of 13 provinces or territories declaring a state of emergency by Mar. 18, and all by Mar. 22, 2020. INTERPRETATION: There was substantial geographic and temporal heterogeneity in NPI implementation across Canada, highlighting the importance of a subnational lens in evaluating the COVID-19 pandemic response. Our comprehensive open-access data set will enable researchers to conduct robust interjurisdictional analyses of NPI impact in curtailing COVID-19 transmission.


Subject(s)
COVID-19/therapy , Pandemics/prevention & control , Social Media/statistics & numerical data , COVID-19/diagnosis , COVID-19/epidemiology , COVID-19/virology , COVID-19 Testing/methods , Canada/epidemiology , Geography , Government , Humans , Infection Control/methods , Pandemics/legislation & jurisprudence , Physical Distancing , Policy , SARS-CoV-2/genetics , Time Factors
9.
AMIA Jt Summits Transl Sci Proc ; 2020: 383-392, 2020.
Article in English | MEDLINE | ID: mdl-32477659

ABSTRACT

Seamless sharing between imaging facilities of medical images obtained on the same patient is crucial in providing accurate and efficient care to patients. However, the terminology used to describe semantically similar examinations can vary widely between facilities. Current practice is manual table-based mapping to a standard terminology, which has substantial potential for mislabelled and missing examinations. In this work, we establish several baseline methods for automating the mapping of radiology imaging procedure descriptions to a SNOMED CT based standard terminology. Our best performing baseline, consisting of a bag of words representation and shallow neural network, achieved 96.3% accuracy. In addition, we explore an unsupervised clustering method that explores relevancy matching without the need for an intervening standard. Lastly, we make the procedure name dataset used in this work available to encourage extension of this application.

10.
J Am Coll Radiol ; 17(9): 1149-1158, 2020 Sep.
Article in English | MEDLINE | ID: mdl-32278847

ABSTRACT

PURPOSE: The aim of this study was to enhance multispecialty CT and MRI protocol assignment quality and efficiency through development, testing, and proposed workflow design of a natural language processing (NLP)-based machine learning classifier. METHODS: NLP-based machine learning classification models were developed using order entry input data and radiologist-assigned protocols from more than 18,000 unique CT and MRI examinations obtained during routine clinical use. k-Nearest neighbor, random forest, and deep neural network classification models were evaluated at baseline and after applying class frequency and confidence thresholding techniques. To simulate performance in real-world deployment, the model was evaluated in two operating modes in combination: automation (automated assignment of the top result) and clinical decision support (CDS; top-three protocol suggestion for clinical review). Finally, model-radiologist discordance was subjectively reviewed to guide explainability and safe use. RESULTS: Baseline protocol assignment performance achieved weighted precision of 0.757 to 0.824. Simulating real-world deployment using combined thresholding techniques, the optimized deep neural network model assigned 69% of protocols in automation mode with 95% accuracy. In the remaining 31% of cases, the model achieved 92% accuracy in CDS mode. Analysis of discordance with subspecialty radiologist labels revealed both more and less appropriate model predictions. CONCLUSIONS: A multiclass NLP-based classification algorithm was designed to drive local operational improvement in CT and MR radiology protocol assignment at subspecialist quality. The results demonstrate a simulated workflow deployment enabling automated assignment of protocols in nearly 7 of 10 cases with very few errors combined with top-three CDS for remaining cases supporting a high-quality, efficient radiology workflow.


Subject(s)
Automation , Machine Learning , Radiology , Natural Language Processing , Neural Networks, Computer
12.
CMAJ Open ; 5(4): E760-E767, 2017 Oct 13.
Article in English | MEDLINE | ID: mdl-29042408

ABSTRACT

BACKGROUND: In 2012, the Ontario government withdrew public insurance coverage of imaging tests for uncomplicated low back pain. We studied the impact of this restriction on test ordering by physicians. METHODS: We compared the numbers of lumbar spine radiography, computed tomography (CT) and single-segment magnetic resonance imaging (MRI) studies ordered by physicians in the 3 years before and after the policy change. We linked claims data from the Ontario Health Insurance Program with physician details to calculate rates per test-ordering physician. We compared changes in rates of monthly test ordering by family physicians and specialists before and after the policy change using segmented regression analysis of interrupted time series data. RESULTS: The number of lumbar spine radiography and spine CT studies ordered by family physicians decreased by 98 597 (28.7%) and 17 499 (28.7%), respectively, in the year after the policy change; there was little change in ordering by specialists. The number of lumbar spine radiography studies ordered per family physician by month decreased by 0.81 tests (p < 0.001) after the intervention, followed by a smaller rebound increase that remained below baseline. Monthly ordering of spine CT per family physician declined by 0.1 tests (p < 0.001), and that of limited spine MRI rose before the intervention, decreased by 0.18 tests (p < 0.001) after the intervention, then started to rise again. Monthly ordering of limited spine MRI by specialists, which had been stable before the policy change, decreased by 0.1 tests per specialist (p < 0.001) afterward, then rose to preintervention levels. INTERPRETATION: The restriction in coverage of imaging tests caused a larger decrease in test ordering by family physicians than by specialists and a larger, more sustained reduction in the use of lumbar spine radiography and spine CT than of spine MRI.

13.
ScientificWorldJournal ; 2015: 824268, 2015.
Article in English | MEDLINE | ID: mdl-26171421

ABSTRACT

Discriminating groups were introduced by G. Baumslag, A. Myasnikov, and V. Remeslennikov as an outgrowth of their theory of algebraic geometry over groups. Algebraic geometry over groups became the main method of attack on the solution of the celebrated Tarski conjectures. In this paper we explore the notion of discrimination in a general universal algebra context. As an application we provide a different proof of a theorem of Malcev on axiomatic classes of Ω-algebras.

15.
Conn Med ; 74(3): 149-56, 2010 Mar.
Article in English | MEDLINE | ID: mdl-20391821

ABSTRACT

BACKGROUND: The amount of literature dealing with the diagnosis and treatment of adolescent concussions is considerable. Most articles focus on the athlete. This study examines both sports-related and nonsports-related concussions in adolescents, their etiology, mechanisms of injury (categorized by sport), symptoms exhibited, physical findings, computerized tomography scan results and the problem of prolonged recovery (persistent postconcussion syndrome used in this article to mean symptoms lasting over four weeks.) OBJECTIVE: The purpose of this study is to present the data, their significance and a new method of management that has successfully allowed the author's concussed patients to recover more rapidly. METHOD: A retrospective review of 863 adolescent concussions, in 11-year-old to 19-year-old patients, from July 2004 through December 31, 2008. Subjects were seen as a result of referrals largely from the author's practice (Pediatric Healthcare Associates), other physicians, athletic trainers or patients previously treated. All concussions, including nonsports-related concussions, were included in the study. Some patients had multiple concussions; 774 individuals accounted for the 863 concussions. The number of patients by age and the number of concussions they sustained are listed below.


Subject(s)
Brain Concussion/complications , Adolescent , Age Factors , Amnesia/etiology , Athletic Injuries/complications , Athletic Injuries/diagnosis , Athletic Injuries/epidemiology , Brain Concussion/diagnosis , Brain Concussion/epidemiology , Brain Concussion/etiology , Child , Female , Humans , Male , Retrospective Studies , Risk Factors , Time Factors , Unconsciousness/etiology , United States/epidemiology , Young Adult
16.
Healthc Q ; 12(3): 32-41, 2009.
Article in English | MEDLINE | ID: mdl-19553764

ABSTRACT

Canadian healthcare organizations are increasingly asked to do more with less, and too often this has resulted in demands on staff to simply work harder and longer. Lean methodologies, originating from Japanese industrial organizations and most notably Toyota, offer an alternative - tried and tested approaches to working smarter. Lean, with its systematic approaches to reducing waste, has found its way to Canadian healthcare organizations with promising results. This article reports on a study of five Canadian healthcare providers that have recently implemented Lean. We offer stories of success but also identify potential obstacles and ways by which they may be surmounted to provide better value for our healthcare investments.


Subject(s)
Efficiency, Organizational , Hospital Administration/methods , Leadership , Technology Transfer , Canada , Health Services Research
17.
Cardiovasc Res ; 71(1): 40-9, 2006 Jul 01.
Article in English | MEDLINE | ID: mdl-16566911

ABSTRACT

Nearly 20 years have passed since Weinberg and Bell attempted to make the first tissue-engineered blood vessels. Following this early attempt, vascular tissue engineering has emerged as one of the most promising approaches to fabricate orderly and mechanically competent vascular substitutes. In elastic and muscular arteries, elastin is a critical structural and regulatory matrix protein and plays an important and dominant role by conferring elasticity to the vessel wall. Elastin also regulates vascular smooth muscle cells activity and phenotype. Despite the great promise that tissue-engineered blood vessels have to offer, little research in the last two decades has addressed the importance of elastin incorporation into these vessels. Although cardiovascular tissue engineering has been reviewed in the past, very little attention has been given to elastin. Thus, this review focuses on the recent advances made towards elastogenesis and the challenges we face in the quest for appropriate functional vascular substitutes.


Subject(s)
Blood Vessel Prosthesis , Elastin/biosynthesis , Muscle, Smooth, Vascular/metabolism , Tissue Engineering , Arteries , Bioreactors , Elasticity , Elastin/chemical synthesis , Extracellular Matrix/metabolism , Humans , Stem Cells/physiology , Stress, Mechanical
SELECTION OF CITATIONS
SEARCH DETAIL