RESUMO
With radiology shortages affecting over half of the global population, the potential of artificial intelligence to revolutionize medical diagnosis and treatment is ever more important. However, lacking trust from medical professionals hinders the widespread adoption of AI models in health sciences. Explainable AI (XAI) aims to increase trust and understanding of black box models by identifying biases and providing transparent explanations. This is the first survey that explores explainable user interfaces (XUI) from a medical domain perspective, analysing the visualization and interaction methods employed in current medical XAI systems. We analysed 42 explainable interfaces following the PRISMA methodology, emphasizing the critical role of effectively conveying information to users as part of the explanation process. We contribute a taxonomy of interface design properties and identify five distinct clusters of research papers. Future research directions include contestability in medical decision support, counterfactual explanations for images, and leveraging Large Language Models to enhance XAI interfaces in healthcare.
Assuntos
Inteligência Artificial , Humanos , Interface Usuário-Computador , Inquéritos e Questionários , Atenção à SaúdeRESUMO
X-ray imaging plays a crucial role in diagnostic medicine. Yet, a significant portion of the global population lacks access to this essential technology due to a shortage of trained radiologists. Eye-tracking data and deep learning models can enhance X-ray analysis by mapping expert focus areas, guiding automated anomaly detection, optimizing workflow efficiency, and bolstering training methods for novice radiologists. However, the literature shows contradictory results regarding the usefulness of eye-tracking data in deep-learning architectures for abnormality detection. We argue that these discrepancies between studies in the literature are due to (a) the way eye-tracking data is (or is not) processed, (b) the types of deep learning architectures chosen, and (c) the type of application that these architectures will have. We conducted a systematic literature review using PRISMA to address these contradicting results. We analyzed 60 studies that incorporated eye-tracking data in a deep-learning approach for different application goals in radiology. We performed a comparative analysis to understand if eye gaze data contains feature maps that can be useful under a deep learning approach and whether they can promote more interpretable predictions. To the best of our knowledge, this is the first survey in the area that performs a thorough investigation of eye gaze data processing techniques and their impacts in different deep learning architectures for applications such as error detection, classification, object detection, expertise level analysis, fatigue estimation and human attention prediction in medical imaging data. Our analysis resulted in two main contributions: (1) taxonomy that first divides the literature by task, enabling us to analyze the value eye movement can bring for each case and build guidelines regarding architectures and gaze processing techniques adequate for each application, and (2) an overall analysis of how eye gaze data can promote explainability in radiology.
Assuntos
Aprendizado Profundo , Tecnologia de Rastreamento Ocular , Radiologia , Humanos , Radiologia/educação , Movimentos Oculares/fisiologia , Fixação Ocular/fisiologiaRESUMO
This study investigates the effects of including patients' clinical information on the performance of deep learning (DL) classifiers for disease location in chest X-ray images. Although current classifiers achieve high performance using chest X-ray images alone, consultations with practicing radiologists indicate that clinical data is highly informative and essential for interpreting medical images and making proper diagnoses. In this work, we propose a novel architecture consisting of two fusion methods that enable the model to simultaneously process patients' clinical data (structured data) and chest X-rays (image data). Since these data modalities are in different dimensional spaces, we propose a spatial arrangement strategy, spatialization, to facilitate the multimodal learning process in a Mask R-CNN model. We performed an extensive experimental evaluation using MIMIC-Eye, a dataset comprising different modalities: MIMIC-CXR (chest X-ray images), MIMIC IV-ED (patients' clinical data), and REFLACX (annotations of disease locations in chest X-rays). Results show that incorporating patients' clinical data in a DL model together with the proposed fusion methods improves the disease localization in chest X-rays by 12% in terms of Average Precision compared to a standard Mask R-CNN using chest X-rays alone. Further ablation studies also emphasize the importance of multimodal DL architectures and the incorporation of patients' clinical data in disease localization. In the interest of fostering scientific reproducibility, the architecture proposed within this investigation has been made publicly accessible( https://github.com/ChihchengHsieh/multimodal-abnormalities-detection ).
Assuntos
Radiologistas , Humanos , Raios X , Reprodutibilidade dos Testes , RadiografiaRESUMO
The recent pandemic, war, and oil crises have caused many to reconsider their need to travel for education, training, and meetings. Providing assistance and training remotely has thus gained importance for many applications, from industrial maintenance to surgical telemonitoring. Current solutions such as video conferencing platforms lack essential communication cues such as spatial referencing, which negatively impacts both time completion and task performance. Mixed Reality (MR) offers opportunities to improve remote assistance and training, as it opens the way to increased spatial clarity and large interaction space. We contribute a survey of remote assistance and training in MR environments through a systematic literature review to provide a deeper understanding of current approaches, benefits and challenges. We analyze 62 articles and contextualize our findings along a taxonomy based on degree of collaboration, perspective sharing, MR space symmetry, time, input and output modality, visual display, and application domain. We identify the main gaps and opportunities in this research area, such as exploring collaboration scenarios beyond one-expert-to-one-trainee, enabling users to move across the reality-virtuality spectrum during a task, or exploring advanced interaction techniques that resort to hand or eye tracking. Our survey informs and helps researchers in different domains, including maintenance, medicine, engineering, or education, build and evaluate novel MR approaches to remote training and assistance. All supplemental materials are available at https://augmented-perception.org/publications/2023-training-survey.html.
RESUMO
One of the most promising applications of Optical See-Through Augmented Reality is minimally laparoscopic surgery, which currently suffers from problems such as surgeon discomfort and fatigue caused by looking at a display positioned outside the surgeon's visual field, made worse by the length of the procedure. This fatigue is especially felt on the surgeon's neck, as it is strained from adopting unnatural postures in order to visualise the laparoscopic video feed. Throughout this paper, we will present work in Augmented Reality, as well as developments in surgery and Augmented Reality applied to both surgery in general and laparoscopy in particular to address these issues. We applied user and task analysis methods to learn about practices performed in the operating room by observing surgeons in their working environment in order to understand, in detail, how they performed their tasks and achieved their intended goals. Drawing on observations and analysis of video recordings of laparoscopic surgeries, we identified relevant constraints and design requirements. Besides proposals to approach the ergonomic issues, we present a design and implementation of a multimodal interface to enhance the laparoscopic procedure. Our method makes it more comfortable for surgeons by allowing them to keep the laparoscopic video in their viewing area regardless of neck posture. Also, our interface makes it possible to access patient imaging data without interrupting the operation. It also makes it possible to communicate with team members through a pointing reticle. We evaluated how surgeons perceived the implemented prototype, in terms of usefulness and usability, via a think-aloud protocol to conduct qualitative evaluation sessions which we describe in detail in this paper. In addition to checking the advantages of the prototype as compared to traditional laparoscopic settings, we also conducted a System Usability Scale questionnaire for measuring its usability, and a NASA Task Load Index questionnaire to rate perceived workload and to assess the prototype effectiveness. Our results show that surgeons consider that our prototype can improve surgeon-to-surgeon communication using head pose as a means of pointing. Also, surgeons believe that our approach can afford a more comfortable posture throughout the surgery and enhance hand-eye coordination, as physicians no longer need to twist their necks to look at screens placed outside the field of operation.
Assuntos
Realidade Aumentada , Laparoscopia , Ergonomia , Humanos , Postura , Gravação em VídeoRESUMO
Locomotion in virtual environments is currently a difficult and unnatural task to perform. Normally, researchers tend to devise ground- or floor-based metaphors to constrain the degrees of freedom (DoFs) during motion. These restrictions enable interactions that accurately emulate the human gait to provide high interaction fidelity. However, flying allows users to reach specific locations in a virtual scene more expeditiously. Our experience suggests that even though flying is not innate to humans, high-interaction-fidelity techniques may also improve the flying experience since flying requires simultaneously controlling additional DoFs. We present the Magic Carpet, an approach to flying that combines a floor proxy with a full-body representation to avoid balance and cybersickness issues. This design space enables DoF separation by treating direction indication and speed control as two separate phases of travel, thereby enabling techniques with higher interaction fidelity. To validate our design space, we conducted two complementary studies, one for each of the travel phases. In this paper, we present the results of both studies and report the best techniques for use within the Magic Carpet design space. To this end, we use both objective and subjective measures to evaluate the efficiency, embodiment effect, and side effects, such as physical fatigue and cybersickness, of the tested techniques in our design space. Our results show that the proposed approach enables high-interaction-fidelity techniques while improving the user experience.
RESUMO
Feet input can support mid-air hand gestures for touchless medical image manipulation to prevent unintended activations, especially in sterile contexts. However, foot interaction has yet to be investigated in dental settings. In this paper, we conducted a mixed methods research study with medical dentistry professionals. To this end, we developed a touchless medical image system in either sitting or standing configurations. Clinicians could use both hands as 3D cursors and a minimalist single-foot gesture vocabulary to activate manipulations. First, we performed a qualitative evaluation with 18 medical dentists to assess the utility and usability of our system. Second, we used quantitative methods to compare pedal foot-supported hand interaction and hands-only conditions next to 22 medical dentists. We expand on previous work by characterizing a range of potential limitations of foot-supported touchless 3D interaction in the dental domain. Our findings suggest that clinicians are open to use their foot for simple, fast and easy access to image data during surgical procedures, such as dental implant placement. Furthermore, 3D hand cursors, supported by foot gestures for activation events, were considered useful and easy to employ for medical image manipulation. Even though most clinicians preferred hands-only manipulation for pragmatic purposes, feet-supported interaction was found to provide more precise control and, most importantly, to decrease the number of unintended activations during manipulation. Finally, we provide design considerations for future work exploring foot-supported touchless interfaces for sterile settings in Dental Medicine, regarding: interaction design, foot input devices, the learning process and camera occlusions.
Assuntos
Odontologia , Pé , Radiografia Dentária , Interface Usuário-Computador , Gráficos por Computador , Humanos , Imageamento TridimensionalRESUMO
Extensive research has been applied to discover new techniques and methods to model protein-ligand interactions. In particular, considerable efforts focused on identifying candidate binding sites, which quite often are active sites that correspond to protein pockets or cavities. Thus, these cavities play an important role in molecular docking. However, there is no established benchmark to assess the accuracy of new cavity detection methods. In practice, each new technique is evaluated using a small set of proteins with known binding sites as ground-truth. However, studies supported by large datasets of known cavities and/or binding sites and statistical classification (i.e., false positives, false negatives, true positives, and true negatives) would yield much stronger and reliable assessments. To this end, we propose CavBench, a generic and extensible benchmark to compare different cavity detection methods relative to diverse ground truth datasets (e.g., PDBsum) using statistical classification methods.
Assuntos
Modelos Moleculares , Proteínas/química , Software , Algoritmos , Conformação Proteica , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
Understanding morphological features that characterize normal hip joint is critical and necessary for a more comprehensive definition of pathological presentations, such as femoroacetabular impingement and hip dysplasia. Based on anatomical observations that articular surfaces of synovial joints are better represented by ovoidal shapes than by spheres, the aim of this study is to computationally test this morphological classification for the femoral head and acetabular cavity of asymptomatic, dysplastic and impinged hips by comparing spherical, ellipsoidal and ovoidal shapes. An image-based surface fitting framework was used to assess the goodness-of-fit of spherical, ellipsoidal and tapered ellipsoidal (i.e., egg-like) shapes. The framework involved image segmentation with active contour methods, mesh smoothing and decimation, and surface fitting to point clouds performed with genetic algorithms. Image data of the hip region was obtained from computed tomography and magnetic resonance imaging scans. Shape analyses were performed upon image data from 20 asymptomatic, 20 dysplastic and 20 impinged (cam, pincer, and mixed) hips of patients with ages ranging between 18 and 45 years old (28 male and 32 women). Tapered ellipsoids presented the lowest fitting errors (i.e., more oval), followed by ellipsoids and spheres which had the worst goodness-of-fit. Ovoidal geometries are also more representative of cam, pincer, mixed impinged hips when compared to spherical or ellipsoidal shapes. The statistical analysis of the surface fitting errors reveal that ovoidal shapes better represent both articular surfaces of the hip joint, revealing a greater approximation to the overall features of asymptomatic, dysplastic and impinged cases.
Assuntos
Doenças Assintomáticas , Luxação do Quadril/patologia , Articulação do Quadril/patologia , Adulto , Feminino , Luxação do Quadril/diagnóstico por imagem , Articulação do Quadril/diagnóstico por imagem , Humanos , Imageamento Tridimensional , Imageamento por Ressonância Magnética , Masculino , Propriedades de Superfície , Tomografia Computadorizada por Raios X , Adulto JovemRESUMO
Analyzing medical volume datasets requires interactive visualization so that users can extract anatomo-physiological information in real-time. Conventional volume rendering systems rely on 2D input devices, such as mice and keyboards, which are known to hamper 3D analysis as users often struggle to obtain the desired orientation that is only achieved after several attempts. In this paper, we address which 3D analysis tools are better performed with 3D hand cursors operating on a touchless interface comparatively to a 2D input devices running on a conventional WIMP interface. The main goals of this paper are to explore the capabilities of (simple) hand gestures to facilitate sterile manipulation of 3D medical data on a touchless interface, without resorting on wearables, and to evaluate the surgical feasibility of the proposed interface next to senior surgeons (N=5) and interns (N=2). To this end, we developed a touchless interface controlled via hand gestures and body postures to rapidly rotate and position medical volume images in three-dimensions, where each hand acts as an interactive 3D cursor. User studies were conducted with laypeople, while informal evaluation sessions were carried with senior surgeons, radiologists and professional biomedical engineers. Results demonstrate its usability as the proposed touchless interface improves spatial awareness and a more fluent interaction with the 3D volume than with traditional 2D input devices, as it requires lesser number of attempts to achieve the desired orientation by avoiding the composition of several cumulative rotations, which is typically necessary in WIMP interfaces. However, tasks requiring precision such as clipping plane visualization and tagging are best performed with mouse-based systems due to noise, incorrect gestures detection and problems in skeleton tracking that need to be addressed before tests in real medical environments might be performed.
Assuntos
Gestos , Imageamento Tridimensional , Interface Usuário-Computador , Bases de Dados Factuais , Estatística como AssuntoRESUMO
Detecting and analyzing protein cavities provides significant information about active sites for biological processes (e.g., protein-protein or protein-ligand binding) in molecular graphics and modeling. Using the three-dimensional structure of a given protein (i.e., atom types and their locations in 3D) as retrieved from a PDB (Protein Data Bank) file, it is now computationally viable to determine a description of these cavities. Such cavities correspond to pockets, clefts, invaginations, voids, tunnels, channels, and grooves on the surface of a given protein. In this work, we survey the literature on protein cavity computation and classify algorithmic approaches into three categories: evolution-based, energy-based, and geometry-based. Our survey focuses on geometric algorithms, whose taxonomy is extended to include not only sphere-, grid-, and tessellation-based methods, but also surface-based, hybrid geometric, consensus, and time-varying methods. Finally, we detail those techniques that have been customized for GPU (Graphics Processing Unit) computing.