Búsqueda | Portal Regional de la BVS

1.

Generative Artificial Intelligence: Enhancing Patient Education in Cardiovascular Imaging.

Marey, Ahmed; Saad, Abdelrahman M; Killeen, Benjamin D; Gomez, Catalina; Tregubova, Mariia; Unberath, Mathias; Umair, Muhammad.

BJR Open ; 6(1): tzae018, 2024 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-39086557

RESUMEN

Cardiovascular disease (CVD) is a major cause of mortality worldwide, especially in resource-limited countries with limited access to healthcare resources. Early detection and accurate imaging are vital for managing CVD, emphasizing the significance of patient education. Generative artificial intelligence (AI), including algorithms to synthesize text, speech, images, and combinations thereof given a specific scenario or prompt, offers promising solutions for enhancing patient education. By combining vision and language models, generative AI enables personalized multimedia content generation through natural language interactions, benefiting patient education in cardiovascular imaging. Simulations, chat-based interactions, and voice-based interfaces can enhance accessibility, especially in resource-limited settings. Despite its potential benefits, implementing generative AI in resource-limited countries faces challenges like data quality, infrastructure limitations, and ethical considerations. Addressing these issues is crucial for successful adoption. Ethical challenges related to data privacy and accuracy must also be overcome to ensure better patient understanding, treatment adherence, and improved healthcare outcomes. Continued research, innovation, and collaboration in generative AI have the potential to revolutionize patient education. This can empower patients to make informed decisions about their cardiovascular health, ultimately improving healthcare outcomes in resource-limited settings.

2.

Interpretable Severity Scoring of Pelvic Trauma Through Automated Fracture Detection and Bayesian Inference.

Chen, Haomin; Dreizin, David; Gomez, Catalina; Zapaishchykova, Anna; Unberath, Mathias.

IEEE Trans Med Imaging ; PP2024 Jul 22.

Artículo en Inglés | MEDLINE | ID: mdl-39037876

RESUMEN

Pelvic ring disruptions result from blunt injury mechanisms and are potentially lethal mainly due to associated injuries and massive pelvic hemorrhage. The severity of pelvic fractures in trauma victims is frequently assessed by grading the fracture according to the Tile AO/OTA classification in whole-body Computed Tomography (CT) scans. Due to the high volume of whole-body CT scans generated in trauma centers, the overall information content of a single whole-body CT scan and low manual CT reading speed, an automatic approach to Tile classification would provide substantial value, e. g., to prioritize the reading sequence of the trauma radiologists or enable them to focus on other major injuries in multi-trauma patients. In such a high-stakes scenario, an automated method for Tile grading should ideally be transparent such that the symbolic information provided by the method follows the same logic a radiologist or orthopedic surgeon would use to determine the fracture grade. This paper introduces an automated yet interpretable pelvic trauma decision support system to assist radiologists in fracture detection and Tile grading. To achieve interpretability despite processing high-dimensional whole-body CT images, we design a neurosymbolic algorithm that operates similarly to human interpretation of CT scans. The algorithm first detects relevant pelvic fractures on CTs with high specificity using Faster-RCNN. To generate robust fracture detections and associated detection (un)certainties, we perform test-time augmentation of the CT scans to apply fracture detection several times in a self-ensembling approach. The fracture detections are interpreted using a structural causal model based on clinical best practices to infer an initial Tile grade. We apply a Bayesian causal model to recover likely co-occurring fractures that may have been rejected initially due to the highly specific operating point of the detector, resulting in an updated list of detected fractures and corresponding final Tile grade. Our method is transparent in that it provides fracture location and types, as well as information on important counterfactuals that would invalidate the system's recommendation. Our approach achieves an AUC of 0.89/0.74 for translational and rotational instability,which is comparable to radiologist performance. Despite being designed for human-machine teaming, our approach does not compromise on performance compared to previous black-box methods.

3.

Explainable AI decision support improves accuracy during telehealth strep throat screening.

Gomez, Catalina; Smith, Brittany-Lee; Zayas, Alisa; Unberath, Mathias; Canares, Therese.

Commun Med (Lond) ; 4(1): 149, 2024 Jul 24.

Artículo en Inglés | MEDLINE | ID: mdl-39048726

RESUMEN

BACKGROUND: Artificial intelligence-based (AI) clinical decision support systems (CDSS) using unconventional data, like smartphone-acquired images, promise transformational opportunities for telehealth; including remote diagnosis. Although such solutions' potential remains largely untapped, providers' trust and understanding are vital for effective adoption. This study examines how different human-AI interaction paradigms affect clinicians' responses to an emerging AI CDSS for streptococcal pharyngitis (strep throat) detection from smartphone throat images. METHODS: In a randomized experiment, we tested explainable AI strategies using three AI-based CDSS prototypes for strep throat prediction. Participants received clinical vignettes via an online survey to predict the disease state and offer clinical recommendations. The first set included a validated CDSS prediction (Modified Centor Score) and the second introduced an explainable AI prototype randomly. We used linear models to assess explainable AI's effect on clinicians' accuracy, confirmatory testing rates, and perceived trust and understanding of the CDSS. RESULTS: The study, involving 121 telehealth providers, shows that compared to using the Centor Score, AI-based CDSS can improve clinicians' predictions. Despite higher agreement with AI, participants report lower trust in its advice than in the Centor Score, leading to more requests for in-person confirmatory testing. CONCLUSIONS: Effectively integrating AI is crucial in the telehealth-based diagnosis of infectious diseases, given the implications of antibiotic over-prescriptions. We demonstrate that AI-based CDSS can improve the accuracy of remote strep throat screening yet underscores the necessity to enhance human-machine collaboration, particularly in trust and intelligibility. This ensures providers and patients can capitalize on AI interventions and smartphones for virtual healthcare.

Strep pharyngitis, or strep throat, is a bacterial infection that can cause a sore throat. Artificial intelligence (AI) can use photos taken on a person's phone to help diagnose strep throat, offering an additional way for doctors to screen patients during virtual appointments. However, it is currently unclear whether doctors will trust AI recommendations or how they might use them in decision-making. We surveyed clinicians about their use of an AI system for strep throat screening with smartphone images. We compared different ways of providing AI recommendations to standard medical guidelines. We found that all tested AI methods helped clinicians to identify strep throat cases. However, clinicians trusted AI less than their usual clinical guidelines, leading to more requests for follow-up in-person testing. Our results show how AI may improve the accuracy of pharyngitis assessment. Still, further research is needed to ensure doctors trust and collaborate with AI to improve remote healthcare.

4.

Vessel-targeted compensation of deformable motion in interventional cone-beam CT.

Lu, Alexander; Huang, Heyuan; Hu, Yicheng; Zbijewski, Wojciech; Unberath, Mathias; Siewerdsen, Jeffrey H; Weiss, Clifford R; Sisniega, Alejandro.

Med Image Anal ; 97: 103254, 2024 Jun 26.

Artículo en Inglés | MEDLINE | ID: mdl-38968908

RESUMEN

The present standard of care for unresectable liver cancer is transarterial chemoembolization (TACE), which involves using chemotherapeutic particles to selectively embolize the arteries supplying hepatic tumors. Accurate volumetric identification of intricate fine vascularity is crucial for selective embolization. Three-dimensional imaging, particularly cone-beam CT (CBCT), aids in visualization and targeting of small vessels in such highly variable anatomy, but long image acquisition time results in intra-scan patient motion, which distorts vascular structures and tissue boundaries. To improve clarity of vascular anatomy and intra-procedural utility, this work proposes a targeted motion estimation and compensation framework that removes the need for any prior information or external tracking and for user interaction. Motion estimation is performed in two stages: (i) a target identification stage that segments arteries and catheters in the projection domain using a multi-view convolutional neural network to construct a coarse 3D vascular mask; and (ii) a targeted motion estimation stage that iteratively solves for the time-varying motion field via optimization of a vessel-enhancing objective function computed over the target vascular mask. The vessel-enhancing objective is derived through eigenvalues of the local image Hessian to emphasize bright tubular structures. Motion compensation is achieved via spatial transformer operators that apply time-dependent deformations to partial angle reconstructions, allowing efficient minimization via gradient backpropagation. The framework was trained and evaluated in anatomically realistic simulated motion-corrupted CBCTs mimicking TACE of hepatic tumors, at intermediate (3.0 mm) and large (6.0 mm) motion magnitudes. Motion compensation substantially improved median vascular DICE score (from 0.30 to 0.59 for large motion), image SSIM (from 0.77 to 0.93 for large motion), and vessel sharpness (0.189 mm-1 to 0.233 mm-1 for large motion) in simulated cases. Motion compensation also demonstrated increased vessel sharpness (0.188 mm-1 before to 0.205 mm-1 after) and reconstructed vessel length (median increased from 37.37 to 41.00 mm) on a clinical interventional CBCT. The proposed anatomy-aware motion compensation framework presented a promising approach for improving the utility of CBCT for intra-procedural vascular imaging, facilitating selective embolization procedures.

5.

Neural digital twins: reconstructing complex medical environments for spatial planning in virtual reality.

Kleinbeck, Constantin; Zhang, Han; Killeen, Benjamin D; Roth, Daniel; Unberath, Mathias.

Int J Comput Assist Radiol Surg ; 19(7): 1301-1312, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38709423

RESUMEN

PURPOSE: Specialized robotic and surgical tools are increasing the complexity of operating rooms (ORs), requiring elaborate preparation especially when techniques or devices are to be used for the first time. Spatial planning can improve efficiency and identify procedural obstacles ahead of time, but real ORs offer little availability to optimize space utilization. Methods for creating reconstructions of physical setups, i.e., digital twins, are needed to enable immersive spatial planning of such complex environments in virtual reality. METHODS: We present a neural rendering-based method to create immersive digital twins of complex medical environments and devices from casual video capture that enables spatial planning of surgical scenarios. To evaluate our approach we recreate two operating rooms and ten objects through neural reconstruction, then conduct a user study with 21 graduate students carrying out planning tasks in the resulting virtual environment. We analyze task load, presence, perceived utility, plus exploration and interaction behavior compared to low visual complexity versions of the same environments. RESULTS: Results show significantly increased perceived utility and presence using the neural reconstruction-based environments, combined with higher perceived workload and exploratory behavior. There's no significant difference in interactivity. CONCLUSION: We explore the feasibility of using modern reconstruction techniques to create digital twins of complex medical environments and objects. Without requiring expert knowledge or specialized hardware, users can create, explore and interact with objects in virtual environments. Results indicate benefits like high perceived utility while being technically approachable, which may indicate promise of this approach for spatial planning and beyond.

Asunto(s)

Quirófanos , Realidad Virtual , Humanos , Interfaz Usuario-Computador , Femenino , Masculino , Adulto , Estudios de Factibilidad , Procedimientos Quirúrgicos Robotizados/métodos

6.

Cognitive load in tele-robotic surgery: a comparison of eye tracker designs.

Soberanis-Mukul, Roger D; Puentes, Paola Ruiz; Acar, Ayberk; Gupta, Iris; Bhowmick, Joyraj; Li, Yizhou; Ghazi, Ahmed; Wu, Jie Ying; Unberath, Mathias.

Int J Comput Assist Radiol Surg ; 19(7): 1281-1284, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38704792

RESUMEN

PURPOSE: Eye gaze tracking and pupillometry are evolving areas within the field of tele-robotic surgery, particularly in the context of estimating cognitive load (CL). However, this is a recent field, and current solutions for gaze and pupil tracking in robotic surgery require assessment. Considering the necessity of stable pupillometry signals for reliable cognitive load estimation, we compare the accuracy of three eye trackers, including head and console-mounted designs. METHODS: We conducted a user study with the da Vinci Research Kit (dVRK), to compare the three designs. We collected eye tracking and dVRK video data while participants observed nine markers distributed over the dVRK screen. We compute and analyze pupil detection stability and gaze prediction accuracy for the three designs. RESULTS: Head-worn devices present better stability and accuracy of gaze prediction and pupil detection compared to console-mounted systems. Tracking stability along the field of view varies between trackers, with gaze predictions detected at invalid zones of the image with high confidence. CONCLUSION: While head-worn solutions show benefits in confidence and stability, our results demonstrate the need to improve eye tacker performance regarding pupil detection, stability, and gaze accuracy in tele-robotic scenarios.

Asunto(s)

Cognición , Tecnología de Seguimiento Ocular , Procedimientos Quirúrgicos Robotizados , Humanos , Cognición/fisiología , Procedimientos Quirúrgicos Robotizados/métodos , Masculino , Femenino , Adulto , Diseño de Equipo , Movimientos Oculares/fisiología , Telemedicina/instrumentación , Fijación Ocular/fisiología , Pupila/fisiología

7.

An endoscopic chisel: intraoperative imaging carves 3D anatomical models.

Mangulabnan, Jan Emily; Soberanis-Mukul, Roger D; Teufel, Timo; Sahu, Manish; Porras, Jose L; Vedula, S Swaroop; Ishii, Masaru; Hager, Gregory; Taylor, Russell H; Unberath, Mathias.

Int J Comput Assist Radiol Surg ; 19(7): 1359-1366, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38753135

RESUMEN

PURPOSE: Preoperative imaging plays a pivotal role in sinus surgery where CTs offer patient-specific insights of complex anatomy, enabling real-time intraoperative navigation to complement endoscopy imaging. However, surgery elicits anatomical changes not represented in the preoperative model, generating an inaccurate basis for navigation during surgery progression. METHODS: We propose a first vision-based approach to update the preoperative 3D anatomical model leveraging intraoperative endoscopic video for navigated sinus surgery where relative camera poses are known. We rely on comparisons of intraoperative monocular depth estimates and preoperative depth renders to identify modified regions. The new depths are integrated in these regions through volumetric fusion in a truncated signed distance function representation to generate an intraoperative 3D model that reflects tissue manipulation RESULTS: We quantitatively evaluate our approach by sequentially updating models for a five-step surgical progression in an ex vivo specimen. We compute the error between correspondences from the updated model and ground-truth intraoperative CT in the region of anatomical modification. The resulting models show a decrease in error during surgical progression as opposed to increasing when no update is employed. CONCLUSION: Our findings suggest that preoperative 3D anatomical models can be updated using intraoperative endoscopy video in navigated sinus surgery. Future work will investigate improvements to monocular depth estimation as well as removing the need for external navigation systems. The resulting ability to continuously update the patient model may provide surgeons with a more precise understanding of the current anatomical state and paves the way toward a digital twin paradigm for sinus surgery.

Asunto(s)

Endoscopía , Imagenología Tridimensional , Modelos Anatómicos , Cirugía Asistida por Computador , Tomografía Computarizada por Rayos X , Imagenología Tridimensional/métodos , Humanos , Endoscopía/métodos , Tomografía Computarizada por Rayos X/métodos , Cirugía Asistida por Computador/métodos , Senos Paranasales/cirugía , Senos Paranasales/diagnóstico por imagen

8.

OneSLAM to map them all: a generalized approach to SLAM for monocular endoscopic imaging based on tracking any point.

Teufel, Timo; Shu, Hongchao; Soberanis-Mukul, Roger D; Mangulabnan, Jan Emily; Sahu, Manish; Vedula, S Swaroop; Ishii, Masaru; Hager, Gregory; Taylor, Russell H; Unberath, Mathias.

Int J Comput Assist Radiol Surg ; 19(7): 1259-1266, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38775904

RESUMEN

PURPOSE: Monocular SLAM algorithms are the key enabling technology for image-based surgical navigation systems for endoscopic procedures. Due to the visual feature scarcity and unique lighting conditions encountered in endoscopy, classical SLAM approaches perform inconsistently. Many of the recent approaches to endoscopic SLAM rely on deep learning models. They show promising results when optimized on singular domains such as arthroscopy, sinus endoscopy, colonoscopy or laparoscopy, but are limited by an inability to generalize to different domains without retraining. METHODS: To address this generality issue, we propose OneSLAM a monocular SLAM algorithm for surgical endoscopy that works out of the box for several endoscopic domains, including sinus endoscopy, colonoscopy, arthroscopy and laparoscopy. Our pipeline builds upon robust tracking any point (TAP) foundation models to reliably track sparse correspondences across multiple frames and runs local bundle adjustment to jointly optimize camera poses and a sparse 3D reconstruction of the anatomy. RESULTS: We compare the performance of our method against three strong baselines previously proposed for monocular SLAM in endoscopy and general scenes. OneSLAM presents better or comparable performance over existing approaches targeted to that specific data in all four tested domains, generalizing across domains without the need for retraining. CONCLUSION: OneSLAM benefits from the convincing performance of TAP foundation models but generalizes to endoscopic sequences of different anatomies all while demonstrating better or comparable performance over domain-specific SLAM approaches. Future research on global loop closure will investigate how to reliably detect loops in endoscopic scenes to reduce accumulated drift and enhance long-term navigation capabilities.

Asunto(s)

Algoritmos , Endoscopía , Humanos , Endoscopía/métodos , Imagenología Tridimensional/métodos , Cirugía Asistida por Computador/métodos , Procesamiento de Imagen Asistido por Computador/métodos

9.

IJCARS: IPCAI 2024 special issue-15th international conference on information processing in computer-assisted interventions 2024-part 1.

Collins, Toby; Moccia, Sara; Unberath, Mathias.

Int J Comput Assist Radiol Surg ; 19(6): 983-984, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38814527

Asunto(s)

Cirugía Asistida por Computador , Humanos , Cirugía Asistida por Computador/métodos , Congresos como Asunto , Procesamiento de Imagen Asistido por Computador/métodos

10.

Deformable motion compensation in interventional cone-beam CT with a context-aware learned autofocus metric.

Huang, Heyuan; Liu, Yixuan; Siewerdsen, Jeffrey H; Lu, Alexander; Hu, Yicheng; Zbijewski, Wojciech; Unberath, Mathias; Weiss, Clifford R; Sisniega, Alejandro.

Med Phys ; 51(6): 4158-4180, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38733602

RESUMEN

PURPOSE: Interventional Cone-Beam CT (CBCT) offers 3D visualization of soft-tissue and vascular anatomy, enabling 3D guidance of abdominal interventions. However, its long acquisition time makes CBCT susceptible to patient motion. Image-based autofocus offers a suitable platform for compensation of deformable motion in CBCT, but it relies on handcrafted motion metrics based on first-order image properties and that lack awareness of the underlying anatomy. This work proposes a data-driven approach to motion quantification via a learned, context-aware, deformable metric, VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ , that quantifies the amount of motion degradation as well as the realism of the structural anatomical content in the image. METHODS: The proposed VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ was modeled as a deep convolutional neural network (CNN) trained to recreate a reference-based structural similarity metric-visual information fidelity (VIF). The deep CNN acted on motion-corrupted images, providing an estimation of the spatial VIF map that would be obtained against a motion-free reference, capturing motion distortion, and anatomic plausibility. The deep CNN featured a multi-branch architecture with a high-resolution branch for estimation of voxel-wise VIF on a small volume of interest. A second contextual, low-resolution branch provided features associated to anatomical context for disentanglement of motion effects and anatomical appearance. The deep CNN was trained on paired motion-free and motion-corrupted data obtained with a high-fidelity forward projection model for a protocol involving 120 kV and 9.90 mGy. The performance of VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ was evaluated via metrics of correlation with ground truth VIF ${\bm{VIF}}$ and with the underlying deformable motion field in simulated data with deformable motion fields with amplitude ranging from 5 to 20 mm and frequency from 2.4 up to 4 cycles/scan. Robustness to variation in tissue contrast and noise levels was assessed in simulation studies with varying beam energy (90-120 kV) and dose (1.19-39.59 mGy). Further validation was obtained on experimental studies with a deformable phantom. Final validation was obtained via integration of VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ on an autofocus compensation framework, applied to motion compensation on experimental datasets and evaluated via metric of spatial resolution on soft-tissue boundaries and sharpness of contrast-enhanced vascularity. RESULTS: The magnitude and spatial map of VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ showed consistent and high correlation levels with the ground truth in both simulation and real data, yielding average normalized cross correlation (NCC) values of 0.95 and 0.88, respectively. Similarly, VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ achieved good correlation values with the underlying motion field, with average NCC of 0.90. In experimental phantom studies, VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ properly reflects the change in motion amplitudes and frequencies: voxel-wise averaging of the local VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ across the full reconstructed volume yielded an average value of 0.69 for the case with mild motion (2 mm, 12 cycles/scan) and 0.29 for the case with severe motion (12 mm, 6 cycles/scan). Autofocus motion compensation using VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ resulted in noticeable mitigation of motion artifacts and improved spatial resolution of soft tissue and high-contrast structures, resulting in reduction of edge spread function width of 8.78% and 9.20%, respectively. Motion compensation also increased the conspicuity of contrast-enhanced vascularity, reflected in an increase of 9.64% in vessel sharpness. CONCLUSION: The proposed VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ , featuring a novel context-aware architecture, demonstrated its capacity as a reference-free surrogate of structural similarity to quantify motion-induced degradation of image quality and anatomical plausibility of image content. The validation studies showed robust performance across motion patterns, x-ray techniques, and anatomical instances. The proposed anatomy- and context-aware metric poses a powerful alternative to conventional motion estimation metrics, and a step forward for application of deep autofocus motion compensation for guidance in clinical interventional procedures.

Asunto(s)

Tomografía Computarizada de Haz Cónico , Procesamiento de Imagen Asistido por Computador , Movimiento , Tomografía Computarizada de Haz Cónico/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Humanos

11.

MoViT: Memorizing Vision Transformers for Medical Image Analysis.

Shen, Yiqing; Guo, Pengfei; Wu, Jingpu; Huang, Qianqi; Le, Nhat; Zhou, Jinyuan; Jiang, Shanshan; Unberath, Mathias.

Mach Learn Med Imaging ; 14349: 205-213, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38617846

RESUMEN

The synergy of long-range dependencies from transformers and local representations of image content from convolutional neural networks (CNNs) has led to advanced architectures and increased performance for various medical image analysis tasks due to their complementary benefits. However, compared with CNNs, transformers require considerably more training data, due to a larger number of parameters and an absence of inductive bias. The need for increasingly large datasets continues to be problematic, particularly in the context of medical imaging, where both annotation efforts and data protection result in limited data availability. In this work, inspired by the human decision-making process of correlating new "evidence" with previously memorized "experience", we propose a Memorizing Vision Transformer (MoViT) to alleviate the need for large-scale datasets to successfully train and deploy transformer-based architectures. MoViT leverages an external memory structure to cache history attention snapshots during the training stage. To prevent overfitting, we incorporate an innovative memory update scheme, attention temporal moving average, to update the stored external memories with the historical moving average. For inference speedup, we design a prototypical attention learning method to distill the external memory into smaller representative subsets. We evaluate our method on a public histology image dataset and an in-house MRI dataset, demonstrating that MoViT applied to varied medical image analysis tasks, can outperform vanilla transformer models across varied data regimes, especially in cases where only a small amount of annotated data is available. More importantly, MoViT can reach a competitive performance of ViT with only 3.0% of the training data. In conclusion, MoViT provides a simple plug-in for transformer architectures which may contribute to reducing the training data needed to achieve acceptable models for a broad range of medical image analysis tasks.

12.

A Deep Learning Framework for Analysis of the Eustachian Tube and the Internal Carotid Artery.

Amanian, Ameen; Jain, Aseem; Xiao, Yuliang; Kim, Chanha; Ding, Andy S; Sahu, Manish; Taylor, Russell; Unberath, Mathias; Ward, Bryan K; Galaiya, Deepa; Ishii, Masaru; Creighton, Francis X.

Otolaryngol Head Neck Surg ; 171(3): 731-739, 2024 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-38686594

RESUMEN

OBJECTIVE: Obtaining automated, objective 3-dimensional (3D) models of the Eustachian tube (ET) and the internal carotid artery (ICA) from computed tomography (CT) scans could provide useful navigational and diagnostic information for ET pathologies and interventions. We aim to develop a deep learning (DL) pipeline to automatically segment the ET and ICA and use these segmentations to compute distances between these structures. STUDY DESIGN: Retrospective cohort. SETTING: Tertiary referral center. METHODS: From a database of 30 CT scans, 60 ET and ICA pairs were manually segmented and used to train an nnU-Net model, a DL segmentation framework. These segmentations were also used to develop a quantitative tool to capture the magnitude and location of the minimum distance point (MDP) between ET and ICA. Performance metrics for the nnU-Net automated segmentations were calculated via the average Hausdorff distance (AHD) and dice similarity coefficient (DSC). RESULTS: The AHD for the ET and ICA were 0.922 and 0.246 mm, respectively. Similarly, the DSC values for the ET and ICA were 0.578 and 0.884. The mean MDP from ET to ICA in the cartilaginous region was 2.6 mm (0.7-5.3 mm) and was located on average 1.9 mm caudal from the bony cartilaginous junction. CONCLUSION: This study describes the first end-to-end DL pipeline for automated ET and ICA segmentation and analyzes distances between these structures. In addition to helping to ensure the safe selection of patients for ET dilation, this method can facilitate large-scale studies exploring the relationship between ET pathologies and the 3D shape of the ET.

Asunto(s)

Arteria Carótida Interna , Aprendizaje Profundo , Trompa Auditiva , Imagenología Tridimensional , Tomografía Computarizada por Rayos X , Humanos , Trompa Auditiva/diagnóstico por imagen , Trompa Auditiva/anatomía & histología , Estudios Retrospectivos , Arteria Carótida Interna/diagnóstico por imagen , Arteria Carótida Interna/anatomía & histología , Femenino , Masculino , Persona de Mediana Edad , Adulto

13.

Take a shot! Natural language control of intelligent robotic X-ray systems in surgery.

Killeen, Benjamin D; Chaudhary, Shreayan; Osgood, Greg; Unberath, Mathias.

Int J Comput Assist Radiol Surg ; 19(6): 1165-1173, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38619790

RESUMEN

PURPOSE: The expanding capabilities of surgical systems bring with them increasing complexity in the interfaces that humans use to control them. Robotic C-arm X-ray imaging systems, for instance, often require manipulation of independent axes via joysticks, while higher-level control options hide inside device-specific menus. The complexity of these interfaces hinder "ready-to-hand" use of high-level functions. Natural language offers a flexible, familiar interface for surgeons to express their desired outcome rather than remembering the steps necessary to achieve it, enabling direct access to task-aware, patient-specific C-arm functionality. METHODS: We present an English language voice interface for controlling a robotic X-ray imaging system with task-aware functions for pelvic trauma surgery. Our fully integrated system uses a large language model (LLM) to convert natural spoken commands into machine-readable instructions, enabling low-level commands like "Tilt back a bit," to increase the angular tilt or patient-specific directions like, "Go to the obturator oblique view of the right ramus," based on automated image analysis. RESULTS: We evaluate our system with 212 prompts provided by an attending physician, in which the system performed satisfactory actions 97% of the time. To test the fully integrated system, we conduct a real-time study in which an attending physician placed orthopedic hardware along desired trajectories through an anthropomorphic phantom, interacting solely with an X-ray system via voice. CONCLUSION: Voice interfaces offer a convenient, flexible way for surgeons to manipulate C-arms based on desired outcomes rather than device-specific processes. As LLMs grow increasingly capable, so too will their applications in supporting higher-level interactions with surgical assistance systems.

Asunto(s)

Procedimientos Quirúrgicos Robotizados , Humanos , Procedimientos Quirúrgicos Robotizados/métodos , Procedimientos Quirúrgicos Robotizados/instrumentación , Interfaz Usuario-Computador , Pelvis/cirugía , Procesamiento de Lenguaje Natural

14.

Stand in surgeon's shoes: virtual reality cross-training to enhance teamwork in surgery.

Killeen, Benjamin D; Zhang, Han; Wang, Liam J; Liu, Zixuan; Kleinbeck, Constantin; Rosen, Michael; Taylor, Russell H; Osgood, Greg; Unberath, Mathias.

Int J Comput Assist Radiol Surg ; 19(6): 1213-1222, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38642297

RESUMEN

PURPOSE: Teamwork in surgery depends on a shared mental model of success, i.e., a common understanding of objectives in the operating room. A shared model leads to increased engagement among team members and is associated with fewer complications and overall better outcomes for patients. However, clinical training typically focuses on role-specific skills, leaving individuals to acquire a shared model indirectly through on-the-job experience. METHODS: We investigate whether virtual reality (VR) cross-training, i.elet@tokeneonedotexposure to other roles, can enhance a shared mental model for non-surgeons more directly. Our study focuses on X-ray guided pelvic trauma surgery, a procedure where successful communication depends on the shared model between the surgeon and a C-arm technologist. We present a VR environment supporting both roles and evaluate a cross-training curriculum in which non-surgeons swap roles with the surgeon. RESULTS: Exposure to the surgical task resulted in higher engagement with the C-arm technologist role in VR, as measured by the mental demand and effort expended by participants ( p < 0.001 ). It also has a significant effect on non-surgeon's mental model of the overall task; novice participants' estimation of the mental demand and effort required for the surgeon's task increases after training, while their perception of overall performance decreases ( p < 0.05 ), indicating a gap in understanding based solely on observation. This phenomenon was also present for a professional C-arm technologist. CONCLUSION: Until now, VR applications for clinical training have focused on virtualizing existing curricula. We demonstrate how novel approaches which are not possible outside of a virtual environment, such as role swapping, may enhance the shared mental model of surgical teams by contextualizing each individual's role within the overall task in a time- and cost-efficient manner. As workflows grow increasingly sophisticated, we see VR curricula as being able to directly foster a shared model for success, ultimately benefiting patient outcomes through more effective teamwork in surgery.

Asunto(s)

Grupo de Atención al Paciente , Realidad Virtual , Humanos , Femenino , Masculino , Curriculum , Competencia Clínica , Adulto , Cirugía Asistida por Computador/métodos , Cirugía Asistida por Computador/educación , Cirujanos/educación , Cirujanos/psicología

15.

Cognitive effort detection for tele-robotic surgery via personalized pupil response modeling.

Büter, Regine; Soberanis-Mukul, Roger D; Shankar, Rohit; Ruiz Puentes, Paola; Ghazi, Ahmed; Wu, Jie Ying; Unberath, Mathias.

Int J Comput Assist Radiol Surg ; 19(6): 1113-1120, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38589579

RESUMEN

PURPOSE: Gaze tracking and pupillometry are established proxies for cognitive load, giving insights into a user's mental effort. In tele-robotic surgery, knowing a user's cognitive load can inspire novel human-machine interaction designs, fostering contextual surgical assistance systems and personalized training programs. While pupillometry-based methods for estimating cognitive effort have been proposed, their application in surgery is limited by the pupil's sensitivity to brightness changes, which can mask pupil's response to cognitive load. Thus, methods considering pupil and brightness conditions are essential for detecting cognitive effort in unconstrained scenarios. METHODS: To contend with this challenge, we introduce a personalized pupil response model integrating pupil and brightness-based features. Discrepancies between predicted and measured pupil diameter indicate dilations due to non-brightness-related sources, i.e., cognitive effort. Combined with gaze entropy, it can detect cognitive load using a random forest classifier. To test our model, we perform a user study with the da Vinci Research Kit, where 17 users perform pick-and-place tasks in addition to auditory tasks known to generate cognitive effort responses. RESULTS: We compare our method to two baselines (BCPD and CPD), demonstrating favorable performance in varying brightness conditions. Our method achieves an average true positive rate of 0.78, outperforming the baselines (0.57 and 0.64). CONCLUSION: We present a personalized brightness-aware model for cognitive effort detection able to operate under unconstrained brightness conditions, comparing favorably to competing approaches, contributing to the advancement of cognitive effort detection in tele-robotic surgery. Future work will consider alternative learning strategies, handling the difficult positive-unlabeled scenario in user studies, where only some positive and no negative events are reliably known.

Asunto(s)

Cognición , Pupila , Procedimientos Quirúrgicos Robotizados , Humanos , Pupila/fisiología , Cognición/fisiología , Procedimientos Quirúrgicos Robotizados/métodos , Telemedicina , Masculino , Adulto , Femenino

16.

An ASER AI/ML expert panel formative user research study for an interpretable interactive splenic AAST grading graphical user interface prototype.

Sarkar, Nathan; Kumagai, Mitsuo; Meyr, Samantha; Pothapragada, Sriya; Unberath, Mathias; Li, Guang; Ahmed, Sagheer Rauf; Smith, Elana Beth; Davis, Melissa Ann; Khatri, Garvit Devmohan; Agrawal, Anjali; Delproposto, Zachary Scott; Chen, Haomin; Caballero, Catalina Gómez; Dreizin, David.

Emerg Radiol ; 31(2): 167-178, 2024 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-38302827

RESUMEN

PURPOSE: The AAST Organ Injury Scale is widely adopted for splenic injury severity but suffers from only moderate inter-rater agreement. This work assesses SpleenPro, a prototype interactive explainable artificial intelligence/machine learning (AI/ML) diagnostic aid to support AAST grading, for effects on radiologist dwell time, agreement, clinical utility, and user acceptance. METHODS: Two trauma radiology ad hoc expert panelists independently performed timed AAST grading on 76 admission CT studies with blunt splenic injury, first without AI/ML assistance, and after a 2-month washout period and randomization, with AI/ML assistance. To evaluate user acceptance, three versions of the SpleenPro user interface with increasing explainability were presented to four independent expert panelists with four example cases each. A structured interview consisting of Likert scales and free responses was conducted, with specific questions regarding dimensions of diagnostic utility (DU); mental support (MS); effort, workload, and frustration (EWF); trust and reliability (TR); and likelihood of future use (LFU). RESULTS: SpleenPro significantly decreased interpretation times for both raters. Weighted Cohen's kappa increased from 0.53 to 0.70 with AI/ML assistance. During user acceptance interviews, increasing explainability was associated with improvement in Likert scores for MS, EWF, TR, and LFU. Expert panelists indicated the need for a combined early notification and grading functionality, PACS integration, and report autopopulation to improve DU. CONCLUSIONS: SpleenPro was useful for improving objectivity of AAST grading and increasing mental support. Formative user research identified generalizable concepts including the need for a combined detection and grading pipeline and integration with the clinical workflow.

Asunto(s)

Tomografía Computarizada por Rayos X , Heridas no Penetrantes , Humanos , Tomografía Computarizada por Rayos X/métodos , Inteligencia Artificial , Reproducibilidad de los Resultados , Aprendizaje Automático

17.

Artificial Intelligence and Technology Collaboratories: Innovating aging research and Alzheimer's care.

Abadir, Peter; Oh, Esther; Chellappa, Rama; Choudhry, Niteesh; Demiris, George; Ganesan, Deepak; Karlawish, Jason; Marlin, Benjamin; Li, Rose M; Dehak, Najim; Arbaje, Alicia; Unberath, Mathias; Cudjoe, Thomas; Chute, Christopher; Moore, Jason H; Phan, Phillip; Samus, Quincy; Schoenborn, Nancy L; Battle, Alexis; Walston, Jeremy D.

Alzheimers Dement ; 20(4): 3074-3079, 2024 04.

Artículo en Inglés | MEDLINE | ID: mdl-38324244

RESUMEN

This perspective outlines the Artificial Intelligence and Technology Collaboratories (AITC) at Johns Hopkins University, University of Pennsylvania, and University of Massachusetts, highlighting their roles in developing AI-based technologies for older adult care, particularly targeting Alzheimer's disease (AD). These National Institute on Aging (NIA) centers foster collaboration among clinicians, gerontologists, ethicists, business professionals, and engineers to create AI solutions. Key activities include identifying technology needs, stakeholder engagement, training, mentoring, data integration, and navigating ethical challenges. The objective is to apply these innovations effectively in real-world scenarios, including in rural settings. In addition, the AITC focuses on developing best practices for AI application in the care of older adults, facilitating pilot studies, and addressing ethical concerns related to technology development for older adults with cognitive impairment, with the ultimate aim of improving the lives of older adults and their caregivers. HIGHLIGHTS: Addressing the complex needs of older adults with Alzheimer's disease (AD) requires a comprehensive approach, integrating medical and social support. Current gaps in training, techniques, tools, and expertise hinder uniform access across communities and health care settings. Artificial intelligence (AI) and digital technologies hold promise in transforming care for this demographic. Yet, transitioning these innovations from concept to marketable products presents significant challenges, often stalling promising advancements in the developmental phase. The Artificial Intelligence and Technology Collaboratories (AITC) program, funded by the National Institute on Aging (NIA), presents a viable model. These Collaboratories foster the development and implementation of AI methods and technologies through projects aimed at improving care for older Americans, particularly those with AD, and promote the sharing of best practices in AI and technology integration. Why Does This Matter? The National Institute on Aging (NIA) Artificial Intelligence and Technology Collaboratories (AITC) program's mission is to accelerate the adoption of artificial intelligence (AI) and new technologies for the betterment of older adults, especially those with dementia. By bridging scientific and technological expertise, fostering clinical and industry partnerships, and enhancing the sharing of best practices, this program can significantly improve the health and quality of life for older adults with Alzheimer's disease (AD).

Asunto(s)

Enfermedad de Alzheimer , Isotiocianatos , Estados Unidos , Humanos , Anciano , Enfermedad de Alzheimer/terapia , Inteligencia Artificial , Gerociencia , Calidad de Vida , Tecnología

18.

Deep learning-based identification of eyes at risk for glaucoma surgery.

Wang, Ruolin; Bradley, Chris; Herbert, Patrick; Hou, Kaihua; Ramulu, Pradeep; Breininger, Katharina; Unberath, Mathias; Yohannan, Jithin.

Sci Rep ; 14(1): 599, 2024 01 05.

Artículo en Inglés | MEDLINE | ID: mdl-38182701

RESUMEN

To develop and evaluate the performance of a deep learning model (DLM) that predicts eyes at high risk of surgical intervention for uncontrolled glaucoma based on multimodal data from an initial ophthalmology visit. Longitudinal, observational, retrospective study. 4898 unique eyes from 4038 adult glaucoma or glaucoma-suspect patients who underwent surgery for uncontrolled glaucoma (trabeculectomy, tube shunt, xen, or diode surgery) between 2013 and 2021, or did not undergo glaucoma surgery but had 3 or more ophthalmology visits. We constructed a DLM to predict the occurrence of glaucoma surgery within various time horizons from a baseline visit. Model inputs included spatially oriented visual field (VF) and optical coherence tomography (OCT) data as well as clinical and demographic features. Separate DLMs with the same architecture were trained to predict the occurrence of surgery within 3 months, within 3-6 months, within 6 months-1 year, within 1-2 years, within 2-3 years, within 3-4 years, and within 4-5 years from the baseline visit. Included eyes were randomly split into 60%, 20%, and 20% for training, validation, and testing. DLM performance was measured using area under the receiver operating characteristic curve (AUC) and precision-recall curve (PRC). Shapley additive explanations (SHAP) were utilized to assess the importance of different features. Model prediction of surgery for uncontrolled glaucoma within 3 months had the best AUC of 0.92 (95% CI 0.88, 0.96). DLMs achieved clinically useful AUC values (> 0.8) for all models that predicted the occurrence of surgery within 3 years. According to SHAP analysis, all 7 models placed intraocular pressure (IOP) within the five most important features in predicting the occurrence of glaucoma surgery. Mean deviation (MD) and average retinal nerve fiber layer (RNFL) thickness were listed among the top 5 most important features by 6 of the 7 models. DLMs can successfully identify eyes requiring surgery for uncontrolled glaucoma within specific time horizons. Predictive performance decreases as the time horizon for forecasting surgery increases. Implementing prediction models in a clinical setting may help identify patients that should be referred to a glaucoma specialist for surgical evaluation.

Asunto(s)

Aprendizaje Profundo , Glaucoma , Oftalmología , Trabeculectomía , Adulto , Humanos , Estudios Retrospectivos , Glaucoma/cirugía , Retina

19.

Assessment of linear regression of peripapillary optical coherence tomography retinal nerve fiber layer measurements to forecast glacuoma trajectory.

Bradley, Chris; Hou, Kaihua; Herbert, Patrick; Unberath, Mathias; Hager, Greg; Boland, Michael V; Ramulu, Pradeep; Yohannan, Jithin.

PLoS One ; 19(1): e0296674, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38215176

RESUMEN

Linear regression of optical coherence tomography measurements of peripapillary retinal nerve fiber layer thickness is often used to detect glaucoma progression and forecast future disease course. However, current measurement frequencies suggest that clinicians often apply linear regression to a relatively small number of measurements (e.g., less than a handful). In this study, we estimate the accuracy of linear regression in predicting the next reliable measurement of average retinal nerve fiber layer thickness using Zeiss Cirrus optical coherence tomography measurements of average retinal nerve fiber layer thickness from a sample of 6,471 eyes with glaucoma or glaucoma-suspect status. Linear regression is compared to two null models: no glaucoma worsening, and worsening due to aging. Linear regression on the first M ≥ 2 measurements was significantly worse at predicting a reliable M+1st measurement for 2 ≤ M ≤ 6. This range was reduced to 2 ≤ M ≤ 5 when retinal nerve fiber layer thickness measurements were first "corrected" for scan quality. Simulations based on measurement frequencies in our sample-on average 393 ± 190 days between consecutive measurements-show that linear regression outperforms both null models when M ≥ 5 and the goal is to forecast moderate (75th percentile) worsening, and when M ≥ 3 for rapid (90th percentile) worsening. If linear regression is used to assess disease trajectory with a small number of measurements over short time periods (e.g., 1-2 years), as is often the case in clinical practice, the number of optical coherence tomography examinations needs to be increased.

Asunto(s)

Glaucoma , Tomografía de Coherencia Óptica , Humanos , Tomografía de Coherencia Óptica/métodos , Modelos Lineales , Células Ganglionares de la Retina , Glaucoma/diagnóstico por imagen , Fibras Nerviosas , Presión Intraocular

20.

Opportunities for Improving Glaucoma Clinical Trials via Deep Learning-Based Identification of Patients with Low Visual Field Variability.

Wang, Ruolin; Bradley, Chris; Herbert, Patrick; Hou, Kaihua; Hager, Gregory D; Breininger, Katharina; Unberath, Mathias; Ramulu, Pradeep; Yohannan, Jithin.

Ophthalmol Glaucoma ; 7(3): 222-231, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38296108

RESUMEN

PURPOSE: Develop and evaluate the performance of a deep learning model (DLM) that forecasts eyes with low future visual field (VF) variability, and study the impact of using this DLM on sample size requirements for neuroprotective trials. DESIGN: Retrospective cohort and simulation study. METHODS: We included 1 eye per patient with baseline reliable VFs, OCT, clinical measures (demographics, intraocular pressure, and visual acuity), and 5 subsequent reliable VFs to forecast VF variability using DLMs and perform sample size estimates. We estimated sample size for 3 groups of eyes: all eyes (AE), low variability eyes (LVE: the subset of AE with a standard deviation of mean deviation [MD] slope residuals in the bottom 25th percentile), and DLM-predicted low variability eyes (DLPE: the subset of AE predicted to be low variability by the DLM). Deep learning models using only baseline VF/OCT/clinical data as input (DLM1), or also using a second VF (DLM2) were constructed to predict low VF variability (DLPE1 and DLPE2, respectively). Data were split 60/10/30 into train/val/test. Clinical trial simulations were performed only on the test set. We estimated the sample size necessary to detect treatment effects of 20% to 50% in MD slope with 80% power. Power was defined as the percentage of simulated clinical trials where the MD slope was significantly worse from the control. Clinical trials were simulated with visits every 3 months with a total of 10 visits. RESULTS: A total of 2817 eyes were included in the analysis. Deep learning models 1 and 2 achieved an area under the receiver operating characteristic curve of 0.73 (95% confidence interval [CI]: 0.68, 0.76) and 0.82 (95% CI: 0.78, 0.85) in forecasting low VF variability. When compared with including AE, using DLPE1 and DLPE2 reduced sample size to achieve 80% power by 30% and 38% for 30% treatment effect, and 31% and 38% for 50% treatment effect. CONCLUSIONS: Deep learning models can forecast eyes with low VF variability using data from a single baseline clinical visit. This can reduce sample size requirements, and potentially reduce the burden of future glaucoma clinical trials. FINANCIAL DISCLOSURE(S): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

Asunto(s)

Aprendizaje Profundo , Presión Intraocular , Campos Visuales , Humanos , Campos Visuales/fisiología , Estudios Retrospectivos , Presión Intraocular/fisiología , Femenino , Masculino , Ensayos Clínicos como Asunto , Glaucoma/fisiopatología , Glaucoma/diagnóstico , Agudeza Visual/fisiología , Anciano , Pruebas del Campo Visual/métodos , Persona de Mediana Edad , Tomografía de Coherencia Óptica/métodos

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA