Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 422
Filter
1.
IEEE Trans Med Imaging ; PP2024 Jul 16.
Article in English | MEDLINE | ID: mdl-39012730

ABSTRACT

Automatic report generation has arisen as a significant research area in computer-aided diagnosis, aiming to alleviate the burden on clinicians by generating reports automatically based on medical images. In this work, we propose a novel framework for automatic ultrasound report generation, leveraging a combination of unsupervised and supervised learning methods to aid the report generation process. Our framework incorporates unsupervised learning methods to extract potential knowledge from ultrasound text reports, serving as the prior information to guide the model in aligning visual and textual features, thereby addressing the challenge of feature discrepancy. Additionally, we design a global semantic comparison mechanism to enhance the performance of generating more comprehensive and accurate medical reports. To enable the implementation of ultrasound report generation, we constructed three large-scale ultrasound image-text datasets from different organs for training and validation purposes. Extensive evaluations with other state-of-the-art approaches exhibit its superior performance across all three datasets. Code and dataset are valuable at this link.

2.
IEEE Trans Med Imaging ; PP2024 Jun 03.
Article in English | MEDLINE | ID: mdl-38829753

ABSTRACT

Registering pre-operative modalities, such as magnetic resonance imaging or computed tomography, to ultrasound images is crucial for guiding clinicians during surgeries and biopsies. Recently, deep-learning approaches have been proposed to increase the speed and accuracy of this registration problem. However, all of these approaches need expensive supervision from the ultrasound domain. In this work, we propose a multitask generative framework that needs weak supervision only from the pre-operative imaging domain during training. To perform a deformable registration, the proposed framework translates a magnetic resonance image to the ultrasound domain while preserving the structural content. To demonstrate the efficacy of the proposed method, we tackle the registration problem of pre-operative 3D MR to transrectal ultrasonography images as necessary for targeted prostate biopsies. We use an in-house dataset of 600 patients, divided into 540 for training, 30 for validation, and the remaining for testing. An expert manually segmented the prostate in both modalities for validation and test sets to assess the performance of our framework. The proposed framework achieves a 3.58 mm target registration error on the expert-selected landmarks, 89.2% in the Dice score, and 1.81 mm 95th percentile Hausdorff distance on the prostate masks in the test set. Our experiments demonstrate that the proposed generative model successfully translates magnetic resonance images into the ultrasound domain. The translated image contains the structural content and fine details due to an ultrasound-specific two-path design of the generative model. The proposed framework enables training learning-based registration methods while only weak supervision from the pre-operative domain is available.

3.
Article in English | MEDLINE | ID: mdl-38831175

ABSTRACT

PURPOSE: Acoustic information can contain viable information in medicine and specifically in surgery. While laparoscopy depends mainly on visual information, our goal is to develop the means to capture and process acoustic information during laparoscopic surgery. METHODS: To achieve this, we iteratively developed three prototypes that will overcome the abdominal wall as a sound barrier and can be used with standard trocars. We evaluated them in terms of clinical applicability and sound transmission quality. Furthermore, the applicability of each prototype for sound classification based on machine learning was evaluated. RESULTS: Our developed prototypes for recording airborne sound from the intraperitoneal cavity represent a promising solution suitable for real-world clinical usage All three prototypes fulfill our set requirements in terms of clinical applicability (i.e., air-tightness, invasiveness, sterility) and show promising results regarding their acoustic characteristics and the associated results on ML-based sound classification. CONCLUSION: In summary, our prototypes for capturing acoustic information during laparoscopic surgeries integrate seamlessly with existing procedures and have the potential to augment the surgeon's perception. This advancement could change how surgeons interact with and understand the surgical field.

4.
EJNMMI Phys ; 11(1): 51, 2024 Jun 26.
Article in English | MEDLINE | ID: mdl-38922372

ABSTRACT

BACKGROUND: Dosimetry-based personalized therapy was shown to have clinical benefits e.g. in liver selective internal radiation therapy (SIRT). Yet, there is no consensus about its introduction into clinical practice, mainly as Monte Carlo simulations (gold standard for dosimetry) involve massive computation time. We addressed the problem of computation time and tested a patch-based approach for Monte Carlo simulations for internal dosimetry to improve parallelization. We introduce a physics-inspired cropping layout for patch-based MC dosimetry, and compare it to cropping layouts of the literature as well as dosimetry using organ-S-values, and dose kernels, taking whole-body Monte Carlo simulations as ground truth. This was evaluated in five patients receiving Yttrium-90 liver SIRT. RESULTS: The patch-based Monte Carlo approach yielded the closest results to the ground truth, making it a valid alternative to the conventional approach. Our physics-inspired cropping layout and mosaicking scheme yielded a voxel-wise error of < 2% compared to whole-body Monte Carlo in soft tissue, while requiring only ≈  10% of the time. CONCLUSIONS: This work demonstrates the feasibility and accuracy of physics-inspired cropping layouts for patch-based Monte Carlo simulations.

5.
Int J Comput Assist Radiol Surg ; 19(7): 1339-1347, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38748052

ABSTRACT

PURPOSE: Ultrasound (US) imaging, while advantageous for its radiation-free nature, is challenging to interpret due to only partially visible organs and a lack of complete 3D information. While performing US-based diagnosis or investigation, medical professionals therefore create a mental map of the 3D anatomy. In this work, we aim to replicate this process and enhance the visual representation of anatomical structures. METHODS: We introduce a point cloud-based probabilistic deep learning (DL) method to complete occluded anatomical structures through 3D shape completion and choose US-based spine examinations as our application. To enable training, we generate synthetic 3D representations of partially occluded spinal views by mimicking US physics and accounting for inherent artifacts. RESULTS: The proposed model performs consistently on synthetic and patient data, with mean and median differences of 2.02 and 0.03 in Chamfer Distance (CD), respectively. Our ablation study demonstrates the importance of US physics-based data generation, reflected in the large mean and median difference of 11.8 CD and 9.55 CD, respectively. Additionally, we demonstrate that anatomical landmarks, such as the spinous process (with reconstruction CD of 4.73) and the facet joints (mean distance to ground truth (GT) of 4.96 mm), are preserved in the 3D completion. CONCLUSION: Our work establishes the feasibility of 3D shape completion for lumbar vertebrae, ensuring the preservation of level-wise characteristics and successful generalization from synthetic to real data. The incorporation of US physics contributes to more accurate patient data completions. Notably, our method preserves essential anatomical landmarks and reconstructs crucial injections sites at their correct locations.


Subject(s)
Deep Learning , Imaging, Three-Dimensional , Ultrasonography , Humans , Imaging, Three-Dimensional/methods , Ultrasonography/methods , Spine/diagnostic imaging , Spine/anatomy & histology , Anatomic Landmarks
6.
Int J Comput Assist Radiol Surg ; 19(7): 1409-1417, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38780829

ABSTRACT

PURPOSE: The modern operating room is becoming increasingly complex, requiring innovative intra-operative support systems. While the focus of surgical data science has largely been on video analysis, integrating surgical computer vision with natural language capabilities is emerging as a necessity. Our work aims to advance visual question answering (VQA) in the surgical context with scene graph knowledge, addressing two main challenges in the current surgical VQA systems: removing question-condition bias in the surgical VQA dataset and incorporating scene-aware reasoning in the surgical VQA model design. METHODS: First, we propose a surgical scene graph-based dataset, SSG-VQA, generated by employing segmentation and detection models on publicly available datasets. We build surgical scene graphs using spatial and action information of instruments and anatomies. These graphs are fed into a question engine, generating diverse QA pairs. We then propose SSG-VQA-Net, a novel surgical VQA model incorporating a lightweight Scene-embedded Interaction Module, which integrates geometric scene knowledge in the VQA model design by employing cross-attention between the textual and the scene features. RESULTS: Our comprehensive analysis shows that our SSG-VQA dataset provides a more complex, diverse, geometrically grounded, unbiased and surgical action-oriented dataset compared to existing surgical VQA datasets and SSG-VQA-Net outperforms existing methods across different question types and complexities. We highlight that the primary limitation in the current surgical VQA systems is the lack of scene knowledge to answer complex queries. CONCLUSION: We present a novel surgical VQA dataset and model and show that results can be significantly improved by incorporating geometric scene features in the VQA model design. We point out that the bottleneck of the current surgical visual question-answer model lies in learning the encoded representation rather than decoding the sequence. Our SSG-VQA dataset provides a diagnostic benchmark to test the scene understanding and reasoning capabilities of the model. The source code and the dataset will be made publicly available at: https://github.com/CAMMA-public/SSG-VQA .


Subject(s)
Operating Rooms , Humans , Surgery, Computer-Assisted/methods , Natural Language Processing , Video Recording
7.
Int J Comput Assist Radiol Surg ; 19(7): 1419-1427, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38789884

ABSTRACT

PURPOSE: Segmenting ultrasound images is important for precise area and/or volume calculations, ensuring reliable diagnosis and effective treatment evaluation for diseases. Recently, many segmentation methods have been proposed and shown impressive performance. However, currently, there is no deeper understanding of how networks segment target regions or how they define the boundaries. In this paper, we present a new approach that analyzes ultrasound segmentation networks in terms of learned borders because border delimitation is challenging in ultrasound. METHODS: We propose a way to split the boundaries for ultrasound images into distinct and completed. By exploiting the Grad-CAM of the split borders, we analyze the areas each network pays attention to. Further, we calculate the ratio of correct predictions for distinct and completed borders. We conducted experiments on an in-house leg ultrasound dataset (LEG-3D-US) as well as on two additional public datasets of thyroid, nerves, and one private for prostate. RESULTS: Quantitatively, the networks exhibit around 10% improvement in handling completed borders compared to distinct borders. Similar to doctors, the network struggles to define the borders in less visible areas. Additionally, the Seg-Grad-CAM analysis underscores how completion uses distinct borders and landmarks, while distinct focuses mainly on the shiny structures. We also observe variations depending on the attention mechanism of each architecture. CONCLUSION: In this work, we highlight the importance of studying ultrasound borders differently than other modalities such as MRI or CT. We split the borders into distinct and completed, similar to clinicians, and show the quality of the network-learned information for these two types of borders. Additionally, we open-source a 3D leg ultrasound dataset to the community https://github.com/Al3xand1a/segmentation-border-analysis .


Subject(s)
Ultrasonography , Humans , Ultrasonography/methods , Male , Thyroid Gland/diagnostic imaging , Prostate/diagnostic imaging , Leg/diagnostic imaging , Imaging, Three-Dimensional/methods
8.
Sci Data ; 11(1): 494, 2024 May 14.
Article in English | MEDLINE | ID: mdl-38744868

ABSTRACT

The standard of care for brain tumors is maximal safe surgical resection. Neuronavigation augments the surgeon's ability to achieve this but loses validity as surgery progresses due to brain shift. Moreover, gliomas are often indistinguishable from surrounding healthy brain tissue. Intraoperative magnetic resonance imaging (iMRI) and ultrasound (iUS) help visualize the tumor and brain shift. iUS is faster and easier to incorporate into surgical workflows but offers a lower contrast between tumorous and healthy tissues than iMRI. With the success of data-hungry Artificial Intelligence algorithms in medical image analysis, the benefits of sharing well-curated data cannot be overstated. To this end, we provide the largest publicly available MRI and iUS database of surgically treated brain tumors, including gliomas (n = 92), metastases (n = 11), and others (n = 11). This collection contains 369 preoperative MRI series, 320 3D iUS series, 301 iMRI series, and 356 segmentations collected from 114 consecutive patients at a single institution. This database is expected to help brain shift and image analysis research and neurosurgical training in interpreting iUS and iMRI.


Subject(s)
Brain Neoplasms , Databases, Factual , Magnetic Resonance Imaging , Multimodal Imaging , Humans , Brain Neoplasms/diagnostic imaging , Brain Neoplasms/surgery , Brain/diagnostic imaging , Brain/surgery , Glioma/diagnostic imaging , Glioma/surgery , Ultrasonography , Neuronavigation/methods
9.
Int J Comput Assist Radiol Surg ; 19(6): 1085-1091, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38570373

ABSTRACT

PURPOSE: Automated endoscopy video analysis is essential for assisting surgeons during medical procedures, but it faces challenges due to complex surgical scenes and limited annotated data. Large-scale pretraining has shown great success in natural language processing and computer vision communities in recent years. These approaches reduce the need for annotated data, which is of great interest in the medical domain. In this work, we investigate endoscopy domain-specific self-supervised pretraining on large collections of data. METHODS: To this end, we first collect Endo700k, the largest publicly available corpus of endoscopic images, extracted from nine public Minimally Invasive Surgery (MIS) datasets. Endo700k comprises more than 700,000 images. Next, we introduce EndoViT, an endoscopy-pretrained Vision Transformer (ViT), and evaluate it on a diverse set of surgical downstream tasks. RESULTS: Our findings indicate that domain-specific pretraining with EndoViT yields notable advantages in complex downstream tasks. In the case of action triplet recognition, our approach outperforms ImageNet pretraining. In semantic segmentation, we surpass the state-of-the-art (SOTA) performance. These results demonstrate the effectiveness of our domain-specific pretraining approach in addressing the challenges of automated endoscopy video analysis. CONCLUSION: Our study contributes to the field of medical computer vision by showcasing the benefits of domain-specific large-scale self-supervised pretraining for vision transformers. We release both our code and pretrained models to facilitate further research in this direction: https://github.com/DominikBatic/EndoViT .


Subject(s)
Endoscopy , Humans , Endoscopy/methods , Endoscopy/education , Image Processing, Computer-Assisted/methods , Video Recording , Minimally Invasive Surgical Procedures/education , Minimally Invasive Surgical Procedures/methods
10.
APL Bioeng ; 8(2): 021501, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38572313

ABSTRACT

Cancer, with high morbidity and high mortality, is one of the major burdens threatening human health globally. Intervention procedures via percutaneous puncture have been widely used by physicians due to its minimally invasive surgical approach. However, traditional manual puncture intervention depends on personal experience and faces challenges in terms of precisely puncture, learning-curve, safety and efficacy. The development of puncture interventional surgery robotic (PISR) systems could alleviate the aforementioned problems to a certain extent. This paper attempts to review the current status and prospective of PISR systems for thoracic and abdominal application. In this review, the key technologies related to the robotics, including spatial registration, positioning navigation, puncture guidance feedback, respiratory motion compensation, and motion control, are discussed in detail.

11.
Int J Comput Assist Radiol Surg ; 19(5): 861-869, 2024 May.
Article in English | MEDLINE | ID: mdl-38270811

ABSTRACT

PURPOSE: The detection and treatment of abdominal aortic aneurysm (AAA), a vascular disorder with life-threatening consequences, is challenging due to its lack of symptoms until it reaches a critical size. Abdominal ultrasound (US) is utilized for diagnosis; however, its inherent low image quality and reliance on operator expertise make computed tomography (CT) the preferred choice for monitoring and treatment. Moreover, CT datasets have been effectively used for training deep neural networks for aorta segmentation. In this work, we demonstrate how leveraging CT labels can be used to improve segmentation in ultrasound and hence save manual annotations. METHODS: We introduce CACTUSS: a common anatomical CT-US space that inherits properties from both CT and ultrasound modalities to produce an image in intermediate representation (IR) space. CACTUSS acts as a virtual third modality between CT and US to address the scarcity of annotated ultrasound training data. The generation of IR images is facilitated by re-parametrizing a physics-based US simulator. In CACTUSS we use IR images as training data for ultrasound segmentation, eliminating the need for manual labeling. In addition, an image-to-image translation network is employed for the model's application on real B-modes. RESULTS: The model's performance is evaluated quantitatively for the task of aorta segmentation by comparison against a fully supervised method in terms of Dice Score and diagnostic metrics. CACTUSS outperforms the fully supervised network in segmentation and meets clinical requirements for AAA screening and diagnosis. CONCLUSION: CACTUSS provides a promising approach to improve US segmentation accuracy by leveraging CT labels, reducing the need for manual annotations. We generate IRs that inherit properties from both modalities while preserving the anatomical structure and are optimized for the task of aorta segmentation. Future work involves integrating CACTUSS into robotic ultrasound platforms for automated screening and conducting clinical feasibility studies.


Subject(s)
Aortic Aneurysm, Abdominal , Tomography, X-Ray Computed , Ultrasonography , Humans , Aortic Aneurysm, Abdominal/diagnostic imaging , Tomography, X-Ray Computed/methods , Ultrasonography/methods , Aorta, Abdominal/diagnostic imaging , Multimodal Imaging/methods
12.
medRxiv ; 2024 Apr 08.
Article in English | MEDLINE | ID: mdl-37745329

ABSTRACT

The standard of care for brain tumors is maximal safe surgical resection. Neuronavigation augments the surgeon's ability to achieve this but loses validity as surgery progresses due to brain shift. Moreover, gliomas are often indistinguishable from surrounding healthy brain tissue. Intraoperative magnetic resonance imaging (iMRI) and ultrasound (iUS) help visualize the tumor and brain shift. iUS is faster and easier to incorporate into surgical workflows but offers a lower contrast between tumorous and healthy tissues than iMRI. With the success of data-hungry Artificial Intelligence algorithms in medical image analysis, the benefits of sharing well-curated data cannot be overstated. To this end, we provide the largest publicly available MRI and iUS database of surgically treated brain tumors, including gliomas (n=92), metastases (n=11), and others (n=11). This collection contains 369 preoperative MRI series, 320 3D iUS series, 301 iMRI series, and 356 segmentations collected from 114 consecutive patients at a single institution. This database is expected to help brain shift and image analysis research and neurosurgical training in interpreting iUS and iMRI.

13.
Med Phys ; 51(3): 2044-2056, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37708456

ABSTRACT

BACKGROUND: Ultrasound (US) has demonstrated to be an effective guidance technique for lumbar spine injections, enabling precise needle placement without exposing the surgeon or the patient to ionizing radiation. However, noise and acoustic shadowing artifacts make US data interpretation challenging. To mitigate these problems, many authors suggested using computed tomography (CT)-to-US registration to align the spine in pre-operative CT to intra-operative US data, thus providing localization of spinal landmarks. PURPOSE: In this paper, we propose a deep learning (DL) pipeline for CT-to-US registration and address the problem of a need for annotated medical data for network training. Firstly, we design a data generation method to generate paired CT-US data where the spine is deformed in a physically consistent manner. Secondly, we train a point cloud (PC) registration network using anatomy-aware losses to enforce anatomically consistent predictions. METHODS: Our proposed pipeline relies on training the network on realistic generated data. In our data generation method, we model the properties of the joints and disks between vertebrae based on biomechanical measurements in previous studies. We simulate the supine and prone position deformation by applying forces on the spine models. We choose the spine models from 35 patients in VerSe dataset. Each spine is deformed 10 times to create a noise-free data with ground-truth segmentation at hand. In our experiments, we use one-leave-out cross-validation strategy to measure the performance and the stability of the proposed method. For each experiment, we choose generated PCs from three spines as the test set. From the remaining, data from 3 spines act as the validation set and we use the rest of the data for training the algorithm. To train our network, we introduce anatomy-aware losses and constraints on the movement to match the physics of the spine, namely, rigidity loss and bio-mechanical loss. We define rigidity loss based on the fact that each vertebra can only transform rigidly while the disks and the surrounding tissue are deformable. Second, by using bio-mechanical loss we stop the network from inferring extreme movements by penalizing the force needed to get to a certain pose. RESULTS: To validate the effectiveness of our fully automated data generation pipeline, we qualitatively assess the fidelity of the generated data. This assessment involves verifying the realism of the spinal deformation and subsequently confirming the plausibility of the simulated ultrasound images. Next, we demonstrate that the introduction of the anatomy-aware losses brings us closer to state-of-the-art (SOTA) and yields a reduction of 0.25 mm in terms of target registration error (TRE) compared to using only mean squared error (MSE) loss on the generated dataset. Furthermore, by using the proposed losses, the rigidity loss in inference decreases which shows that the inferred deformation respects the rigidity of the vertebrae and only introduces deformations in the soft tissue area to compensate the difference to the target PC. We also show that our results are close to the SOTA for the simulated US dataset with TRE of 3.89 mm and 3.63 mm for the proposed method and SOTA respectively. In addition, we show that our method is more robust against errors in the initialization in comparison to SOTA and significantly achieves better results (TRE of 4.88 mm compared to 5.66 mm) in this experiment. CONCLUSIONS: In conclusion, we present a pipeline for spine CT-to-US registration and explore the potential benefits of utilizing anatomy-aware losses to enhance registration results. Additionally, we propose a fully automatic method to synthesize paired CT-US data with physically consistent deformations, which offers the opportunity to generate extensive datasets for network training. The generated dataset and the source code for data generation and registration pipeline can be accessed via https://github.com/mfazampour/medphys_ct_us_registration.


Subject(s)
Spine , Tomography, X-Ray Computed , Humans , Tomography, X-Ray Computed/methods , Spine/diagnostic imaging , Algorithms , Lumbar Vertebrae , Software , Radiation, Ionizing , Image Processing, Computer-Assisted/methods
14.
Ultrasonics ; 137: 107179, 2024 Feb.
Article in English | MEDLINE | ID: mdl-37939413

ABSTRACT

Ultrasound is an adjunct tool to mammography that can quickly and safely aid physicians in diagnosing breast abnormalities. Clinical ultrasound often assumes a constant sound speed to form diagnostic B-mode images. However, the components of breast tissue, such as glandular tissue, fat, and lesions, differ in sound speed. Given a constant sound speed assumption, these differences can degrade the quality of reconstructed images via phase aberration. Sound speed images can be a powerful tool for improving image quality and identifying diseases if properly estimated. To this end, we propose a supervised deep-learning approach for sound speed estimation from analytic ultrasound signals. We develop a large-scale simulated ultrasound dataset that generates representative breast tissue samples by modeling breast gland, skin, and lesions with varying echogenicity and sound speed. We adopt a fully convolutional neural network architecture trained on a simulated dataset to produce an estimated sound speed map. The simulated tissue is interrogated with a plane wave transmit sequence, and the complex-value reconstructed images are used as input for the convolutional network. The network is trained on the sound speed distribution map of the simulated data, and the trained model can estimate sound speed given reconstructed pulse-echo signals. We further incorporate thermal noise augmentation during training to enhance model robustness to artifacts found in real ultrasound data. To highlight the ability of our model to provide accurate sound speed estimations, we evaluate it on simulated, phantom, and in-vivo breast ultrasound data.


Subject(s)
Deep Learning , Humans , Female , Algorithms , Ultrasonography, Mammary , Sound , Ultrasonography/methods , Phantoms, Imaging , Image Processing, Computer-Assisted/methods
15.
Article in English | MEDLINE | ID: mdl-38083453

ABSTRACT

The field of robotic microsurgery and micro-manipulation has undergone a profound evolution in recent years, particularly with regard to the accuracy, precision, versatility, and dexterity. These advancements have the potential to revolutionize high-precision biomedical procedures, such as neurosurgery, vitreoretinal surgery, and cell micro-manipulation. However, a critical challenge in developing micron-precision robotic systems is accurately verifying the end-effector motion in 3D. Such verification is complicated due to environmental vibrations, inaccuracy of mechanical assembly, and other physical uncertainties. To overcome these challenges, this paper proposes a novel single-camera framework that utilizes mirrors with known geometric parameters to estimate the 3D position of the microsurgical instrument. Euclidean distance between reconstructed points by the algorithm and the robot movement recorded by the highly accurate encoders is considered an error. Our method exhibits an accurate estimation with the mean absolute error of 0.044 mm when tested on a 23G surgical cannula with a diameter of 0.640 mm and operates at a resolution of 4024 × 3036 at 30 frames per second.


Subject(s)
Robotics , Surgery, Computer-Assisted , Microsurgery , Motion , Movement
16.
IEEE Int Conf Robot Autom ; 2023: 4724-4731, 2023.
Article in English | MEDLINE | ID: mdl-38125032

ABSTRACT

In the last decade, various robotic platforms have been introduced that could support delicate retinal surgeries. Concurrently, to provide semantic understanding of the surgical area, recent advances have enabled microscope-integrated intraoperative Optical Coherent Tomography (iOCT) with high-resolution 3D imaging at near video rate. The combination of robotics and semantic understanding enables task autonomy in robotic retinal surgery, such as for subretinal injection. This procedure requires precise needle insertion for best treatment outcomes. However, merging robotic systems with iOCT introduces new challenges. These include, but are not limited to high demands on data processing rates and dynamic registration of these systems during the procedure. In this work, we propose a framework for autonomous robotic navigation for subretinal injection, based on intelligent real-time processing of iOCT volumes. Our method consists of an instrument pose estimation method, an online registration between the robotic and the iOCT system, and trajectory planning tailored for navigation to an injection target. We also introduce intelligent virtual B-scans, a volume slicing approach for rapid instrument pose estimation, which is enabled by Convolutional Neural Networks (CNNs). Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method. Finally, we discuss identified challenges in this work and suggest potential solutions to further the development of such systems.

17.
Robotica ; 41(5): 1536-1549, 2023 May.
Article in English | MEDLINE | ID: mdl-37982126

ABSTRACT

Retinal surgery is widely considered to be a complicated and challenging task even for specialists. Image-guided robot-assisted intervention is among the novel and promising solutions that may enhance human capabilities therein. In this paper, we demonstrate the possibility of using spotlights for 5D guidance of a microsurgical instrument. The theoretical basis of the localization for the instrument based on the projection of a single spotlight is analyzed to deduce the position and orientation of the spotlight source. The usage of multiple spotlights is also proposed to check the possibility of further improvements for the performance boundaries. The proposed method is verified within a high-fidelity simulation environment using the 3D creation suite Blender. Experimental results show that the average positioning error is 0.029 mm using a single spotlight and 0.025 mm with three spotlights, respectively, while the rotational errors are 0.124 and 0.101, which shows the application to be promising in instrument localization for retinal surgery.

18.
Sci Rep ; 13(1): 19539, 2023 11 09.
Article in English | MEDLINE | ID: mdl-37945590

ABSTRACT

When dealing with a newly emerging disease such as COVID-19, the impact of patient- and disease-specific factors (e.g., body weight or known co-morbidities) on the immediate course of the disease is largely unknown. An accurate prediction of the most likely individual disease progression can improve the planning of limited resources and finding the optimal treatment for patients. In the case of COVID-19, the need for intensive care unit (ICU) admission of pneumonia patients can often only be determined on short notice by acute indicators such as vital signs (e.g., breathing rate, blood oxygen levels), whereas statistical analysis and decision support systems that integrate all of the available data could enable an earlier prognosis. To this end, we propose a holistic, multimodal graph-based approach combining imaging and non-imaging information. Specifically, we introduce a multimodal similarity metric to build a population graph that shows a clustering of patients. For each patient in the graph, we extract radiomic features from a segmentation network that also serves as a latent image feature encoder. Together with clinical patient data like vital signs, demographics, and lab results, these modalities are combined into a multimodal representation of each patient. This feature extraction is trained end-to-end with an image-based Graph Attention Network to process the population graph and predict the COVID-19 patient outcomes: admission to ICU, need for ventilation, and mortality. To combine multiple modalities, radiomic features are extracted from chest CTs using a segmentation neural network. Results on a dataset collected in Klinikum rechts der Isar in Munich, Germany and the publicly available iCTCF dataset show that our approach outperforms single modality and non-graph baselines. Moreover, our clustering and graph attention increases understanding of the patient relationships within the population graph and provides insight into the network's decision-making process.


Subject(s)
COVID-19 , Humans , Prognosis , Lung , Disease Progression , Hospitalization
19.
Article in English | MEDLINE | ID: mdl-37823976

ABSTRACT

PURPOSE: Surgical procedures take place in highly complex operating rooms (OR), involving medical staff, patients, devices and their interactions. Until now, only medical professionals are capable of comprehending these intricate links and interactions. This work advances the field toward automated, comprehensive and semantic understanding and modeling of the OR domain by introducing semantic scene graphs (SSG) as a novel approach to describing and summarizing surgical environments in a structured and semantically rich manner. METHODS: We create the first open-source 4D SSG dataset. 4D-OR includes simulated total knee replacement surgeries captured by RGB-D sensors in a realistic OR simulation center. It includes annotations for SSGs, human and object pose, clinical roles and surgical phase labels. We introduce a neural network-based SSG generation pipeline for semantic reasoning in the OR and apply our approach to two downstream tasks: clinical role prediction and surgical phase recognition. RESULTS: We show that our pipeline can successfully reason within the OR domain. The capabilities of our scene graphs are further highlighted by their successful application to clinical role prediction and surgical phase recognition tasks. CONCLUSION: This work paves the way for multimodal holistic operating room modeling, with the potential to significantly enhance the state of the art in surgical data analysis, such as enabling more efficient and precise decision-making during surgical procedures, and ultimately improving patient safety and surgical outcomes. We release our code and dataset at github.com/egeozsoy/4D-OR.

20.
Biomed Opt Express ; 14(10): 5466-5483, 2023 Oct 01.
Article in English | MEDLINE | ID: mdl-37854552

ABSTRACT

With the incremental popularity of ophthalmic imaging techniques, anonymization of the clinical image datasets is becoming a critical issue, especially the fundus images, which would have unique patient-specific biometric content. Towards achieving a framework to anonymize ophthalmic images, we propose an image-specific de-identification method on the vascular structure of retinal fundus images while preserving important clinical features such as hard exudates. Our method calculates the contribution of latent code in latent space to the vascular structure by computing the gradient map of the generated image with respect to latent space and then by computing the overlap between the vascular mask and the gradient map. The proposed method is designed to specifically target and effectively manipulate the latent code with the highest contribution score in vascular structures. Extensive experimental results show that our proposed method is competitive with other state-of-the-art approaches in terms of identity similarity and lesion similarity, respectively. Additionally, our approach allows for a better balance between identity similarity and lesion similarity, thus ensuring optimal performance in a trade-off manner.

SELECTION OF CITATIONS
SEARCH DETAIL