Self-supervised learning for interventional image analytics: toward robust device trackers.

Islam, Saahil; Murthy, Venkatesh N; Neumann, Dominik; Das, Badhan Kumar; Sharma, Puneet; Maier, Andreas; Comaniciu, Dorin; Ghesu, Florin C

Islam, Saahil; Murthy, Venkatesh N; Neumann, Dominik; Das, Badhan Kumar; Sharma, Puneet; Maier, Andreas; Comaniciu, Dorin; Ghesu, Florin C.

Afiliación

Islam S; Friedrich-Alexander-Universität Erlangen-Nürnberg, Pattern Recognition Lab, Erlangen, Germany.
Murthy VN; Siemens Healthineers, Digital Technology and Innovation, Erlangen, Germany.
Neumann D; Siemens Healthineers, Digital Technology and Innovation, Princeton, New Jersey, United States.
Das BK; Siemens Healthineers, Digital Technology and Innovation, Erlangen, Germany.
Sharma P; Friedrich-Alexander-Universität Erlangen-Nürnberg, Pattern Recognition Lab, Erlangen, Germany.
Maier A; Siemens Healthineers, Digital Technology and Innovation, Erlangen, Germany.
Comaniciu D; Siemens Healthineers, Digital Technology and Innovation, Princeton, New Jersey, United States.
Ghesu FC; Friedrich-Alexander-Universität Erlangen-Nürnberg, Pattern Recognition Lab, Erlangen, Germany.

J Med Imaging (Bellingham) ; 11(3): 035001, 2024 May.

Article en En | MEDLINE | ID: mdl-38756438

ABSTRACT

ABSTRACT

Purpose:

The accurate detection and tracking of devices, such as guiding catheters in live X-ray image acquisitions, are essential prerequisites for endovascular cardiac interventions. This information is leveraged for procedural guidance, e.g., directing stent placements. To ensure procedural safety and efficacy, there is a need for high robustness/no failures during tracking. To achieve this, one needs to efficiently tackle challenges, such as device obscuration by the contrast agent or other external devices or wires and changes in the field-of-view or acquisition angle, as well as the continuous movement due to cardiac and respiratory motion.

Approach:

To overcome the aforementioned challenges, we propose an approach to learn spatio-temporal features from a very large data cohort of over 16 million interventional X-ray frames using self-supervision for image sequence data. Our approach is based on a masked image modeling technique that leverages frame interpolation-based reconstruction to learn fine inter-frame temporal correspondences. The features encoded in the resulting model are fine-tuned downstream in a light-weight model.

Results:

Our approach achieves state-of-the-art performance, in particular for robustness, compared to ultra optimized reference solutions (that use multi-stage feature fusion or multi-task and flow regularization). The experiments show that our method achieves a 66.31% reduction in the maximum tracking error against the reference solutions (23.20% when flow regularization is used), achieving a success score of 97.95% at a 3× faster inference speed of 42 frames-per-second (on GPU). In addition, we achieve a 20% reduction in the standard deviation of errors, which indicates a much more stable tracking performance.

Conclusions:

The proposed data-driven approach achieves superior performance, particularly in robustness and speed compared with the frequently used multi-modular approaches for device tracking. The results encourage the use of our approach in various other tasks within interventional image analytics that require effective understanding of spatio-temporal semantics.

Palabras clave

device tracking; interventional imaging; self-supervised learning

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: J Med Imaging (Bellingham) Año: 2024 Tipo del documento: Article País de afiliación: Alemania

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google