RESUMO
Screening colonoscopy is an important clinical application for several 3D computer vision techniques, including depth estimation, surface reconstruction, and missing region detection. However, the development, evaluation, and comparison of these techniques in real colonoscopy videos remain largely qualitative due to the difficulty of acquiring ground truth data. In this work, we present a Colonoscopy 3D Video Dataset (C3VD) acquired with a high definition clinical colonoscope and high-fidelity colon models for benchmarking computer vision methods in colonoscopy. We introduce a novel multimodal 2D-3D registration technique to register optical video sequences with ground truth rendered views of a known 3D model. The different modalities are registered by transforming optical images to depth maps with a Generative Adversarial Network and aligning edge features with an evolutionary optimizer. This registration method achieves an average translation error of 0.321 millimeters and an average rotation error of 0.159 degrees in simulation experiments where error-free ground truth is available. The method also leverages video information, improving registration accuracy by 55.6% for translation and 60.4% for rotation compared to single frame registration. 22 short video sequences were registered to generate 10,015 total frames with paired ground truth depth, surface normals, optical flow, occlusion, six degree-of-freedom pose, coverage maps, and 3D models. The dataset also includes screening videos acquired by a gastroenterologist with paired ground truth pose and 3D surface models. The dataset and registration source code are available at https://durr.jhu.edu/C3VD.
RESUMO
Purpose: A method for fluoroscopic guidance of a robotic assistant is presented for instrument placement in pelvic trauma surgery. The solution uses fluoroscopic images acquired in standard clinical workflow and helps avoid repeat fluoroscopy commonly performed during implant guidance. Approach: Images acquired from a mobile C-arm are used to perform 3D-2D registration of both the patient (via patient CT) and the robot (via CAD model of a surgical instrument attached to its end effector, e.g; a drill guide), guiding the robot to target trajectories defined in the patient CT. The proposed approach avoids C-arm gantry motion, instead manipulating the robot to acquire disparate views of the instrument. Phantom and cadaver studies were performed to determine operating parameters and assess the accuracy of the proposed approach in aligning a standard drill guide instrument. Results: The proposed approach achieved average drill guide tip placement accuracy of 1.57 ± 0.47 mm and angular alignment of 0.35 ± 0.32 deg in phantom studies. The errors remained within 2 mm and 1 deg in cadaver experiments, comparable to the margins of errors provided by surgical trackers (but operating without the need for external tracking). Conclusions: By operating at a fixed fluoroscopic perspective and eliminating the need for encoded C-arm gantry movement, the proposed approach simplifies and expedites the registration of image-guided robotic assistants and can be used with simple, non-calibrated, non-encoded, and non-isocentric C-arm systems to accurately guide a robotic device in a manner that is compatible with the surgical workflow.
RESUMO
Purpose: Data-intensive modeling could provide insight on the broad variability in outcomes in spine surgery. Previous studies were limited to analysis of demographic and clinical characteristics. We report an analytic framework called "SpineCloud" that incorporates quantitative features extracted from perioperative images to predict spine surgery outcome. Approach: A retrospective study was conducted in which patient demographics, imaging, and outcome data were collected. Image features were automatically computed from perioperative CT. Postoperative 3- and 12-month functional and pain outcomes were analyzed in terms of improvement relative to the preoperative state. A boosted decision tree classifier was trained to predict outcome using demographic and image features as predictor variables. Predictions were computed based on SpineCloud and conventional demographic models, and features associated with poor outcome were identified from weighting terms evident in the boosted tree. Results: Neither approach was predictive of 3- or 12-month outcomes based on preoperative data alone in the current, preliminary study. However, SpineCloud predictions incorporating image features obtained during and immediately following surgery (i.e., intraoperative and immediate postoperative images) exhibited significant improvement in area under the receiver operating characteristic (AUC): AUC = 0.72 ( CI 95 = 0.59 to 0.83) at 3 months and AUC = 0.69 ( CI 95 = 0.55 to 0.82) at 12 months. Conclusions: Predictive modeling of lumbar spine surgery outcomes was improved by incorporation of image-based features compared to analysis based on conventional demographic data. The SpineCloud framework could improve understanding of factors underlying outcome variability and warrants further investigation and validation in a larger patient cohort.
RESUMO
Purpose: Measurement of global spinal alignment (GSA) is an important aspect of diagnosis and treatment evaluation for spinal deformity but is subject to a high level of inter-reader variability. Approach: Two methods for automatic GSA measurement are proposed to mitigate such variability and reduce the burden of manual measurements. Both approaches use vertebral labels in spine computed tomography (CT) as input: the first (EndSeg) segments vertebral endplates using input labels as seed points; and the second (SpNorm) computes a two-dimensional curvilinear fit to the input labels. Studies were performed to characterize the performance of EndSeg and SpNorm in comparison to manual GSA measurement by five clinicians, including measurements of proximal thoracic kyphosis, main thoracic kyphosis, and lumbar lordosis. Results: For the automatic methods, 93.8% of endplate angle estimates were within the inter-reader 95% confidence interval ( CI 95 ). All GSA measurements for the automatic methods were within the inter-reader CI 95 , and there was no statistically significant difference between automatic and manual methods. The SpNorm method appears particularly robust as it operates without segmentation. Conclusions: Such methods could improve the reproducibility and reliability of GSA measurements and are potentially suitable to applications in large datasets-e.g., for outcome assessment in surgical data science.
RESUMO
The fidelity of image-guided neurosurgical procedures is often compromised due to the mechanical deformations that occur during surgery. In recent work, a framework was developed to predict the extent of this brain shift in brain-tumor resection procedures. The approach uses preoperatively determined surgical variables to predict brain shift and then subsequently corrects the patient's preoperative image volume to more closely match the intraoperative state of the patient's brain. However, a clinical workflow difficulty with the execution of this framework is the preoperative acquisition of surgical variables. To simplify and expedite this process, an Android, Java-based application was developed for tablets to provide neurosurgeons with the ability to manipulate three-dimensional models of the patient's neuroanatomy and determine an expected head orientation, craniotomy size and location, and trajectory to be taken into the tumor. These variables can then be exported for use as inputs to the biomechanical model associated with the correction framework. A multisurgeon, multicase mock trial was conducted to compare the accuracy of the virtual plan to that of a mock physical surgery. It was concluded that the Android application was an accurate, efficient, and timely method for planning surgical variables.