Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters











Publication year range
1.
Article in English | MEDLINE | ID: mdl-32396084

ABSTRACT

Single plane wave transmissions are promising for automated imaging tasks requiring high ultrasound frame rates over an extended field of view. However, a single plane wave insonification typically produces suboptimal image quality. To address this limitation, we are exploring the use of deep neural networks (DNNs) as an alternative to delay-and-sum (DAS) beamforming. The objectives of this work are to obtain information directly from raw channel data and to simultaneously generate both a segmentation map for automated ultrasound tasks and a corresponding ultrasound B-mode image for interpretable supervision of the automation. We focus on visualizing and segmenting anechoic targets surrounded by tissue and ignoring or deemphasizing less important surrounding structures. DNNs trained with Field II simulations were tested with simulated, experimental phantom, and in vivo data sets that were not included during training. With unfocused input channel data (i.e., prior to the application of receive time delays), simulated, experimental phantom, and in vivo test data sets achieved mean ± standard deviation Dice similarity coefficients of 0.92 ± 0.13, 0.92 ± 0.03, and 0.77 ± 0.07, respectively, and generalized contrast-to-noise ratios (gCNRs) of 0.95 ± 0.08, 0.93 ± 0.08, and 0.75 ± 0.14, respectively. With subaperture beamformed channel data and a modification to the input layer of the DNN architecture to accept these data, the fidelity of image reconstruction increased (e.g., mean gCNR of multiple acquisitions of two in vivo breast cysts ranged 0.89-0.96), but DNN display frame rates were reduced from 395 to 287 Hz. Overall, the DNNs successfully translated feature representations learned from simulated data to phantom and in vivo data, which is promising for this novel approach to simultaneous ultrasound image formation and segmentation.


Subject(s)
Deep Learning , Image Processing, Computer-Assisted/methods , Ultrasonography/methods , Algorithms , Breast/diagnostic imaging , Breast Neoplasms/diagnostic imaging , Female , Humans , Phantoms, Imaging
2.
IEEE Trans Med Imaging ; 39(5): 1438-1447, 2020 05.
Article in English | MEDLINE | ID: mdl-31689184

ABSTRACT

We present a self-supervised approach to training convolutional neural networks for dense depth estimation from monocular endoscopy data without a priori modeling of anatomy or shading. Our method only requires monocular endoscopic videos and a multi-view stereo method, e.g., structure from motion, to supervise learning in a sparse manner. Consequently, our method requires neither manual labeling nor patient computed tomography (CT) scan in the training and application phases. In a cross-patient experiment using CT scans as groundtruth, the proposed method achieved submillimeter mean residual error. In a comparison study to recent self-supervised depth estimation methods designed for natural video on in vivo sinus endoscopy data, we demonstrate that the proposed approach outperforms the previous methods by a large margin. The source code for this work is publicly available online at https://github.com/lppllppl920/EndoscopyDepthEstimation-Pytorch.


Subject(s)
Algorithms , Endoscopy , Humans , Neural Networks, Computer , Supervised Machine Learning , Tomography, X-Ray Computed
3.
Crit Care Med ; 47(9): 1232-1234, 2019 09.
Article in English | MEDLINE | ID: mdl-31162207

ABSTRACT

OBJECTIVES: To compare noninvasive mobility sensor patient motion signature to direct observations by physicians and nurses. DESIGN: Prospective, observational study. SETTING: Academic hospital surgical ICU. PATIENTS AND MEASUREMENTS: A total of 2,426 1-minute clips from six ICU patients (development dataset) and 4,824 1-minute clips from five patients (test dataset). INTERVENTIONS: None. MAIN RESULTS: Noninvasive mobility sensor achieved a minute-level accuracy of 94.2% (2,138/2,272) and an hour-level accuracy of 81.4% (70/86). CONCLUSIONS: The automated noninvasive mobility sensor system represents a significant departure from current manual measurement and reporting used in clinical care, lowering the burden of measurement and documentation on caregivers.


Subject(s)
Early Ambulation/instrumentation , Intensive Care Units/organization & administration , Remote Sensing Technology/instrumentation , Academic Medical Centers , Aged , Aged, 80 and over , Algorithms , Female , Humans , Male , Prospective Studies
4.
Med Image Anal ; 55: 148-164, 2019 07.
Article in English | MEDLINE | ID: mdl-31078111

ABSTRACT

In this paper, we present three deformable registration algorithms designed within a paradigm that uses 3D statistical shape models to accomplish two tasks simultaneously: 1) register point features from previously unseen data to a statistically derived shape (e.g., mean shape), and 2) deform the statistically derived shape to estimate the shape represented by the point features. This paradigm, called the deformable most-likely-point paradigm, is motivated by the idea that generative shape models built from available data can be used to estimate previously unseen data. We developed three deformable registration algorithms within this paradigm using statistical shape models built from reliably segmented objects with correspondences. Results from several experiments show that our algorithms produce accurate registrations and reconstructions in a variety of applications with errors up to CT resolution on medical datasets. Our code is available at https://github.com/AyushiSinha/cisstICP.


Subject(s)
Algorithms , Imaging, Three-Dimensional/methods , Nasal Cavity/diagnostic imaging , Pelvis/diagnostic imaging , Radiographic Image Interpretation, Computer-Assisted/methods , Tomography, X-Ray Computed/methods , Turbinates/diagnostic imaging , Computer Simulation , Humans , Models, Statistical
5.
JAMA Netw Open ; 2(4): e191860, 2019 04 05.
Article in English | MEDLINE | ID: mdl-30951163

ABSTRACT

Importance: Competence in cataract surgery is a public health necessity, and videos of cataract surgery are routinely available to educators and trainees but currently are of limited use in training. Machine learning and deep learning techniques can yield tools that efficiently segment videos of cataract surgery into constituent phases for subsequent automated skill assessment and feedback. Objective: To evaluate machine learning and deep learning algorithms for automated phase classification of manually presegmented phases in videos of cataract surgery. Design, Setting, and Participants: This was a cross-sectional study using a data set of videos from a convenience sample of 100 cataract procedures performed by faculty and trainee surgeons in an ophthalmology residency program from July 2011 to December 2017. Demographic characteristics for surgeons and patients were not captured. Ten standard labels in the procedure and 14 instruments used during surgery were manually annotated, which served as the ground truth. Exposures: Five algorithms with different input data: (1) a support vector machine input with cross-sectional instrument label data; (2) a recurrent neural network (RNN) input with a time series of instrument labels; (3) a convolutional neural network (CNN) input with cross-sectional image data; (4) a CNN-RNN input with a time series of images; and (5) a CNN-RNN input with time series of images and instrument labels. Each algorithm was evaluated with 5-fold cross-validation. Main Outcomes and Measures: Accuracy, area under the receiver operating characteristic curve, sensitivity, specificity, and precision. Results: Unweighted accuracy for the 5 algorithms ranged between 0.915 and 0.959. Area under the receiver operating characteristic curve for the 5 algorithms ranged between 0.712 and 0.773, with small differences among them. The area under the receiver operating characteristic curve for the image-only CNN-RNN (0.752) was significantly greater than that of the CNN with cross-sectional image data (0.712) (difference, -0.040; 95% CI, -0.049 to -0.033) and the CNN-RNN with images and instrument labels (0.737) (difference, 0.016; 95% CI, 0.014 to 0.018). While specificity was uniformly high for all phases with all 5 algorithms (range, 0.877 to 0.999), sensitivity ranged between 0.005 (95% CI, 0.000 to 0.015) for the support vector machine for wound closure (corneal hydration) and 0.974 (95% CI, 0.957 to 0.991) for the RNN for main incision. Precision ranged between 0.283 and 0.963. Conclusions and Relevance: Time series modeling of instrument labels and video images using deep learning techniques may yield potentially useful tools for the automated detection of phases in cataract surgery procedures.


Subject(s)
Cataract Extraction/instrumentation , Image Processing, Computer-Assisted/methods , Video Recording/methods , Algorithms , Cataract/epidemiology , Cross-Sectional Studies , Deep Learning , Humans , Machine Learning , Neural Networks, Computer , Observational Studies as Topic , Retrospective Studies , Sensitivity and Specificity
6.
IEEE Trans Med Imaging ; 37(10): 2185-2195, 2018 10.
Article in English | MEDLINE | ID: mdl-29993881

ABSTRACT

Functional endoscopic sinus surgery (FESS) is one of the most common outpatient surgical procedures performed in the head and neck region. It is used to treat chronic sinusitis, a disease characterized by inflammation in the nose and surrounding paranasal sinuses, affecting about 15% of the adult population. During FESS, the nasal cavity is visualized using an endoscope, and instruments are used to remove tissues that are often within a millimeter of critical anatomical structures, such as the optic nerve, carotid arteries, and nasolacrimal ducts. To maintain orientation and to minimize the risk of damage to these structures, surgeons use surgical navigation systems to visualize the 3-D position of their tools on patients' preoperative Computed Tomographies (CTs). This paper presents an image-based method for enhanced endoscopic navigation. The main contributions are: (1) a system that enables a surgeon to asynchronously register a sequence of endoscopic images to a CT scan with higher accuracy than other reported solutions using no additional hardware; (2) the ability to report the robustness of the registration; and (3) evaluation on in vivo human data. The system also enables the overlay of anatomical structures, visible, or occluded, on top of video images. The methods are validated on four different data sets using multiple evaluation metrics. First, for experiments on synthetic data, we observe a mean absolute position error of 0.21mm and a mean absolute orientation error of 2.8° compared with ground truth. Second, for phantom data, we observe a mean absolute position error of 0.97mm and a mean absolute orientation error of 3.6° compared with the same motion tracked by an electromagnetic tracker. Third, for cadaver data, we use fiducial landmarks and observe an average reprojection distance error of 0.82mm. Finally, for in vivo clinical data, we report an average ICP residual error of 0.88mm in areas that are not composed of erectile tissue and an average ICP residual error of 1.09mm in areas that are composed of erectile tissue.


Subject(s)
Imaging, Three-Dimensional/methods , Paranasal Sinuses , Video-Assisted Surgery/methods , Algorithms , Databases, Factual , Humans , Paranasal Sinuses/diagnostic imaging , Paranasal Sinuses/surgery , Phantoms, Imaging , Sinusitis/diagnostic imaging , Sinusitis/surgery , Tomography, X-Ray Computed
7.
IEEE Trans Med Imaging ; 37(6): 1464-1477, 2018 06.
Article in English | MEDLINE | ID: mdl-29870374

ABSTRACT

Interventional applications of photoacoustic imaging typically require visualization of point-like targets, such as the small, circular, cross-sectional tips of needles, catheters, or brachytherapy seeds. When these point-like targets are imaged in the presence of highly echogenic structures, the resulting photoacoustic wave creates a reflection artifact that may appear as a true signal. We propose to use deep learning techniques to identify these types of noise artifacts for removal in experimental photoacoustic data. To achieve this goal, a convolutional neural network (CNN) was first trained to locate and classify sources and artifacts in pre-beamformed data simulated with -Wave. Simulations initially contained one source and one artifact with various medium sound speeds and 2-D target locations. Based on 3,468 test images, we achieved a 100% success rate in classifying both sources and artifacts. After adding noise to assess potential performance in more realistic imaging environments, we achieved at least 98% success rates for channel signal-to-noise ratios (SNRs) of -9dB or greater, with a severe decrease in performance below -21dB channel SNR. We then explored training with multiple sources and two types of acoustic receivers and achieved similar success with detecting point sources. Networks trained with simulated data were then transferred to experimental waterbath and phantom data with 100% and 96.67% source classification accuracy, respectively (particularly when networks were tested at depths that were included during training). The corresponding mean ± one standard deviation of the point source location error was 0.40 ± 0.22 mm and 0.38 ± 0.25 mm for waterbath and phantom experimental data, respectively, which provides some indication of the resolution limits of our new CNN-based imaging system. We finally show that the CNN-based information can be displayed in a novel artifact-free image format, enabling us to effectively remove reflection artifacts from photoacoustic images, which is not possible with traditional geometry-based beamforming.


Subject(s)
Deep Learning , Image Processing, Computer-Assisted/methods , Photoacoustic Techniques/methods , Artifacts , Humans
8.
Crit Care Med ; 45(4): 630-636, 2017 Apr.
Article in English | MEDLINE | ID: mdl-28291092

ABSTRACT

OBJECTIVES: To develop and validate a noninvasive mobility sensor to automatically and continuously detect and measure patient mobility in the ICU. DESIGN: Prospective, observational study. SETTING: Surgical ICU at an academic hospital. PATIENTS: Three hundred sixty-two hours of sensor color and depth image data were recorded and curated into 109 segments, each containing 1,000 images, from eight patients. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Three Microsoft Kinect sensors (Microsoft, Beijing, China) were deployed in one ICU room to collect continuous patient mobility data. We developed software that automatically analyzes the sensor data to measure mobility and assign the highest level within a time period. To characterize the highest mobility level, a validated 11-point mobility scale was collapsed into four categories: nothing in bed, in-bed activity, out-of-bed activity, and walking. Of the 109 sensor segments, the noninvasive mobility sensor was developed using 26 of these from three ICU patients and validated on 83 remaining segments from five different patients. Three physicians annotated each segment for the highest mobility level. The weighted Kappa (κ) statistic for agreement between automated noninvasive mobility sensor output versus manual physician annotation was 0.86 (95% CI, 0.72-1.00). Disagreement primarily occurred in the "nothing in bed" versus "in-bed activity" categories because "the sensor assessed movement continuously," which was significantly more sensitive to motion than physician annotations using a discrete manual scale. CONCLUSIONS: Noninvasive mobility sensor is a novel and feasible method for automating evaluation of ICU patient mobility.


Subject(s)
Intensive Care Units , Monitoring, Physiologic/methods , Movement , Aged , Algorithms , Female , Humans , Male , Middle Aged , Monitoring, Physiologic/instrumentation , Prospective Studies , Video Recording/instrumentation , Walking
9.
Med Image Comput Comput Assist Interv ; 9900: 482-490, 2016 Oct.
Article in English | MEDLINE | ID: mdl-29170766

ABSTRACT

Throughout a patient's stay in the Intensive Care Unit (ICU), accurate measurement of patient mobility, as part of routine care, is helpful in understanding the harmful effects of bedrest [1]. However, mobility is typically measured through observation by a trained and dedicated observer, which is extremely limiting. In this work, we present a video-based automated mobility measurement system called NIMS: Non-Invasive Mobility Sensor . Our main contributions are: (1) a novel multi-person tracking methodology designed for complex environments with occlusion and pose variations, and (2) an application of human-activity attributes in a clinical setting. We demonstrate NIMS on data collected from an active patient room in an adult ICU and show a high inter-rater reliability using a weighted Kappa statistic of 0.86 for automatic prediction of the highest level of patient mobility as compared to clinical experts.


Subject(s)
Actigraphy/instrumentation , Algorithms , Movement , Video Recording , Actigraphy/methods , Adult , Humans , Intensive Care Units , Reproducibility of Results , Sensitivity and Specificity
10.
Article in English | MEDLINE | ID: mdl-29225400

ABSTRACT

Functional Endoscopic Sinus Surgery (FESS) is a challenging procedure for otolaryngologists and is the main surgical approach for treating chronic sinusitis, to remove nasal polyps and open up passageways. To reach the source of the problem and to ultimately remove it, the surgeons must often remove several layers of cartilage and tissues. Often, the cartilage occludes or is within a few millimeters of critical anatomical structures such as nerves, arteries and ducts. To make FESS safer, surgeons use navigation systems that register a patient to his/her CT scan and track the position of the tools inside the patient. Current navigation systems, however, suffer from tracking errors greater than 1 mm, which is large when compared to the scale of the sinus cavities, and errors of this magnitude prevent from accurately overlaying virtual structures on the endoscope images. In this paper, we present a method to facilitate this task by 1) registering endoscopic images to CT data and 2) overlaying areas of interests on endoscope images to improve the safety of the procedure. First, our system uses structure from motion (SfM) to generate a small cloud of 3D points from a short video sequence. Then, it uses iterative closest point (ICP) algorithm to register the points to a 3D mesh that represents a section of a patients sinuses. The scale of the point cloud is approximated by measuring the magnitude of the endoscope's motion during the sequence. We have recorded several video sequences from five patients and, given a reasonable initial registration estimate, our results demonstrate an average registration error of 1.21 mm when the endoscope is viewing erectile tissues and an average registration error of 0.91 mm when the endoscope is viewing non-erectile tissues. Our implementation SfM + ICP can execute in less than 7 seconds and can use as few as 15 frames (0.5 second of video). Future work will involve clinical validation of our results and strengthening the robustness to initial guesses and erectile tissues.

11.
Med Image Comput Comput Assist Interv ; 9902: 133-141, 2016 Oct.
Article in English | MEDLINE | ID: mdl-29226285

ABSTRACT

Functional endoscopic sinus surgery (FESS) is a surgical procedure used to treat acute cases of sinusitis and other sinus diseases. FESS is fast becoming the preferred choice of treatment due to its minimally invasive nature. However, due to the limited field of view of the endoscope, surgeons rely on navigation systems to guide them within the nasal cavity. State of the art navigation systems report registration accuracy of over 1mm, which is large compared to the size of the nasal airways. We present an anatomically constrained video-CT registration algorithm that incorporates multiple video features. Our algorithm is robust in the presence of outliers. We also test our algorithm on simulated and in-vivo data, and test its accuracy against degrading initializations.


Subject(s)
Algorithms , Sinusitis/diagnostic imaging , Sinusitis/surgery , Tomography, X-Ray Computed/methods , Video-Assisted Surgery/methods , Endoscopy , Humans , Reproducibility of Results , Sensitivity and Specificity
12.
Article in English | MEDLINE | ID: mdl-29238119

ABSTRACT

We present an automatic segmentation and statistical shape modeling system for the paranasal sinuses which allows us to locate structures in and around the sinuses, as well as to observe the natural variations that occur in these structures. This system involves deformably registering a given patient image to a manually segmented template image, and using the resulting deformation field to transfer labels from template to patient. We use 3D snake splines to correct errors in the deformable registration. Once we have several accurately segmented images, we build statistical shape models for each structure in the sinus allowing us to observe the mean shape of the population, as well as the variations observed in the population. These shape models are useful in several ways. First, regular video-CT registration methods are insufficient to accurately register pre-operative computed tomography (CT) images with intra-operative endoscopy video because of deformations that occur in structures containing high amounts of erectile tissue. Our aim is to estimate these deformations using our shape models in order to improve video-CT registration, as well as to distinguish normal variations in anatomy from abnormal variations, and automatically detect and stage pathology. We can also compare the mean shape and variances of different populations, such as different genders or ethnicities, and observe the differences and similarities, as well as of different age groups, and observe the developmental changes that occur in the sinuses.

13.
Comput Assist Robot Endosc (2014) ; 8899: 88-98, 2014 Jan 01.
Article in English | MEDLINE | ID: mdl-26539567

ABSTRACT

Camera motion estimation is a standard yet critical step to endoscopic visualization. It is affected by the variation of locations and correspondences of features detected in 2D images. Feature detectors and descriptors vary, though one of the most widely used remains SIFT. Practitioners usually also adopt its feature matching strategy, which defines inliers as the feature pairs subjecting to a global affine transformation. However, for endoscopic videos, we are curious if it is more suitable to cluster features into multiple groups. We can still enforce the same transformation as in SIFT within each group. Such a multi-model idea has been recently examined in the Multi-Affine work, which outperforms Lowe's SIFT in terms of re-projection error on minimally invasive endoscopic images with manually labelled ground-truth matches of SIFT features. Since their difference lies in matching, the accuracy gain of estimated motion is attributed to the holistic Multi-Affine feature matching algorithm. But, more concretely, the matching criterion and point searching can be the same as those built in SIFT. We argue that the real variation is only the motion model verification. We either enforce a single global motion model or employ a group of multiple local ones. In this paper, we investigate how sensitive the estimated motion is affected by the number of motion models assumed in feature matching. While the sensitivity can be analytically evaluated, we present an empirical analysis in a leaving-one-out cross validation setting without requiring labels of ground-truth matches. Then, the sensitivity is characterized by the variance of a sequence of motion estimates. We present a series of quantitative comparison such as accuracy and variance between Multi-Affine motion models and the global affine model.

14.
Med Image Comput Comput Assist Interv ; 15(Pt 2): 592-600, 2012.
Article in English | MEDLINE | ID: mdl-23286097

ABSTRACT

Tool tracking is an accepted capability for computer-aided surgical intervention which has numerous applications, both in robotic and manual minimally-invasive procedures. In this paper, we describe a tracking system which learns visual feature descriptors as class-specific landmarks on an articulated tool. The features are localized in 3D using stereo vision and are fused with the robot kinematics to track all of the joints of the dexterous manipulator. Experiments are performed using previously-collected porcine data from a surgical robot.


Subject(s)
Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Pattern Recognition, Automated/methods , Robotics/instrumentation , Robotics/methods , Surgery, Computer-Assisted/methods , Surgical Instruments , Algorithms , Image Enhancement/methods , Reproducibility of Results , Sensitivity and Specificity , Surgery, Computer-Assisted/instrumentation
SELECTION OF CITATIONS
SEARCH DETAIL