Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
1.
IEEE Trans Pattern Anal Mach Intell ; 45(9): 10615-10631, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37079402

ABSTRACT

Deep convolutional neural networks for dense prediction tasks are commonly optimized using synthetic data, as generating pixel-wise annotations for real-world data is laborious. However, the synthetically trained models do not generalize well to real-world environments. This poor "synthetic to real" (S2R) generalization we address through the lens of shortcut learning. We demonstrate that the learning of feature representations in deep convolutional networks is heavily influenced by synthetic data artifacts (shortcut attributes). To mitigate this issue, we propose an Information-Theoretic Shortcut Avoidance (ITSA) approach to automatically restrict shortcut-related information from being encoded into the feature representations. Specifically, our proposed method minimizes the sensitivity of latent features to input variations: to regularize the learning of robust and shortcut-invariant features in synthetically trained models. To avoid the prohibitive computational cost of direct input sensitivity optimization, we propose a practical yet feasible algorithm to achieve robustness. Our results show that the proposed method can effectively improve S2R generalization in multiple distinct dense prediction tasks, such as stereo matching, optical flow, and semantic segmentation. Importantly, the proposed method enhances the robustness of the synthetically trained networks and outperforms their fine-tuned counterparts (on real data) for challenging out-of-domain applications.

2.
Sci Rep ; 11(1): 7956, 2021 04 12.
Article in English | MEDLINE | ID: mdl-33846450

ABSTRACT

Prostate cancer (PCa) is the second most frequent type of cancer found in men worldwide, with around one in nine men being diagnosed with PCa within their lifetime. PCa often shows no symptoms in its early stages and its diagnosis techniques are either invasive, resource intensive, or has low efficacy, making widespread early detection onerous. Inspired by the recent success of deep convolutional neural networks (CNN) in computer aided detection (CADe), we propose a new CNN based framework for incidental detection of clinically significant prostate cancer (csPCa) in patients who had a CT scan of the abdomen/pelvis for other reasons. While CT is generally considered insufficient to diagnose PCa due to its inferior soft tissue characterisation, our evaluations on a relatively large dataset consisting of 139 clinically significant PCa patients and 432 controls show that the proposed deep neural network pipeline can detect csPCa patients at a level that is suitable for incidental detection. The proposed pipeline achieved an area under the receiver operating characteristic curve (ROC-AUC) of 0.88 (95% Confidence Interval: 0.86-0.90) at patient level csPCa detection on CT, significantly higher than the AUCs achieved by two radiologists (0.61 and 0.70) on the same task.


Subject(s)
Incidental Findings , Prostatic Neoplasms/diagnostic imaging , Tomography, X-Ray Computed , Artifacts , Confidence Intervals , Humans , Male , Middle Aged , Neural Networks, Computer , Prostatic Neoplasms/pathology , ROC Curve
3.
J Synchrotron Radiat ; 28(Pt 2): 566-575, 2021 Mar 01.
Article in English | MEDLINE | ID: mdl-33650569

ABSTRACT

In recent years, major capability improvements at synchrotron beamlines have given researchers the ability to capture more complex structures at a higher resolution within a very short time. This opens up the possibility of studying dynamic processes and observing resulting structural changes over time. However, such studies can create a huge quantity of 3D image data, which presents a major challenge for segmentation and analysis. Here tomography experiments at the Australian synchrotron source are examined, which were used to study bread dough formulations during rising and baking, resulting in over 460 individual 3D datasets. The current pipeline for segmentation and analysis involves semi-automated methods using commercial software that require a large amount of user input. This paper focuses on exploring machine learning methods to automate this process. The main challenge to be faced is in generating adequate training datasets to train the machine learning model. Creating training data by manually segmenting real images is very labour-intensive, so instead methods of automatically creating synthetic training datasets which have the same attributes of the original images have been tested. The generated synthetic images are used to train a U-Net model, which is then used to segment the original bread dough images. The trained U-Net outperformed the previously used segmentation techniques while taking less manual effort. This automated model for data segmentation would alleviate the time-consuming aspects of experimental workflow and would open the door to perform 4D characterization experiments with smaller time steps.

4.
Article in English | MEDLINE | ID: mdl-32275594

ABSTRACT

This paper presents an innovative method for motion segmentation in RGB-D dynamic videos with multiple moving objects. The focus is on finding static, small or slow moving objects (often overlooked by other methods) that their inclusion can improve the motion segmentation results. In our approach, semantic object based segmentation and motion cues are combined to estimate the number of moving objects, their motion parameters and perform segmentation. Selective object-based sampling and correspondence matching are used to estimate object specific motion parameters. The main issue with such an approach is the over segmentation of moving parts due to the fact that different objects can have the same motion (e.g. background objects). To resolve this issue, we propose to identify objects with similar motions by characterizing each motion by a distribution of a simple metric and using a statistical inference theory to assess their similarities. To demonstrate the significance of the proposed statistical inference, we present an ablation study, with and without static objects inclusion, on SLAM accuracy using the TUM-RGBD dataset. To test the effectiveness of the proposed method for finding small or slow moving objects, we applied the method to RGB-D MultiBody and SBM-RGBD motion segmentation datasets. The results showed that we can improve the accuracy of motion segmentation for small objects while remaining competitive on overall measures.

5.
Sensors (Basel) ; 20(3)2020 Feb 10.
Article in English | MEDLINE | ID: mdl-32050574

ABSTRACT

One of the core challenges in visual multi-target tracking is occlusion. This is especially important in applications such as video surveillance and sports analytics. While offline batch processing algorithms can utilise future measurements to handle occlusion effectively, online algorithms have to rely on current and past measurements only. As such, it is markedly more challenging to handle occlusion in online applications. To address this problem, we propagate information over time in a way that it generates a sense of déjà vu when similar visual and motion features are observed. To achieve this, we extend the Generalized Labeled Multi-Bernoulli (GLMB) filter, originally designed for tracking point-sized targets, to be used in visual multi-target tracking. The proposed algorithm includes a novel false alarm detection/removal and label recovery methods capable of reliably recovering tracks that are even lost for a substantial period of time. We compare the performance of the proposed method with the state-of-the-art methods in challenging datasets using standard visual tracking metrics. Our comparisons show that the proposed method performs favourably compared to the state-of-the-art methods, particularly in terms of ID switches and fragmentation metrics which signifies occlusion.

6.
IEEE Trans Med Imaging ; 39(4): 854-865, 2020 04.
Article in English | MEDLINE | ID: mdl-31425069

ABSTRACT

Volumetric imaging is an essential diagnostic tool for medical practitioners. The use of popular techniques such as convolutional neural networks (CNN) for analysis of volumetric images is constrained by the availability of detailed (with local annotations) training data and GPU memory. In this paper, the volumetric image classification problem is posed as a multi-instance classification problem and a novel method is proposed to adaptively select positive instances from positive bags during the training phase. This method uses the extreme value theory to model the feature distribution of the images without a pathology and use it to identify positive instances of an imaged pathology. The experimental results, on three separate image classification tasks (i.e. classify retinal OCT images according to the presence or absence of fluid build-ups, emphysema detection in pulmonary 3D-CT images and detection of cancerous regions in 2D histopathology images) show that the proposed method produces classifiers that have similar performance to fully supervised methods and achieves the state of the art performance in all examined test cases.


Subject(s)
Deep Learning , Imaging, Three-Dimensional/methods , Tomography, X-Ray Computed/methods , Algorithms , Emphysema/diagnostic imaging , Humans , Lung/diagnostic imaging , Pulmonary Disease, Chronic Obstructive/diagnostic imaging , Supervised Machine Learning
7.
Sensors (Basel) ; 19(17)2019 Sep 01.
Article in English | MEDLINE | ID: mdl-31480502

ABSTRACT

In many multi-object tracking applications, the sensor(s) may have controllable states. Examples include movable sensors in multi-target tracking applications in defence, and unmanned air vehicles (UAVs) as sensors in multi-object systems used in civil applications such as inspection and fault detection. Uncertainties in the number of objects (due to random appearances and disappearances) as well as false alarms and detection uncertainties collectively make the above problem a highly challenging stochastic sensor control problem. Numerous solutions have been proposed to tackle the problem of precise control of sensor(s) for multi-object detection and tracking, and, in this work, recent contributions towards the advancement in the domain are comprehensively reviewed. After an introduction, we provide an overview of the sensor control problem and present the key components of sensor control solutions in general. Then, we present a categorization of the existing methods and review those methods under each category. The categorization includes a new generation of solutions called selective sensor control that have been recently developed for applications where particular objects of interest need to be accurately detected and tracked by controllable sensors.

8.
Sensors (Basel) ; 19(9)2019 Apr 29.
Article in English | MEDLINE | ID: mdl-31035720

ABSTRACT

This paper presents a novel Track-Before-Detect (TBD) Labeled Multi-Bernoulli (LMB) filter tailored for industrial mobile platform safety applications. At the core of the developed solution is two techniques for fusion of color and edge information in visual tracking. We derive an application specific separable likelihood function that captures the geometric shape of the human targets wearing safety vests. We use a novel geometric shape likelihood along with a color likelihood to devise two Bayesian updates steps which fuse shape and color related information. One approach is sequential and the other is based on weighted Kullback-Leibler average (KLA). Experimental results show that the KLA based fusion variant of the proposed algorithm outperforms both the sequential update based variant and a state-of-art method in terms of the performance metrics commonly used in computer vision literature.

9.
Sensors (Basel) ; 19(7)2019 Apr 03.
Article in English | MEDLINE | ID: mdl-30987259

ABSTRACT

There is a large body of literature on solving the SLAM problem for various autonomous vehicle applications. A substantial part of the solutions is formulated based on using statistical (mainly Bayesian) filters such as Kalman filter and its extended version. In such solutions, the measurements are commonly some point features or detections collected by the sensor(s) on board the autonomous vehicle. With the increasing utilization of scanners with common autonomous cars, and availability of 3D point clouds in real-time and at fast rates, it is now possible to use more sophisticated features extracted from the point clouds for filtering. This paper presents the idea of using planar features with multi-object Bayesian filters for SLAM. With Bayesian filters, the first step is prediction, where the object states are propagated to the next time based on a stochastic transition model. We first present how such a transition model can be developed, and then propose a solution for state prediction. In the simulation studies, using a dataset of measurements acquired from real vehicle sensors, we apply the proposed model to predict the next planar features and vehicle states. The results show reasonable accuracy and efficiency for statistical filtering-based SLAM applications.

10.
IEEE Trans Med Imaging ; 38(8): 1858-1874, 2019 08.
Article in English | MEDLINE | ID: mdl-30835214

ABSTRACT

Retinal swelling due to the accumulation of fluid is associated with the most vision-threatening retinal diseases. Optical coherence tomography (OCT) is the current standard of care in assessing the presence and quantity of retinal fluid and image-guided treatment management. Deep learning methods have made their impact across medical imaging, and many retinal OCT analysis methods have been proposed. However, it is currently not clear how successful they are in interpreting the retinal fluid on OCT, which is due to the lack of standardized benchmarks. To address this, we organized a challenge RETOUCH in conjunction with MICCAI 2017, with eight teams participating. The challenge consisted of two tasks: fluid detection and fluid segmentation. It featured for the first time: all three retinal fluid types, with annotated images provided by two clinical centers, which were acquired with the three most common OCT device vendors from patients with two different retinal diseases. The analysis revealed that in the detection task, the performance on the automated fluid detection was within the inter-grader variability. However, in the segmentation task, fusing the automated methods produced segmentations that were superior to all individual methods, indicating the need for further improvements in the segmentation performance.


Subject(s)
Image Interpretation, Computer-Assisted/methods , Retina/diagnostic imaging , Tomography, Optical Coherence/methods , Algorithms , Databases, Factual , Humans , Retinal Diseases/diagnostic imaging
11.
IEEE Trans Image Process ; 27(9): 4182-4194, 2018 Sep.
Article in English | MEDLINE | ID: mdl-29870340

ABSTRACT

Identifying the underlying models in a set of data points that is contaminated by noise and outliers leads to a highly complex multi-model fitting problem. This problem can be posed as a clustering problem by the projection of higher-order affinities between data points into a graph, which can be clustered using spectral clustering. Calculating all possible higher-order affinities is computationally expensive. Hence, in most cases, only a subset is used. In this paper, we propose an effective sampling method for obtaining a highly accurate approximation of the full graph, which is required to solve multi-structural model fitting problems in computer vision. The proposed method is based on the observation that the usefulness of a graph for segmentation improves as the distribution of the hypotheses that are used to build the graph approaches the distribution of the actual parameters for the given data. In this paper, we approximate this actual parameter distribution by using a th-order statistics-based cost function, and the samples are generated using a greedy algorithm that is coupled with a data sub-sampling strategy. The experimental analysis shows that the proposed method is both accurate and computationally efficient compared with the state-of-the-art robust multi-model fitting techniques. The implementation of the method is publicly available from https://github.com/RuwanT/model-fitting-cbs.

12.
IEEE Trans Pattern Anal Mach Intell ; 38(2): 350-62, 2016 Feb.
Article in English | MEDLINE | ID: mdl-26761739

ABSTRACT

Identifying the underlying model in a set of data contaminated by noise and outliers is a fundamental task in computer vision. The cost function associated with such tasks is often highly complex, hence in most cases only an approximate solution is obtained by evaluating the cost function on discrete locations in the parameter (hypothesis) space. To be successful at least one hypothesis has to be in the vicinity of the solution. Due to noise hypotheses generated by minimal subsets can be far from the underlying model, even when the samples are from the said structure. In this paper we investigate the feasibility of using higher than minimal subset sampling for hypothesis generation. Our empirical studies showed that increasing the sample size beyond minimal size ( p ), in particular up to p+2, will significantly increase the probability of generating a hypothesis closer to the true model when subsets are selected from inliers. On the other hand, the probability of selecting an all inlier sample rapidly decreases with the sample size, making direct extension of existing methods unfeasible. Hence, we propose a new computationally tractable method for robust model fitting that uses higher than minimal subsets. Here, one starts from an arbitrary hypothesis (which does not need to be in the vicinity of the solution) and moves until either a structure in data is found or the process is re-initialized. The method also has the ability to identify when the algorithm has reached a hypothesis with adequate accuracy and stops appropriately, thereby saving computational time. The experimental analysis carried out using synthetic and real data shows that the proposed method is both accurate and efficient compared to the state-of-the-art robust model fitting techniques.

13.
IEEE Trans Med Imaging ; 33(2): 422-32, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24144657

ABSTRACT

Nonrigid image registration techniques using intensity based similarity measures are widely used in medical imaging applications. Due to high computational complexities of these techniques, particularly for volumetric images, finding appropriate registration methods to both reduce the computation burden and increase the registration accuracy has become an intensive area of research. In this paper, we propose a fast and accurate nonrigid registration method for intra-modality volumetric images. Our approach exploits the information provided by an order statistics based segmentation method, to find the important regions for registration and use an appropriate sampling scheme to target those areas and reduce the registration computation time. A unique advantage of the proposed method is its ability to identify the point of diminishing returns and stop the registration process. Our experiments on registration of end-inhale to end-exhale lung CT scan pairs, with expert annotated landmarks, show that the new method is both faster and more accurate than the state of the art sampling based techniques, particularly for registration of images with large deformations.


Subject(s)
Imaging, Three-Dimensional/methods , Tomography, X-Ray Computed/methods , Algorithms , Exhalation , Humans , Inhalation , Lung/diagnostic imaging
14.
ScientificWorldJournal ; 2013: 878417, 2013.
Article in English | MEDLINE | ID: mdl-24348191

ABSTRACT

Motion segmentation is an important task in computer vision and several practical approaches have already been developed. A common approach to motion segmentation is to use the optical flow and formulate the segmentation problem using a linear approximation of the brightness constancy constraints. Although there are numerous solutions to solve this problem and their accuracies and reliabilities have been studied, the exact definition of the segmentation problem, its theoretical feasibility and the conditions for successful motion segmentation are yet to be derived. This paper presents a simplified theoretical framework for the prediction of feasibility, of segmentation of a two-dimensional linear equation system. A statistical definition of a separable motion (structure) is presented and a relatively straightforward criterion for predicting the separability of two different motions in this framework is derived. The applicability of the proposed criterion for prediction of the existence of multiple motions in practice is examined using both synthetic and real image sequences. The prescribed separability criterion is useful in designing computer vision applications as it is solely based on the amount of relative motion and the scale of measurement noise.


Subject(s)
Models, Theoretical , Motion , Algorithms , Computer Simulation
15.
IEEE Trans Image Process ; 22(6): 2128-37, 2013 Jun.
Article in English | MEDLINE | ID: mdl-23412610

ABSTRACT

Complexities of dynamic volumetric imaging challenge the available computer vision techniques on a number of different fronts. This paper examines the relationship between the estimation accuracy and required amount of smoothness for a general solution from a robust statistics perspective. We show that a (surprisingly) small amount of local smoothing is required to satisfy both the necessary and sufficient conditions for accurate optic flow estimation. This notion is called "just enough" smoothing, and its proper implementation has a profound effect on the preservation of local information in processing 3D dynamic scans. To demonstrate the effect of "just enough" smoothing, a robust 3D optic flow method with quantized local smoothing is presented, and the effect of local smoothing on the accuracy of motion estimation in dynamic lung CT images is examined using both synthetic and real image sequences with ground truth.


Subject(s)
Four-Dimensional Computed Tomography/methods , Image Processing, Computer-Assisted/methods , Algorithms , Humans , Lung/diagnostic imaging , Lung/physiology
16.
Med Image Comput Comput Assist Interv ; 13(Pt 2): 193-200, 2010.
Article in English | MEDLINE | ID: mdl-20879315

ABSTRACT

Emphysema is one of the most widespread diseases in subjects with smoking history. The gold standard method for estimating the severity of emphysema is a lung function test, such as forced expiratory volume in first second (FEV1). However, several clinical studies showed that chest CT scans offer more sensitive estimates of emphysema progression. The standard CT densitometric score of emphysema is the relative area of voxels below a threshold (RA). The RA score is a global measurement and reflects the overall emphysema progression. In this work, we propose a framework for estimation of local emphysema progression from longitudinal chest CT scans. First, images are registered to a common system of coordinates and then local image dissimilarities are computed in corresponding anatomical locations. Finally, the obtained dissimilarity representation is converted into a single emphysema progression score. We applied the proposed algorithm on 27 patients with severe emphysema with CT scans acquired five time points, at baseline, after 3, after 12, after 21 and after 24 or 30 months. The results showed consistent emphysema progression with time and the overall progression score correlates significantly with the increase in RA score.


Subject(s)
Emphysema/diagnostic imaging , Lung/diagnostic imaging , Pattern Recognition, Automated/methods , Radiographic Image Interpretation, Computer-Assisted/methods , Radiography, Thoracic/methods , Subtraction Technique , Tomography, X-Ray Computed/methods , Algorithms , Disease Progression , Early Diagnosis , Humans , Radiographic Image Enhancement/methods , Reproducibility of Results , Sensitivity and Specificity
17.
IEEE Trans Image Process ; 15(7): 2006-18, 2006 Jul.
Article in English | MEDLINE | ID: mdl-16830920

ABSTRACT

In this paper, we address the problem of recovering the true underlying model of a surface while performing the segmentation. First, and in order to solve the model selection problem, we introduce a novel criterion, which is based on minimising strain energy of fitted surfaces. We then evaluate its performance and compare it with many other existing model selection techniques. Using this criterion, we then present a robust range data segmentation algorithm capable of segmenting complex objects with planar and curved surfaces. The presented algorithm simultaneously identifies the type (order and geometric shape) of each surface and separates all the points that are part of that surface. This paper includes the segmentation results of a large collection of range images obtained from objects with planar and curved surfaces. The resulting segmentation algorithm successfully segments various possible types of curved objects. More importantly, the new technique is capable of detecting the association between separated parts of a surface, which has the same Cartesian equation while segmenting a scene. This aspect is very useful in some industrial applications of range data analysis.


Subject(s)
Algorithms , Artificial Intelligence , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Information Storage and Retrieval/methods , Pattern Recognition, Automated/methods , Reproducibility of Results , Sensitivity and Specificity , Surface Properties
SELECTION OF CITATIONS
SEARCH DETAIL
...