Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38466603

RESUMO

Analysis of the 3-D texture is indispensable for various tasks, such as retrieval, segmentation, classification, and inspection of sculptures, knit fabrics, and biological tissues. A 3-D texture represents a locally repeated surface variation (SV) that is independent of the overall shape of the surface and can be determined using the local neighborhood and its characteristics. Existing methods mostly employ computer vision techniques that analyze a 3-D mesh globally, derive features, and then utilize them for classification or retrieval tasks. While several traditional and learning-based methods have been proposed in the literature, only a few have addressed 3-D texture analysis, and none have considered unsupervised schemes so far. This article proposes an original framework for the unsupervised segmentation of 3-D texture on the mesh manifold. The problem is approached as a binary surface segmentation task, where the mesh surface is partitioned into textured and nontextured regions without prior annotation. The proposed method comprises a mutual transformer-based system consisting of a label generator (LG) and a label cleaner (LC). Both models take geometric image representations of the surface mesh facets and label them as texture or nontexture using an iterative mutual learning scheme. Extensive experiments on three publicly available datasets with diverse texture patterns demonstrate that the proposed framework outperforms standard and state-of-the-art unsupervised techniques and performs reasonably well compared to supervised methods.

2.
Sci Data ; 11(1): 15, 2024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-38167525

RESUMO

Drone-person tracking in uniform appearance crowds poses unique challenges due to the difficulty in distinguishing individuals with similar attire and multi-scale variations. To address this issue and facilitate the development of effective tracking algorithms, we present a novel dataset named D-PTUAC (Drone-Person Tracking in Uniform Appearance Crowd). The dataset comprises 138 sequences comprising over 121 K frames, each manually annotated with bounding boxes and attributes. During dataset creation, we carefully consider 18 challenging attributes encompassing a wide range of viewpoints and scene complexities. These attributes are annotated to facilitate the analysis of performance based on specific attributes. Extensive experiments are conducted using 44 state-of-the-art (SOTA) trackers, and the performance gap between the visual object trackers on existing benchmarks compared to our proposed dataset demonstrate the need for a dedicated end-to-end aerial visual object tracker that accounts the inherent properties of aerial environment.

3.
IEEE Trans Image Process ; 33: 241-256, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38064329

RESUMO

Accurate classification of nuclei communities is an important step towards timely treating the cancer spread. Graph theory provides an elegant way to represent and analyze nuclei communities within the histopathological landscape in order to perform tissue phenotyping and tumor profiling tasks. Many researchers have worked on recognizing nuclei regions within the histology images in order to grade cancerous progression. However, due to the high structural similarities between nuclei communities, defining a model that can accurately differentiate between nuclei pathological patterns still needs to be solved. To surmount this challenge, we present a novel approach, dubbed neural graph refinement, that enhances the capabilities of existing models to perform nuclei recognition tasks by employing graph representational learning and broadcasting processes. Based on the physical interaction of the nuclei, we first construct a fully connected graph in which nodes represent nuclei and adjacent nodes are connected to each other via an undirected edge. For each edge and node pair, appearance and geometric features are computed and are then utilized for generating the neural graph embeddings. These embeddings are used for diffusing contextual information to the neighboring nodes, all along a path traversing the whole graph to infer global information over an entire nuclei network and predict pathologically meaningful communities. Through rigorous evaluation of the proposed scheme across four public datasets, we showcase that learning such communities through neural graph refinement produces better results that outperform state-of-the-art methods.


Assuntos
Núcleo Celular , Aprendizagem , Técnicas Histológicas
4.
IEEE J Biomed Health Inform ; 28(2): 952-963, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37999960

RESUMO

Early-stage cancer diagnosis potentially improves the chances of survival for many cancer patients worldwide. Manual examination of Whole Slide Images (WSIs) is a time-consuming task for analyzing tumor-microenvironment. To overcome this limitation, the conjunction of deep learning with computational pathology has been proposed to assist pathologists in efficiently prognosing the cancerous spread. Nevertheless, the existing deep learning methods are ill-equipped to handle fine-grained histopathology datasets. This is because these models are constrained via conventional softmax loss function, which cannot expose them to learn distinct representational embeddings of the similarly textured WSIs containing an imbalanced data distribution. To address this problem, we propose a novel center-focused affinity loss (CFAL) function that exhibits 1) constructing uniformly distributed class prototypes in the feature space, 2) penalizing difficult samples, 3) minimizing intra-class variations, and 4) placing greater emphasis on learning minority class features. We evaluated the performance of the proposed CFAL loss function on two publicly available breast and colon cancer datasets having varying levels of imbalanced classes. The proposed CFAL function shows better discrimination abilities as compared to the popular loss functions such as ArcFace, CosFace, and Focal loss. Moreover, it outperforms several SOTA methods for histology image classification across both datasets.


Assuntos
Mama , Neoplasias , Humanos , Mama/diagnóstico por imagem , Técnicas Histológicas , Microambiente Tumoral , Neoplasias/diagnóstico por imagem
5.
Artigo em Inglês | MEDLINE | ID: mdl-37021915

RESUMO

Automatic tissue classification is a fundamental task in computational pathology for profiling tumor micro-environments. Deep learning has advanced tissue classification performance at the cost of significant computational power. Shallow networks have also been end-to-end trained using direct supervision however their performance degrades because of the lack of capturing robust tissue heterogeneity. Knowledge distillation has recently been employed to improve the performance of the shallow networks used as student networks by using additional supervision from deep neural networks used as teacher networks. In the current work, we propose a novel knowledge distillation algorithm to improve the performance of shallow networks for tissue phenotyping in histology images. For this purpose, we propose multi-layer feature distillation such that a single layer in the student network gets supervision from multiple teacher layers. In the proposed algorithm, the size of the feature map of two layers is matched by using a learnable multi-layer perceptron. The distance between the feature maps of the two layers is then minimized during the training of the student network. The overall objective function is computed by summation of the loss over multiple layers combination weighted with a learnable attention-based parameter. The proposed algorithm is named as Knowledge Distillation for Tissue Phenotyping (KDTP). Experiments are performed on five different publicly available histology image classification datasets using several teacher-student network combinations within the KDTP algorithm. Our results demonstrate a significant performance increase in the student networks by using the proposed KDTP algorithm compared to direct supervision-based training methods.

6.
J Digit Imaging ; 36(4): 1653-1662, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37059892

RESUMO

Tissue phenotyping is a fundamental step in computational pathology for the analysis of tumor micro-environment in whole slide images (WSIs). Automatic tissue phenotyping in whole slide images (WSIs) of colorectal cancer (CRC) assists pathologists in better cancer grading and prognostication. In this paper, we propose a novel algorithm for the identification of distinct tissue components in colon cancer histology images by blending a comprehensive learning system with deep features extraction in the current work. Firstly, we extracted the features from the pre-trained VGG19 network which are then transformed into mapped features space for nodes enhancement generation. Utilizing both mapped features and enhancement nodes, the proposed algorithm classifies seven distinct tissue components including stroma, tumor, complex stroma, necrotic, normal benign, lymphocytes, and smooth muscle. To validate our proposed model, the experiments are performed on two publicly available colorectal cancer histology datasets. We showcase that our approach achieves a remarkable performance boost surpassing existing state-of-the-art methods by (1.3% AvTP, 2% F1) and (7% AvTP, 6% F1) on CRCD-1, and CRCD-2, respectively.


Assuntos
Algoritmos , Neoplasias Colorretais , Humanos , Aprendizagem , Patologistas , Neoplasias Colorretais/diagnóstico por imagem , Microambiente Tumoral
7.
IEEE Trans Pattern Anal Mach Intell ; 45(5): 6552-6574, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-36215368

RESUMO

Accurate and robust visual object tracking is one of the most challenging and fundamental computer vision problems. It entails estimating the trajectory of the target in an image sequence, given only its initial location, and segmentation, or its rough approximation in the form of a bounding box. Discriminative Correlation Filters (DCFs) and deep Siamese Networks (SNs) have emerged as dominating tracking paradigms, which have led to significant progress. Following the rapid evolution of visual object tracking in the last decade, this survey presents a systematic and thorough review of more than 90 DCFs and Siamese trackers, based on results in nine tracking benchmarks. First, we present the background theory of both the DCF and Siamese tracking core formulations. Then, we distinguish and comprehensively review the shared as well as specific open research challenges in both these tracking paradigms. Furthermore, we thoroughly analyze the performance of DCF and Siamese trackers on nine benchmarks, covering different experimental aspects of visual tracking: datasets, evaluation metrics, performance, and speed comparisons. We finish the survey by presenting recommendations and suggestions for distinguished open challenges based on our analysis.

8.
Artigo em Inglês | MEDLINE | ID: mdl-36107888

RESUMO

Neuromorphic vision is a bio-inspired technology that has triggered a paradigm shift in the computer vision community and is serving as a key enabler for a wide range of applications. This technology has offered significant advantages, including reduced power consumption, reduced processing needs, and communication speedups. However, neuromorphic cameras suffer from significant amounts of measurement noise. This noise deteriorates the performance of neuromorphic event-based perception and navigation algorithms. In this article, we propose a novel noise filtration algorithm to eliminate events that do not represent real log-intensity variations in the observed scene. We employ a graph neural network (GNN)-driven transformer algorithm, called GNN-Transformer, to classify every active event pixel in the raw stream into real log-intensity variation or noise. Within the GNN, a message-passing framework, referred to as EventConv, is carried out to reflect the spatiotemporal correlation among the events while preserving their asynchronous nature. We also introduce the known-object ground-truth labeling (KoGTL) approach for generating approximate ground-truth labels of event streams under various illumination conditions. KoGTL is used to generate labeled datasets, from experiments recorded in challenging lighting conditions, including moon light. These datasets are used to train and extensively test our proposed algorithm. When tested on unseen datasets, the proposed algorithm outperforms state-of-the-art methods by at least 8.8% in terms of filtration accuracy. Additional tests are also conducted on publicly available datasets (ETH Zürich Color-DAVIS346 datasets) to demonstrate the generalization capabilities of the proposed algorithm in the presence of illumination variations and different motion dynamics. Compared to state-of-the-art solutions, qualitative results verified the superior capability of the proposed algorithm to eliminate noise while preserving meaningful events in the scene.

9.
Med Image Anal ; 79: 102480, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35598521

RESUMO

Identification of nuclear components in the histology landscape is an important step towards developing computational pathology tools for the profiling of tumor micro-environment. Most existing methods for the identification of such components are limited in scope due to heterogeneous nature of the nuclei. Graph-based methods offer a natural way to formulate the nucleus classification problem to incorporate both appearance and geometric locations of the nuclei. The main challenge is to define models that can handle such an unstructured domain. Current approaches focus on learning better features and then employ well-known classifiers for identifying distinct nuclear phenotypes. In contrast, we propose a message passing network that is a fully learnable framework build on classical network flow formulation. Based on physical interaction of the nuclei, a nearest neighbor graph is constructed such that the nodes represent the nuclei centroids. For each edge and node, appearance and geometric features are computed which are then used for the construction of messages utilized for diffusing contextual information to the neighboring nodes. Such an algorithm can infer global information over an entire network and predict biologically meaningful nuclear communities. We show that learning such communities improves the performance of nucleus classification task in histology images. The proposed algorithm can be used as a component in existing state-of-the-art methods resulting in improved nucleus classification performance across four different publicly available datasets.


Assuntos
Técnicas Histológicas , Redes Neurais de Computação , Algoritmos , Núcleo Celular , Humanos
10.
Comput Biol Med ; 143: 105281, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35139456

RESUMO

Nucleus detection is an important step for the analysis of histology images in the field of computational pathology. Pathologists use quantitative nuclear morphology for better cancer grading and prognostication. The nucleus detection becomes very challenging because of the large morphological variations across different types of nuclei, nuclei clutter, and heterogeneity. To address these challenges, we aim to improve the nucleus detection using multi-level feature fusion based on discriminative correlation filters. The proposed algorithm employs multiple features pool, based on varying features combinations. Early fusion is employed to integrate multi-feature information within a pool and inter-pool fusion is proposed to fuse information across multiple pools. Inter-pool consistency is proposed to find the pools which are consistent and complement each other to improve performance. For this purpose, the relative standard deviation is used as an inter-pool consistency measure. Pool robustness to noise is also estimated using relative standard deviation as a robustness measure. High-level pool fusion is proposed using inter-pool consistency and pool-robustness scores. The proposed algorithm facilitates a robust and reliable appearance model for nucleus detection. The proposed algorithm is evaluated on three publicly available datasets and compared with several existing state-of-the-art methods. Our proposed algorithm has consistently outperformed existing methods on a wide range of experiments.

11.
J Digit Imaging ; 35(2): 281-301, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35013827

RESUMO

Hypertensive retinopathy (HR) refers to changes in the morphological diameter of the retinal vessels due to persistent high blood pressure. Early detection of such changes helps in preventing blindness or even death due to stroke. These changes can be quantified by computing the arteriovenous ratio and the tortuosity severity in the retinal vasculature. This paper presents a decision support system for detecting and grading HR using morphometric analysis of retinal vasculature, particularly measuring the arteriovenous ratio (AVR) and retinal vessel tortuosity. In the first step, the retinal blood vessels are segmented and classified as arteries and veins. Then, the width of arteries and veins is measured within the region of interest around the optic disk. Next, a new iterative method is proposed to compute the AVR from the caliber measurements of arteries and veins using Parr-Hubbard and Knudtson methods. Moreover, the retinal vessel tortuosity severity index is computed for each image using 14 tortuosity severity metrics. In the end, a hybrid decision support system is proposed for the detection and grading of HR using AVR and tortuosity severity index. Furthermore, we present a new publicly available retinal vessel morphometry (RVM) dataset to evaluate the proposed methodology. The RVM dataset contains 504 retinal images with pixel-level annotations for vessel segmentation, artery/vein classification, and optic disk localization. The image-level labels for vessel tortuosity index and HR grade are also available. The proposed methods of iterative AVR measurement, tortuosity index, and HR grading are evaluated using the new RVM dataset. The results indicate that the proposed method gives superior performance than existing methods. The presented methodology is a novel advancement in automated detection and grading of HR, which can potentially be used as a clinical decision support system.


Assuntos
Retinopatia Hipertensiva , Disco Óptico , Humanos , Retinopatia Hipertensiva/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Vasos Retinianos/diagnóstico por imagem
12.
IEEE Trans Cybern ; 52(11): 12259-12274, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34232902

RESUMO

Visual object tracking is a fundamental and challenging task in many high-level vision and robotics applications. It is typically formulated by estimating the target appearance model between consecutive frames. Discriminative correlation filters (DCFs) and their variants have achieved promising speed and accuracy for visual tracking in many challenging scenarios. However, because of the unwanted boundary effects and lack of geometric constraints, these methods suffer from performance degradation. In the current work, we propose hierarchical spatiotemporal graph-regularized correlation filters for robust object tracking. The target sample is decomposed into a large number of deep channels, which are then used to construct a spatial graph such that each graph node corresponds to a particular target location across all channels. Such a graph effectively captures the spatial structure of the target object. In order to capture the temporal structure of the target object, the information in the deep channels obtained from a temporal window is compressed using the principal component analysis, and then, a temporal graph is constructed such that each graph node corresponds to a particular target location in the temporal dimension. Both spatial and temporal graphs span different subspaces such that the target and the background become linearly separable. The learned correlation filter is constrained to act as an eigenvector of the Laplacian of these spatiotemporal graphs. We propose a novel objective function that incorporates these spatiotemporal constraints into the DCFs framework. We solve the objective function using alternating direction methods of multipliers such that each subproblem has a closed-form solution. We evaluate our proposed algorithm on six challenging benchmark datasets and compare it with 33 existing state-of-the art trackers. Our results demonstrate an excellent performance of the proposed algorithm compared to the existing trackers.

13.
IEEE Trans Pattern Anal Mach Intell ; 44(5): 2485-2503, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-33296300

RESUMO

Moving Object Segmentation (MOS) is a fundamental task in computer vision. Due to undesirable variations in the background scene, MOS becomes very challenging for static and moving camera sequences. Several deep learning methods have been proposed for MOS with impressive performance. However, these methods show performance degradation in the presence of unseen videos; and usually, deep learning models require large amounts of data to avoid overfitting. Recently, graph learning has attracted significant attention in many computer vision applications since they provide tools to exploit the geometrical structure of data. In this work, concepts of graph signal processing are introduced for MOS. First, we propose a new algorithm that is composed of segmentation, background initialization, graph construction, unseen sampling, and a semi-supervised learning method inspired by the theory of recovery of graph signals. Second, theoretical developments are introduced, showing one bound for the sample complexity in semi-supervised learning, and two bounds for the condition number of the Sobolev norm. Our algorithm has the advantage of requiring less labeled data than deep learning methods while having competitive results on both static and moving camera videos. Our algorithm is also adapted for Video Object Segmentation (VOS) tasks and is evaluated on six publicly available datasets outperforming several state-of-the-art methods in challenging conditions.

14.
Med Image Anal ; 72: 102104, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34242872

RESUMO

Nucleus detection in histology images is a fundamental step for cellular-level analysis in computational pathology. In clinical practice, quantitative nuclear morphology can be used for diagnostic decision making, prognostic stratification, and treatment outcome prediction. Nucleus detection is a challenging task because of large variations in the shape of different types of nucleus such as nuclear clutter, heterogeneous chromatin distribution, and irregular and fuzzy boundaries. To address these challenges, we aim to accurately detect nuclei using spatially constrained context-aware correlation filters using hierarchical deep features extracted from multiple layers of a pre-trained network. During training, we extract contextual patches around each nucleus which are used as negative examples while the actual nucleus patch is used as a positive example. In order to spatially constrain the correlation filters, we propose to construct a spatial structural graph across different nucleus components encoding pairwise similarities. The correlation filters are constrained to act as eigenvectors of the Laplacian of the spatial graphs enforcing these to capture the nucleus structure. A novel objective function is proposed by embedding graph-based structural information as well as the contextual information within the discriminative correlation filter framework. The learned filters are constrained to be orthogonal to both the contextual patches and the spatial graph-Laplacian basis to improve the localization and discriminative performance. The proposed objective function trains a hierarchy of correlation filters on different deep feature layers to capture the heterogeneity in nuclear shape and texture. The proposed algorithm is evaluated on three publicly available datasets and compared with 15 current state-of-the-art methods demonstrating competitive performance in terms of accuracy, speed, and generalization.


Assuntos
Técnicas Histológicas , Redes Neurais de Computação , Algoritmos , Núcleo Celular , Humanos , Prognóstico
15.
Artigo em Inglês | MEDLINE | ID: mdl-32966218

RESUMO

In computational pathology, automated tissue phenotyping in cancer histology images is a fundamental tool for profiling tumor microenvironments. Current tissue phenotyping methods use features derived from image patches which may not carry biological significance. In this work, we propose a novel multiplex cellular community-based algorithm for tissue phenotyping integrating cell-level features within a graph-based hierarchical framework. We demonstrate that such integration offers better performance compared to prior deep learning and texture-based methods as well as to cellular community based methods using uniplex networks. To this end, we construct celllevel graphs using texture, alpha diversity and multi-resolution deep features. Using these graphs, we compute cellular connectivity features which are then employed for the construction of a patch-level multiplex network. Over this network, we compute multiplex cellular communities using a novel objective function. The proposed objective function computes a low-dimensional subspace from each cellular network and subsequently seeks a common low-dimensional subspace using the Grassmann manifold. We evaluate our proposed algorithm on three publicly available datasets for tissue phenotyping, demonstrating a significant improvement over existing state-of-the-art methods.

16.
Med Image Anal ; 63: 101696, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32330851

RESUMO

Classification of various types of tissue in cancer histology images based on the cellular compositions is an important step towards the development of computational pathology tools for systematic digital profiling of the spatial tumor microenvironment. Most existing methods for tissue phenotyping are limited to the classification of tumor and stroma and require large amount of annotated histology images which are often not available. In the current work, we pose the problem of identifying distinct tissue phenotypes as finding communities in cellular graphs or networks. First, we train a deep neural network for cell detection and classification into five distinct cellular components. Considering the detected nuclei as nodes, potential cell-cell connections are assigned using Delaunay triangulation resulting in a cell-level graph. Based on this cell graph, a feature vector capturing potential cell-cell connection of different types of cells is computed. These feature vectors are used to construct a patch-level graph based on chi-square distance. We map patch-level nodes to the geometric space by representing each node as a vector of geodesic distances from other nodes in the network and iteratively drifting the patch nodes in the direction of positive density gradients towards maximum density regions. The proposed algorithm is evaluated on a publicly available dataset and another new large-scale dataset consisting of 280K patches of seven tissue phenotypes. The estimated communities have significant biological meanings as verified by the expert pathologists. A comparison with current state-of-the-art methods reveals significant performance improvement in tissue phenotyping.


Assuntos
Neoplasias Colorretais , Redes Neurais de Computação , Algoritmos , Núcleo Celular , Neoplasias Colorretais/diagnóstico por imagem , Técnicas Histológicas , Humanos , Microambiente Tumoral
17.
Neural Netw ; 117: 8-66, 2019 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-31129491

RESUMO

Conventional neural networks have been demonstrated to be a powerful framework for background subtraction in video acquired by static cameras. Indeed, the well-known Self-Organizing Background Subtraction (SOBS) method and its variants based on neural networks have long been the leading methods on the large-scale CDnet 2012 dataset during a long time. Convolutional neural networks, which are used in deep learning, have been recently and excessively employed for background initialization, foreground detection, and deep learned features. The top background subtraction methods currently used in CDnet 2014 are based on deep neural networks, and have demonstrated a large performance improvement in comparison to conventional unsupervised approaches based on multi-feature or multi-cue strategies. Furthermore, since the seminal work of Braham and Van Droogenbroeck in 2016, a large number of studies on convolutional neural networks applied to background subtraction have been published, and a continual gain of performance has been achieved. In this context, we provide the first review of deep neural network concepts in background subtraction for novices and experts in order to analyze this success and to provide further directions. To do so, we first surveyed the background initialization and background subtraction methods based on deep neural networks concepts, and also deep learned features. We then discuss the adequacy of deep neural networks for the task of background subtraction. Finally, experimental results are presented for the CDnet 2014 dataset.


Assuntos
Aprendizado Profundo/normas
18.
Artigo em Inglês | MEDLINE | ID: mdl-30296231

RESUMO

Moving object detection is a fundamental step in various computer vision applications. Robust Principal Component Analysis (RPCA) based methods have often been employed for this task. However, the performance of these methods deteriorates in the presence of dynamic background scenes, camera jitter, camouflaged moving objects, and/or variations in illumination. It is because of an underlying assumption that the elements in the sparse component are mutually independent, and thus the spatiotemporal structure of the moving objects is lost. To address this issue, we propose a spatiotemporal structured sparse RPCA algorithm for moving objects detection, where we impose spatial and temporal regularization on the sparse component in the form of graph Laplacians. Each Laplacian corresponds to a multi-feature graph constructed over superpixels in the input matrix. We enforce the sparse component to act as eigenvectors of the spatial and temporal graph Laplacians while minimizing the RPCA objective function. These constraints incorporate a spatiotemporal subspace structure within the sparse component. Thus, we obtain a novel objective function for separating moving objects in the presence of complex backgrounds. The proposed objective function is solved using a linearized alternating direction method of multipliers based batch optimization. Moreover, we also propose an online optimization algorithm for real-time applications. We evaluated both the batch and online solutions using six publicly available datasets that included most of the aforementioned challenges. Our experiments demonstrated the superior performance of the proposed algorithms compared with the current state-of-the-art methods.

19.
IEEE Trans Image Process ; 26(12): 5840-5854, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-28866495

RESUMO

Background estimation and foreground segmentation are important steps in many high-level vision tasks. Many existing methods estimate background as a low-rank component and foreground as a sparse matrix without incorporating the structural information. Therefore, these algorithms exhibit degraded performance in the presence of dynamic backgrounds, photometric variations, jitter, shadows, and large occlusions. We observe that these backgrounds often span multiple manifolds. Therefore, constraints that ensure continuity on those manifolds will result in better background estimation. Hence, we propose to incorporate the spatial and temporal sparse subspace clustering into the robust principal component analysis (RPCA) framework. To that end, we compute a spatial and temporal graph for a given sequence using motion-aware correlation coefficient. The information captured by both graphs is utilized by estimating the proximity matrices using both the normalized Euclidean and geodesic distances. The low-rank component must be able to efficiently partition the spatiotemporal graphs using these Laplacian matrices. Embedded with the RPCA objective function, these Laplacian matrices constrain the background model to be spatially and temporally consistent, both on linear and nonlinear manifolds. The solution of the proposed objective function is computed by using the linearized alternating direction method with adaptive penalty optimization scheme. Experiments are performed on challenging sequences from five publicly available datasets and are compared with the 23 existing state-of-the-art methods. The results demonstrate excellent performance of the proposed algorithm for both the background estimation and foreground segmentation.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...