Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Sensors (Basel) ; 23(18)2023 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-37765864

RESUMO

Long-range target detection in thermal infrared imagery is a challenging research problem due to the low resolution and limited detail captured by thermal sensors. The limited size and variability in thermal image datasets for small target detection is also a major constraint for the development of accurate and robust detection algorithms. To address both the sensor and data constraints, we propose a novel convolutional neural network (CNN) feature extraction architecture designed for small object detection in data-limited settings. More specifically, we focus on long-range ground-based thermal vehicle detection, but also show the effectiveness of the proposed algorithm on drone and satellite aerial imagery. The design of the proposed architecture is inspired by an analysis of popular object detectors as well as custom-designed networks. We find that restricted receptive fields (rather than more globalized features, as is the trend), along with less downsampling of feature maps and attenuated processing of fine-grained features, lead to greatly improved detection rates while mitigating the model's capacity to overfit on small or poorly varied datasets. Our approach achieves state-of-the-art results on the Defense Systems Information Analysis Center (DSIAC) automated target recognition (ATR) and the Tiny Object Detection in Aerial Images (AI-TOD) datasets.

2.
IEEE J Biomed Health Inform ; 26(4): 1614-1627, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-34516380

RESUMO

Optical coherence tomography (OCT) has been identified as a non-invasive and inexpensive imaging modality to discover potential biomarkers for Alzheimer's diagnosis and progress determination. Current hypotheses presume the thickness of the retinal layers, which are analyzable within OCT scans, as an effective biomarker for the presence of Alzheimer's. As a logical first step, this work concentrates on the accurate segmentation of retinal layers to isolate the layers for further analysis. This paper proposes a generative adversarial network (GAN) that concurrently learns to increase the image resolution for higher clarity and then segment the retinal layers. We propose a multi-stage and multi-discriminatory generative adversarial network (MultiSDGAN) specifically for superresolution and segmentation of OCT scans of the retinal layer. The resulting generator is adversarially trained against multiple discriminator networks at multiple stages. We aim to avoid early saturation of generator model training leading to poor segmentation accuracies and enhance the process of OCT domain translation by satisfying all the discriminators in multiple scales. We also investigated incorporating the Dice loss and Structured Similarity Index Measure (SSIM) as additional loss functions to specifically target and improve our proposed GAN architecture's segmentation and superresolution performance, respectively. The ablation study results conducted on our data set suggest that the proposed MultiSDGAN with ten-fold cross-validation (10-CV) provides a reduced equal error rate with 44.24% and 34.09% relative improvements, respectively (p-values of the improvement level tests .01). Furthermore, our experimental results also demonstrate that the addition of the new terms to the loss function improves the segmentation results significantly by relative improvements of 31.33% (p-value .01).


Assuntos
Doença de Alzheimer , Tomografia de Coerência Óptica , Doença de Alzheimer/diagnóstico por imagem , Humanos , Processamento de Imagem Assistida por Computador/métodos , Retina/diagnóstico por imagem
3.
Artigo em Inglês | MEDLINE | ID: mdl-31331886

RESUMO

In this paper, we propose a novel deep sparse coding network (SCN) capable of efficiently adapting its own regularization parameters for a given application. The network is trained end-to-end with a supervised task-driven learning algorithm via error backpropagation. During training, the network learns both the dictionaries and the regularization parameters of each sparse coding layer so that the reconstructive dictionaries are smoothly transformed into increasingly discriminative representations. In addition, the adaptive regularization also offers the network more flexibility to adjust sparsity levels. Furthermore, we have devised a sparse coding layer utilizing a 'skinny' dictionary. Integral to computational efficiency, these skinny dictionaries compress the high dimensional sparse codes into lower dimensional structures. The adaptivity and discriminability of our fifteen-layer sparse coding network are demonstrated on five benchmark datasets, namely Cifar-10, Cifar-100, STL-10, SVHN and MNIST, most of which are considered difficult for sparse coding models. Experimental results show that our architecture overwhelmingly outperforms traditional one-layer sparse coding architectures while using much fewer parameters. Moreover, our multilayer architecture exploits the benefits of depth with sparse coding's characteristic ability to operate on smaller datasets. In such data-constrained scenarios, our technique demonstrates highly competitive performance compared to the deep neural networks.

4.
IEEE Trans Image Process ; 27(11): 5214-5224, 2018 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-29994676

RESUMO

Domain adaptation is a promising technique when addressing limited or no labeled target data by borrowing well-labeled knowledge from the auxiliary source data. Recently, researchers have exploited multi-layer structures for discriminative feature learning to reduce the domain discrepancy. However, there are limited research efforts on simultaneously building a deep structure and a discriminative classifier over both labeled source and unlabeled target. In this paper, we propose a semi-supervised deep domain adaptation framework, in which the multi-layer feature extractor and a multi-class classifier are jointly learned to benefit from each other. Specifically, we develop a novel semi-supervised class-wise adaptation manner to fight off the conditional distribution mismatch between two domains by assigning a probabilistic label to each target sample, i.e., multiple class labels with different probabilities. Furthermore, a multi-class classifier is simultaneously trained on labeled source and unlabeled target samples in a semi-supervised fashion. In this way, the deep structure can formally alleviate the domain divergence and enhance the feature transferability. Experimental evaluations on several standard cross-domain benchmarks verify the superiority of our proposed approach.

5.
IEEE Trans Image Process ; 25(1): 24-38, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26540686

RESUMO

Dictionary learning algorithms have been successfully used for both reconstructive and discriminative tasks, where an input signal is represented with a sparse linear combination of dictionary atoms. While these methods are mostly developed for single-modality scenarios, recent studies have demonstrated the advantages of feature-level fusion based on the joint sparse representation of the multimodal inputs. In this paper, we propose a multimodal task-driven dictionary learning algorithm under the joint sparsity constraint (prior) to enforce collaborations among multiple homogeneous/heterogeneous sources of information. In this task-driven formulation, the multimodal dictionaries are learned simultaneously with their corresponding classifiers. The resulting multimodal dictionaries can generate discriminative latent features (sparse codes) from the data that are optimized for a given task such as binary or multiclass classification. Moreover, we present an extension of the proposed formulation using a mixed joint and independent sparsity prior, which facilitates more flexible fusion of the modalities at feature level. The efficacy of the proposed algorithms for multimodal classification is illustrated on four different applications--multimodal face recognition, multi-view face recognition, multi-view action recognition, and multimodal biometric recognition. It is also shown that, compared with the counterpart reconstructive-based dictionary learning algorithms, the task-driven formulations are more computationally efficient in the sense that they can be equipped with more compact dictionaries and still achieve superior performance.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Identificação Biométrica , Face/anatomia & histologia , Humanos
6.
IEEE Trans Cybern ; 45(3): 576-87, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25014986

RESUMO

Advances in acoustic sensing have enabled the simultaneous acquisition of multiple measurements of the same physical event via co-located acoustic sensors. We exploit the inherent correlation among such multiple measurements for acoustic signal classification, to identify the launch/impact of munition (i.e., rockets, mortars). Specifically, we propose a probabilistic graphical model framework that can explicitly learn the class conditional correlations between the cepstral features extracted from these different measurements. Additionally, we employ symbolic dynamic filtering-based features, which offer improvements over the traditional cepstral features in terms of robustness to signal distortions. Experiments on real acoustic data sets show that our proposed algorithm outperforms conventional classifiers as well as the recently proposed joint sparsity models for multisensor acoustic classification. Additionally our proposed algorithm is less sensitive to insufficiency in training samples compared to competing approaches.

7.
IEEE Trans Pattern Anal Mach Intell ; 36(1): 113-26, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24231870

RESUMO

Traditional biometric recognition systems rely on a single biometric signature for authentication. While the advantage of using multiple sources of information for establishing the identity has been widely recognized, computational models for multimodal biometrics recognition have only recently received attention. We propose a multimodal sparse representation method, which represents the test data by a sparse linear combination of training data, while constraining the observations from different modalities of the test subject to share their sparse representations. Thus, we simultaneously take into account correlations as well as coupling information among biometric modalities. A multimodal quality measure is also proposed to weigh each modality as it gets fused. Furthermore, we also kernelize the algorithm to handle nonlinearity in data. The optimization problem is solved using an efficient alternative direction method. Various experiments show that the proposed method compares favorably with competing fusion-based methods.


Assuntos
Identificação Biométrica/métodos , Algoritmos , Bases de Dados Factuais , Dermatoglifia/classificação , Face/anatomia & histologia , Humanos , Iris/anatomia & histologia
8.
IEEE Trans Image Process ; 22(12): 5123-35, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24058027

RESUMO

In this paper, we present dictionary learning methods for sparse signal representations in a high dimensional feature space. Using the kernel method, we describe how the well known dictionary learning approaches, such as the method of optimal directions and KSVD, can be made nonlinear. We analyze their kernel constructions and demonstrate their effectiveness through several experiments on classification problems. It is shown that nonlinear dictionary learning approaches can provide significantly better performance compared with their linear counterparts and kernel principal component analysis, especially when the data is corrupted by different types of degradations.

9.
IEEE Trans Syst Man Cybern B Cybern ; 42(6): 1586-98, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22614692

RESUMO

This paper investigates the joint-structured-sparsity-based methods for transient acoustic signal classification with multiple measurements. By joint structured sparsity, we not only use the sparsity prior for each measurement but we also exploit the structural information across the sparse representation vectors of multiple measurements. Several different sparse prior models are investigated in this paper to exploit the correlations among the multiple measurements with the notion of the joint structured sparsity for improving the classification accuracy. Specifically, we propose models with the joint structured sparsity under different assumptions: same sparse code model, common sparse pattern model, and a newly proposed joint dynamic sparse model. For the joint dynamic sparse model, we also develop an efficient greedy algorithm to solve it. Extensive experiments are carried out on real acoustic data sets, and the results are compared with the conventional discriminative classifiers in order to verify the effectiveness of the proposed method.

10.
Appl Opt ; 50(17): 2744-51, 2011 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-21673780

RESUMO

This paper describes a new kernel wavelet-based anomaly detection technique for long-wave (LW) forward-looking infrared imagery. The proposed approach called kernel wavelet-Reed-Xiaoli (wavelet-RX) algorithm is essentially an extension of the wavelet-RX algorithm (combination of wavelet transform and RX anomaly detector) to a high-dimensional feature space (possibly infinite) via a certain nonlinear mapping function of the input data. The wavelet-RX algorithm in this high-dimensional feature space can easily be implemented in terms of kernels that implicitly compute dot products in the feature space (kernelizing the wavelet-RX algorithm). In the proposed kernel wavelet-RX algorithm, a two-dimensional wavelet transform is first applied to decompose the input image into uniform subbands. A number of significant subbands (high-energy subbands) are concatenated together to form a subband-image cube. The kernel RX algorithm is then applied to this subband-image cube. Experimental results are presented for the proposed kernel wavelet-RX, wavelet-RX, and the classical constant false alarm rate (CFAR) algorithm for detecting anomalies (targets) in a large database of LW imagery. The receiver operating characteristic plots show that the proposed kernel wavelet-RX algorithm outperforms the wavelet-RX as well as the classical CFAR detector.

11.
Appl Opt ; 50(10): 1425-33, 2011 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-21460910

RESUMO

We present an automatic target recognition algorithm using the recently developed theory of sparse representations and compressive sensing. We show how sparsity can be helpful for efficient utilization of data for target recognition. We verify the efficacy of the proposed algorithm in terms of the recognition rate and confusion matrices on the well known Comanche (Boeing-Sikorsky, USA) forward-looking IR data set consisting of ten different military targets at different orientations.

12.
Appl Opt ; 49(24): 4621-32, 2010 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-20733634

RESUMO

This paper describes a new wavelet-based anomaly detection technique for a dual-band forward-looking infrared (FLIR) sensor consisting of a coregistered longwave (LW) with a midwave (MW) sensor. The proposed approach, called the wavelet-RX (Reed-Xiaoli) algorithm, consists of a combination of a two-dimensional (2D) wavelet transform and a well-known multivariate anomaly detector called the RX algorithm. In our wavelet-RX algorithm, a 2D wavelet transform is first applied to decompose the input image into uniform subbands. A subband-image cube is formed by concatenating together a number of significant subbands (high-energy subbands). The RX algorithm is then applied to the subband-image cube obtained from a wavelet decomposition of the LW or MW sensor data. In the case of the dual band, the RX algorithm is applied to a subband-image cube constructed by concatenating together the high-energy subbands of the LW and MW subband-image cubes. Experimental results are presented for the proposed wavelet-RX and the classical constant false alarm rate (CFAR) algorithm for detecting anomalies (targets) in a single broadband FLIR (LW or MW) or in a coregistered dual-band FLIR sensor. The results show that the proposed wavelet-RX algorithm outperforms the classical CFAR detector for both single-band and dual-band FLIR sensors.

13.
IEEE Trans Pattern Anal Mach Intell ; 28(2): 178-94, 2006 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-16468616

RESUMO

In this paper, we present a kernel realization of a matched subspace detector (MSD) that is based on a subspace mixture model defined in a high-dimensional feature space associated with a kernel function. The linear subspace mixture model for the MSD is first reformulated in a high-dimensional feature space and then the corresponding expression for the generalized likelihood ratio test (GLRT) is obtained for this model. The subspace mixture model in the feature space and its corresponding GLRT expression are equivalent to a nonlinear subspace mixture model with a corresponding nonlinear GLRT expression in the original input space. In order to address the intractability of the GLRT in the feature space, we kernelize the GLRT expression using the kernel eigenvector representations as well as the kernel trick where dot products in the feature space are implicitly computed by kernels. The proposed kernel-based nonlinear detector, so-called kernel matched subspace detector (KMSD), is applied to several hyperspectral images to detect targets of interest. KMSD showed superior detection performance over the conventional MSD when tested on several synthetic data and real hyperspectral imagery.


Assuntos
Algoritmos , Inteligência Artificial , Colorimetria/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Análise Espectral/métodos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
14.
IEEE Trans Syst Man Cybern B Cybern ; 35(4): 670-81, 2005 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16128452

RESUMO

Many image recognition algorithms based on data-learning perform dimensionality reduction before the actual learning and classification because the high dimensionality of raw imagery would require enormous training sets to achieve satisfactory performance. A potential problem with this approach is that most dimensionality reduction techniques, such as principal component analysis (PCA), seek to maximize the representation of data variation into a small number of PCA components, without considering interclass discriminability. This paper presents a neural-network-based transformation that simultaneously seeks to provide dimensionality reduction and a high degree of discriminability by combining together the learning mechanism of a neural-network-based PCA and a backpropagation learning algorithm. The joint discrimination-compression algorithm is applied to infrared imagery to detect military vehicles.


Assuntos
Algoritmos , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Inteligência Artificial , Análise por Conglomerados , Simulação por Computador , Aumento da Imagem/métodos , Modelos Estatísticos , Análise de Componente Principal , Radar
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA