Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Annu Int Conf IEEE Eng Med Biol Soc ; 2022: 4599-4603, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-36085895

RESUMO

The COVID-19 pandemic has fueled exponential growth in the adoption of remote delivery of primary, specialty, and urgent health care services. One major challenge is the lack of access to physical exam including accurate and inexpensive measurement of remote vital signs. Here we present a novel method for machine learning-based estimation of patient respiratory rate from audio. There exist non-learning methods but their accuracy is limited and work using machine learning known to us is either not directly useful or uses non-public datasets. We are aware of only one publicly available dataset which is small and which we use to evaluate our algorithm. However, to avoid the overfitting problem, we expand its effective size by proposing a new data augmentation method. Our algorithm uses the spectrogram representation and requires labels for breathing cycles, which are used to train a recurrent neural network for recognizing the cycles. Our augmentation method exploits the independence property of the most periodic frequency components of the spectrogram and permutes their order to create multiple signal representations. Our experiments show that our method almost halves the errors obtained by the existing (non-learning) methods. Clinical Relevance- We achieve a Mean Absolute Error (MAE) of 1.0 for the respiratory rate while relying only on an audio signal of a patient breathing. This signal can be collected from a smartphone such that physicians can automatically and reliably determine respiratory rate in a remote setting.


Assuntos
COVID-19 , Taxa Respiratória , COVID-19/diagnóstico , Humanos , Aprendizado de Máquina , Pandemias , Respiração
2.
IEEE Trans Pattern Anal Mach Intell ; 41(8): 1909-1923, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-30605094

RESUMO

Joint image filters leverage the guidance image as a prior and transfer the structural details from the guidance image to the target image for suppressing noise or enhancing spatial resolution. Existing methods either rely on various explicit filter constructions or hand-designed objective functions, thereby making it difficult to understand, improve, and accelerate these filters in a coherent framework. In this paper, we propose a learning-based approach for constructing joint filters based on Convolutional Neural Networks. In contrast to existing methods that consider only the guidance image, the proposed algorithm can selectively transfer salient structures that are consistent with both guidance and target images. We show that the model trained on a certain type of data, e.g., RGB and depth images, generalizes well to other modalities, e.g., flash/non-Flash and RGB/NIR images. We validate the effectiveness of the proposed joint filter through extensive experimental evaluations with state-of-the-art methods.

3.
IEEE Trans Pattern Anal Mach Intell ; 41(11): 2599-2613, 2019 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-30106708

RESUMO

Convolutional neural networks have recently demonstrated high-quality reconstruction for single image super-resolution. However, existing methods often require a large number of network parameters and entail heavy computational loads at runtime for generating high-accuracy super-resolution results. In this paper, we propose the deep Laplacian Pyramid Super-Resolution Network for fast and accurate image super-resolution. The proposed network progressively reconstructs the sub-band residuals of high-resolution images at multiple pyramid levels. In contrast to existing methods that involve the bicubic interpolation for pre-processing (which results in large feature maps), the proposed method directly extracts features from the low-resolution input space and thereby entails low computational loads. We train the proposed network with deep supervision using the robust Charbonnier loss functions and achieve high-quality image reconstruction. Furthermore, we utilize the recursive layers to share parameters across as well as within pyramid levels, and thus drastically reduce the number of parameters. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of run-time and image quality.

4.
IEEE Trans Image Process ; 27(10): 4838-4849, 2018 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-29969395

RESUMO

Superpixel segmentation has been one of the most important tasks in computer vision. In practice, an object can be represented by a number of segments at finer levels with consistent details or included in a surrounding region at coarser levels. Thus, a superpixel segmentation hierarchy is of great importance for applications that require different levels of image details. However, there is no method that can generate all scales of superpixels accurately in real time. In this paper, we propose the superhierarchy algorithm which is able to generate multi-scale superpixels as accurately as the state-of-the-art methods but with one to two orders of magnitude speed-up. The proposed algorithm can be directly integrated with recent efficient edge detectors to significantly outperform the state-of-the-art methods in terms of segmentation accuracy. Quantitative and qualitative evaluations on a number of applications demonstrate that the proposed algorithm is accurate and efficient in generating a hierarchy of superpixels.

5.
IEEE Trans Neural Netw Learn Syst ; 28(6): 1373-1385, 2017 06.
Artigo em Inglês | MEDLINE | ID: mdl-28113825

RESUMO

In this paper, we propose a clustering algorithm based on a two-phased neural network architecture. We combine the strength of an autoencoderlike network for unsupervised representation learning with the discriminative power of a support vector machine (SVM) network for fine-tuning the initial clusters. The first network is referred as prototype encoding network, where the data reconstruction error is minimized in an unsupervised manner. The second phase, i.e., SVM network, endeavors to maximize the margin between cluster boundaries in a supervised way making use of the first output. Both the networks update the cluster centroids successively by establishing a topology preserving scheme like self-organizing map on the latent space of each network. Cluster fine-tuning is accomplished in a network structure by the alternate usage of the encoding part of both the networks. In the experiments, challenging data sets from two popular repositories with different patterns, dimensionality, and the number of clusters are used. The proposed hybrid architecture achieves comparatively better results both visually and analytically than the previous neural network-based approaches available in the literature.

6.
IEEE Trans Cybern ; 46(1): 51-63, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25680224

RESUMO

In this paper, we formulate particle filter-based object tracking as an exclusive sparse learning problem that exploits contextual information. To achieve this goal, we propose the context-aware exclusive sparse tracker (CEST) to model particle appearances as linear combinations of dictionary templates that are updated dynamically. Learning the representation of each particle is formulated as an exclusive sparse representation problem, where the overall dictionary is composed of multiple group dictionaries that can contain contextual information. With context, CEST is less prone to tracker drift. Interestingly, we show that the popular L1 tracker is a special case of our CEST formulation. The proposed learning problem is efficiently solved using an accelerated proximal gradient method that yields a sequence of closed form updates. To make the tracker much faster, we reduce the number of learning problems to be solved by using the dual problem to quickly and systematically rank and prune particles in each frame. We test our CEST tracker on challenging benchmark sequences that involve heavy occlusion, drastic illumination changes, and large pose variations. Experimental results show that CEST consistently outperforms state-of-the-art trackers.

7.
IEEE Trans Pattern Anal Mach Intell ; 37(6): 1304-11, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-26357351

RESUMO

A robust and effective specular highlight removal method is proposed in this paper. It is based on a key observation--the maximum fraction of the diffuse colour component in diffuse local patches in colour images changes smoothly. The specular pixels can thus be treated as noise in this case. This property allows the specular highlights to be removed in an image denoising fashion: an edge-preserving low-pass filter (e.g., the bilateral filter) can be used to smooth the maximum fraction of the colour components of the original image to remove the noise contributed by the specular pixels. Recent developments in fast bilateral filtering techniques enable the proposed method to run over 200× faster than state-of-the-art techniques on a standard CPU and differentiates it from previous work.

8.
IEEE Trans Pattern Anal Mach Intell ; 36(9): 1900-6, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-26352241

RESUMO

This paper is aimed at obtaining the statistics as a probabilistic model pertaining to the geometric, topological and photometric structure of natural images. The image structure is represented by its segmentation graph derived from the low-level hierarchical multiscale image segmentation. We first estimate the statistics of a number of segmentation graph properties from a large number of images. Our estimates confirm some findings reported in the past work, as well as provide some new ones. We then obtain a Markov random field based model of the segmentation graph which subsumes the observed statistics. To demonstrate the value of the model and the statistics, we show how its use as a prior impacts three applications: image classification, semantic image segmentation and object detection.

9.
IEEE Trans Image Process ; 22(12): 4841-52, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23963228

RESUMO

We present a new upsampling method to enhance the spatial resolution of depth images. Given a low-resolution depth image from an active depth sensor and a potentially high-resolution color image from a passive RGB camera, we formulate it as an adaptive cost aggregation problem and solve it using the bilateral filter. The formulation synergistically combines the median and bilateral filters thus it better preserves the depth edges and is more robust to noise. Numerical and visual evaluations on a total of 37 Middlebury data sets demonstrate the effectiveness of our method. A real-time high-resolution depth capturing system is also developed using commercial active depth sensor based on the proposed upsampling method.

10.
IEEE Trans Image Process ; 21(10): 4361-8, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22829402

RESUMO

In this paper, we propose a simple but effective shadow removal method using a single input image. We first derive a 2-D intrinsic image from a single RGB camera image based solely on colors, particularly chromaticity. We next present a method to recover a 3-D intrinsic image based on bilateral filtering and the 2-D intrinsic image. The luminance contrast in regions with similar surface reflectance due to geometry and illumination variances is effectively reduced in the derived 3-D intrinsic image, while the contrast in regions with different surface reflectance is preserved. However, the intrinsic image contains incorrect luminance values. To obtain the correct luminance, we decompose the input RGB image and the intrinsic image. Each image is decomposed into a base layer and a detail layer. We obtain a shadow-free image by combining the base layer from the input RGB image and the detail layer from the intrinsic image such that the details of the intrinsic image are transferred to the input RGB image from which the correct luminance values can be obtained. Unlike previous methods, the presented technique is fully automatic and does not require shadow detection.

11.
IEEE Trans Image Process ; 21(10): 4410-9, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22801509

RESUMO

In this paper, we propose a simple but effective image transform, called the epipolar distance transform, for matching low-texture regions. It converts image intensity values to a relative location inside a planar segment along the epipolar line, such that pixels in the low-texture regions become distinguishable. We theoretically prove that the transform is affine invariant, thus the transformed images can be directly used for stereo matching. Any existing stereo algorithms can be directly used with the transformed images to improve reconstruction accuracy for low-texture regions. Results on real indoor and outdoor images demonstrate the effectiveness of the proposed transform for matching low-texture regions, keypoint detection, and description for low-texture scenes. Our experimental results on Middlebury images also demonstrate the robustness of our transform for highly textured scenes. The proposed transform has a great advantage, its low computational complexity. It was tested on a MacBook Air laptop computer with a 1.8 GHz Core i7 processor, with a speed of about 9 frames per second for a video graphics array-sized image.

12.
IEEE Trans Image Process ; 20(1): 53-63, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21172743

RESUMO

Based upon a new correspondence matching invariant called illumination chromaticity constancy, we present a new solution for illumination chromaticity estimation, correspondence searching, and specularity removal. Using as few as two images, the core of our method is the computation of a vote distribution for a number of illumination chromaticity hypotheses via correspondence matching. The hypothesis with the highest vote is accepted as correct. The estimated illumination chromaticity is then used together with the new matching invariant to match highlights, which inherently provides solutions for correspondence searching and specularity removal. Our method differs from the previous approaches: those treat these vision problems separately and generally require that specular highlights be detected in a preprocessing step. Also, our method uses more images than previous illumination chromaticity estimation methods, which increases its robustness because more inputs/constraints are used. Experimental results on both synthetic and real images demonstrate the effectiveness of the proposed method.

13.
IEEE Trans Pattern Anal Mach Intell ; 30(12): 2158-74, 2008 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-18988949

RESUMO

Suppose a set of arbitrary (unlabeled) images contains frequent occurrences of 2D objects from an unknown category. This paper is aimed at simultaneously solving the following related problems: (1) unsupervised identification of photometric, geometric, and topological properties of multiscale regions comprising instances of the 2D category; (2) learning a region-based structural model of the category in terms of these properties; and (3) detection, recognition and segmentation of objects from the category in new images. To this end, each image is represented by a tree that captures a multiscale image segmentation. The trees are matched to extract the maximally matching subtrees across the set, which are taken as instances of the target category. The extracted subtrees are then fused into a tree-union that represents the canonical category model. Detection, recognition, and segmentation of objects from the learned category are achieved simultaneously by finding matches of the category model with the segmentation tree of a new image. Experimental validation on benchmark datasets demonstrates the robustness and high accuracy of the learned category models, when only a few training examples are used for learning without any human supervision.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Simulação por Computador , Aumento da Imagem/métodos , Modelos Teóricos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
14.
IEEE Trans Pattern Anal Mach Intell ; 29(7): 1244-61, 2007 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-17496381

RESUMO

The analysis of periodic or repetitive motions is useful in many applications, such as the recognition and classification of human and animal activities. Existing methods for the analysis of periodic motions first extract motion trajectories using spatial information and then determine if they are periodic. These approaches are mostly based on feature matching or spatial correlation, which are often infeasible, unreliable, or computationally demanding. In this paper, we present a new approach, based on the time-frequency analysis of the video sequence as a whole. Multiple periodic trajectories are extracted and their periods are estimated simultaneously. The objects that are moving in a periodic manner are extracted using the spatial domain information. Experiments with synthetic and real sequences display the capabilities of this approach.


Assuntos
Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Movimento/fisiologia , Oscilometria/métodos , Reconhecimento Automatizado de Padrão/métodos , Gravação em Vídeo/métodos , Algoritmos , Humanos , Aumento da Imagem/métodos , Movimento (Física)
15.
IEEE Trans Pattern Anal Mach Intell ; 29(2): 356-61, 2007 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-17170487

RESUMO

Wide field of view (FOV) and high-resolution image acquisition is highly desirable in many vision-based applications. Several systems have reported the use of reflections off mirror pyramids to capture high-resolution, single-viewpoint, and wide-FOV images. Using a dual mirror pyramid (DMP) panoramic camera as an example, in this paper, we examine how the pyramid geometry, and the selection and placement of imager clusters can be optimized to maximize the overall panoramic FOV, sensor utilization efficiency, and image uniformity. The analysis can be generalized and applied to other pyramid-based designs.


Assuntos
Algoritmos , Desenho Assistido por Computador , Aumento da Imagem/instrumentação , Interpretação de Imagem Assistida por Computador/métodos , Lentes , Modelos Teóricos , Fotografação/instrumentação , Simulação por Computador , Desenho de Equipamento , Análise de Falha de Equipamento , Aumento da Imagem/métodos
16.
IEEE Trans Pattern Anal Mach Intell ; 26(7): 941-6, 2004 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-18579952

RESUMO

A mirror pyramid consists of a set of planar mirror faces arranged around an axis of symmetry and inclined to form a pyramid. By strategically positioning a number of conventional cameras around a mirror pyramid, the viewpoints of the cameras' mirror images can be located at a single point within the pyramid and their optical axes pointed in different directions to effectively form a virtual camera with a panoramic field of view. Mirror pyramid-based panoramic cameras have a number of attractive properties, including single-viewpoint imaging, high resolution, and video rate capture. It is also possible to place multiple viewpoints within a single mirror pyramid, yielding compact designs for simultaneous multiview panoramic video rate imaging. Nalwa [4] first described some of the basic ideas behind mirror pyramid cameras. In this paper, we analyze the general class of multiview panoramic cameras, provide a method for designing these cameras, and present experimental results using a prototype we have developed to validate single-pyramid multiview designs. We first give a description of mirror pyramid cameras, including the imaging geometry, and investigate the relationship between the placement of viewpoints within the pyramid and the cameras' field of view (FOV), using simulations to illustrate the concepts. A method for maximizing sensor utilization in a mirror pyramid-based multiview panoramic camera is also presented. Images acquired using the experimental prototype for two viewpoints are shown.


Assuntos
Algoritmos , Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Fotografação/métodos , Aumento da Imagem/instrumentação , Fotografação/instrumentação
17.
Neural Comput ; 14(5): 1071-103, 2002 May.
Artigo em Inglês | MEDLINE | ID: mdl-11972908

RESUMO

A learning account for the problem of object recognition is developed within the probably approximately correct (PAC) model of learnability. The key assumption underlying this work is that objects can be recognized (or discriminated) using simple representations in terms of syntactically simple relations over the raw image. Although the potential number of these simple relations could be huge, only a few of them are actually present in each observed image, and a fairly small number of those observed are relevant to discriminating an object. We show that these properties can be exploited to yield an efficient learning approach in terms of sample and computational complexity within the PAC model. No assumptions are needed on the distribution of the observed objects, and the learning performance is quantified relative to its experience. Most important, the success of learning an object representation is naturally tied to the ability to represent it as a function of some intermediate representations extracted from the image. We evaluate this approach in a large-scale experimental study in which the SNoW learning architecture is used to learn representations for the 100 objects in the Columbia Object Image Library. Experimental results exhibit good generalization and robustness properties of the SNoW-based method relative to other approaches. SNoW's recognition rate degrades more gracefully when the training data contains fewer views, and it shows similar behavior in some preliminary experiments with partially occluded objects.


Assuntos
Inteligência Artificial , Percepção de Forma , Modelos Neurológicos , Sistemas Computacionais
18.
IEEE Trans Image Process ; 11(11): 1228-37, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-18249693

RESUMO

This paper is concerned with developing a lossless image compression method which employs an optimal amount of segmentation information to exploit spatial redundancies inherent in image data. Multiscale segmentation is obtained using a previously proposed transform which provides a tree-structured segmentation of the image into regions characterized by grayscale homogeneity. In the proposed algorithm we prune the tree to control the size and number of regions thus obtaining a rate-optimal balance between the overhead inherent in coding the segmented data and the coding gain that we derive from it. Another novelty of the proposed approach is that we use an image model comprising separate descriptions of pixels lying near the edges of a region and those lying in the interior. Results show that the proposed algorithm can provide performance comparable to the best available methods and 15-20% better compression when compared with the JPEG lossless compression standard for a wide range of images.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA