Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38656859

RESUMO

Urban safety plays an essential role in the quality of citizens' lives and in the sustainable development of cities. In recent years, researchers have attempted to apply machine learning techniques to identify the role of location-specific attributes in the development of urban safety. However, existing studies have mainly relied on limited images (e.g., map images, single- or four-directional images) of areas based on a relatively large geographical unit and have narrowly focused on severe crime rates, which limits their predictive performance and implications for urban safety. In this work, we propose a novel method that predicts "deviance," which includes formal deviant crimes (e.g., murders) and informal deviant behaviors (e.g., loud parties at night). To do this, we first collect a large-scale geo-tagged dataset consisting of incident report data for seven metropolitan cities, along with corresponding sequential images around incident sites obtained from Google Street View. We then design a convolutional neural network that learns spatio-temporal visual attributes of deviant streets. Experimental results show that our framework is able to reliably recognize real-world deviance in various cities. Furthermore, we analyze which visual attribute is important for deviance identification and severity estimation with respect to social science as well as activated feature maps in the neural network. We have released our dataset and source codes on https://github.com/JinhwiPark/DevianceNet/.

2.
Adv Sci (Weinh) ; 10(32): e2304310, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37691086

RESUMO

Fano resonance, known for its unique asymmetric line shape, has gained significant attention in photonics, particularly in sensing applications. However, it remains difficult to achieve controllable Fano parameters with a simple geometric structure. Here, a novel approach of using a thin-film optical Fano resonator with a porous layer to generate entire spectral shapes from quasi-Lorentzian to Lorentzian to Fano is proposed and experimentally demonstrated. The glancing angle deposition technique is utilized to create a polarization-dependent Fano resonator. By altering the linear polarization between s- and p-polarization, a switchable Fano device between quasi-Lorentz state and negative Fano state is demonstrated. This change in spectral shape is advantageous for detecting materials with a low-refractive index. A bio-particle sensing experiment is conducted that demonstrates an enhanced signal-to-noise ratio and prediction accuracy. Finally, the challenge of optimizing the film-based Fano resonator due to intricate interplay among numerous parameters, including layer thicknesses, porosity, and materials selection, is addressed. The inverse design tool is developed based on a multilayer perceptron model that allows fast computation for all ranges of Fano parameters. The method provides improved accuracy of the mean validation factor (MVF = 0.07, q-q') compared to the conventional exhaustive enumeration method (MVF = 0.37).

3.
IEEE Trans Pattern Anal Mach Intell ; 45(6): 6766-6782, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-34232862

RESUMO

With the increasing social demands of disaster response, methods of visual observation for rescue and safety have become increasingly important. However, because of the shortage of datasets for disaster scenarios, there has been little progress in computer vision and robotics in this field. With this in mind, we present the first large-scale synthetic dataset of egocentric viewpoints for disaster scenarios. We simulate pre- and post-disaster cases with drastic changes in appearance, such as buildings on fire and earthquakes. The dataset consists of more than 300K high-resolution stereo image pairs, all annotated with ground-truth data for the semantic label, depth in metric scale, optical flow with sub-pixel precision, and surface normal as well as their corresponding camera poses. To create realistic disaster scenes, we manually augment the effects with 3D models using physically-based graphics tools. We train various state-of-the-art methods to perform computer vision tasks using our dataset, evaluate how well these methods recognize the disaster situations, and produce reliable results of virtual scenes as well as real-world images. We also present a convolutional neural network-based egocentric localization method that is robust to drastic appearance changes, such as the texture changes in a fire, and layout changes from a collapse. To address these key challenges, we propose a new model that learns a shape-based representation by training on stylized images, and incorporate the dominant planes of query images as approximate scene coordinates. We evaluate the proposed method using various scenes including a simulated disaster dataset to demonstrate the effectiveness of our method when confronted with significant changes in scene layout. Experimental results show that our method provides reliable camera pose predictions despite vastly changed conditions.

4.
Ocul Surf ; 26: 283-294, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35753666

RESUMO

PURPOSE: Develop a deep learning-based automated method to segment meibomian glands (MG) and eyelids, quantitatively analyze the MG area and MG ratio, estimate the meiboscore, and remove specular reflections from infrared images. METHODS: A total of 1600 meibography images were captured in a clinical setting. 1000 images were precisely annotated with multiple revisions by investigators and graded 6 times by meibomian gland dysfunction (MGD) experts. Two deep learning (DL) models were trained separately to segment areas of the MG and eyelid. Those segmentation were used to estimate MG ratio and meiboscores using a classification-based DL model. A generative adversarial network was implemented to remove specular reflections from original images. RESULTS: The mean ratio of MG calculated by investigator annotation and DL segmentation was consistent 26.23% vs 25.12% in the upper eyelids and 32.34% vs. 32.29% in the lower eyelids, respectively. Our DL model achieved 73.01% accuracy for meiboscore classification on validation set and 59.17% accuracy when tested on images from independent center, compared to 53.44% validation accuracy by MGD experts. The DL-based approach successfully removes reflection from the original MG images without affecting meiboscore grading. CONCLUSIONS: DL with infrared meibography provides a fully automated, fast quantitative evaluation of MG morphology (MG Segmentation, MG area, MG ratio, and meiboscore) which are sufficiently accurate for diagnosing dry eye disease. Also, the DL removes specular reflection from images to be used by ophthalmologists for distraction-free assessment.


Assuntos
Aprendizado Profundo , Síndromes do Olho Seco , Doenças Palpebrais , Disfunção da Glândula Tarsal , Oftalmologistas , Humanos , Glândulas Tarsais/diagnóstico por imagem , Síndromes do Olho Seco/diagnóstico por imagem , Lágrimas , Doenças Palpebrais/diagnóstico por imagem
5.
Micromachines (Basel) ; 12(12)2021 Nov 26.
Artigo em Inglês | MEDLINE | ID: mdl-34945303

RESUMO

The light field camera provides a robust way to capture both spatial and angular information within a single shot. One of its important applications is in 3D depth sensing, which can extract depth information from the acquired scene. However, conventional light field cameras suffer from shallow depth of field (DoF). Here, a vari-focal light field camera (VF-LFC) with an extended DoF is newly proposed for mid-range 3D depth sensing applications. As a main lens of the system, a vari-focal lens with four different focal lengths is adopted to extend the DoF up to ~15 m. The focal length of the micro-lens array (MLA) is optimized by considering the DoF both in the image plane and in the object plane for each focal length. By dividing measurement regions with each focal length, depth estimation with high reliability is available within the entire DoF. The proposed VF-LFC is evaluated by the disparity data extracted from images with different distances. Moreover, the depth measurement in an outdoor environment demonstrates that our VF-LFC could be applied in various fields such as delivery robots, autonomous vehicles, and remote sensing drones.

6.
IEEE Trans Pattern Anal Mach Intell ; 43(4): 1225-1238, 2021 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-31613749

RESUMO

We propose a novel approach to infer a high-quality depth map from a set of images with small viewpoint variations. In general, techniques for depth estimation from small motion consist of camera pose estimation and dense reconstruction. In contrast to prior approaches that recover scene geometry and camera motions using pre-calibrated cameras, we introduce in this paper a self-calibrating bundle adjustment method tailored for small motion which enables computation of camera poses without the need for camera calibration. For dense depth reconstruction, we present a convolutional neural network called DPSNet (Deep Plane Sweep Network) whose design is inspired by best practices of traditional geometry-based approaches. Rather than directly estimating depth or optical flow correspondence from image pairs as done in many previous deep learning methods, DPSNet takes a plane sweep approach that involves building a cost volume from deep features using the plane sweep algorithm, regularizing the cost volume, and regressing the depth map from the cost volume. The cost volume is constructed using a differentiable warping process that allows for end-to-end training of the network. Through the effective incorporation of conventional multiview stereo concepts within a deep learning framework, the proposed method achieves state-of-the-art results on a variety of challenging datasets.

7.
Artigo em Inglês | MEDLINE | ID: mdl-31478856

RESUMO

Depth from focus (DfF) is a method of estimating the depth of a scene by using information acquired through changes in the focus of a camera. Within the DfF framework of, the focus measure (FM) forms the foundation which determines the accuracy of the output. With the results from the FM, the role of a DfF pipeline is to determine and recalculate unreliable measurements while enhancing those that are reliable. In this paper, we propose a new FM, which we call the "ring difference filter" (RDF), that can more accurately and robustly measure focus. FMs can usually be categorized as confident local methods or noise robust non-local methods. The RDF's unique ring-and-disk structure allows it to have the advantages of both local and non-local FMs. We then describe an efficient pipeline that utilizes the RDF's properties. Part of this pipeline is our proposed RDF-based cost aggregation method, which is able to robustly refine the initial results in the presence of image noise. Our method is able to reproduce results that are on par with or even better than those of state-of-the-art methods, while spending less time in computation.

8.
IEEE Trans Pattern Anal Mach Intell ; 41(4): 775-787, 2019 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-29993773

RESUMO

Structure from small motion has become an important topic in 3D computer vision as a method for estimating depth, since capturing the input is so user-friendly. However, major limitations exist with respect to the form of depth uncertainty, due to the narrow baseline and the rolling shutter effect. In this paper, we present a dense 3D reconstruction method from small motion clips using commercial hand-held cameras, which typically cause the undesired rolling shutter artifact. To address these problems, we introduce a novel small motion bundle adjustment that effectively compensates for the rolling shutter effect. Moreover, we propose a pipeline for a fine-scale dense 3D reconstruction that models the rolling shutter effect by utilizing both sparse 3D points and the camera trajectory from narrow-baseline images. In this reconstruction, the sparse 3D points are propagated to obtain an initial depth hypothesis using a geometry guidance term. Then, the depth information on each pixel is obtained by sweeping the plane around each depth search space near the hypothesis. The proposed framework shows accurate dense reconstruction results suitable for various sought-after applications. Both qualitative and quantitative evaluations show that our method consistently generates better depth maps compared to state-of-the-art methods.

9.
IEEE Trans Pattern Anal Mach Intell ; 41(2): 297-310, 2019 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-29994179

RESUMO

One of the core applications of light field imaging is depth estimation. To acquire a depth map, existing approaches apply a single photo-consistency measure to an entire light field. However, this is not an optimal choice because of the non-uniform light field degradations produced by limitations in the hardware design. In this paper, we introduce a pipeline that automatically determines the best configuration for photo-consistency measure, which leads to the most reliable depth label from the light field. We analyzed the practical factors affecting degradation in lenslet light field cameras, and designed a learning based framework that can retrieve the best cost measure and optimal depth label. To enhance the reliability of our method, we augmented an existing light field benchmark to simulate realistic source dependent noise, aberrations, and vignetting artifacts. The augmented dataset was used for the training and validation of the proposed approach. Our method was competitive with several state-of-the-art methods for the benchmark and real-world light field datasets.

10.
Artigo em Inglês | MEDLINE | ID: mdl-30571628

RESUMO

As the computing power of hand-held devices grows, there has been increasing interest in the capture of depth information, to enable a variety of photographic applications. However, under low-light conditions, most devices still suffer from low imaging quality and inaccurate depth acquisition. To address the problem, we present a robust depth estimation method from a short burst shot with varied intensity (i.e., Auto-exposure bracketing) and/or strong noise (i.e., High ISO). Our key idea synergistically combines deep convolutional neural networks with geometric understanding of the scene. We introduce a geometric transformation between optical flow and depth tailored for burst images, enabling our learning-based multi-view stereo matching to be performed effectively. We then describe our depth estimation pipeline that incorporates this geometric transformation into our residual-flow network. It allows our framework to produce an accurate depth map even with a bracketed image sequence. We demonstrate that our method outperforms state-of-the-art methods for various datasets captured by a smartphone and a DSLR camera. Moreover, we show that the estimated depth is applicable for image quality enhancement and photographic editing.

11.
IEEE Trans Image Process ; 26(5): 2311-2326, 2017 May.
Artigo em Inglês | MEDLINE | ID: mdl-28252398

RESUMO

We present a novel coded exposure video technique for multi-image motion deblurring. The key idea of this paper is to capture video frames with a set of complementary fluttering patterns, which enables us to preserve all spectrum bands of a latent image and recover a sharp latent image. To achieve this, we introduce an algorithm for generating a complementary set of binary sequences based on the modern communication theory and implement the coded exposure video system with an off-the-shelf machine vision camera. To demonstrate the effectiveness of our method, we provide in-depth analyses of the theoretical bounds and the spectral gains of our method and other state-of-the-art computational imaging approaches. We further show deblurring results on various challenging examples with quantitative and qualitative comparisons to other computational image capturing methods used for image deblurring, and show how our method can be applied for protecting privacy in videos.

12.
IEEE Trans Pattern Anal Mach Intell ; 39(2): 287-300, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-26978556

RESUMO

We present a novel method for the geometric calibration of micro-lens-based light field cameras. Accurate geometric calibration is the basis of various applications. Instead of using sub-aperture images, we directly utilize raw images for calibration. We select appropriate regions in raw images and extract line features from micro-lens images in those regions. For the entire process, we formulate a new projection model of a micro-lens-based light field camera, which contains a smaller number of parameters than previous models. The model is transformed into a linear form using line features. We compute the initial solution of both the intrinsic and the extrinsic parameters by a linear computation and refine them via non-linear optimization. Experimental results demonstrate the accuracy of the correspondences between rays and pixels in raw images, as estimated by the proposed method.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA