Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
1.
Sensors (Basel) ; 23(2)2023 Jan 11.
Article in English | MEDLINE | ID: mdl-36679643

ABSTRACT

It is important to estimate the exact depth from 2D images, and many studies have been conducted for a long period of time to solve depth estimation problems. Recently, as research on estimating depth from monocular camera images based on deep learning is progressing, research for estimating accurate depths using various techniques is being conducted. However, depth estimation from 2D images has been a problem in predicting the boundary between objects. In this paper, we aim to predict sophisticated depths by emphasizing the precise boundaries between objects. We propose a depth estimation network with encoder-decoder structures using the Laplacian pyramid and local planar guidance method. In the process of upsampling the learned features using the encoder, the purpose of this step is to obtain a clearer depth map by guiding a more sophisticated boundary of an object using the Laplacian pyramid and local planar guidance techniques. We train and test our models with KITTI and NYU Depth V2 datasets. The proposed network constructs a DNN using only convolution and uses the ConvNext networks as a backbone. A trained model shows the performance of the absolute relative error (Abs_rel) 0.054 and root mean square error (RMSE) 2.252 based on the KITTI dataset and absolute relative error (Abs_rel) 0.102 and root mean square error 0.355 based on the NYU Depth V2 dataset. On the state-of-the-art monocular depth estimation, our network performance shows the fifth-best performance based on the KITTI Eigen split and the eighth-best performance based on the NYU Depth V2.


Subject(s)
Algorithms , Depth Perception
2.
Sensors (Basel) ; 22(5)2022 Mar 03.
Article in English | MEDLINE | ID: mdl-35271133

ABSTRACT

In this paper, I propose a bird eye view image detection method for parking areas and collision risk areas at the same time in parking situations. Deep learning algorithms using area detection and semantic segmentation were used. The main architecture of the method described in this paper is based on a harmonic densely connected network and a cross-stage partial network. The dataset used for training was calibrated to four 190° wide-angle cameras to generate around view monitor (AVM) images based on the Chungbuk National University parking lot, and an experiment was performed based on this dataset. In the experimental results, the available parking area was visualized by detecting the parking line, parking area, and available driving area in the AVM images. Furthermore, the undetected area in the semantic segmentation as a collision risk area was visualized in order to obtain the results. According to the proposed attention CSPHarDNet model, the experimental results were 81.89% mIoU and 18.36 FPS in a NVIDIA Xavier environment. The results of this experiment demonstrated that algorithms can be used in real time in a parking situation and have better performance results compared to the conventional HarDNet.


Subject(s)
Automobile Driving , Deep Learning , Algorithms , Humans , Neural Networks, Computer
3.
Sensors (Basel) ; 21(22)2021 Nov 17.
Article in English | MEDLINE | ID: mdl-34833698

ABSTRACT

Although numerous road segmentation studies have utilized vision data, obtaining robust classification is still challenging due to vision sensor noise and target object deformation. Long-distance images are still problematic because of blur and low resolution, and these features make distinguishing roads from objects difficult. This study utilizes light detection and ranging (LiDAR), which generates information that camera images lack, such as distance, height, and intensity, as a reliable supplement to address this problem. In contrast to conventional approaches, additional domain transformation to a bird's eye view space is executed to obtain long-range data with resolutions comparable to those of short-range data. This study proposes a convolutional neural network architecture that processes data transformed to a bird's eye view plane. The network's pathways are split into two parts to resolve calibration errors in the transformed image and point cloud. The network, which has modules that operate sequentially at various scaled dilated convolution rates, is designed to quickly and accurately handle a wide range of data. Comprehensive empirical studies using the Karlsruhe Institute of Technology and Toyota Technological Institute's (KITTI's) road detection benchmarks demonstrate that this study's approach takes advantage of camera and LiDAR information, achieving robust road detection with short runtimes. Our result ranks 22nd in the KITTI's leaderboard and shows real-time performance.


Subject(s)
Neural Networks, Computer
4.
IEEE Trans Image Process ; 20(4): 1152-65, 2011 Apr.
Article in English | MEDLINE | ID: mdl-20923738

ABSTRACT

The authors present a robust face recognition system for large-scale data sets taken under uncontrolled illumination variations. The proposed face recognition system consists of a novel illumination-insensitive preprocessing method, a hybrid Fourier-based facial feature extraction, and a score fusion scheme. First, in the preprocessing stage, a face image is transformed into an illumination-insensitive image, called an "integral normalized gradient image," by normalizing and integrating the smoothed gradients of a facial image. Then, for feature extraction of complementary classifiers, multiple face models based upon hybrid Fourier features are applied. The hybrid Fourier features are extracted from different Fourier domains in different frequency bandwidths, and then each feature is individually classified by linear discriminant analysis. In addition, multiple face models are generated by plural normalized face images that have different eye distances. Finally, to combine scores from multiple complementary classifiers, a log likelihood ratio-based score fusion scheme is applied. The proposed system using the face recognition grand challenge (FRGC) experimental protocols is evaluated; FRGC is a large available data set. Experimental results on the FRGC version 2.0 data sets have shown that the proposed method shows an average of 81.49% verification rate on 2-D face images under various environmental variations such as illumination changes, expression changes, and time elapses.


Subject(s)
Artifacts , Biometry/methods , Face/anatomy & histology , Image Interpretation, Computer-Assisted/methods , Lighting/methods , Pattern Recognition, Automated/methods , Photography/methods , Algorithms , Computer Simulation , Fourier Analysis , Humans , Image Enhancement/methods , Models, Anatomic , Reproducibility of Results , Sensitivity and Specificity
SELECTION OF CITATIONS
SEARCH DETAIL