Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 108
Filtrar
1.
Sensors (Basel) ; 24(10)2024 May 12.
Artículo en Inglés | MEDLINE | ID: mdl-38793931

RESUMEN

The process of image fusion is the process of enriching an image and improving the image's quality, so as to facilitate the subsequent image processing and analysis. With the increasing importance of image fusion technology, the fusion of infrared and visible images has received extensive attention. In today's deep learning environment, deep learning is widely used in the field of image fusion. However, in some applications, it is not possible to obtain a large amount of training data. Because some special organs of snakes can receive and process infrared information and visible information, the fusion method of infrared and visible light to simulate the visual mechanism of snakes came into being. Therefore, this paper takes into account the perspective of visual bionics to achieve image fusion; such methods do not need to obtain a significant amount of training data. However, most of the fusion methods for simulating snakes face the problem of unclear details, so this paper combines this method with a pulse coupled neural network (PCNN). By studying two receptive field models of retinal nerve cells, six dual-mode cell imaging mechanisms of rattlesnakes and their mathematical models and the PCNN model, an improved fusion method of infrared and visible images was proposed. For the proposed fusion method, eleven groups of source images were used, and three non-reference image quality evaluation indexes were compared with seven other fusion methods. The experimental results show that the improved algorithm proposed in this paper is better overall than the comparison method for the three evaluation indexes.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Redes Neurales de la Computación , Serpientes , Animales , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Aprendizaje Profundo , Rayos Infrarrojos
2.
Sensors (Basel) ; 24(14)2024 Jul 19.
Artículo en Inglés | MEDLINE | ID: mdl-39066083

RESUMEN

Infrared images hold significant value in applications such as remote sensing and fire safety. However, infrared detectors often face the problem of high hardware costs, which limits their widespread use. Advancements in deep learning have spurred innovative approaches to image super-resolution (SR), but comparatively few efforts have been dedicated to the exploration of infrared images. To address this, we design the Residual Swin Transformer and Average Pooling Block (RSTAB) and propose the SwinAIR, which can effectively extract and fuse the diverse frequency features in infrared images and achieve superior SR reconstruction performance. By further integrating SwinAIR with U-Net, we propose the SwinAIR-GAN for real infrared image SR reconstruction. SwinAIR-GAN extends the degradation space to better simulate the degradation process of real infrared images. Additionally, it incorporates spectral normalization, dropout, and artifact discrimination loss to reduce the potential image artifacts. Qualitative and quantitative evaluations on various datasets confirm the effectiveness of our proposed method in reconstructing realistic textures and details of infrared images.

3.
Sensors (Basel) ; 24(5)2024 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-38475150

RESUMEN

Due to the complexity of real optical flow capture, the existing research still has not performed real optical flow capture of infrared (IR) images with the production of an optical flow based on IR images, which makes the research and application of deep learning-based optical flow computation limited to the field of RGB images only. Therefore, in this paper, we propose a method to produce an optical flow dataset of IR images. We utilize the RGB-IR cross-modal image transformation network to rationally transform existing RGB image optical flow datasets. The RGB-IR cross-modal image transformation is based on the improved Pix2Pix implementation, and in the experiments, the network is validated and evaluated using the RGB-IR aligned bimodal dataset M3FD. Then, RGB-IR cross-modal transformation is performed on the existing RGB optical flow dataset KITTI, and the optical flow computation network is trained using the IR images generated by the transformation. Finally, the computational results of the optical flow computation network before and after training are analyzed based on the RGB-IR aligned bimodal data.

4.
Sensors (Basel) ; 24(5)2024 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-38475212

RESUMEN

Steel surfaces often display intricate texture patterns that can resemble defects, posing a challenge in accurately identifying actual defects. Therefore, it is crucial to develop a highly robust defect detection model. This study proposes a defect detection method for steel infrared images based on a Regularized YOLO framework. Firstly, the Coordinate Attention (CA) is embedded within the C2F framework, utilizing a lightweight attention module to enhance the feature extraction capability of the backbone network. Secondly, the neck part design incorporates the Bi-directional Feature Pyramid Network (BiFPN) for weighted fusion of multi-scale feature maps. This creates a model called BiFPN-Concat, which enhances feature fusion capability. Finally, the loss function of the model is regularized to improve the generalization performance of the model. The experimental results indicate that the model has only 3.03 M parameters, yet achieves a mAP@0.5 of 80.77% on the NEU-DET dataset and 99.38% on the ECTI dataset. This represents an improvement of 2.3% and 1.6% over the baseline model, respectively. This method is well-suited for industrial detection applications involving non-destructive testing of steel using infrared imagery.

5.
Sensors (Basel) ; 24(4)2024 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-38400285

RESUMEN

Infrared image processing is an effective method for diagnosing faults in electrical equipment, in which target device segmentation and temperature feature extraction are key steps. Target device segmentation separates the device to be diagnosed from the image, while temperature feature extraction analyzes whether the device is overheating and has potential faults. However, the segmentation of infrared images of electrical equipment is slow due to issues such as high computational complexity, and the temperature information extracted lacks accuracy due to the insufficient consideration of the non-linear relationship between the image grayscale and temperature. Therefore, in this study, we propose an optimized maximum between-class variance thresholding method (OTSU) segmentation algorithm based on the Gray Wolf Optimization (GWO) algorithm, which accelerates the segmentation speed by optimizing the threshold determination process using OTSU. The experimental results show that compared to the non-optimized method, the optimized segmentation method increases the threshold calculation time by more than 83.99% while maintaining similar segmentation results. Based on this, to address the issue of insufficient accuracy in temperature feature extraction, we propose a temperature value extraction method for infrared images based on the K-nearest neighbor (KNN) algorithm. The experimental results demonstrate that compared to traditional linear methods, this method achieves a 73.68% improvement in the maximum residual absolute value of the extracted temperature values and a 78.95% improvement in the average residual absolute value.

6.
Sensors (Basel) ; 24(17)2024 Sep 09.
Artículo en Inglés | MEDLINE | ID: mdl-39275771

RESUMEN

Infrared and visible image fusion can integrate rich edge details and salient infrared targets, resulting in high-quality images suitable for advanced tasks. However, most available algorithms struggle to fully extract detailed features and overlook the interaction of complementary features across different modal images during the feature fusion process. To address this gap, this study presents a novel fusion method based on multi-scale edge enhancement and a joint attention mechanism (MEEAFusion). Initially, convolution kernels of varying scales were utilized to obtain shallow features with multiple receptive fields unique to the source image. Subsequently, a multi-scale gradient residual block (MGRB) was developed to capture the high-level semantic information and low-level edge texture information of the image, enhancing the representation of fine-grained features. Then, the complementary feature between infrared and visible images was defined, and a cross-transfer attention fusion block (CAFB) was devised with joint spatial attention and channel attention to refine the critical supplemental information. This allowed the network to obtain fused features that were rich in both common and complementary information, thus realizing feature interaction and pre-fusion. Lastly, the features were reconstructed to obtain the fused image. Extensive experiments on three benchmark datasets demonstrated that the MEEAFusion proposed in this research has considerable strengths in terms of rich texture details, significant infrared targets, and distinct edge contours, and it achieves superior fusion performance.

7.
Sensors (Basel) ; 24(4)2024 Feb 19.
Artículo en Inglés | MEDLINE | ID: mdl-38400490

RESUMEN

This paper presents an FPGA-based lightweight and real-time infrared image processor based on a series of hardware-oriented lightweight algorithms. The two-point correction algorithm based on blackbody radiation is introduced to calibrate the non-uniformity of the sensor. With precomputed gain and offset matrices, the design can achieve real-time non-uniformity correction with a resolution of 640×480. The blind pixel detection algorithm employs the first-level approximation to simplify multiple iterative computations. The blind pixel compensation algorithm in our design is constructed on the side-window-filtering method. The results of eight convolution kernels for side windows are computed simultaneously to improve the processing speed. Due to the proposed side-window-filtering-based blind pixel compensation algorithm, blind pixels can be effectively compensated while details in the image are preserved. Before image output, we also incorporated lightweight histogram equalization to make the processed image more easily observable to the human eyes. The proposed lightweight infrared image processor is implemented on Xilinx XC7A100T-2. Our proposed lightweight infrared image processor costs 10,894 LUTs, 9367 FFs, 4 BRAMs, and 5 DSP48. Under a 50 MHz clock, the processor achieves a speed of 30 frames per second at the cost of 1800 mW. The maximum operating frequency of our proposed processor can reach 186 MHz. Compared with existing similar works, our proposed infrared image processor incurs minimal resource overhead and has lower power consumption.

8.
Sensors (Basel) ; 24(4)2024 Feb 06.
Artículo en Inglés | MEDLINE | ID: mdl-38400227

RESUMEN

Among the numerous gaze-estimation methods currently available, appearance-based methods predominantly use RGB images as input and employ convolutional neural networks (CNNs) to detect facial images to regressively obtain gaze angles or gaze points. Model-based methods require high-resolution images to obtain a clear eyeball geometric model. These methods face significant challenges in outdoor environments and practical application scenarios. This paper proposes a model-based gaze-estimation algorithm using a low-resolution 3D TOF camera. This study uses infrared images instead of RGB images as input to overcome the impact of varying illumination intensity in the environment on gaze estimation. We utilized a trained YOLOv8 neural network model to detect eye landmarks in captured facial images. Combined with the depth map from a time-of-flight (TOF) camera, we calculated the 3D coordinates of the canthus points of a single eye of the subject. Based on this, we fitted a 3D geometric model of the eyeball to determine the subject's gaze angle. Experimental validation showed that our method achieved a root mean square error of 6.03° and 4.83° in the horizontal and vertical directions, respectively, for the detection of the subject's gaze angle. We also tested the proposed method in a real car driving environment, achieving stable driver gaze detection at various locations inside the car, such as the dashboard, driver mirror, and the in-vehicle screen.

9.
Entropy (Basel) ; 26(8)2024 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-39202151

RESUMEN

In order to minimize the disparity between visible and infrared modalities and enhance pedestrian feature representation, a cross-modality person re-identification method is proposed, which integrates modality generation and feature enhancement. Specifically, a lightweight network is used for dimension reduction and augmentation of visible images, and intermediate modalities are generated to bridge the gap between visible images and infrared images. The Convolutional Block Attention Module is embedded into the ResNet50 backbone network to selectively emphasize key features sequentially from both channel and spatial dimensions. Additionally, the Gradient Centralization algorithm is introduced into the Stochastic Gradient Descent optimizer to accelerate convergence speed and improve generalization capability of the network model. Experimental results on SYSU-MM01 and RegDB datasets demonstrate that our improved network model achieves significant performance gains, with an increase in Rank-1 accuracy of 7.12% and 6.34%, as well as an improvement in mAP of 4.00% and 6.05%, respectively.

10.
Sensors (Basel) ; 23(9)2023 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-37177444

RESUMEN

Currently, infrared small target detection and tracking under complex backgrounds remains challenging because of the low resolution of infrared images and the lack of shape and texture features in these small targets. This study proposes a framework for infrared vehicle small target detection and tracking, comprising three components: full-image object detection, cropped-image object detection and tracking, and object trajectory prediction. We designed a CNN-based real-time detection model with a high recall rate for the first component to detect potential object regions in the entire image. The KCF algorithm and the designed lightweight CNN-based target detection model, which parallelly lock on the target more precisely in the target potential area, were used in the second component. In the final component, we designed an optimized Kalman filter to estimate the target's trajectory. We validated our method on a public dataset. The results show that the proposed real-time detection and tracking framework for infrared vehicle small targets could steadily track vehicle targets and adapt well in situations such as the temporary disappearance of targets and interference from other vehicles.

11.
Sensors (Basel) ; 23(16)2023 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-37631844

RESUMEN

Infrared ship target detection is crucial technology in marine scenarios. Ship targets vary in scale throughout navigation because the distance between the ship and the infrared camera is constantly changing. Furthermore, complex backgrounds, such as sea clutter, can cause significant interference during detection tasks. In this paper, multiscale morphological reconstruction-based saliency mapping, combined with a two-branch compensation strategy (MMRSM-TBC) algorithm, is proposed for the detection of ship targets of various sizes and against complex backgrounds. First, a multiscale morphological reconstruction method is proposed to enhance the ship targets in the infrared image and suppress any irrelevant background. Then, by introducing a structure tensor with two feature-based filter templates, we utilize the contour information of the ship targets and further improve their intensities in the saliency map. After that, a two-branch compensation strategy is proposed, due to the uneven distribution of image grayscale. Finally, the target is extracted using an adaptive threshold. The experimental results fully show that our proposed algorithm achieves strong performance in the detection of different-sized ship targets and has a higher accuracy than other existing methods.

12.
Sensors (Basel) ; 23(20)2023 Oct 12.
Artículo en Inglés | MEDLINE | ID: mdl-37896518

RESUMEN

In the context of non-uniformity correction (NUC) within infrared imaging systems, current methods frequently concentrate solely on high-frequency stripe non-uniformity noise, neglecting the impact of global low-frequency non-uniformity on image quality, and are susceptible to ghosting artifacts from neighboring frames. In response to such challenges, we propose a method for the correction of non-uniformity in single-frame infrared images based on noise separation in the wavelet domain. More specifically, we commence by decomposing the noisy image into distinct frequency components through wavelet transformation. Subsequently, we employ a clustering algorithm to extract high-frequency noise from the vertical components within the wavelet domain, concurrently employing a method of surface fitting to capture low-frequency noise from the approximate components within the wavelet domain. Ultimately, the restored image is obtained by subtracting the combined noise components. The experimental results demonstrate that the proposed method, when applied to simulated noisy images, achieves the optimal levels among seven compared methods in terms of MSE, PSNR, and SSIM metrics. After correction on three sets of real-world test image sequences, the average non-uniformity index is reduced by 75.54%. Moreover, our method does not impose significant computational overhead in the elimination of superimposed noise, which is particularly suitable for applications necessitating stringent requirements in both image quality and processing speed.

13.
Sensors (Basel) ; 23(21)2023 Oct 31.
Artículo en Inglés | MEDLINE | ID: mdl-37960559

RESUMEN

Real-time compression of images with a high dynamic range into those with a low dynamic range while preserving the maximum amount of detail is still a critical technology in infrared image processing. We propose a dynamic range compression and enhancement algorithm for infrared images with local optimal contrast (DRCE-LOC). The algorithm has four steps. The first involves blocking the original image to determine the optimal stretching coefficient by using the information of the local block. In the second, the algorithm combines the original image with a low-pass filter to create the background and detailed layers, compressing the background layer with a dynamic range of adaptive gain, and enhancing the detailed layer for the visual characteristics of the human eye. Third, the original image was used as input, the compressed background layer was used as a brightness-guided image, and the local optimal stretching coefficient was used for dynamic range compression. Fourth, an 8-bit image was created (from typical 14-bit input) by merging the enhanced details and the compressed background. Implemented on FPGA, it used 2.2554 Mb of Block RAM, five dividers, and a root calculator with a total image delay of 0.018 s. The study analyzed mainstream algorithms in various scenarios (rich scenes, small targets, and indoor scenes), confirming the proposed algorithm's superiority in real-time processing, resource utilization, preservation of the image's details, and visual effects.

14.
Sensors (Basel) ; 23(19)2023 Sep 27.
Artículo en Inglés | MEDLINE | ID: mdl-37836931

RESUMEN

Infrared sensors capture thermal radiation emitted by objects. They can operate in all weather conditions and are thus employed in fields such as military surveillance, autonomous driving, and medical diagnostics. However, infrared imagery poses challenges such as low contrast and indistinct textures due to the long wavelength of infrared radiation and susceptibility to interference. In addition, complex enhancement algorithms make real-time processing challenging. To address these problems and improve visual quality, in this paper, we propose a multi-scale FPGA-based method for real-time enhancement of infrared images by using rolling guidance filter (RGF) and contrast-limited adaptive histogram equalization (CLAHE). Specifically, the original image is first decomposed into various scales of detail layers and a base layer using RGF. Secondly, we fuse detail layers of diverse scales, then enhance the detail information by using gain coefficients and employ CLAHE to improve the contrast of the base layer. Thirdly, we fuse the detail layers and base layer to obtain the image with global details of the input image. Finally, the proposed algorithm is implemented on an FPGA using advanced high-level synthesis tools. Comprehensive testing of our proposed method on the AXU15EG board demonstrates its effectiveness in significantly improving image contrast and enhancing detail information. At the same time, real-time enhancement at a speed of 147 FPS is achieved for infrared images with a resolution of 640 × 480.

15.
Sensors (Basel) ; 23(22)2023 Nov 09.
Artículo en Inglés | MEDLINE | ID: mdl-38005458

RESUMEN

Infrared image sensing technology has received widespread attention due to its advantages of not being affected by the environment, good target recognition, and high anti-interference ability. However, with the improvement of the integration of the infrared focal plane, the dynamic range of the photoelectric system is difficult to improve, that is, the restrictive trade-off between noise and full well capacity is particularly prominent. Since the capacitance of the inversion MOS capacitor changes with the gate-source voltage adaptively, the inversion MOS capacitor is used as the capacitor in the infrared pixel circuit, which can solve the contradiction between noise in low light and full well capacity in high light. To this end, a highly dynamic pixel structure based on adaptive capacitance is proposed, so that the capacitance of the infrared image sensor can automatically change from 6.5 fF to 37.5 fF as the light intensity increases. And based on 55 nm CMOS process technology, the performance parameters of an infrared image sensor with a 12,288 × 12,288 pixel array are studied. The research results show that a small-size pixel of 5.5 µm × 5.5 µm has a large full well capacity of 1.31 Me- and a variable conversion gain, with a noise of less than 0.43 e- and a dynamic range of more than 130 dB.

16.
Sensors (Basel) ; 23(6)2023 Mar 11.
Artículo en Inglés | MEDLINE | ID: mdl-36991744

RESUMEN

As the demand for thermal information increases in industrial fields, numerous studies have focused on enhancing the quality of infrared images. Previous studies have attempted to independently overcome one of the two main degradations of infrared images, fixed pattern noise (FPN) and blurring artifacts, neglecting the other problems, to reduce the complexity of the problems. However, this is infeasible for real-world infrared images, where two degradations coexist and influence each other. Herein, we propose an infrared image deconvolution algorithm that jointly considers FPN and blurring artifacts in a single framework. First, an infrared linear degradation model that incorporates a series of degradations of the thermal information acquisition system is derived. Subsequently, based on the investigation of the visual characteristics of the column FPN, a strategy to precisely estimate FPN components is developed, even in the presence of random noise. Finally, a non-blind image deconvolution scheme is proposed by analyzing the distinctive gradient statistics of infrared images compared with those of visible-band images. The superiority of the proposed algorithm is experimentally verified by removing both artifacts. Based on the results, the derived infrared image deconvolution framework successfully reflects a real infrared imaging system.

17.
Sensors (Basel) ; 23(6)2023 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-36991995

RESUMEN

Aiming at reducing image detail loss and edge blur in the existing nonuniformity correction (NUC) methods, a new visible-image-assisted NUC algorithm based on a dual-discriminator generative adversarial network (GAN) with SEBlock (VIA-NUC) is proposed. The algorithm uses the visible image as a reference for better uniformity. The generative model downsamples the infrared and visible images separately for multiscale feature extraction. Then, image reconstruction is achieved by decoding the infrared feature maps with the assistance of the visible features at the same scale. During decoding, SEBlock, a channel attention mechanism, and skip connection are used to ensure that more distinctive channel and spatial features are extracted from the visible features. Two discriminators based on vision transformer (Vit) and discrete wavelet transform (DWT) were designed, which perform global and local judgments on the generated image from the texture features and frequency domain features of the model, respectively. The results are then fed back to the generator for adversarial learning. This approach can effectively remove nonuniform noise while preserving the texture. The performance of the proposed method was validated using public datasets. The average structural similarity (SSIM) and average peak signal-to-noise ratio (PSNR) of the corrected images exceeded 0.97 and 37.11 dB, respectively. The experimental results show that the proposed method improves the metric evaluation by more than 3%.

18.
Sensors (Basel) ; 24(1)2023 Dec 20.
Artículo en Inglés | MEDLINE | ID: mdl-38202904

RESUMEN

Removing noise from acquired images is a crucial step in various image processing and computer vision tasks. However, the existing methods primarily focus on removing specific noise and ignore the ability to work across modalities, resulting in limited generalization performance. Inspired by the iterative procedure of image processing used by professionals, we propose a pixel-wise crossmodal image-denoising method based on deep reinforcement learning to effectively handle noise across modalities. We proposed a similarity reward to help teach an optimal action sequence to model the step-wise nature of the human processing process explicitly. In addition, We designed an action set capable of handling multiple types of noise to construct the action space, thereby achieving successful crossmodal denoising. Extensive experiments against state-of-the-art methods on publicly available RGB, infrared, and terahertz datasets demonstrate the superiority of our method in crossmodal image denoising.

19.
Sensors (Basel) ; 23(9)2023 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-37177648

RESUMEN

Infrared thermography (IRT), is one of the most interesting techniques to identify different kinds of defects, such as delamination and damage existing for quality management of material. Objective detection and segmentation algorithms in deep learning have been widely applied in image processing, although very rarely in the IRT field. In this paper, spatial deep-learning image processing methods for defect detection and identification were discussed and investigated. The aim in this work is to integrate such deep-learning (DL) models to enable interpretations of thermal images automatically for quality management (QM). That requires achieving a high enough accuracy for each deep-learning method so that they can be used to assist human inspectors based on the training. There are several alternatives of deep Convolutional Neural Networks for detecting the images that were employed in this work. These included: 1. The instance segmentation methods Mask-RCNN (Mask Region-based Convolutional Neural Networks) and Center-Mask; 2. The independent semantic segmentation methods: U-net and Resnet-U-net; 3. The objective localization methods: You Only Look Once (YOLO-v3) and Faster Region-based Convolutional Neural Networks (Fast-er-RCNN). In addition, a regular infrared image segmentation processing combination method (Absolute thermal contrast (ATC) and global threshold) was introduced for comparison. A series of academic samples composed of different materials and containing artificial defects of different shapes and nature (flat-bottom holes, Teflon inserts) were evaluated, and all results were studied to evaluate the efficacy and performance of the proposed algorithms.

20.
Entropy (Basel) ; 25(5)2023 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-37238581

RESUMEN

With the ongoing development of image technology, the deployment of various intelligent applications on embedded devices has attracted increased attention in the industry. One such application is automatic image captioning for infrared images, which involves converting images into text. This practical task is widely used in night security, as well as for understanding night scenes and other scenarios. However, due to the differences in image features and the complexity of semantic information, generating captions for infrared images remains a challenging task. From the perspective of deployment and application, to improve the correlation between descriptions and objects, we introduced the YOLOv6 and LSTM as encoder-decoder structure and proposed infrared image caption based on object-oriented attention. Firstly, to improve the domain adaptability of the detector, we optimized the pseudo-label learning process. Secondly, we proposed the object-oriented attention method to address the alignment problem between complex semantic information and embedded words. This method helps select the most crucial features of the object region and guides the caption model in generating words that are more relevant to the object. Our methods have shown good performance on the infrared image and can produce words explicitly associated with the object regions located by the detector. The robustness and effectiveness of the proposed methods were demonstrated through evaluation on various datasets, along with other state-of-the-art methods. Our approach achieved BLUE-4 scores of 31.6 and 41.2 on KAIST and Infrared City and Town datasets, respectively. Our approach provides a feasible solution for the deployment of embedded devices in industrial applications.

SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda