Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 307
Filtrar
1.
Rev Cardiovasc Med ; 25(6): 211, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-39076307

RESUMEN

This article reviews four new technologies for assessment of coronary hemodynamics based on medical imaging and artificial intelligence, including quantitative flow ratio (QFR), optical flow ratio (OFR), computational fractional flow reserve (CT-FFR) and artificial intelligence (AI)-based instantaneous wave-free ratio (iFR). These technologies use medical imaging such as coronary angiography, computed tomography angiography (CTA), and optical coherence tomography (OCT), to reconstruct three-dimensional vascular models through artificial intelligence algorithms, simulate and calculate hemodynamic parameters in the coronary arteries, and achieve non-invasive and rapid assessment of the functional significance of coronary stenosis. This article details the working principles, advantages such as non-invasiveness, efficiency, accuracy, limitations such as image dependency, and assumption restrictions, of each technology. It also compares and analyzes the image dependency, calculation accuracy, calculation speed, and operation simplicity, of the four technologies. The results show that these technologies are highly consistent with the traditional invasive wire method, and shows distinct advantages in terms of accuracy, reliability, convenience and cost-effectiveness, but there are also factors that affect accuracy. The results of this review demonstrates that AI-based iFR technology is currently one of the most promising technologies. The main challenges and directions for future development are also discussed. These technologies bring new ideas for the non-invasive assessment of coronary artery disease, and are expected to promote the technological progress in this field.

2.
Sensors (Basel) ; 24(5)2024 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-38475150

RESUMEN

Due to the complexity of real optical flow capture, the existing research still has not performed real optical flow capture of infrared (IR) images with the production of an optical flow based on IR images, which makes the research and application of deep learning-based optical flow computation limited to the field of RGB images only. Therefore, in this paper, we propose a method to produce an optical flow dataset of IR images. We utilize the RGB-IR cross-modal image transformation network to rationally transform existing RGB image optical flow datasets. The RGB-IR cross-modal image transformation is based on the improved Pix2Pix implementation, and in the experiments, the network is validated and evaluated using the RGB-IR aligned bimodal dataset M3FD. Then, RGB-IR cross-modal transformation is performed on the existing RGB optical flow dataset KITTI, and the optical flow computation network is trained using the IR images generated by the transformation. Finally, the computational results of the optical flow computation network before and after training are analyzed based on the RGB-IR aligned bimodal data.

3.
Sensors (Basel) ; 24(2)2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-38257410

RESUMEN

Detecting violent behavior in videos to ensure public safety and security poses a significant challenge. Precisely identifying and categorizing instances of violence in real-life closed-circuit television, which vary across specifications and locations, requires comprehensive understanding and processing of the sequential information embedded in these videos. This study aims to introduce a model that adeptly grasps the spatiotemporal context of videos within diverse settings and specifications of violent scenarios. We propose a method to accurately capture spatiotemporal features linked to violent behaviors using optical flow and RGB data. The approach leverages a Conv3D-based ResNet-3D model as the foundational network, capable of handling high-dimensional video data. The efficiency and accuracy of violence detection are enhanced by integrating an attention mechanism, which assigns greater weight to the most crucial frames within the RGB and optical-flow sequences during instances of violence. Our model was evaluated on the UBI-Fight, Hockey, Crowd, and Movie-Fights datasets; the proposed method outperformed existing state-of-the-art techniques, achieving area under the curve scores of 95.4, 98.1, 94.5, and 100.0 on the respective datasets. Moreover, this research not only has the potential to be applied in real-time surveillance systems but also promises to contribute to a broader spectrum of research in video analysis and understanding.


Asunto(s)
Flujo Optico , Violencia , Sistemas de Computación
4.
Sensors (Basel) ; 24(13)2024 Jun 21.
Artículo en Inglés | MEDLINE | ID: mdl-39000817

RESUMEN

Parallax processing and structure preservation have long been important and challenging tasks in image stitching. In this paper, an image stitching method based on sliding camera to eliminate perspective deformation and asymmetric optical flow to solve parallax is proposed. By maintaining the viewpoint of two input images in the mosaic non-overlapping area and creating a virtual camera by interpolation in the overlapping area, the viewpoint is gradually transformed from one to another so as to complete the smooth transition of the two image viewpoints and reduce perspective deformation. Two coarsely aligned warped images are generated with the help of a global projection plane. After that, the optical flow propagation and gradient descent method are used to quickly calculate the bidirectional asymmetric optical flow between the two warped images, and the optical-flow-based method is used to further align the two warped images to reduce parallax. In the image blending, the softmax function and registration error are used to adjust the width of the blending area, further eliminating ghosting and reducing parallax. Finally, by comparing our method with APAP, AANAP, SPHP, SPW, TFT, and REW, it has been proven that our method can not only effectively solve perspective deformation, but also gives more natural transitions between images. At the same time, our method can robustly reduce local misalignment in various scenarios, with higher structural similarity index. A scoring method combining subjective and objective evaluations of perspective deformation, local alignment and runtime is defined and used to rate all methods, where our method ranks first.

5.
Sensors (Basel) ; 24(9)2024 Apr 30.
Artículo en Inglés | MEDLINE | ID: mdl-38732964

RESUMEN

Motion object detection (MOD) with freely moving cameras is a challenging task in computer vision. To extract moving objects, most studies have focused on the difference in motion features between foreground and background, which works well for dynamic scenes with relatively regular movements and variations. However, abrupt illumination changes and occlusions often occur in real-world scenes, and the camera may also pan, tilt, rotate, and jitter, etc., resulting in local irregular variations and global discontinuities in motion features. Such complex and changing scenes bring great difficulty in detecting moving objects. To solve this problem, this paper proposes a new MOD method that effectively leverages local and global visual information for foreground/background segmentation. Specifically, on the global side, to support a wider range of camera motion, the relative inter-frame transformations are optimized to absolute transformations referenced to intermediate frames in a global form after enriching the inter-frame matching pairs. The global transformation is fine-tuned using the spatial transformer network (STN). On the local side, to address the problem of dynamic background scenes, foreground object detection is optimized by utilizing the pixel differences between the current frame and the local background model, as well as the consistency of local spatial variations. Then, the spatial information is combined using optical flow segmentation methods, enhancing the precision of the object information. The experimental results show that our method achieves a detection accuracy improvement of over 1.5% compared with the state-of-the-art methods on the datasets of CDNET2014, FBMS-59, and CBD. It demonstrates significant effectiveness in challenging scenarios such as shadows, abrupt changes in illumination, camera jitter, occlusion, and moving backgrounds.

6.
Sensors (Basel) ; 24(10)2024 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-38793871

RESUMEN

The sky may seem big enough for two flying vehicles to collide, but the facts show that mid-air collisions still occur occasionally and are a significant concern. Pilots learn manual tactics to avoid collisions, such as see-and-avoid, but these rules have limitations. Automated solutions have reduced collisions, but these technologies are not mandatory in all countries or airspaces, and they are expensive. These problems have prompted researchers to continue the search for low-cost solutions. One attractive solution is to use computer vision to detect obstacles in the air due to its reduced cost and weight. A well-trained deep learning solution is appealing because object detection is fast in most cases, but it relies entirely on the training data set. The algorithm chosen for this study is optical flow. The optical flow vectors can help us to separate the motion caused by camera motion from the motion caused by incoming objects without relying on training data. This paper describes the development of an optical flow-based airborne obstacle detection algorithm to avoid mid-air collisions. The approach uses the visual information from a monocular camera and detects the obstacles using morphological filters, optical flow, focus of expansion, and a data clustering algorithm. The proposal was evaluated using realistic vision data obtained with a self-developed simulator. The simulator provides different environments, trajectories, and altitudes of flying objects. The results showed that the optical flow-based algorithm detected all incoming obstacles along their trajectories in the experiments. The results showed an F-score greater than 75% and a good balance between precision and recall.

7.
Sensors (Basel) ; 24(11)2024 Jun 03.
Artículo en Inglés | MEDLINE | ID: mdl-38894396

RESUMEN

The growing use of Unmanned Aerial Vehicles (UAVs) raises the need to improve their autonomous navigation capabilities. Visual odometry allows for dispensing positioning systems, such as GPS, especially on indoor flights. This paper reports an effort toward UAV autonomous navigation by proposing a translational velocity observer based on inertial and visual measurements for a quadrotor. The proposed observer complementarily fuses available measurements from different domains and is synthesized following the Immersion and Invariance observer design technique. A formal Lyapunov-based observer error convergence to zero is provided. The proposed observer algorithm is evaluated using numerical simulations in the Parrot Mambo Minidrone App from Simulink-Matlab.

8.
Sensors (Basel) ; 24(7)2024 Apr 07.
Artículo en Inglés | MEDLINE | ID: mdl-38610557

RESUMEN

Relative localization (RL) and circumnavigation is a highly challenging problem that is crucial for the safe flight of multi-UAVs (multiple unmanned aerial vehicles). Most methods depend on some external infrastructure for positioning. However, in some complex environments such as forests, it is difficult to set up such infrastructures. In this paper, an approach to infrastructure-free RL estimations of multi-UAVs is investigated for circumnavigating a slowly drifting UGV0 (unmanned ground vehicle 0), where UGV0 serves as the RL and circumnavigation target. Firstly, a discrete-time direct RL estimator is proposed to ascertain the coordinates of each UAV relative to the UGV0 based on intelligent sensing. Secondly, an RL fusion estimation method is proposed to obtain the final estimate of UGV0. Thirdly, an integrated estimation control scheme is also proposed for the application of the RL fusion estimation method to circumnavigation. The convergence and the performance are analyzed. The simulation results validate the effectiveness of the proposed algorithm for RL fusion estimations and of the integrated scheme.

9.
J Environ Manage ; 367: 122048, 2024 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-39088903

RESUMEN

Monitoring suspended sediment concentration (SSC) in rivers is pivotal for water quality management and sustainable river ecosystem development. However, achieving continuous and precise SSC monitoring is fraught with challenges, including low automation, lengthy measurement processes, and high cost. This study proposes an innovative approach for SSC identification in rivers using multimodal data fusion. We developed a robust model by harnessing colour features from video images, motion characteristics from the Lucas-Kanade (LK) optical flow method, and temperature data. By integrating ResNet with a mixed density network (MDN), our method fused the image and optical flow fields, and temperature data to enhance accuracy and reliability. Validated at a hydropower station in the Xinjiang Uygur Autonomous Region, China, the results demonstrated that while the image field alone offers a baseline level of SSC identification, it experiences local errors under specific conditions. The incorporation of optical flow and water temperature information enhanced model robustness, particularly when coupling the image and optical flow fields, yielding a Nash-Sutcliffe efficiency (NSE) of 0.91. Further enhancement was observed with the combined use of all three data types, attaining an NSE of 0.93. This integrated approach offers a more accurate SSC identification solution, enabling non-contact, low-cost measurements, facilitating remote online monitoring, and supporting water resource management and river water-sediment element monitoring.

10.
Eur Radiol ; 33(11): 8203-8213, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37286789

RESUMEN

OBJECTIVES: To evaluate the performance of a deep learning-based multi-source model for survival prediction and risk stratification in patients with heart failure. METHODS: Patients with heart failure with reduced ejection fraction (HFrEF) who underwent cardiac magnetic resonance between January 2015 and April 2020 were retrospectively included in this study. Baseline electronic health record data, including clinical demographic information, laboratory data, and electrocardiographic information, were collected. Short-axis non-contrast cine images of the whole heart were acquired to estimate the cardiac function parameters and the motion features of the left ventricle. Model accuracy was evaluated using the Harrell's concordance index. All patients were followed up for major adverse cardiac events (MACEs), and survival prediction was assessed using Kaplan-Meier curves. RESULTS: A total of 329 patients were evaluated (age 54 ± 14 years; men, 254) in this study. During a median follow-up period of 1041 days, 62 patients experienced MACEs and their median survival time was 495 days. When compared with conventional Cox hazard prediction models, deep learning models showed better survival prediction performance. Multi-data denoising autoencoder (DAE) model reached the concordance index of 0.8546 (95% CI: 0.7902-0.8883). Furthermore, when divided into phenogroups, the multi-data DAE model could significantly discriminate between the survival outcomes of the high-risk and low-risk groups compared with other models (p < 0.001). CONCLUSIONS: The proposed deep learning (DL) model based on non-contrast cardiac cine magnetic resonance imaging could independently predict the outcome of patients with HFrEF and showed better prediction efficiency than conventional methods. CLINICAL RELEVANCE STATEMENT: The proposed multi-source deep learning model based on cardiac magnetic resonance enables survival prediction in patients with heart failure. KEY POINTS: • A multi-source deep learning model based on non-contrast cardiovascular magnetic resonance (CMR) cine images was built to make robust survival prediction in patients with heart failure. • The ground truth definition contains electronic health record data as well as DL-based motion data, and cardiac motion information is extracted by optical flow method from non-contrast CMR cine images. • The DL-based model exhibits better prognostic value and stratification performance when compared with conventional prediction models and could aid in the risk stratification in patients with HF.


Asunto(s)
Aprendizaje Profundo , Insuficiencia Cardíaca , Disfunción Ventricular Izquierda , Masculino , Humanos , Adulto , Persona de Mediana Edad , Anciano , Imagen por Resonancia Cinemagnética , Pronóstico , Estudios Retrospectivos , Factores de Riesgo , Función Ventricular Izquierda , Volumen Sistólico , Valor Predictivo de las Pruebas
11.
BMC Med Imaging ; 23(1): 108, 2023 08 17.
Artículo en Inglés | MEDLINE | ID: mdl-37592200

RESUMEN

OBJECTIVES: To develop a quantitative analysis method for right diaphragm deformation. This method is based on optical flow and applied to diaphragm ultrasound imaging. METHODS: This study enrolls six healthy subjects and eight patients under mechanical ventilation. Dynamic images with 3-5 breathing cycles were acquired from three directions of right diaphragm by a portable ultrasound system. Filtering and density clustering algorithms are used for denoising Digital Imaging and Communications in Medicine (DICOM) data. An optical flow based method is applied to track movements of the right diaphragm. An improved drift correction algorithm is used to optimize the results. The method can automatically analyze the respiratory cycle, inter-frame/cumulative vertical and horizontal displacements, and strain of the input right diaphragm ultrasound image. RESULTS: The optical-flow-based diaphragm ultrasound image motion tracking algorithm can accurately track the right diaphragm during respiratory motion. There are significant differences in horizontal and vertical displacements in each section (p-values < 0.05 for all). Significant differences are found between healthy subjects and mechanical ventilation patients for both horizontal and vertical displacements in Section III (p-values < 0.05 for both). There is no significant difference in global strain in each section between healthy subjects and mechanical ventilation patients (p-values > 0.05 for all). CONCLUSIONS: The developed method can quantitatively evaluate the inter-frame/cumulative displacement of the diaphragm in both horizontal and vertical directions, as well as the global strain in three different imaging planes. The above indicators can be used to evaluate diaphragmatic dynamics.


Asunto(s)
Diafragma , Flujo Optico , Humanos , Diafragma/diagnóstico por imagen , Tórax , Ultrasonografía , Ultrasonografía Intervencional
12.
BMC Med Imaging ; 23(1): 88, 2023 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-37407909

RESUMEN

BACKGROUND: Ultrasonic echocardiography is commonly used for monitoring myocardial dysfunction. However, it has limitations such as poor quality of echocardiography images and subjective judgment of doctors. METHODS: In this paper, a calculation model based on optical flow tracking of echocardiogram is proposed for the quantitative estimation motion of the segmental wall. To improve the accuracy of optical flow estimation, a method based on confidence-optimized multiresolution(COM) optical flow model is proposed to reduce the estimation errors caused by the large amplitude of myocardial motion and the presence of "shadows" and other image quality problems. In addition, motion vector decomposition and dynamic tracking of the ventricular region of interest are used to extract information regarding the myocardial segmental motion. The proposed method was validated using simulation images and 50 clinical cases (25 patients and 25 healthy volunteers) for myocardial motion analysis. RESULTS: The results demonstrated that the proposed method could track the motion information of myocardial segments well and reduce the estimation errors of optical flow caused due to the use of low-quality echocardiogram images. CONCLUSIONS: The proposed method improves the accuracy of motion estimation for the cardiac ventricular wall.


Asunto(s)
Ventrículos Cardíacos , Ultrasonido , Humanos , Ventrículos Cardíacos/diagnóstico por imagen , Corazón , Ecocardiografía/métodos , Miocardio
13.
Sensors (Basel) ; 23(10)2023 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-37430667

RESUMEN

Fetal movement (FM) is an important indicator of fetal health. However, the current methods of FM detection are unsuitable for ambulatory or long-term observation. This paper proposes a non-contact method for monitoring FM. We recorded abdominal videos from pregnant women and then detected the maternal abdominal region within each frame. FM signals were acquired by optical flow color-coding, ensemble empirical mode decomposition, energy ratio, and correlation analysis. FM spikes, indicating the occurrence of FMs, were recognized using the differential threshold method. FM parameters including number, interval, duration, and percentage were calculated, and good agreement was found with the manual labeling performed by the professionals, achieving true detection rate, positive predictive value, sensitivity, accuracy, and F1_score of 95.75%, 95.26%, 95.75%, 91.40%, and 95.50%, respectively. The changes in FM parameters with gestational week were consistent with pregnancy progress. In general, this study provides a novel contactless FM monitoring technology for use at home.


Asunto(s)
Abdomen , Movimiento Fetal , Embarazo , Femenino , Humanos , Grabación en Video , Grabación de Cinta de Video , Monitoreo Fetal
14.
Sensors (Basel) ; 23(10)2023 May 17.
Artículo en Inglés | MEDLINE | ID: mdl-37430742

RESUMEN

Reconstruction-based and prediction-based approaches are widely used for video anomaly detection (VAD) in smart city surveillance applications. However, neither of these approaches can effectively utilize the rich contextual information that exists in videos, which makes it difficult to accurately perceive anomalous activities. In this paper, we exploit the idea of a training model based on the "Cloze Test" strategy in natural language processing (NLP) and introduce a novel unsupervised learning framework to encode both motion and appearance information at an object level. Specifically, to store the normal modes of video activity reconstructions, we first design an optical stream memory network with skip connections. Secondly, we build a space-time cube (STC) for use as the basic processing unit of the model and erase a patch in the STC to form the frame to be reconstructed. This enables a so-called "incomplete event (IE)" to be completed. On this basis, a conditional autoencoder is utilized to capture the high correspondence between optical flow and STC. The model predicts erased patches in IEs based on the context of the front and back frames. Finally, we employ a generating adversarial network (GAN)-based training method to improve the performance of VAD. By distinguishing the predicted erased optical flow and erased video frame, the anomaly detection results are shown to be more reliable with our proposed method which can help reconstruct the original video in IE. Comparative experiments conducted on the benchmark UCSD Ped2, CUHK Avenue, and ShanghaiTech datasets demonstrate AUROC scores reaching 97.7%, 89.7%, and 75.8%, respectively.

15.
Sensors (Basel) ; 23(24)2023 Dec 14.
Artículo en Inglés | MEDLINE | ID: mdl-38139658

RESUMEN

SLAM (simultaneous localization and mapping) plays a crucial role in autonomous robot navigation. A challenging aspect of visual SLAM systems is determining the 3D camera orientation of the motion trajectory. In this paper, we introduce an end-to-end network structure, InertialNet, which establishes the correlation between the image sequence and the IMU signals. Our network model is built upon inertial measurement learning and is employed to predict the camera's general motion pose. By incorporating an optical flow substructure, InertialNet is independent of the appearance of training sets and can be adapted to new environments. It maintains stable predictions even in the presence of image blur, changes in illumination, and low-texture scenes. In our experiments, we evaluated InertialNet on the public EuRoC dataset and our dataset, demonstrating its feasibility with faster training convergence and fewer model parameters for inertial measurement prediction.

16.
Sensors (Basel) ; 23(2)2023 Jan 14.
Artículo en Inglés | MEDLINE | ID: mdl-36679763

RESUMEN

Epilepsy is a debilitating neurological condition characterized by intermittent paroxysmal states called fits or seizures. Especially, the major motor seizures of a convulsive nature, such as tonic-clonic seizures, can cause aggravating consequences. Timely alerting for these convulsive epileptic states can therefore prevent numerous complications, during, or following the fit. Based on our previous research, a non-contact method using automated video camera observation and optical flow analysis underwent field trials in clinical settings. Here, we propose a novel adaptive learning paradigm for optimization of the seizure detection algorithm in each individual application. The main objective of the study was to minimize the false detection rate while avoiding undetected seizures. The system continuously updated detection parameters retrospectively using the data from the generated alerts. The system can be used under supervision or, alternatively, through autonomous validation of the alerts. In the latter case, the system achieved self-adaptive, unsupervised learning functionality. The method showed improvement of the detector performance due to the learning algorithm. This functionality provided a personalized seizure alerting device that adapted to the specific patient and environment. The system can operate in a fully automated mode, still allowing human observer to monitor and override the decision process while the algorithm provides suggestions as an expert system.


Asunto(s)
Epilepsia Tónico-Clónica , Epilepsia , Humanos , Estudios Retrospectivos , Tecnología de Sensores Remotos , Electroencefalografía/métodos , Convulsiones/diagnóstico , Epilepsia/diagnóstico , Algoritmos
17.
Sensors (Basel) ; 23(3)2023 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-36772527

RESUMEN

In the Information Age, the widespread usage of blackbox algorithms makes it difficult to understand how data is used. The practice of sensor fusion to achieve results is widespread, as there are many tools to further improve the robustness and performance of a model. In this study, we demonstrate the utilization of a Long Short-Term Memory (LSTM-CCA) model for the fusion of Passive RF (P-RF) and Electro-Optical (EO) data in order to gain insights into how P-RF data are utilized. The P-RF data are constructed from the in-phase and quadrature component (I/Q) data processed via histograms, and are combined with enhanced EO data via dense optical flow (DOF). The preprocessed data are then used as training data with an LSTM-CCA model in order to achieve object detection and tracking. In order to determine the impact of the different data inputs, a greedy algorithm (explainX.ai) is implemented to determine the weight and impact of the canonical variates provided to the fusion model on a scenario-by-scenario basis. This research introduces an explainable LSTM-CCA framework for P-RF and EO sensor fusion, providing novel insights into the sensor fusion process that can assist in the detection and differentiation of targets and help decision-makers to determine the weights for each input.

18.
Sensors (Basel) ; 23(3)2023 Feb 02.
Artículo en Inglés | MEDLINE | ID: mdl-36772697

RESUMEN

The uncertainty of target sizes and the complexity of backgrounds are the main reasons for the poor detection performance of small infrared targets. Focusing on this issue, this paper presents a robust and accurate algorithm that combines multiscale kurtosis map fusion and the optical flow method for the detection of small infrared targets in complex natural scenes. The paper has made three main contributions: First, it proposes a structure for infrared small target detection technology based on multiscale kurtosis maps and optical flow fields, which can well represent the shape, size and motion information of the target and is advantageous to the enhancement of the target and the suppression of the background. Second, a strategy of multi-scale kurtosis map fusion is presented to match the shape and the size of the small target, which can effectively enhance small targets with different sizes as well as suppress the highlighted noise points and the residual background edges. During the fusion process, a novel weighting mechanism is proposed to fuse different scale kurtosis maps, by means of which the scale that matches the true target is effectively enhanced. Third, an improved optical flow method is utilized to further suppress the nontarget residual clutter that cannot be completely removed by multiscale kurtosis map fusion. By means of the scale confidence parameter obtained during the multiscale kurtosis map fusion step, the optical flow method can select the optimal neighborhood that matches best to the target size and shape, which can effectively improve the integrity of the detection target and the ability to suppress residual clutter. As a result, the proposed method achieves a superior performance. Experimental results on eleven typical complex infrared natural scenes show that, compared with seven state-of-the-art methods, the presented method outperforms in terms of subjective visual effect, as well as some main objective evaluation indicators such as BSF, SCRG and ROC, etc.

19.
Sensors (Basel) ; 23(8)2023 Apr 18.
Artículo en Inglés | MEDLINE | ID: mdl-37112421

RESUMEN

Regularization is an important technique for training deep neural networks. In this paper, we propose a novel shared-weight teacher-student strategy and a content-aware regularization (CAR) module. Based on a tiny, learnable, content-aware mask, CAR is randomly applied to some channels in the convolutional layers during training to be able to guide predictions in a shared-weight teacher-student strategy. CAR prevents motion estimation methods in unsupervised learning from co-adaptation. Extensive experiments on optical flow and scene flow estimation show that our method significantly improves on the performance of the original networks and surpasses other popular regularization methods. The method also surpasses all variants with similar architectures and the supervised PWC-Net on MPI-Sintel and on KITTI. Our method shows strong cross-dataset generalization, i.e., our method solely trained on MPI-Sintel outperforms a similarly trained supervised PWC-Net by 27.9% and 32.9% on KITTI, respectively. Our method uses fewer parameters and less computation, and has faster inference times than the original PWC-Net.

20.
Sensors (Basel) ; 23(20)2023 Oct 23.
Artículo en Inglés | MEDLINE | ID: mdl-37896748

RESUMEN

In this paper, we propose a robust and integrated visual odometry framework exploiting the optical flow and feature point method that achieves faster pose estimate and considerable accuracy and robustness during the odometry process. Our method utilizes optical flow tracking to accelerate the feature point matching process. In the odometry, two visual odometry methods are used: global feature point method and local feature point method. When there is good optical flow tracking and enough key points optical flow tracking matching is successful, the local feature point method utilizes prior information from the optical flow to estimate relative pose transformation information. In cases where there is poor optical flow tracking and only a small number of key points successfully match, the feature point method with a filtering mechanism is used for posing estimation. By coupling and correlating the two aforementioned methods, this visual odometry greatly accelerates the computation time for relative pose estimation. It reduces the computation time of relative pose estimation to 40% of that of the ORB_SLAM3 front-end odometry, while ensuring that it is not too different from the ORB_SLAM3 front-end odometry in terms of accuracy and robustness. The effectiveness of this method was validated and analyzed using the EUROC dataset within the ORB_SLAM3 open-source framework. The experimental results serve as supporting evidence for the efficacy of the proposed approach.

SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda