Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
1.
Sensors (Basel) ; 24(19)2024 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-39409528

RESUMO

In this paper, we focus on the multi-target tracking (MOT) task in satellite videos. To achieve efficient and accurate tracking, we propose a transformer-distillation-based end-to-end joint detection and tracking (JDT) method. Specifically, (1) considering that targets in satellite videos usually have small scales and are shot from a bird's-eye view, we propose a pixel-wise transformer-based feature distillation module through which useful object representations are learned via pixel-wise distillation using a strong teacher detection network; (2) targets in satellite videos, such as airplanes, ships, and vehicles, usually have similar appearances, so we propose a temperature-controllable key feature learning objective function, and by highlighting the learning of similar features during distilling, the tracking accuracy for such objects can be further improved; (3) we propose a method that is based on an end-to-end network but simultaneously learns from a highly precise teacher network and tracking head during training so that the tracking accuracy of the end-to-end network can be improved via distillation without compromising efficiency. The experimental results on three recently released publicly available datasets demonstrated the superior performance of the proposed method for satellite videos. The proposed method achieved over 90% overall tracking performance on the AIR-MOT dataset.

2.
Sensors (Basel) ; 24(4)2024 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-38400356

RESUMO

Models based on joint detection and re-identification (ReID), which significantly increase the efficiency of online multi-object tracking (MOT) systems, are an evolution from separate detection and ReID models in the tracking-by-detection (TBD) paradigm. It is observed that these joint models are typically one-stage, while the two-stage models become obsolete because of their slow speed and low efficiency. However, the two-stage models have naive advantages over the one-stage anchor-based and anchor-free models in handling feature misalignment and occlusion, which suggests that the two-stage models, via meticulous design, could be on par with the state-of-the-art one-stage models. Following this intuition, we propose a robust and efficient two-stage joint model based on R-FCN, whose backbone and neck are fully convolutional, and the RoI-wise process only involves simple calculations. In the first stage, an adaptive sparse anchoring scheme is utilized to produce adequate, high-quality proposals to improve efficiency. To boost both detection and ReID, two key elements-feature aggregation and feature disentanglement-are taken into account. To improve robustness against occlusion, the position-sensitivity is exploited, first to estimate occlusion and then to direct the post-process for anti-occlusion. Finally, we link the model to a hierarchical association algorithm to form a complete MOT system called PSMOT. Compared to other cutting-edge systems, PSMOT achieves competitive performance while maintaining time efficiency.

3.
Sensors (Basel) ; 24(8)2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38676054

RESUMO

Modern vehicles equipped with Advanced Driver Assistance Systems (ADAS) rely heavily on sensor fusion to achieve a comprehensive understanding of their surrounding environment. Traditionally, the Kalman Filter (KF) has been a popular choice for this purpose, necessitating complex data association and track management to ensure accurate results. To address errors introduced by these processes, the application of the Gaussian Mixture Probability Hypothesis Density (GM-PHD) filter is a good choice. This alternative filter implicitly handles the association and appearance/disappearance of tracks. The approach presented here allows for the replacement of KF frameworks in many applications while achieving runtimes below 1 ms on the test system. The key innovations lie in the utilization of sensor-based parameter models to implicitly handle varying Fields of View (FoV) and sensing capabilities. These models represent sensor-specific properties such as detection probability and clutter density across the state space. Additionally, we introduce a method for propagating additional track properties such as classification with the GM-PHD filter, further contributing to its versatility and applicability. The proposed GM-PHD filter approach surpasses a KF approach on the KITTI dataset and another custom dataset. The mean OSPA(2) error could be reduced from 1.56 (KF approach) to 1.40 (GM-PHD approach), showcasing its potential in ADAS perception.

4.
Sensors (Basel) ; 24(17)2024 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-39275667

RESUMO

Detecting and tracking personnel onboard is an important measure to prevent ships from being invaded by outsiders and ensure ship security. Ships are characterized by more cabins, numerous equipment, and dense personnel, so there are problems such as unpredictable personnel trajectories, frequent occlusions, and many small targets, which lead to the poor performance of existing multi-target-tracking algorithms on shipboard surveillance videos. This study conducts research in the context of onboard surveillance and proposes a multi-object detection and tracking algorithm for anti-intrusion on ships. First, this study designs the BR-YOLO network to provide high-quality object-detection results for the tracking algorithm. The shallow layers of its backbone network use the BiFormer module to capture dependencies between distant objects and reduce information loss. Second, the improved C2f module is used in the deep layer of BR-YOLO to introduce the RepGhost structure to achieve model lightweighting through reparameterization. Then, the Part OSNet network is proposed, which uses different pooling branches to focus on multi-scale features, including part-level features, thereby obtaining strong Re-ID feature representations and providing richer appearance information for personnel tracking. Finally, by integrating the appearance information for association matching, the tracking trajectory is generated in Tracking-By-Detection mode and validated on the self-constructed shipboard surveillance dataset. The experimental results show that the algorithm in this paper is effective in shipboard surveillance. Compared with the present mainstream algorithms, the MOTA, HOTP, and IDF1 are enhanced by about 10 percentage points, the MOTP is enhanced by about 7 percentage points, and IDs are also significantly reduced, which is of great practical significance for the prevention of intrusion by ship personnel.

5.
Sensors (Basel) ; 24(14)2024 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-39066088

RESUMO

The presence of fog in the background can prevent small and distant objects from being detected, let alone tracked. Under safety-critical conditions, multi-object tracking models require faster tracking speed while maintaining high object-tracking accuracy. The original DeepSORT algorithm used YOLOv4 for the detection phase and a simple neural network for the deep appearance descriptor. Consequently, the feature map generated loses relevant details about the track being matched with a given detection in fog. Targets with a high degree of appearance similarity on the detection frame are more likely to be mismatched, resulting in identity switches or track failures in heavy fog. We propose an improved multi-object tracking model based on the DeepSORT algorithm to improve tracking accuracy and speed under foggy weather conditions. First, we employed our camera-radar fusion network (CR-YOLOnet) in the detection phase for faster and more accurate object detection. We proposed an appearance feature network to replace the basic convolutional neural network. We incorporated GhostNet to take the place of the traditional convolutional layers to generate more features and reduce computational complexities and costs. We adopted a segmentation module and fed the semantic labels of the corresponding input frame to add rich semantic information to the low-level appearance feature maps. Our proposed method outperformed YOLOv5 + DeepSORT with a 35.15% increase in multi-object tracking accuracy, a 32.65% increase in multi-object tracking precision, a speed increase by 37.56%, and identity switches decreased by 46.81%.

6.
Sensors (Basel) ; 23(6)2023 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-36991667

RESUMO

Multi-object tracking (MOT) is a topic of great interest in the field of computer vision, which is essential in smart behavior-analysis systems for healthcare, such as human-flow monitoring, crime analysis, and behavior warnings. Most MOT methods achieve stability by combining object-detection and re-identification networks. However, MOT requires high efficiency and accuracy in complex environments with occlusions and interference. This often increases the algorithm's complexity, affects the speed of tracking calculations, and reduces real-time performance. In this paper, we present an improved MOT method combining an attention mechanism and occlusion sensing as a solution. A convolutional block attention module (CBAM) calculates the weights of space and channel attention from the feature map. The attention weights are used to fuse the feature maps to extract adaptively robust object representations. An occlusion-sensing module detects an object's occlusion, and the appearance characteristics of an occluded object are not updated. This can enhance the model's ability to extract object features and improve appearance feature pollution caused by the short-term occlusion of an object. Experiments on public datasets demonstrate the competitive performance of the proposed method compared with the state-of-the-art MOT methods. The experimental results show that our method has powerful data association capability, e.g., 73.2% MOTA and 73.9% IDF1 on the MOT17 dataset.

7.
Sensors (Basel) ; 23(8)2023 Apr 21.
Artigo em Inglês | MEDLINE | ID: mdl-37112491

RESUMO

This study proposes a visual tracking system that can detect and track multiple fast-moving appearance-varying targets simultaneously with 500 fps image processing. The system comprises a high-speed camera and a pan-tilt galvanometer system, which can rapidly generate large-scale high-definition images of the wide monitored area. We developed a CNN-based hybrid tracking algorithm that can robustly track multiple high-speed moving objects simultaneously. Experimental results demonstrate that our system can track up to three moving objects with velocities lower than 30 m per second simultaneously within an 8-m range. The effectiveness of our system was demonstrated through several experiments conducted on simultaneous zoom shooting of multiple moving objects (persons and bottles) in a natural outdoor scene. Moreover, our system demonstrates high robustness to target loss and crossing situations.

8.
Sensors (Basel) ; 23(23)2023 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-38067875

RESUMO

Pig husbandry constitutes a significant segment within the broader framework of livestock farming, with porcine well-being emerging as a paramount concern due to its direct implications on pig breeding and production. An easily observable proxy for assessing the health of pigs lies in their daily patterns of movement. The daily movement patterns of pigs can be used as an indicator of their health, in which more active pigs are usually healthier than those who are not active, providing farmers with knowledge of identifying pigs' health state before they become sick or their condition becomes life-threatening. However, the conventional means of estimating pig mobility largely rely on manual observations by farmers, which is impractical in the context of contemporary centralized and extensive pig farming operations. In response to these challenges, multi-object tracking and pig behavior methods are adopted to monitor pig health and welfare closely. Regrettably, these existing methods frequently fall short of providing precise and quantified measurements of movement distance, thereby yielding a rudimentary metric for assessing pig health. This paper proposes a novel approach that integrates optical flow and a multi-object tracking algorithm to more accurately gauge pig movement based on both qualitative and quantitative analyses of the shortcomings of solely relying on tracking algorithms. The optical flow records accurate movement between two consecutive frames and the multi-object tracking algorithm offers individual tracks for each pig. By combining optical flow and the tracking algorithm, our approach can accurately estimate each pig's movement. Moreover, the incorporation of optical flow affords the capacity to discern partial movements, such as instances where only the pig's head is in motion while the remainder of its body remains stationary. The experimental results show that the proposed method has superiority over the method of solely using tracking results, i.e., bounding boxes. The reason is that the movement calculated based on bounding boxes is easily affected by the size fluctuation while the optical flow data can avoid these drawbacks and even provide more fine-grained motion information. The virtues inherent in the proposed method culminate in the provision of more accurate and comprehensive information, thus enhancing the efficacy of decision-making and management processes within the realm of pig farming.


Assuntos
Fluxo Óptico , Suínos , Animais , Movimento/fisiologia , Algoritmos , Movimento (Física) , Fazendas
9.
Sensors (Basel) ; 23(1)2023 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-36617130

RESUMO

Effective livestock management is critical for cattle farms in today's competitive era of smart modern farming. To ensure farm management solutions are efficient, affordable, and scalable, the manual identification and detection of cattle are not feasible in today's farming systems. Fortunately, automatic tracking and identification systems have greatly improved in recent years. Moreover, correctly identifying individual cows is an integral part of predicting behavior during estrus. By doing so, we can monitor a cow's behavior, and pinpoint the right time for artificial insemination. However, most previous techniques have relied on direct observation, increasing the human workload. To overcome this problem, this paper proposes the use of state-of-the-art deep learning-based Multi-Object Tracking (MOT) algorithms for a complete system that can automatically and continuously detect and track cattle using an RGB camera. This study compares state-of-the-art MOTs, such as Deep-SORT, Strong-SORT, and customized light-weight tracking algorithms. To improve the tracking accuracy of these deep learning methods, this paper presents an enhanced re-identification approach for a black cattle dataset in Strong-SORT. For evaluating MOT by detection, the system used the YOLO v5 and v7, as a comparison with the instance segmentation model Detectron-2, to detect and classify the cattle. The high cattle-tracking accuracy with a Multi-Object Tracking Accuracy (MOTA) was 96.88%. Using these methods, the findings demonstrate a highly accurate and robust cattle tracking system, which can be applied to innovative monitoring systems for agricultural applications. The effectiveness and efficiency of the proposed system were demonstrated by analyzing a sample of video footage. The proposed method was developed to balance the trade-off between costs and management, thereby improving the productivity and profitability of dairy farms; however, this method can be adapted to other domestic species.


Assuntos
Aprendizado Profundo , Feminino , Bovinos , Humanos , Animais , Indústria de Laticínios/métodos , Algoritmos , Agricultura , Fazendas
10.
Sensors (Basel) ; 23(14)2023 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-37514671

RESUMO

Although Short-Wave Infrared (SWIR) sensors have advantages in terms of robustness in bad weather and low-light conditions, the SWIR images have not been well studied for automated object detection and tracking systems. The majority of previous multi-object tracking studies have focused on pedestrian tracking in visible-spectrum images, but tracking different types of vehicles is also important in city-surveillance scenarios. In addition, the previous studies were based on high-computing-power environments such as GPU workstations or servers, but edge computing should be considered to reduce network bandwidth usage and privacy concerns in city-surveillance scenarios. In this paper, we propose a fast and effective multi-object tracking method, called Multi-Class Distance-based Tracking (MCDTrack), on SWIR images of city-surveillance scenarios in a low-power and low-computation edge-computing environment. Eight-bit integer quantized object detection models are used, and simple distance and IoU-based similarity scores are employed to realize effective multi-object tracking in an edge-computing environment. Our MCDTrack is not only superior to previous multi-object tracking methods but also shows high tracking accuracy of 77.5% MOTA and 80.2% IDF1 although the object detection and tracking are performed on the edge-computing device. Our study results indicate that a robust city-surveillance solution can be developed based on the edge-computing environment and low-frame-rate SWIR images.

11.
Sensors (Basel) ; 23(14)2023 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-37514694

RESUMO

In marine surveillance, distinguishing between normal and anomalous vessel movement patterns is critical for identifying potential threats in a timely manner. Once detected, it is important to monitor and track these vessels until a necessary intervention occurs. To achieve this, track association algorithms are used, which take sequential observations comprising the geological and motion parameters of the vessels and associate them with respective vessels. The spatial and temporal variations inherent in these sequential observations make the association task challenging for traditional multi-object tracking algorithms. Additionally, the presence of overlapping tracks and missing data can further complicate the trajectory tracking process. To address these challenges, in this study, we approach this tracking task as a multivariate time series problem and introduce a 1D CNN-LSTM architecture-based framework for track association. This special neural network architecture can capture the spatial patterns as well as the long-term temporal relations that exist among the sequential observations. During the training process, it learns and builds the trajectory for each of these underlying vessels. Once trained, the proposed framework takes the marine vessel's location and motion data collected through the automatic identification system (AIS) as input and returns the most likely vessel track as output in real-time. To evaluate the performance of our approach, we utilize an AIS dataset containing observations from 327 vessels traveling in a specific geographic region. We measure the performance of our proposed framework using standard performance metrics such as accuracy, precision, recall, and F1 score. When compared with other competitive neural network architectures, our approach demonstrates a superior tracking performance.

12.
Sensors (Basel) ; 23(7)2023 Mar 23.
Artigo em Inglês | MEDLINE | ID: mdl-37050449

RESUMO

Multi-object tracking (MOT) is a prominent and important study in point cloud processing and computer vision. The main objective of MOT is to predict full tracklets of several objects in point cloud. Occlusion and similar objects are two common problems that reduce the algorithm's performance throughout the tracking phase. The tracking performance of current MOT techniques, which adopt the 'tracking-by-detection' paradigm, is degrading, as evidenced by increasing numbers of identification (ID) switch and tracking drifts because it is difficult to perfectly predict the location of objects in complex scenes that are unable to track. Since the occluded object may have been visible in former frames, we manipulated the speed and location position of the object in the previous frames in order to guess where the occluded object might have been. In this paper, we employed a unique intersection over union (IoU) method in three-dimension (3D) planes, namely a distance IoU non-maximum suppression (DIoU-NMS) to accurately detect objects, and consequently we use 3D-DIoU for an object association process in order to increase tracking robustness and speed. By using a hybrid 3D DIoU-NMS and 3D-DIoU method, the tracking speed improved significantly. Experimental findings on the Waymo Open Dataset and nuScenes dataset, demonstrate that our multistage data association and tracking technique has clear benefits over previously developed algorithms in terms of tracking accuracy. In comparison with other 3D MOT tracking methods, our proposed approach demonstrates significant enhancement in tracking performances.

13.
Sensors (Basel) ; 23(7)2023 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-37050842

RESUMO

The current popular one-shot multi-object tracking (MOT) algorithms are dominated by the joint detection and embedding paradigm, which have high inference speeds and accuracy, but their tracking performance is unstable in crowded scenes. Not only does the detection branch have difficulty in obtaining the accurate object position, but the ambiguous appearance of features extracted by the re-identification (re-ID) branch also leads to identity switches. Focusing on the above problems, this paper proposes a more robust MOT algorithm, named CSMOT, based on FairMOT. First, on the basis of the encoder-decoder network, a coordinate attention module is designed to enhance the information interaction between channels (horizontal and vertical coordinates), which improves its object-detection abilities. Then, an angle-center loss that effectively maximizes intra-class similarity is proposed to optimize the re-ID branch, and the extracted re-ID features are made more discriminative. We further redesign the re-ID feature dimension to balance the detection and re-ID tasks. Finally, a simple and effective data association mechanism is introduced, which associates each detection instead of just the high-score detections during the tracking process. The experimental results show that our one-shot MOT algorithm achieves excellent tracking performance on multiple public datasets and can be effectively applied to crowded scenes. In particular, CSMOT decreases the number of ID switches by 11.8% and 33.8% on the MOT16 and MOT17 test datasets, respectively, compared to the baseline.

14.
Sensors (Basel) ; 22(22)2022 Nov 18.
Artigo em Inglês | MEDLINE | ID: mdl-36433519

RESUMO

Propagation and association tasks in Multi-Object Tracking (MOT) play a pivotal role in accurately linking the trajectories of moving objects. Recently, modern deep learning models have been addressing these tasks by introducing fragmented solutions for each different problem such as appearance modeling, motion modeling, and object associations. To bring unification in the MOT task, we introduce a pixel-guided approach to efficiently build the joint-detection and tracking framework for multi-object tracking. Specifically, the up-sampled multi-scale features from consecutive frames are queued to detect the object locations by using a transformer-decoder, and per-pixel distributions are utilized to compute the association matrix according to object queries. Additionally, we introduce a long-term appearance association on track features to learn the long-term association of tracks against detections to compute the similarity matrix. Finally, a similarity matrix is jointly integrated with the Byte-Tracker resulting in a state-of-the-art MOT performance. The experiments with the standard MOT15 and MOT17 benchmarks show that our approach achieves significant tracking performance.

15.
Sensors (Basel) ; 22(15)2022 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-35957245

RESUMO

A catalogue of over 22,000 objects in Earth's orbit is currently maintained, and that number is expected to double within the next decade. Novel data collection regimes are needed to scale our ability to detect, track, classify and characterize resident space objects in a crowded low Earth orbit. This research presents RSOnet, an image-processing framework for space domain awareness using star trackers. Star trackers are cost-effective, flight proven, and require basic image processing to be used as an attitude-determination sensor. RSOnet is designed to augment the capabilities of a star tracker by becoming an opportunistic space-surveillance sensor. Our research demonstrates that star trackers are a feasible source for RSO detections in LEO by demonstrating the performance of RSOnet on real detections from a star-tracker-like imager in space. RSOnet convolutional-neural-network model architecture, graph-based multi-object classifier and characterization results are described in this paper.

16.
Sensors (Basel) ; 22(22)2022 Nov 09.
Artigo em Inglês | MEDLINE | ID: mdl-36433246

RESUMO

Three-dimensional multimodality multi-object tracking has attracted great attention due to the use of complementary information. However, such a framework generally adopts a one-stage association approach, which fails to perform precise matching between detections and tracklets, and, thus, cannot robustly track objects in complex scenes. To address this matching problem caused by one-stage association, we propose a novel multi-stage association method, which consists of a hierarchical matching module and a customized track management module. Specifically, the hierarchical matching module defines the reliability of the objects by associating multimodal detections, and matches detections with trajectories based on the reliability in turn, which increases the utilization of true detections, and, thus, guides accurate association. Then, based on the reliability of the trajectories provided by the matching module, the customized track management module sets maximum missing frames with differences for tracks, which decreases the number of identity switches of the same object and, thus, further improves the association accuracy. By using the proposed multi-stage association method, we develop a tracker called MSA-MOT for the 3D multi-object tracking task, alleviating the inherent matching problem in one-stage association. Extensive experiments are conducted on the challenging KITTI benchmark, and the results show that our tracker outperforms the previous state-of-the-art methods in terms of both accuracy and speed. Moreover, the ablation and exploration analysis results demonstrate the effectiveness of the proposed multi-stage association method.


Assuntos
Algoritmos , Atenção , Reprodutibilidade dos Testes
17.
Sensors (Basel) ; 22(22)2022 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-36433416

RESUMO

Economic and social progress in the Republic of Korea resulted in an increased standard of living, which subsequently produced more waste. The Korean government implemented a volume-based trash disposal system that may modify waste disposal characteristics to handle vast volumes of waste efficiently. However, the inconvenience of having to purchase standard garbage bags on one's own led to passive participation by citizens and instances of illegally dumping waste in non-standard plastic bags. As a result, there is a need for the development of automatic detection and reporting of illegal acts of garbage dumping. To achieve this, we suggest a system for tracking unlawful rubbish disposal that is based on deep neural networks. The proposed monitoring approach obtains the articulation points (joints) of a dumper through OpenPose and identifies the type of garbage bag through the object detection model, You Only Look Once (YOLO), to determine the distance of the dumper's wrist to the garbage bag and decide whether it is illegal dumping. Additionally, we introduced a method of tracking the IDs issued to the waste bags using the multi-object tracking (MOT) model to reduce the false detection of illegal dumping. To evaluate the efficacy of the proposed illegal dumping monitoring system, we compared it with the other systems based on behavior recognition. As a result, it was validated that the suggested approach had a higher degree of accuracy and a lower percentage of false alarms, making it useful for a variety of upcoming applications.


Assuntos
Eliminação de Resíduos , Veículos Automotores , Redes Neurais de Computação , Atenção , República da Coreia
18.
Sensors (Basel) ; 22(6)2022 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-35336294

RESUMO

Multi-object tracking in video surveillance is subjected to illumination variation, blurring, motion, and similarity variations during the identification process in real-world practice. The previously proposed applications have difficulties in learning the appearances and differentiating the objects from sundry detections. They mostly rely heavily on local features and tend to lose vital global structured features such as contour features. This contributes to their inability to accurately detect, classify or distinguish the fooling images. In this paper, we propose a paradigm aimed at eliminating these tracking difficulties by enhancing the detection quality rate through the combination of a convolutional neural network (CNN) and a histogram of oriented gradient (HOG) descriptor. We trained the algorithm with an input of 120 × 32 images size and cleaned and converted them into binary for reducing the numbers of false positives. In testing, we eliminated the background on frames size and applied morphological operations and Laplacian of Gaussian model (LOG) mixture after blobs. The images further underwent feature extraction and computation with the HOG descriptor to simplify the structural information of the objects in the captured video images. We stored the appearance features in an array and passed them into the network (CNN) for further processing. We have applied and evaluated our algorithm for real-time multiple object tracking on various city streets using EPFL multi-camera pedestrian datasets. The experimental results illustrate that our proposed technique improves the detection rate and data associations. Our algorithm outperformed the online state-of-the-art approach by recording the highest in precisions and specificity rates.


Assuntos
Redes Neurais de Computação , Pedestres , Algoritmos , Humanos , Movimento (Física) , Distribuição Normal
19.
Sensors (Basel) ; 22(20)2022 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-36298293

RESUMO

Effective multi-object tracking is still challenging due to the trade-off between tracking accuracy and speed. Because the recent multi-object tracking (MOT) methods leverage object appearance and motion models so as to associate detections between consecutive frames, the key for effective multi-object tracking is to reduce the computational complexity of learning both models. To this end, this work proposes global appearance and motion models to discriminate multiple objects instead of learning local object-specific models. In concrete detail, it learns a global appearance model using contrastive learning between object appearances. In addition, we learn a global relation motion model using relative motion learning between objects. Moreover, this paper proposes object constraint learning for improving tracking efficiency. This study considers the discriminability of the models as a constraint, and learns both models when inconsistency with the constraint occurs. Therefore, object constraint learning differs from the conventional online learning for multi-object tracking which updates learnable parameters per frame. This work incorporates global models and object constraint learning into the confidence-based association method, and compare our tracker with the state-of-the-art methods on public available MOT Challenge datasets. As a result, we achieve 64.5% MOTA (multi-object tracking accuracy) and 6.54 Hz tracking speed on the MOT16 test dataset. The comparison results show that our methods can contribute to improve tracking accuracy and tracking speed together.


Assuntos
Algoritmos , Aprendizagem , Gravação em Vídeo , Movimento (Física)
20.
Sensors (Basel) ; 22(23)2022 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-36501808

RESUMO

As an essential part of intelligent monitoring, behavior recognition, automatic driving, and others, the challenge of multi-object tracking is still to ensure tracking accuracy and robustness, especially in complex occlusion environments. Aiming at the issues of the occlusion, background noise, and motion state violent change for multi-object in a complex scene, an improved DeepSORT algorithm based on YOLOv5 is proposed for multi-object tracking to enhance the speed and accuracy of tracking. Firstly, a general object motion model is devised, which is similar to the variable acceleration motion model, and a multi-object tracking framework with the general motion model is established. Then, the latest YOLOv5 algorithm, which has satisfactory detection accuracy, is utilized to obtain the object information as the input of multi-object tracking. An unscented Kalman filter (UKF) is proposed to estimate the motion state of multi-object to solve nonlinear errors. In addition, the adaptive factor is introduced to evaluate observation noise and detect abnormal observations so as to adaptively adjust the innovation covariance matrix. Finally, an improved DeepSORT algorithm for multi-object tracking is formed to promote robustness and accuracy. Extensive experiments are carried out on the MOT16 data set, and we compare the proposed algorithm with the DeepSORT algorithm. The results indicate that the speed and precision of the improved DeepSORT are increased by 4.75% and 2.30%, respectively. Especially in the MOT16 of the dynamic camera, the improved DeepSORT shows better performance.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa