Search | VHL Regional Portal

Show: 20 | 50 | 100

Results 1 - 11 de 11

Filter

ResNeXt-CC: a novel network based on cross-layer deep-feature fusion for white blood cell classification.

Luo, Yang; Xu, Ying; Wang, Changbin; Li, Qiuju; Fu, Chong; Jiang, Hongyang.

Sci Rep ; 14(1): 18439, 2024 08 08.

Article in English | MEDLINE | ID: mdl-39117714

ABSTRACT

Accurate diagnosis of white blood cells from cytopathological images is a crucial step in evaluating leukaemia. In recent years, image classification methods based on fully convolutional networks have drawn extensive attention and achieved competitive performance in medical image classification. In this paper, we propose a white blood cell classification network called ResNeXt-CC for cytopathological images. First, we transform cytopathological images from the RGB color space to the HSV color space so as to precisely extract the texture features, color changes and other details of white blood cells. Second, since cell classification primarily relies on distinguishing local characteristics, we design a cross-layer deep-feature fusion module to enhance our ability to extract discriminative information. Third, the efficient attention mechanism based on the ECANet module is used to promote the feature extraction capability of cell details. Finally, we combine the modified softmax loss function and the central loss function to train the network, thereby effectively addressing the problem of class imbalance and improving the network performance. The experimental results on the C-NMC 2019 dataset show that our proposed method manifests obvious advantages over the existing classification methods, including ResNet-50, Inception-V3, Densenet121, VGG16, Cross ViT, Token-to-Token ViT, Deep ViT, and simple ViT about 5.5-20.43% accuracy, 3.6-23.56% F1-score, 3.5-25.71% AUROC and 8.1-36.98% specificity, respectively.

Subject(s)

Leukocytes , Humans , Leukocytes/cytology , Neural Networks, Computer , Image Processing, Computer-Assisted/methods , Leukemia/pathology , Leukemia/classification , Algorithms , Deep Learning

FV-MViT: Mobile Vision Transformer for Finger Vein Recognition.

Li, Xiongjun; Feng, Jin; Cai, Jilin; Lin, Guowen.

Sensors (Basel) ; 24(4)2024 Feb 19.

Article in English | MEDLINE | ID: mdl-38400488

ABSTRACT

In addressing challenges related to high parameter counts and limited training samples for finger vein recognition, we present the FV-MViT model. It serves as a lightweight deep learning solution, emphasizing high accuracy, portable design, and low latency. The FV-MViT introduces two key components. The Mul-MV2 Block utilizes a dual-path inverted residual connection structure for multi-scale convolutions, extracting additional local features. Simultaneously, the Enhanced MobileViT Block eliminates the large-scale convolution block at the beginning of the original MobileViT Block. It converts the Transformer's self-attention into separable self-attention with linear complexity, optimizing the back end of the original MobileViT Block with depth-wise separable convolutions. This aims to extract global features and effectively reduce parameter counts and feature extraction times. Additionally, we introduce a soft target center cross-entropy loss function to enhance generalization and increase accuracy. Experimental results indicate that the FV-MViT achieves a recognition accuracy of 99.53% and 100.00% on the Shandong University (SDU) and Universiti Teknologi Malaysia (USM) datasets, with equal error rates of 0.47% and 0.02%, respectively. The model has a parameter count of 5.26 million and exhibits a latency of 10.00 milliseconds from the sample input to the recognition output. Comparison with state-of-the-art (SOTA) methods reveals competitive performance for FV-MViT.

Subject(s)

Electric Power Supplies , Extremities , Humans , Entropy , Recognition, Psychology , Veins

CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again.

Hou, Haoxiong; Shen, Chao; Zhang, Ximing; Gao, Wei.

Sensors (Basel) ; 23(7)2023 Apr 06.

Article in English | MEDLINE | ID: mdl-37050842

ABSTRACT

The current popular one-shot multi-object tracking (MOT) algorithms are dominated by the joint detection and embedding paradigm, which have high inference speeds and accuracy, but their tracking performance is unstable in crowded scenes. Not only does the detection branch have difficulty in obtaining the accurate object position, but the ambiguous appearance of features extracted by the re-identification (re-ID) branch also leads to identity switches. Focusing on the above problems, this paper proposes a more robust MOT algorithm, named CSMOT, based on FairMOT. First, on the basis of the encoder-decoder network, a coordinate attention module is designed to enhance the information interaction between channels (horizontal and vertical coordinates), which improves its object-detection abilities. Then, an angle-center loss that effectively maximizes intra-class similarity is proposed to optimize the re-ID branch, and the extracted re-ID features are made more discriminative. We further redesign the re-ID feature dimension to balance the detection and re-ID tasks. Finally, a simple and effective data association mechanism is introduced, which associates each detection instead of just the high-score detections during the tracking process. The experimental results show that our one-shot MOT algorithm achieves excellent tracking performance on multiple public datasets and can be effectively applied to crowded scenes. In particular, CSMOT decreases the number of ID switches by 11.8% and 33.8% on the MOT16 and MOT17 test datasets, respectively, compared to the baseline.

Multimodal motor imagery decoding method based on temporal spatial feature alignment and fusion.

Zhang, Yukun; Qiu, Shuang; He, Huiguang.

J Neural Eng ; 20(2)2023 03 13.

Article in English | MEDLINE | ID: mdl-36854181

ABSTRACT

Objective. A motor imagery-based brain-computer interface (MI-BCI) translates spontaneous movement intention from the brain to outside devices. Multimodal MI-BCI that uses multiple neural signals contains rich common and complementary information and is promising for enhancing the decoding accuracy of MI-BCI. However, the heterogeneity of different modalities makes the multimodal decoding task difficult. How to effectively utilize multimodal information remains to be further studied.Approach. In this study, a multimodal MI decoding neural network was proposed. Spatial feature alignment losses were designed to enhance the feature representations extracted from the heterogeneous data and guide the fusion of features from different modalities. An attention-based modality fusion module was built to align and fuse the features in the temporal dimension. To evaluate the proposed decoding method, a five-class MI electroencephalography (EEG) and functional near infrared spectroscopy (fNIRS) dataset were constructed.Main results and significance. The comparison experimental results showed that the proposed decoding method achieved higher decoding accuracy than the compared methods on both the self-collected dataset and a public dataset. The ablation results verified the effectiveness of each part of the proposed method. Feature distribution visualization results showed that the proposed losses enhance the feature representation of EEG and fNIRS modalities. The proposed method based on EEG and fNIRS modalities has significant potential for improving decoding performance of MI tasks.

Subject(s)

Brain-Computer Interfaces , Imagination , Electroencephalography/methods , Brain , Movement , Neural Networks, Computer , Algorithms

Feature Aggregation and Refinement Network for 2D Anatomical Landmark Detection.

Ao, Yueyuan; Wu, Hong.

J Digit Imaging ; 36(2): 547-561, 2023 04.

Article in English | MEDLINE | ID: mdl-36401132

ABSTRACT

Localization of anatomical landmarks is essential for clinical diagnosis, treatment planning, and research. This paper proposes a novel deep network named feature aggregation and refinement network (FARNet) for automatically detecting anatomical landmarks. FARNet employs an encoder-decoder structure architecture. To alleviate the problem of limited training data in the medical domain, we adopt a backbone network pre-trained on natural images as the encoder. The decoder includes a multi-scale feature aggregation module for multi-scale feature fusion and a feature refinement module for high-resolution heatmap regression. Coarse-to-fine supervisions are applied to the two modules to facilitate end-to-end training. We further propose a novel loss function named Exponential Weighted Center loss for accurate heatmap regression, which focuses on the losses from the pixels near landmarks and suppresses the ones from far away. We evaluate FARNet on three publicly available anatomical landmark detection datasets, including cephalometric, hand, and spine radiographs. Our network achieves state-of-the-art performances on all three datasets. Code is available at https://github.com/JuvenileInWind/FARNet .

Subject(s)

Hand , Spine , Humans

[Research on grading algorithm of diabetic retinopathy based on cross-layer bilinear pooling].

Liang, Liming; Peng, Renjie; Feng, Jun; Yin, Jiang.

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 39(5): 928-936, 2022 Oct 25.

Article in Chinese | MEDLINE | ID: mdl-36310481

ABSTRACT

Considering the small differences between different types in the diabetic retinopathy (DR) grading task, a retinopathy grading algorithm based on cross-layer bilinear pooling is proposed. Firstly, the input image is cropped according to the Hough circle transform (HCT), and then the image contrast is improved by the preprocessing method; then the squeeze excitation group residual network (SEResNeXt) is used as the backbone of the model, and a cross-layer bilinear pooling module is introduced for classification. Finally, a random puzzle generator is introduced in the training process for progressive training, and the center loss (CL) and focal loss (FL) methods are used to further improve the effect of the final classification. The quadratic weighted Kappa (QWK) is 90.84% in the Indian Diabetic Retinopathy Image Dataset (IDRiD), and the area under the receiver operating characteristic curve (AUC) in the Messidor-2 dataset (Messidor-2) is 88.54%. Experiments show that the algorithm proposed in this paper has a certain application value in the field of diabetic retina grading.

Subject(s)

Diabetes Mellitus , Diabetic Retinopathy , Humans , Diabetic Retinopathy/diagnostic imaging , Algorithms , ROC Curve

S2FLNet: Hepatic steatosis detection network with body shape.

Wang, Qiyue; Xue, Wu; Zhang, Xiaoke; Jin, Fang; Hahn, James.

Comput Biol Med ; 140: 105088, 2021 Nov 30.

Article in English | MEDLINE | ID: mdl-34864582

ABSTRACT

Fat accumulation in the liver cells can increase the risk of cardiac complications and cardiovascular disease mortality. Therefore, a way to quickly and accurately detect hepatic steatosis is critically important. However, current methods, e.g., liver biopsy, magnetic resonance imaging, and computerized tomography scan, are subject to high cost and/or medical complications. In this paper, we propose a deep neural network to estimate the degree of hepatic steatosis (low, mid, high) using only body shapes. The proposed network adopts dilated residual network blocks to extract refined features of input body shape maps by expanding the receptive field. Furthermore, to classify the degree of steatosis more accurately, we create a hybrid of the center loss and cross entropy loss to compact intra-class variations and separate inter-class differences. We performed extensive tests on the public medical dataset with various network parameters. Our experimental results show that the proposed network achieves a total accuracy of over 82% and offers an accurate and accessible assessment for hepatic steatosis.

A Generalizable and Discriminative Learning Method for Deep EEG-Based Motor Imagery Classification.

Huang, Xiuyu; Zhou, Nan; Choi, Kup-Sze.

Front Neurosci ; 15: 760979, 2021.

Article in English | MEDLINE | ID: mdl-34744622

ABSTRACT

Convolutional neural networks (CNNs) have been widely applied to the motor imagery (MI) classification field, significantly improving the state-of-the-art (SoA) performance in terms of classification accuracy. Although innovative model structures are thoroughly explored, little attention was drawn toward the objective function. In most of the available CNNs in the MI area, the standard cross-entropy loss is usually performed as the objective function, which only ensures deep feature separability. Corresponding to the limitation of current objective functions, a new loss function with a combination of smoothed cross-entropy (with label smoothing) and center loss is proposed as the supervision signal for the model in the MI recognition task. Specifically, the smoothed cross-entropy is calculated by the entropy between the predicted labels and the one-hot hard labels regularized by a noise of uniform distribution. The center loss learns a deep feature center for each class and minimizes the distance between deep features and their corresponding centers. The proposed loss tries to optimize the model in two learning objectives, preventing overconfident predictions and increasing deep feature discriminative capacity (interclass separability and intraclass invariant), which guarantee the effectiveness of MI recognition models. We conduct extensive experiments on two well-known benchmarks (BCI competition IV-2a and IV-2b) to evaluate our method. The result indicates that the proposed approach achieves better performance than other SoA models on both datasets. The proposed learning scheme offers a more robust optimization for the CNN model in the MI classification task, simultaneously decreasing the risk of overfitting and increasing the discriminative power of deeply learned features.

Cross-Modality Person Re-Identification Based on Heterogeneous Center Loss and Non-Local Features.

Han, Chengmei; Pan, Peng; Zheng, Aihua; Tang, Jin.

Entropy (Basel) ; 23(7)2021 Jul 20.

Article in English | MEDLINE | ID: mdl-34356460

ABSTRACT

Cross-modality person re-identification is the study of images of people matching under different modalities (RGB modality, IR modality). Given one RGB image of a pedestrian collected under visible light in the daytime, cross-modality person re-identification aims to determine whether the same pedestrian appears in infrared images (IR images) collected by infrared cameras at night, and vice versa. Cross-modality person re-identification can solve the task of pedestrian recognition in low light or at night. This paper aims to improve the degree of similarity for the same pedestrian in two modalities by improving the feature expression ability of the network and designing appropriate loss functions. To implement our approach, we introduce a deep neural network structure combining heterogeneous center loss (HC loss) and a non-local mechanism. On the one hand, this can heighten the performance of feature representation of the feature learning module, and, on the other hand, it can improve the similarity of cross-modality within the class. Experimental data show that the network achieves excellent performance on SYSU-MM01 datasets.

10.

Metric learning for image-based flower cultivars identification.

Zhang, Ruisong; Tian, Ye; Zhang, Junmei; Dai, Silan; Hou, Xiaogai; Wang, Jue; Guo, Qi.

Plant Methods ; 17(1): 65, 2021 Jun 22.

Article in English | MEDLINE | ID: mdl-34158091

ABSTRACT

BACKGROUND: The study of plant phenotype by deep learning has received increased interest in recent years, which impressive progress has been made in the fields of plant breeding. Deep learning extremely relies on a large amount of training data to extract and recognize target features in the field of plant phenotype classification and recognition tasks. However, for some flower cultivars identification tasks with a huge number of cultivars, it is difficult for traditional deep learning methods to achieve better recognition results with limited sample data. Thus, a method based on metric learning for flower cultivars identification is proposed to solve this problem. RESULTS: We added center loss to the classification network to make inter-class samples disperse and intra-class samples compact, the script of ResNet18, ResNet50, and DenseNet121 were used for feature extraction. To evaluate the effectiveness of the proposed method, a public dataset Oxford 102 Flowers dataset and two novel datasets constructed by us are chosen. For the method of joint supervision of center loss and L2-softmax loss, the test accuracy rate is 91.88%, 97.34%, and 99.82% across three datasets, respectively. Feature distribution observed by T-distributed stochastic neighbor embedding (T-SNE) verifies the effectiveness of the method presented above. CONCLUSIONS: An efficient metric learning method has been described for flower cultivars identification task, which not only provides high recognition rates but also makes the feature extracted from the recognition network interpretable. This study demonstrated that the proposed method provides new ideas for the application of a small amount of data in the field of identification, and has important reference significance for the flower cultivars identification research.

11.

Learning to Combine Local and Global Image Information for Contactless Palmprint Recognition.

Stoimchev, Marjan; Ivanovska, Marija; Struc, Vitomir.

Sensors (Basel) ; 22(1)2021 Dec 23.

Article in English | MEDLINE | ID: mdl-35009614

ABSTRACT

In the past few years, there has been a leap from traditional palmprint recognition methodologies, which use handcrafted features, to deep-learning approaches that are able to automatically learn feature representations from the input data. However, the information that is extracted from such deep-learning models typically corresponds to the global image appearance, where only the most discriminative cues from the input image are considered. This characteristic is especially problematic when data is acquired in unconstrained settings, as in the case of contactless palmprint recognition systems, where visual artifacts caused by elastic deformations of the palmar surface are typically present in spatially local parts of the captured images. In this study we address the problem of elastic deformations by introducing a new approach to contactless palmprint recognition based on a novel CNN model, designed as a two-path architecture, where one path processes the input in a holistic manner, while the second path extracts local information from smaller image patches sampled from the input image. As elastic deformations can be assumed to most significantly affect the global appearance, while having a lesser impact on spatially local image areas, the local processing path addresses the issues related to elastic deformations thereby supplementing the information from the global processing path. The model is trained with a learning objective that combines the Additive Angular Margin (ArcFace) Loss and the well-known center loss. By using the proposed model design, the discriminative power of the learned image representation is significantly enhanced compared to standard holistic models, which, as we show in the experimental section, leads to state-of-the-art performance for contactless palmprint recognition. Our approach is tested on two publicly available contactless palmprint datasets-namely, IITD and CASIA-and is demonstrated to perform favorably against state-of-the-art methods from the literature. The source code for the proposed model is made publicly available.

Subject(s)

Algorithms , Hand , Learning , Recognition, Psychology

See more details

SEND TO:

Export

RSS

XML

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL