Pesquisa | Portal Regional da BVS

1.

Travel-mode inference based on GPS-trajectory data through multi-scale mixed attention mechanism.

Pei, Xiaohui; Yang, Xianjun; Wang, Tao; Ding, Zenghui; Xu, Yang; Jia, Lin; Sun, Yining.

Heliyon ; 10(15): e35572, 2024 Aug 15.

Artigo em Inglês | MEDLINE | ID: mdl-39170500

RESUMO

Identifying travel modes is essential for modern urban transportation planning and management. Recent advancements in data collection, especially those involving Global Positioning System (GPS) technology, offer promising opportunities for rapidly and accurately inferring users' travel modes. This study presents an innovative method for inferring travel modes from GPS trajectory data. The method utilizes multi-scale convolutional techniques to capture and analyze both temporal and spatial information of the data, thereby revealing the underlying spatiotemporal relationships inherent in user movement and behavior patterns. In addition, an attention mechanism is integrated into the model to enable autonomous learning. This mechanism enhances the model's capacity to identify and emphasize key information across different time periods and spatial locations, thus improving the accuracy of travel mode inference. Evaluation on the open-source GPS trajectory dataset, GeoLife, demonstrates that the proposed method attained an accuracy of 83.3%. This result highlights the effectiveness of the method, demonstrating that the model can more accurately understand and predict user travel modes through the integration of multi-scale convolutional technologies and attention mechanisms.

2.

Interactive Multi-scale Fusion: Advancing Brain Tumor Detection Through Trans-IMSM Model.

Durairaj, Vasanthi; Uthirapathy, Palani.

J Imaging Inform Med ; 2024 Aug 15.

Artigo em Inglês | MEDLINE | ID: mdl-39147889

RESUMO

Multi-modal medical image (MI) fusion assists in generating collaboration images collecting complement features through the distinct images of several conditions. The images help physicians to diagnose disease accurately. Hence, this research proposes a novel multi-modal MI fusion modal named guided filter-based interactive multi-scale and multi-modal transformer (Trans-IMSM) fusion approach to develop high-quality computed tomography-magnetic resonance imaging (CT-MRI) fused images for brain tumor detection. This research utilizes the CT and MRI brain scan dataset to gather the input CT and MRI images. At first, the data preprocessing is carried out to preprocess these input images to improve the image quality and generalization ability for further analysis. Then, these preprocessed CT and MRI are decomposed into detail and base components utilizing the guided filter-based MI decomposition approach. This approach involves two phases: such as acquiring the image guidance and decomposing the images utilizing the guided filter. A canny operator is employed to acquire the image guidance comprising robust edge for CT and MRI images, and the guided filter is applied to decompose the guidance and preprocessed images. Then, by applying the Trans-IMSM model, fuse the detail components, while a weighting approach is used for the base components. The fused detail and base components are subsequently processed through a gated fusion and reconstruction network, and the final fused images for brain tumor detection are generated. Extensive tests are carried out to compute the Trans-IMSM method's efficacy. The evaluation results demonstrated the robustness and effectiveness, achieving an accuracy of 98.64% and an SSIM of 0.94.

3.

Leadwise clustering multi-branch network for multi-label ECG classification.

Zhou, Feiyan; Chen, Lingzhi.

Med Eng Phys ; 130: 104196, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-39160024

RESUMO

The 12-lead electrocardiogram (ECG) is widely used for diagnosing cardiovascular diseases in clinical practice. Recently, deep learning methods have become increasingly effective for automatically classifying ECG signals. However, most current research simply combines the 12-lead ECG signals into a matrix without fully considering the intrinsic relationships between the leads and the heart's structure. To better utilize medical domain knowledge, we propose a multi-branch network for multi-label ECG classification and introduce an intuitive and effective lead grouping strategy. Correspondingly, we design multi-branch networks where each branch employs a multi-scale convolutional network structure to extract more comprehensive features, with each branch corresponding to a lead combination. To better integrate features from different leads, we propose a feature weighting fusion module. We evaluate our method on the PTB-XL dataset for classifying 4 arrhythmia types and normal rhythm, and on the China Physiological Signal Challenge 2018 (CPSC2018) database for classifying 8 arrhythmia types and normal rhythm. Experimental results on multiple multi-label datasets demonstrate that our proposed multi-branch network outperforms state-of-the-art networks in multi-label classification tasks.

Assuntos

Eletrocardiografia , Processamento de Sinais Assistido por Computador , Humanos , Análise por Conglomerados , Arritmias Cardíacas/diagnóstico , Arritmias Cardíacas/fisiopatologia , Aprendizado Profundo , Redes Neurais de Computação

4.

A novel approach for automatic classification of macular degeneration OCT images.

Pang, Shilong; Zou, Beiji; Xiao, Xiaoxia; Peng, Qinghua; Yan, Junfeng; Zhang, Wensheng; Yue, Kejuan.

Sci Rep ; 14(1): 19285, 2024 08 20.

Artigo em Inglês | MEDLINE | ID: mdl-39164445

RESUMO

Age-related macular degeneration (AMD) and diabetic macular edema (DME) are significant causes of blindness worldwide. The prevalence of these diseases is steadily increasing due to population aging. Therefore, early diagnosis and prevention are crucial for effective treatment. Classification of Macular Degeneration OCT Images is a widely used method for assessing retinal lesions. However, there are two main challenges in OCT image classification: incomplete image feature extraction and lack of prominence in important positional features. To address these challenges, we proposed a deep learning neural network model called MSA-Net, which incorporates our proposed multi-scale architecture and spatial attention mechanism. Our multi-scale architecture is based on depthwise separable convolution, which ensures comprehensive feature extraction from multiple scales while minimizing the growth of model parameters. The spatial attention mechanism is aim to highlight the important positional features in the images, which emphasizes the representation of macular region features in OCT images. We test MSA-NET on the NEH dataset and the UCSD dataset, performing three-class (CNV, DURSEN, and NORMAL) and four-class (CNV, DURSEN, DME, and NORMAL) classification tasks. On the NEH dataset, the accuracy, sensitivity, and specificity are 98.1%, 97.9%, and 98.0%, respectively. After fine-tuning on the UCSD dataset, the accuracy, sensitivity, and specificity are 96.7%, 96.7%, and 98.9%, respectively. Experimental results demonstrate the excellent classification performance and generalization ability of our model compared to previous models and recent well-known OCT classification models, establishing it as a highly competitive intelligence classification approach in the field of macular degeneration.

Assuntos

Aprendizado Profundo , Degeneração Macular , Redes Neurais de Computação , Tomografia de Coerência Óptica , Humanos , Degeneração Macular/diagnóstico por imagem , Degeneração Macular/classificação , Degeneração Macular/patologia , Tomografia de Coerência Óptica/métodos , Edema Macular/diagnóstico por imagem , Edema Macular/classificação , Edema Macular/patologia , Retinopatia Diabética/diagnóstico por imagem , Retinopatia Diabética/classificação , Retinopatia Diabética/patologia , Retinopatia Diabética/diagnóstico , Processamento de Imagem Assistida por Computador/métodos

5.

MARes-Net: multi-scale attention residual network for jaw cyst image segmentation.

Ding, Xiaokang; Jiang, Xiaoliang; Zheng, Huixia; Shi, Hualuo; Wang, Ban; Chan, Sixian.

Front Bioeng Biotechnol ; 12: 1454728, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-39161348

RESUMO

Jaw cyst is a fluid-containing cystic lesion that can occur in any part of the jaw and cause facial swelling, dental lesions, jaw fractures, and other associated issues. Due to the diversity and complexity of jaw images, existing deep-learning methods still have challenges in segmentation. To this end, we propose MARes-Net, an innovative multi-scale attentional residual network architecture. Firstly, the residual connection is used to optimize the encoder-decoder process, which effectively solves the gradient disappearance problem and improves the training efficiency and optimization ability. Secondly, the scale-aware feature extraction module (SFEM) significantly enhances the network's perceptual abilities by extending its receptive field across various scales, spaces, and channel dimensions. Thirdly, the multi-scale compression excitation module (MCEM) compresses and excites the feature map, and combines it with contextual information to obtain better model performance capabilities. Furthermore, the introduction of the attention gate module marks a significant advancement in refining the feature map output. Finally, rigorous experimentation conducted on the original jaw cyst dataset provided by Quzhou People's Hospital to verify the validity of MARes-Net architecture. The experimental data showed that precision, recall, IoU and F1-score of MARes-Net reached 93.84%, 93.70%, 86.17%, and 93.21%, respectively. Compared with existing models, our MARes-Net shows its unparalleled capabilities in accurately delineating and localizing anatomical structures in the jaw cyst image segmentation.

6.

mm3DSNet: multi-scale and multi-feedforward self-attention 3D segmentation network for CT scans of hepatobiliary ducts.

Zhou, Yinghong; Xie, Yiying; Cai, Nian; Liang, Yuchen; Gong, Ruifeng; Wang, Ping.

Med Biol Eng Comput ; 2024 Aug 23.

Artigo em Inglês | MEDLINE | ID: mdl-39177918

RESUMO

Image segmentation is a key step of the 3D reconstruction of the hepatobiliary duct tree, which is significant for preoperative planning. In this paper, a novel 3D U-Net variant is designed for CT image segmentation of hepatobiliary ducts from the abdominal CT scans, which is composed of a 3D encoder-decoder and a 3D multi-feedforward self-attention module (MFSAM). To well sufficient semantic and spatial features with high inference speed, the 3D ConvNeXt block is designed as the 3D extension of the 2D ConvNeXt. To improve the ability of semantic feature extraction, the MFSAM is designed to transfer the semantic and spatial features at different scales from the encoder to the decoder. Also, to balance the losses for the voxels and the edges of the hepatobiliary ducts, a boundary-aware overlap cross-entropy loss is proposed by combining the cross-entropy loss, the Dice loss, and the boundary loss. Experimental results indicate that the proposed method is superior to some existing deep networks as well as the radiologist without rich experience in terms of CT segmentation of hepatobiliary ducts, with a segmentation performance of 76.54% Dice and 6.56 HD.

7.

Insights into impact of chlorogenic acid on multi-scale structure and digestive properties of lotus seed starch under autoclaving treatment.

Wang, Xiaoying; Liu, Lu; Chen, Wenjing; Jia, Ru; Zheng, Baodong; Guo, Zebin.

Int J Biol Macromol ; 278(Pt 2): 134863, 2024 Aug 19.

Artigo em Inglês | MEDLINE | ID: mdl-39168208

RESUMO

The interaction between polyphenols and starch is an important factor affecting the structure and function of starch. Here, the impact of chlorogenic acid on the multi-scale structure and digestive properties of lotus seed starch under autoclaving treatment were evaluated in this study. The results showed that lotus seed starch granules were destroyed under autoclaving treatment, and chlorogenic acid promoted the formation of loose gel structure of lotus seed starch. In particular, the long- and short-range ordered structure of lotus seed starch-chlorogenic acid complexes were reduced compared with lotus seed starch under autoclaving treatment. The relative crystallinity of A-LS-CA complexes decreased from 23.4 % to 20.3 %, the value of R1047/1022 reduced from 0.87 to 0.80, and the proportion of amorphous region increased from 10.26 % to 13.85 %. In addition, thermal stability, storage modulus and loss modulus of lotus seed starch-chlorogenic acid complexes were reduced, indicating that the viscoelasticity of lotus seed starch gel was weakened with the addition of chlorogenic acid. It is remarkable that chlorogenic acid increased the proportion of resistant starch from 58.25 ± 1.43 % to 63.85 ± 0.96 % compared with lotus seed starch under autoclaving treatment. Here, the research results provided a theoretical guidance for the development of functional foods containing lotus seed starch.

8.

DEAF-Net: Detail-Enhanced Attention Feature Fusion Network for Retinal Vessel Segmentation.

Cai, Pengfei; Li, Biyuan; Sun, Gaowei; Yang, Bo; Wang, Xiuwei; Lv, Chunjie; Yan, Jun.

J Imaging Inform Med ; 2024 Aug 05.

Artigo em Inglês | MEDLINE | ID: mdl-39103564

RESUMO

Retinal vessel segmentation is crucial for the diagnosis of ophthalmic and cardiovascular diseases. However, retinal vessels are densely and irregularly distributed, with many capillaries blending into the background, and exhibit low contrast. Moreover, the encoder-decoder-based network for retinal vessel segmentation suffers from irreversible loss of detailed features due to multiple encoding and decoding, leading to incorrect segmentation of the vessels. Meanwhile, the single-dimensional attention mechanisms possess limitations, neglecting the importance of multidimensional features. To solve these issues, in this paper, we propose a detail-enhanced attention feature fusion network (DEAF-Net) for retinal vessel segmentation. First, the detail-enhanced residual block (DERB) module is proposed to strengthen the capacity for detailed representation, ensuring that intricate features are efficiently maintained during the segmentation of delicate vessels. Second, the multidimensional collaborative attention encoder (MCAE) module is proposed to optimize the extraction of multidimensional information. Then, the dynamic decoder (DYD) module is introduced to preserve spatial information during the decoding process and reduce the information loss caused by upsampling operations. Finally, the proposed detail-enhanced feature fusion (DEFF) module composed of DERB, MCAE and DYD modules fuses feature maps from both encoding and decoding and achieves effective aggregation of multi-scale contextual information. The experiments conducted on the datasets of DRIVE, CHASEDB1, and STARE, achieving Sen of 0.8305, 0.8784, and 0.8654, and AUC of 0.9886, 0.9913, and 0.9911 on DRIVE, CHASEDB1, and STARE, respectively, demonstrate the performance of our proposed network, particularly in the segmentation of fine retinal vessels.

9.

A hybrid quantum-classical classification model based on branching multi-scale entanglement renormalization ansatz.

Hou, Yan-Yan; Li, Jian; Xu, Tao; Liu, Xin-Yu.

Sci Rep ; 14(1): 18521, 2024 Aug 09.

Artigo em Inglês | MEDLINE | ID: mdl-39122811

RESUMO

Tensor networks are emerging architectures for implementing quantum classification models. The branching multi-scale entanglement renormalization ansatz (BMERA) is a tensor network known for its enhanced entanglement properties. This paper introduces a hybrid quantum-classical classification model based on BMERA and explores the correlation between circuit layout, expressiveness, and classification accuracy. Additionally, we present an autodifferentiation method for computing the cost function gradient, which serves as a viable option for other hybrid quantum-classical models. Through numerical experiments, we demonstrate the accuracy and robustness of our classification model in tasks such as image recognition and cluster excitation discrimination, offering a novel approach for designing quantum classification models.

10.

Low-light image enhancement using generative adversarial networks.

Wang, Litian; Zhao, Liquan; Zhong, Tie; Wu, Chunming.

Sci Rep ; 14(1): 18489, 2024 Aug 09.

Artigo em Inglês | MEDLINE | ID: mdl-39122932

RESUMO

In low-light environments, the amount of light captured by the camera sensor is reduced, resulting in lower image brightness. This makes it difficult to recognize or completely lose details in the image, which affects subsequent processing of low-light images. Low-light image enhancement methods can increase image brightness while better-restoring color and detail information. A generative adversarial network is proposed for low-quality image enhancement to improve the quality of low-light images. This network consists of a generative network and an adversarial network. In the generative network, a multi-scale feature extraction module, which consists of dilated convolutions, regular convolutions, max pooling, and average pooling, is designed. This module can extract low-light image features from multiple scales, thereby obtaining richer feature information. Secondly, an illumination attention module is designed to reduce the interference of redundant features. This module assigns greater weight to important illumination features, enabling the network to extract illumination features more effectively. Finally, an encoder-decoder generative network is designed. It uses the multi-scale feature extraction module, illumination attention module, and other conventional modules to enhance low-light images and improve quality. Regarding the adversarial network, a dual-discriminator structure is designed. This network has a global adversarial network and a local adversarial network. They determine if the input image is actual or generated from global and local features, enhancing the performance of the generator network. Additionally, an improved loss function is proposed by introducing color loss and perceptual loss into the conventional loss function. It can better measure the color loss between the generated image and a normally illuminated image, thus reducing color distortion during the enhancement process. The proposed method, along with other methods, is tested using both synthesized and real low-light images. Experimental results show that, compared to other methods, the images enhanced by the proposed method are closer to normally illuminated images for synthetic low-light images. For real low-light images, the images enhanced by the proposed method retain more details, are more apparent, and exhibit higher performance metrics. Overall, compared to other methods, the proposed method demonstrates better image enhancement capabilities for both synthetic and real low-light images.

11.

M ³: using mask-attention and multi-scale for multi-modal brain MRI classification.

Kong, Guanqing; Wu, Chuanfu; Zhang, Zongqiu; Yin, Chuansheng; Qin, Dawei.

Front Neuroinform ; 18: 1403732, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-39139696

RESUMO

Introduction: Brain diseases, particularly the classification of gliomas and brain metastases and the prediction of HT in strokes, pose significant challenges in healthcare. Existing methods, relying predominantly on clinical data or imaging-based techniques such as radiomics, often fall short in achieving satisfactory classification accuracy. These methods fail to adequately capture the nuanced features crucial for accurate diagnosis, often hindered by noise and the inability to integrate information across various scales. Methods: We propose a novel approach that mask attention mechanisms with multi-scale feature fusion for Multimodal brain disease classification tasks, termed M 3, which aims to extract features highly relevant to the disease. The extracted features are then dimensionally reduced using Principal Component Analysis (PCA), followed by classification with a Support Vector Machine (SVM) to obtain the predictive results. Results: Our methodology underwent rigorous testing on multi-parametric MRI datasets for both brain tumors and strokes. The results demonstrate a significant improvement in addressing critical clinical challenges, including the classification of gliomas, brain metastases, and the prediction of hemorrhagic stroke transformations. Ablation studies further validate the effectiveness of our attention mechanism and feature fusion modules. Discussion: These findings underscore the potential of our approach to meet and exceed current clinical diagnostic demands, offering promising prospects for enhancing healthcare outcomes in the diagnosis and treatment of brain diseases.

12.

Considering multi-scale built environment in modeling severity of traffic violations by elderly drivers: An interpretable machine learning framework.

Sun, Zhiyuan; Ai, Zhoumeng; Wang, Zehao; Wang, Jianyu; Gu, Xin; Wang, Duo; Lu, Huapu; Chen, Yanyan.

Accid Anal Prev ; 207: 107740, 2024 Aug 13.

Artigo em Inglês | MEDLINE | ID: mdl-39142041

RESUMO

The causes of traffic violations by elderly drivers are different from those of other age groups. To reduce serious traffic violations that are more likely to cause serious traffic crashes, this study divided the severity of traffic violations into three levels (i.e., slight, ordinary, severe) based on point deduction, and explore the patterns of serious traffic violations (i.e., ordinary, severe) using multi-source data. This paper designed an interpretable machine learning framework, in which four popular machine learning models were enhanced and compared. Specifically, adaptive synthetic sampling method was applied to overcome the effects of imbalanced data and improve the prediction accuracy of minority classes (i.e., ordinary, severe); multi-objective feature selection based on NSGA-II was used to remove the redundant factors to increase the computational efficiency and make the patterns discovered by the explainer more effective; Bayesian hyperparameter optimization aimed to obtain more effective hyperparameters combination with fewer iterations and boost the model adaptability. Results show that the proposed interpretable machine learning framework can significantly improve and distinguish the performance of four popular machine learning models and two post-hoc interpretation methods. It is found that six of the top ten important factors belong to multi-scale built environment attributes. By comparing the results of feature contribution and interaction effects, some findings can be summarized: ordinary and severe traffic violations have some identical influencing factors and interactive effects; have the same influencing factors or the same combinations of influencing factors, but the values of the factors are different; have some unique influencing factors and unique combinations of influencing factors.

13.

A study on the classification of complexly shaped cultivated land considering multi-scale features and edge priors.

Xiao, Jianghui; Zhang, Dongmei; Li, Jiang; Liu, Jiancong.

Environ Monit Assess ; 196(9): 816, 2024 Aug 15.

Artigo em Inglês | MEDLINE | ID: mdl-39145878

RESUMO

Obtaining accurate cultivated land distribution data is crucial for sustainable agricultural development. The current cultivated land extraction studies mainly analyze crops on a regular shape and a small block scale. Aiming at the problem of fragmentation of plots in complexly shaped cultivated land leads to variable scales and blurred edges and the difficulty of extracting the context information by kernel convolution operation of the CNN-based model. We propose a complexly shaped farmland extraction network considering multi-scale features and edge priors (MFEPNet). Specifically, we design a context cross-attention fusion module to couple the local-global features extracted by the two-terminal path CNN-transformer network, which obtains more accurate cultivated land plot representations. This paper constructs the relation maps through a multi-scale feature reconstruction module to realize multi-scale information compensates by combining the gated weight parameter based on information entropy. Additionally, we design a texture-enhanced edge module, which uses the attention mechanism to fuse the edge information of texture feature extraction and the reconstructed feature map to enhance the edge features. In general, the network effectively reduces the influence of variable scale, blurred edges, and limited global field of view. The novel model proposed in this paper is compared with classical deep learning models such as UNet, DeeplabV3 +, DANet, PSPNet, RefineNet, SegNet, ACFNet, and OCRNet on the regular and irregular farmland datasets divided by IFLYTEK and Netherlands datasets. The experimental results show that MFEPNet achieves 92.40 % and 91.65 % MIoU on regular and irregular farmland datasets, which is better than the benchmark experimental model.

Assuntos

Agricultura , Produtos Agrícolas , Produtos Agrícolas/crescimento & desenvolvimento , Monitoramento Ambiental/métodos , Conservação dos Recursos Naturais , Redes Neurais de Computação , Aprendizado Profundo , Fazendas

14.

High-resolution satellite estimates of coal mine methane emissions from local to regional scales in Shanxi, China.

Bai, Shengxi; Zhang, Yongguang; Li, Fei; Yan, Yingqi; Chen, Huilin; Feng, Shuzhuang; Jiang, Fei; Sun, Shiwei; Wang, Zhongting; Zhou, Chunyan; Zhou, Wei; Zhao, Shaohua.

Sci Total Environ ; 950: 175446, 2024 Aug 10.

Artigo em Inglês | MEDLINE | ID: mdl-39134266

RESUMO

Coal mines are significant anthropogenic sources of methane emissions, detectable and traceable from high spatial resolution satellites. Nevertheless, estimating local or regional-scale coal mine methane emission intensities based on high-resolution satellite observations remains challenging. In this study, we devise a novel interpolation algorithm based on high-resolution satellite observations (including Gaofen5-01A/02, Ziyuan-1 02D, PRISMA, GHGSat-C1 to C5, EnMAP, and EMIT) and conduct assessments of annual mean coal mine methane emissions in Shanxi Province, China, one of the world's largest coal-producing regions, spanning the period 2019 to 2023 across various scales: point-source, local, and regional. We use high-resolution satellite observations to perform interpolation-based estimations of methane emissions from three typical coal-mining areas. This approach, known as IPLTSO (Interpolation based on Satellite Observations), provides spatially explicit maps of methane emission intensities in these areas, thereby providing a novel local-scale coal mine methane emission inventory derived from high-resolution top-down observations. For regional-scale estimation and mapping, we utilize high-resolution satellite data to complement and substitute facility-level emission inventories for interpolation (IPLTSO+GCMT, Interpolation based on Satellite Observations and Global Coal Mine Tracker). We evaluate our IPLTSO and IPLTSO+GCMT estimation with emission inventories, top-down methane emission estimates from TROPOMI observations, and TROPOMI's methane concentration enhancements. The results suggest a notable right-skewed distribution of methane emission flux rates from coal mine point sources. Our IPLTSO+GCMT estimates the annual average coal mine methane emission in Shanxi Province from 2019 to 2023 at 8.9 ± 0.5 Tg/yr, marginally surpassing top-down inversion results from TROPOMI (8.5 ± 0.6 Tg/yr in 2019 and 8.6 ± 0.6 Tg/yr in 2020). Furthermore, the spatial patterns of methane emission intensity delineated by IPLTSO+GCMT and IPLTSO closely mirror those observed in TROPOMI's methane enhancements. Our comparative assessment underscores the superior performance and substantial potential of the developed interpolation algorithm based on high-resolution satellite observations for multi-scale estimation of coal mine methane emissions.

15.

A lightweight Color-changing melon ripeness detection algorithm based on model pruning and knowledge distillation: leveraging dilated residual and multi-screening path aggregation.

Chen, Guojun; Hou, Yongjie; Chen, Haozhen; Cao, Lei; Yuan, Jianqiang.

Front Plant Sci ; 15: 1406593, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-39109070

RESUMO

Color-changing melons are a kind of cucurbit plant that combines ornamental and food. With the aim of increasing the efficiency of harvesting Color-changing melon fruits while reducing the deployment cost of detection models on agricultural equipment, this study presents an improved YOLOv8s network approach that uses model pruning and knowledge distillation techniques. The method first merges Dilated Wise Residual (DWR) and Dilated Reparam Block (DRB) to reconstruct the C2f module in the Backbone for better feature fusion. Next, we designed a multilevel scale fusion feature pyramid network (HS-PAN) to enrich semantic information and strengthen localization information to enhance the detection of Color-changing melon fruits with different maturity levels. Finally, we used Layer-Adaptive Sparsity Pruning and Block-Correlation Knowledge Distillation to simplify the model and recover its accuracy. In the Color-changing melon images dataset, the mAP0.5 of the improved model reaches 96.1%, the detection speed is 9.1% faster than YOLOv8s, the number of Params is reduced from 6.47M to 1.14M, the number of computed FLOPs is reduced from 22.8GFLOPs to 7.5GFLOPs. The model's size has also decreased from 12.64MB to 2.47MB, and the performance of the improved YOLOv8 is significantly more outstanding than other lightweight networks. The experimental results verify the effectiveness of the proposed method in complex scenarios, which provides a reference basis and technical support for the subsequent automatic picking of Color-changing melons.

16.

Spatio-temporal evolution characteristics and driving mechanisms of waterlogging in urban agglomeration from multi-scale perspective: A case study of the Guangdong-Hong Kong-Macao Greater Bay Area, China.

Xu, Tao; Liu, Fan; Wan, Zixia; Zhang, Chunbo; Zhao, Yaolong.

J Environ Manage ; 368: 122109, 2024 Aug 09.

Artigo em Inglês | MEDLINE | ID: mdl-39126843

RESUMO

Understanding the characteristics of waterlogging in urban agglomeration is essential for effective waterlogging prevention and management, as well as for promoting sustainable urban development. Previous studies have predominantly focused on the driving mechanisms of waterlogging in urban agglomeration at a single scale, but urban agglomeration space has greater spatio-temporal heterogeneity, it is often difficult to fully reveal such characteristics at a single scale. Consequently, this study endeavors to explore the spatio-temporal evolution characteristics and underlying mechanisms of waterlogging incidents within urban agglomerations by adopting a multi-scale analytical approach. The results indicate that: (1) The waterlogging degree and high-density zones increase in the GBA, and the waterlogging points are spatially polycentric. However, the waterlogging point in Hong Kong is decreasing. (2) The influence of ISP and AI on waterlogging is dominant at all scales, followed by RE and Slope. ISPâ©Slope and ISPâ©RE are the key interactions for waterlogging. (3) The aggregation of waterlogging decreases with grid scale, and the influence of land cover factors on waterlogging increases with grid scale. Moreover, the findings at the grid scale outperformed those at the watershed scale, indicating that the grid scale is more conducive to the investigation of waterlogging in urban agglomerations. This research broadens our comprehension of the mechanisms behind waterlogging in urban agglomeration and provide references for policy decisions on waterlogging prevention and mitigation within urban agglomerations.

17.

Industry Image Classification Based on Stochastic Configuration Networks and Multi-Scale Feature Analysis.

Wang, Qinxia; Liu, Dandan; Tian, Hao; Qin, Yongpeng; Zhao, Difei.

Sensors (Basel) ; 24(15)2024 Jul 24.

Artigo em Inglês | MEDLINE | ID: mdl-39123845

RESUMO

For industry image data, this paper proposes an image classification method based on stochastic configuration networks and multi-scale feature extraction. The multi-scale features are extracted from images of different scales using deep 2DSCN, and the hidden features of multiple layers are also connected together to obtain more informational features. The integrated features are fed into SCNs to learn a classifier which improves the recognition rate for different categories. In the experiments, a handwritten digit database and an industry hot-rolled steel strip database are used, and the comparison results demonstrate the proposed method can effectively improve the classification accuracy.

18.

Research into the Applications of a Multi-Scale Feature Fusion Model in the Recognition of Abnormal Human Behavior.

Li, Congcong; Li, Yifan; Wang, Bin; Zhang, Yuting.

Sensors (Basel) ; 24(15)2024 Aug 05.

Artigo em Inglês | MEDLINE | ID: mdl-39124111

RESUMO

Due to the increasing severity of aging populations in modern society, the accurate and timely identification of, and responses to, sudden abnormal behaviors of the elderly have become an urgent and important issue. In the current research on computer vision-based abnormal behavior recognition, most algorithms have shown poor generalization and recognition abilities in practical applications, as well as issues with recognizing single actions. To address these problems, an MSCS-DenseNet-LSTM model based on a multi-scale attention mechanism is proposed. This model integrates the MSCS (Multi-Scale Convolutional Structure) module into the initial convolutional layer of the DenseNet model to form a multi-scale convolution structure. It introduces the improved Inception X module into the Dense Block to form an Inception Dense structure, and gradually performs feature fusion through each Dense Block module. The CBAM attention mechanism module is added to the dual-layer LSTM to enhance the model's generalization ability while ensuring the accurate recognition of abnormal actions. Furthermore, to address the issue of single-action abnormal behavior datasets, the RGB image dataset RIDS (RGB image dataset) and the contour image dataset CIDS (contour image dataset) containing various abnormal behaviors were constructed. The experimental results validate that the proposed MSCS-DenseNet-LSTM model achieved an accuracy, sensitivity, and specificity of 98.80%, 98.75%, and 98.82% on the two datasets, and 98.30%, 98.28%, and 98.38%, respectively.

Assuntos

Algoritmos , Redes Neurais de Computação , Humanos , Reconhecimento Automatizado de Padrão/métodos , Comportamento/fisiologia , Processamento de Imagem Assistida por Computador/métodos

19.

Remote refocusing for multi-scale imaging.

Prince, Md Nasful Huda; Sain, Nikhil; Chakraborty, Tonmoy.

J Biomed Opt ; 29(8): 080501, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-39119134

RESUMO

Significance: The technique of remote focusing (RF) has attracted considerable attention among microscopists due to its ability to quickly adjust focus across different planes, thus facilitating quicker volumetric imaging. However, the difficulty in changing objectives to align with a matching objective in a remote setting while upholding key requirements remains a challenge. Aim: We aim to propose a customized yet straightforward technique to align multiple objectives with a remote objective, employing an identical set of optical elements to ensure meeting the criteria of remote focusing. Approach: We propose a simple optical approach for aligning multiple objectives with a singular remote objective to achieve a perfect imaging system. This method utilizes readily accessible, commercial optical components to meet the fundamental requirements of remote focusing. Results: Our experimental observations indicate that the proposed RF technique offers at least comparable, if not superior, performance over a significant axial depth compared with the conventional RF technique based on commercial lenses while offering the flexibility to switch the objective for multi-scale imaging. Conclusions: The proposed technique addresses various microscopy challenges, particularly in the realm of multi-resolution imaging. We have experimentally demonstrated the efficacy of this technique by capturing images of focal volumes generated by two distinct objectives in a water medium.

Assuntos

Desenho de Equipamento , Processamento de Imagem Assistida por Computador/métodos , Microscopia/métodos , Lentes

20.

Deep learning architecture with shunted transformer and 3D deformable convolution for voxel-level dose prediction of head and neck tumors.

Chen, Liting; Sun, Hongfei; Wang, Zhongfei; Zhang, Te; Zhang, Hailang; Wang, Wei; Sun, Xiaohuan; Duan, Jie; Gao, Yue; Zhao, Lina.

Phys Eng Sci Med ; 2024 Aug 05.

Artigo em Inglês | MEDLINE | ID: mdl-39101991

RESUMO

Intensity-modulated radiation therapy (IMRT) has been widely used in treating head and neck tumors. However, due to the complex anatomical structures in the head and neck region, it is challenging for the plan optimizer to rapidly generate clinically acceptable IMRT treatment plans. A novel deep learning multi-scale Transformer (MST) model was developed in the current study aiming to accelerate the IMRT planning for head and neck tumors while generating more precise prediction of the voxel-level dose distribution. The proposed end-to-end MST model employs the shunted Transformer to capture multi-scale features and learn a global dependency, and utilizes 3D deformable convolution bottleneck blocks to extract shape-aware feature and compensate the loss of spatial information in the patch merging layers. Moreover, data augmentation and self-knowledge distillation are used to further improve the prediction performance of the model. The MST model was trained and evaluated on the OpenKBP Challenge dataset. Its prediction accuracy was compared with three previous dose prediction models: C3D, TrDosePred, and TSNet. The predicted dose distributions of our proposed MST model in the tumor region are closest to the original clinical dose distribution. The MST model achieves the dose score of 2.23 Gy and the DVH score of 1.34 Gy on the test dataset, outperforming the other three models by 8%-17%. For clinical-related DVH dosimetric metrics, the prediction accuracy in terms of mean absolute error (MAE) is 2.04% for D 99 , 1.54% for D 95 , 1.87% for D 1 , 1.87% for D mean , 1.89% for D 0.1 c c , respectively, superior to the other three models. The quantitative results demonstrated that the proposed MST model achieved more accurate voxel-level dose prediction than the previous models for head and neck tumors. The MST model has a great potential to be applied to other disease sites to further improve the quality and efficiency of radiotherapy planning.

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA