Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Neuroscience ; 2024 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-39265802

RESUMO

Auditory spatial attention detection (ASAD) aims to decipher the spatial locus of a listener's selective auditory attention from electroencephalogram (EEG) signals. However, current models may exhibit deficiencies in EEG feature extraction, leading to overfitting on small datasets or a decline in EEG discriminability. Furthermore, they often neglect topological relationships between EEG channels and, consequently, brain connectivities. Although graph-based EEG modeling has been employed in ASAD, effectively incorporating both local and global connectivities remains a great challenge. To address these limitations, we propose a new ASAD model. First, time-frequency feature fusion provides a more precise and discriminative EEG representation. Second, EEG segments are treated as graphs, and the graph convolution and global attention mechanism are leveraged to capture local and global brain connections, respectively. A series of experiments are conducted in a leave-trials-out cross-validation manner. On the MAD-EEG and KUL datasets, the accuracies of the proposed model are more than 9% and 3% higher than those of the corresponding state-of-the-art models, respectively, while the accuracy of the proposed model on the SNHL dataset is roughly comparable to that of the state-of-the-art model. EEG time-frequency feature fusion proves to be indispensable in the proposed model. EEG electrodes over the frontal cortex are most important for ASAD tasks, followed by those over the temporal lobe. Additionally, the proposed model performs well even on small datasets. This study contributes to a deeper understanding of the neural encoding related to human hearing and attention, with potential applications in neuro-steered hearing devices.

2.
Sensors (Basel) ; 24(13)2024 Jul 04.
Artigo em Inglês | MEDLINE | ID: mdl-39001124

RESUMO

The integration of visual algorithms with infrared imaging technology has become an effective tool for industrial gas leak detection. However, existing research has mostly focused on simple scenarios where a gas plume is clearly visible, with limited studies on detecting gas in complex scenes where target contours are blurred and contrast is low. This paper uses a cooled mid-wave infrared (MWIR) system to provide high sensitivity and fast response imaging and proposes the MWIRGas-YOLO network for detecting gas leaks in mid-wave infrared imaging. This network effectively detects low-contrast gas leakage and segments the gas plume within the scene. In MWIRGas-YOLO, it utilizes the global attention mechanism (GAM) to fully focus on gas plume targets during feature fusion, adds a small target detection layer to enhance information on small-sized targets, and employs transfer learning of similar features from visible light smoke to provide the model with prior knowledge of infrared gas features. Using a cooled mid-wave infrared imager to collect gas leak images, the experimental results show that the proposed algorithm significantly improves the performance over the original model. The segment mean average precision reached 96.1% (mAP50) and 47.6% (mAP50:95), respectively, outperforming the other mainstream algorithms. This can provide an effective reference for research on infrared imaging for gas leak detection.

3.
Comput Biol Med ; 173: 108369, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38552283

RESUMO

BACKGROUND: Glomerular lesions reflect the onset and progression of renal disease. Pathological diagnoses are widely regarded as the definitive method for recognizing these lesions, as the deviations in histopathological structures closely correlate with impairments in renal function. METHODS: Deep learning plays a crucial role in streamlining the laborious, challenging, and subjective task of recognizing glomerular lesions by pathologists. However, the current methods treat pathology images as data in regular Euclidean space, limiting their ability to efficiently represent the complex local features and global connections. In response to this challenge, this paper proposes a graph neural network (GNN) that utilizes global attention pooling (GAP) to more effectively extract high-level semantic features from glomerular images. The model incorporates Bayesian collaborative learning (BCL), enhancing node feature fine-tuning and fusion during training. In addition, this paper adds a soft classification head to mitigate the semantic ambiguity associated with a purely hard classification. RESULTS: This paper conducted extensive experiments on four glomerular datasets, comprising a total of 491 whole slide images (WSIs) and 9030 images. The results demonstrate that the proposed model achieves impressive F1 scores of 81.37%, 90.12%, 87.72%, and 98.68% on four private datasets for glomerular lesion recognition. These scores surpass the performance of the other models used for comparison. Furthermore, this paper employed a publicly available BReAst Carcinoma Subtyping (BRACS) dataset with an 85.61% F1 score to further prove the superiority of the proposed model. CONCLUSION: The proposed model not only facilitates precise recognition of glomerular lesions but also serves as a potent tool for diagnosing kidney diseases effectively. Furthermore, the framework and training methodology of the GNN can be adeptly applied to address various pathology image classification challenges.


Assuntos
Práticas Interdisciplinares , Nefropatias , Humanos , Teorema de Bayes , Nefropatias/diagnóstico por imagem , Glomérulos Renais/diagnóstico por imagem , Redes Neurais de Computação
4.
Foods ; 13(6)2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38540915

RESUMO

As a traditional delicacy in China, preserved eggs inevitably experience instances of substandard quality during the production process. Chinese preserved egg production facilities can only rely on experienced workers to select the preserved eggs. However, the manual selection of preserved eggs presents challenges such as a low efficiency, subjective judgments, high costs, and hindered industrial production processes. In response to these challenges, this study procured the transmitted imagery of preserved eggs and refined the ConvNeXt network across four pivotal dimensions: the dimensionality reduction of model feature maps, the integration of multi-scale feature fusion (MSFF), the incorporation of a global attention mechanism (GAM) module, and the amalgamation of the cross-entropy loss function with focal loss. The resultant refined model, ConvNeXt_PEgg, attained proficiency in classifying and grading preserved eggs. Notably, the improved model achieved a classification accuracy of 92.6% across the five categories of preserved eggs, with a grading accuracy of 95.9% spanning three levels. Moreover, in contrast to its predecessor, the refined model witnessed a 24.5% reduction in the parameter volume, alongside a 3.2 percentage point augmentation in the classification accuracy and a 2.8 percentage point boost in the grading accuracy. Through meticulous comparative analysis, each enhancement exhibited varying degrees of performance elevation. Evidently, the refined model outshone a plethora of classical models, underscoring its efficacy in discerning the internal quality of preserved eggs. With its potential for real-world implementation, this technology portends to heighten the economic viability of manufacturing facilities.

5.
Heliyon ; 10(6): e27364, 2024 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-38510021

RESUMO

The promoter is a key DNA sequence whose primary function is to control the initiation time and the degree of expression of gene transcription. Accurate identification of promoters is essential for understanding gene expression studies. Traditional sequencing techniques for identifying promoters are costly and time-consuming. Therefore, the development of computational methods to identify promoters has become critical. Since deep learning methods show great potential in identifying promoters, this study proposes a new promoter prediction model, called iPro2L-DG. The iPro2L-DG predictor, based on an improved Densely Connected Convolutional Network (DenseNet) and a Global Attention Mechanism (GAM), is constructed to achieve the prediction of promoters. The promoter sequences are combined feature encoding using C2 encoding and nucleotide chemical property (NCP) encoding. An improved DenseNet extracts advanced feature information from the combined feature encoding. GAM evaluates the importance of advanced feature information in terms of channel and spatial dimensions, and finally uses a Full Connect Neural Network (FNN) to derive prediction probabilities. The experimental results showed that the accuracy of iPro2L-DG in the first layer (promoter identification) was 94.10% with Matthews correlation coefficient value of 0.8833. In the second layer (promoter strength prediction), the accuracy was 89.42% with Matthews correlation coefficient value of 0.7915. The iPro2L-DG predictor significantly outperforms other existing predictors in promoter identification and promoter strength prediction. Therefore, our proposed model iPro2L-DG is the most advanced promoter prediction tool. The source code of the iPro2L-DG model can be found in https://github.com/leirufeng/iPro2L-DG.

6.
Sensors (Basel) ; 24(2)2024 Jan 18.
Artigo em Inglês | MEDLINE | ID: mdl-38257709

RESUMO

In recent years, there has been significant growth in the ubiquity and popularity of three-dimensional (3D) point clouds, with an increasing focus on the classification of 3D point clouds. To extract richer features from point clouds, many researchers have turned their attention to various point set regions and channels within irregular point clouds. However, this approach has limited capability in attending to crucial regions of interest in 3D point clouds and may overlook valuable information from neighboring features during feature aggregation. Therefore, this paper proposes a novel 3D point cloud classification method based on global attention and adaptive graph convolution (Att-AdaptNet). The method consists of two main branches: the first branch computes attention masks for each point, while the second branch employs adaptive graph convolution to extract global features from the point set. It dynamically learns features based on point interactions, generating adaptive kernels to effectively and precisely capture diverse relationships among points from different semantic parts. Experimental results demonstrate that the proposed model achieves 93.8% in overall accuracy and 90.8% in average accuracy on the ModeNet40 dataset.

7.
Neural Netw ; 171: 104-113, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38091754

RESUMO

Network pruning has attracted increasing attention recently for its capability of transferring large-scale neural networks (e.g., CNNs) into resource-constrained devices. Such a transfer is typically achieved by removing redundant network parameters while retaining its generalization performance in a static or dynamic manner. Concretely, static pruning usually maintains a larger and fit-to-all (samples) compressed network by removing the same channels for all samples, which cannot maximally excavate redundancy in the given network. In contrast, dynamic pruning can adaptively remove (more) different channels for different samples and obtain state-of-the-art performance along with a higher compression ratio. However, since the system has to preserve the complete network information for sample-specific pruning, the dynamic pruning methods are usually not memory-efficient. In this paper, our interest is to explore a static alternative, dubbed GlobalPru, from a different perspective by respecting the differences among data. Specifically, a novel channel attention-based learn-to-rank framework is proposed to learn a global ranking of channels with respect to network redundancy. In this method, each sample-wise (local) channel attention is forced to reach an agreement on the global ranking among different data. Hence, all samples can empirically share the same ranking of channels and make the pruning statically in practice. Extensive experiments on ImageNet, SVHN, and CIFAR-10/100 demonstrate that the proposed GlobalPru achieves superior performance than state-of-the-art static and dynamic pruning methods by significant margins.


Assuntos
Compressão de Dados , Generalização Psicológica , Aprendizagem , Redes Neurais de Computação
8.
Cogn Emot ; : 1-16, 2023 Nov 28.
Artigo em Inglês | MEDLINE | ID: mdl-38014823

RESUMO

It has been claimed that a broad attentional breadth buffers the impact of negative stimuli on human perception and cognition. Here we identify issues with the research on which this claim is based, and then rigorously test the claim. To induce narrow versus broad attentional breadth participants attended to the local versus global elements of Navon stimuli, and to investigate the impact of emotionally salient stimuli on performance we measured the effect of task-irrelevant stimuli of varying emotional salience (negative, neutral, or positive) on task performance. Across a series of experiments, we found that the Navon stimuli were effective in inducing different attentional breadths, and that both negative and positive task-irrelevant stimuli slowed responses relative to neutral stimuli, but that the magnitude of this emotion-induced slowing was invariant to whether attentional breadth was broad or narrow. This indicates that a broad attentional breadth did not buffer against the effect of either negative or positive emotionally salient stimuli. These results challenge the claim the broadening attentional breadth protects against the impact of emotionally salient stimuli.

9.
Entropy (Basel) ; 25(7)2023 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-37509971

RESUMO

Head pose estimation is an important technology for analyzing human behavior and has been widely researched and applied in areas such as human-computer interaction and fatigue detection. However, traditional head pose estimation networks suffer from the problem of easily losing spatial structure information, particularly in complex scenarios where occlusions and multiple object detections are common, resulting in low accuracy. To address the above issues, we propose a head pose estimation model based on the residual network and capsule network. Firstly, a deep residual network is used to extract features from three stages, capturing spatial structure information at different levels, and a global attention block is employed to enhance the spatial weight of feature extraction. To effectively avoid the loss of spatial structure information, the features are encoded and transmitted to the output using an improved capsule network, which is enhanced in its generalization ability through self-attention routing mechanisms. To enhance the robustness of the model, we optimize Huber loss, which is first used in head pose estimation. Finally, experiments are conducted on three popular public datasets, 300W-LP, AFLW2000, and BIWI. The results demonstrate that the proposed method achieves state-of-the-art results, particularly in scenarios with occlusions.

10.
Magn Reson Med ; 90(5): 1919-1931, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37382206

RESUMO

PURPOSE: Although recent convolutional neural network (CNN) methodologies have shown promising results in fast MR imaging, there is still a desire to explore how they can be used to learn the frequency characteristics of multicontrast images and reconstruct texture details. METHODS: A global attention-enabled texture enhancement network (GATE-Net) with a frequency-dependent feature extraction module (FDFEM) and convolution-based global attention module (GAM) is proposed to address the highly under-sampling MR image reconstruction problem. First, FDFEM enables GATE-Net to effectively extract high-frequency features from shareable information of multicontrast images to improve the texture details of reconstructed images. Second, GAM with less computation complexity has the receptive field of the entire image, which can fully explore useful shareable information of multi-contrast images and suppress less beneficial shareable information. RESULTS: The ablation studies are conducted to evaluate the effectiveness of the proposed FDFEM and GAM. Experimental results under various acceleration rates and datasets consistently demonstrate the superiority of GATE-Net, in terms of peak signal-to-noise ratio, structural similarity and normalized mean square error. CONCLUSION: A global attention-enabled texture enhancement network is proposed. it can be applied to multicontrast MR image reconstruction tasks with different acceleration rates and datasets and achieves superior performance in comparison with state-of-the-art methods.


Assuntos
Imageamento por Ressonância Magnética , Redes Neurais de Computação , Imageamento por Ressonância Magnética/métodos , Processamento de Imagem Assistida por Computador/métodos , Encéfalo/diagnóstico por imagem , Razão Sinal-Ruído
11.
Plants (Basel) ; 12(8)2023 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-37111819

RESUMO

Rice lodging seriously affects rice quality and production. Traditional manual methods of detecting rice lodging are labour-intensive and can result in delayed action, leading to production loss. With the development of the Internet of Things (IoT), unmanned aerial vehicles (UAVs) provide imminent assistance for crop stress monitoring. In this paper, we proposed a novel lightweight detection system with UAVs for rice lodging. We leverage UAVs to acquire the distribution of rice growth, and then our proposed global attention network (GloAN) utilizes the acquisition to detect the lodging areas efficiently and accurately. Our methods aim to accelerate the processing of diagnosis and reduce production loss caused by lodging. The experimental results show that our GloAN can lead to a significant increase in accuracy with negligible computational costs. We further tested the generalization ability of our GloAN and the results show that the GloAN generalizes well in peers' models (Xception, VGG, ResNet, and MobileNetV2) with knowledge distillation and obtains the optimal mean intersection over union (mIoU) of 92.85%. The experimental results show the flexibility of GloAN in rice lodging detection.

12.
Entropy (Basel) ; 25(3)2023 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-36981310

RESUMO

Monocular depth estimation techniques are used to recover the distance from the target to the camera plane in an image scene. However, there are still several problems, such as insufficient estimation accuracy, the inaccurate localization of details, and depth discontinuity in planes parallel to the camera plane. To solve these problems, we propose the Global Feature Interaction Network (GFI-Net), which aims to utilize geometric features, such as object locations and vanishing points, on a global scale. In order to capture the interactive information of the width, height, and channel of the feature graph and expand the global information in the network, we designed a global interactive attention mechanism. The global interactive attention mechanism reduces the loss of pixel information and improves the performance of depth estimation. Furthermore, the encoder uses the Transformer to reduce coding losses and improve the accuracy of depth estimation. Finally, a local-global feature fusion module is designed to improve the depth map's representation of detailed areas. The experimental results on the NYU-Depth-v2 dataset and the KITTI dataset showed that our model achieved state-of-the-art performance with full detail recovery and depth continuation on the same plane.

13.
Entropy (Basel) ; 25(2)2023 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-36832747

RESUMO

Advanced object detection methods always face high algorithmic complexity or low accuracy when used in pedestrian target detection for the autonomous driving system. This paper proposes a lightweight pedestrian detection approach called the YOLOv5s-G2 network to address these issues. We apply Ghost and GhostC3 modules in the YOLOv5s-G2 network to minimize computational cost during feature extraction while keeping the network's capability of extracting features intact. The YOLOv5s-G2 network improves feature extraction accuracy by incorporating the Global Attention Mechanism (GAM) module. This application can extract relevant information for pedestrian target identification tasks and suppress irrelevant information, improving the unidentified problem of occluded and small targets by replacing the GIoU loss function used in the bounding box regression with the α-CIoU loss function. The YOLOv5s-G2 network is evaluated on the WiderPerson dataset to ensure its efficacy. Our proposed YOLOv5s-G2 network offers a 1.0% increase in detection accuracy and a 13.2% decrease in Floating Point Operations (FLOPs) compared to the existing YOLOv5s network. As a result, the YOLOv5s-G2 network is preferable for pedestrian identification as it is both more lightweight and more accurate.

14.
Med Biol Eng Comput ; 61(3): 847-865, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-36624356

RESUMO

Traumatic brain injury (TBI) engenders traumatic necrosis and penumbra-areas of secondary neural injury which are crucial targets for therapeutic interventions. Segmenting manually areas of ongoing changes like necrosis, edema, hematoma, and inflammation is tedious, error-prone, and biased. Using the multi-parametric MR data from a rodent model study, we demonstrate the effectiveness of an end-end deep learning global-attention-based UNet (GA-UNet) framework for automatic segmentation and quantification of TBI lesions. Longitudinal MR scans (2 h, 1, 3, 7, 14, 30, and 60 days) were performed on eight Sprague-Dawley rats after controlled cortical injury was performed. TBI lesion and sub-regions segmentation was performed using 3D-UNet and GA-UNet. Dice statistics (DSI) and Hausdorff distance were calculated to assess the performance. MR scan variations-based (bias, noise, blur, ghosting) data augmentation was performed to develop a robust model.Training/validation median DSI for U-Net was 0.9368 with T2w and MPRAGE inputs, whereas GA-UNet had 0.9537 for the same. Testing accuracies were higher for GA-UNet than U-Net with a DSI of 0.8232 for the T2w-MPRAGE inputs.Longitudinally, necrosis remained constant while oligemia and penumbra decreased, and edema appearing around day 3 which increased with time. GA-UNet shows promise for multi-contrast MR image-based segmentation/quantification of TBI in large cohort studies.


Assuntos
Lesões Encefálicas Traumáticas , Aprendizado Profundo , Ratos , Animais , Ratos Sprague-Dawley , Imageamento por Ressonância Magnética , Estudos de Coortes , Lesões Encefálicas Traumáticas/diagnóstico por imagem , Processamento de Imagem Assistida por Computador
15.
Sensors (Basel) ; 23(2)2023 Jan 09.
Artigo em Inglês | MEDLINE | ID: mdl-36679538

RESUMO

Sentiment analysis aims to mine polarity features in the text, which can empower intelligent terminals to recognize opinions and further enhance interaction capabilities with customers. Considerable progress has been made using recurrent neural networks or pre-trained models to learn semantic representations. However, recently published models with complex structures require increasing computational resources to reach state-of-the-art (SOTA) performance. It is still a significant challenge to deploy these models to run on micro-intelligent terminals with limited computing power and memory. This paper proposes a lightweight and efficient framework based on hybrid multi-grained embedding on sentiment analysis (MC-GGRU). The gated recurrent unit model is designed to incorporate a global attention structure that allows contextual representations to be learned from unstructured text using word tokens. In addition, a multi-grained feature layer can further enrich sentence representation features with implicit semantics from characters. Through hybrid multi-grained representation, MC-GGRU achieves high inference performance with a shallow structure. The experimental results of five public datasets show that our method achieves SOTA for sentiment classification with a trade-off between accuracy and speed.


Assuntos
Semântica , Análise de Sentimentos , Idioma , Redes Neurais de Computação , Aprendizado de Máquina
16.
Int J Comput Assist Radiol Surg ; 17(10): 1903-1913, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35680692

RESUMO

PURPOSE: Automatic image segmentation of surgical instruments is a fundamental task in robot-assisted minimally invasive surgery, which greatly improves the context awareness of surgeons during the operation. A novel method based on Mask R-CNN is proposed in this paper to realize accurate instance segmentation of surgical instruments. METHODS: A novel feature extraction backbone is built, which could extract both local features through the convolutional neural network branch and global representations through the Swin-Transformer branch. Moreover, skip fusions are applied in the backbone to fuse both features and improve the generalization ability of the network. RESULTS: The proposed method is evaluated on the dataset of MICCAI 2017 EndoVis Challenge with three segmentation tasks and shows state-of-the-art performance with an mIoU of 0.5873 in type segmentation and 0.7408 in part segmentation. Furthermore, the results of ablation studies prove that the proposed novel backbone contributes to at least 17% improvement in mIoU. CONCLUSION: The promising results demonstrate that our method can effectively extract global representations as well as local features in the segmentation of surgical instruments and improve the accuracy of segmentation. With the proposed novel backbone, the network can segment the contours of surgical instruments' end tips more precisely. This method can provide more accurate data for localization and pose estimation of surgical instruments, and make a further contribution to the automation of robot-assisted minimally invasive surgery.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Automação , Endoscopia , Humanos , Processamento de Imagem Assistida por Computador/métodos , Instrumentos Cirúrgicos
17.
Patterns (N Y) ; 3(5): 100491, 2022 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-35607621

RESUMO

Machine-learning-based materials property prediction models have emerged as a promising approach for new materials discovery, among which the graph neural networks (GNNs) have shown the best performance due to their capability to learn high-level features from crystal structures. However, existing GNN models suffer from their lack of scalability, high hyperparameter tuning complexity, and constrained performance due to over-smoothing. We propose a scalable global graph attention neural network model DeeperGATGNN with differentiable group normalization (DGN) and skip connections for high-performance materials property prediction. Our systematic benchmark studies show that our model achieves the state-of-the-art prediction results on five out of six datasets, outperforming five existing GNN models by up to 10%. Our model is also the most scalable one in terms of graph convolution layers, which allows us to train very deep networks (e.g., >30 layers) without significant performance degradation. Our implementation is available at https://github.com/usccolumbia/deeperGATGNN.

18.
Comput Biol Med ; 136: 104761, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34426168

RESUMO

In this paper, we propose a novel M-SegNet architecture with global attention for the segmentation of brain magnetic resonance imaging (MRI). The proposed architecture consists of a multiscale deep network at the encoder side, deep supervision at the decoder side, a global attention mechanism, different sizes of convolutional kernels, and combined-connections with skip connections and pooling indices. The multiscale side input layers were used to support deep layers for extracting the discriminative information and the upsampling layer at the decoder side provided deep supervision, which reduced the gradient problem. The global attention mechanism is utilized to capture rich contextual information in the decoder stage by integrating local features with their respective global dependencies. In addition, multiscale convolutional kernels of different sizes were used to extract abundant semantic features from brain MRI scans in the encoder and decoder modules. Moreover, combined-connections were used to pass features from the encoder to the decoder path to recover the spatial information lost during downsampling and makes the model converge faster. Furthermore, we adopted uniform non-overlapping input patches to focus on fine details for the segmentation of brain MRI. We verified the proposed architecture on publicly accessible datasets for the task of segmentation of brain MRI. The experimental results show that the proposed model outperforms conventional methods by achieving an average Dice similarity coefficient score of 0.96.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Algoritmos , Encéfalo/diagnóstico por imagem , Imageamento por Ressonância Magnética
19.
Med Phys ; 48(9): 5004-5016, 2021 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34224147

RESUMO

PURPOSE: Accurate segmentation of complex tumors in lung computed tomography (CT) images is essential to improve the effectiveness and safety of lung cancer treatment. However, the characteristics of heterogeneity, blurred boundaries, and large-area adhesion to tissues with similar gray-scale features always make the segmentation of complex tumors difficult. METHODS: This study proposes an effective deep network for the automatic segmentation of complex lung tumors (CLT-Net). The network architecture uses an encoder-decoder model that combines long and short skip connections and a global attention unit to identify target regions using multiscale semantic information. A boundary-aware loss function integrating Tversky loss and boundary loss based on the level-set calculation is designed to improve the network's ability to perceive boundary positions of difficult-to-segment (DTS) tumors. We use a dynamic weighting strategy to balance the contributions of the two parts of the loss function. RESULTS: The proposed method was verified on a dataset consisting of 502 lung CT images containing DTS tumors. The experiments show that the Dice similarity coefficient and Hausdorff distance metric of the proposed method are improved by 13.2% and 8.5% on average, respectively, compared with state-of-the-art segmentation models. Furthermore, we selected three additional medical image datasets with different modalities to evaluate the proposed model. Compared with mainstream architectures, the Dice similarity coefficient is also improved to a certain extent, which demonstrates the effectiveness of our method for segmenting medical images. CONCLUSIONS: Quantitative and qualitative results show that our method outperforms current mainstream lung tumor segmentation networks in terms of Dice similarity coefficient and Hausdorff distance. Note that the proposed method is not limited to the segmentation of complex lung tumors but also performs in different modalities of medical image segmentation.


Assuntos
Processamento de Imagem Assistida por Computador , Neoplasias Pulmonares , Humanos , Neoplasias Pulmonares/diagnóstico por imagem , Tomografia Computadorizada por Raios X
20.
Sensors (Basel) ; 21(10)2021 May 12.
Artigo em Inglês | MEDLINE | ID: mdl-34066042

RESUMO

In this paper, we propose a multi-scale feature extraction with novel attention-based convolutional learning using the U-SegNet architecture to achieve segmentation of brain tissue from a magnetic resonance image (MRI). Although convolutional neural networks (CNNs) show enormous growth in medical image segmentation, there are some drawbacks with the conventional CNN models. In particular, the conventional use of encoder-decoder approaches leads to the extraction of similar low-level features multiple times, causing redundant use of information. Moreover, due to inefficient modeling of long-range dependencies, each semantic class is likely to be associated with non-accurate discriminative feature representations, resulting in low accuracy of segmentation. The proposed global attention module refines the feature extraction and improves the representational power of the convolutional neural network. Moreover, the attention-based multi-scale fusion strategy can integrate local features with their corresponding global dependencies. The integration of fire modules in both the encoder and decoder paths can significantly reduce the computational complexity owing to fewer model parameters. The proposed method was evaluated on publicly accessible datasets for brain tissue segmentation. The experimental results show that our proposed model achieves segmentation accuracies of 94.81% for cerebrospinal fluid (CSF), 95.54% for gray matter (GM), and 96.33% for white matter (WM) with a noticeably reduced number of learnable parameters. Our study shows better segmentation performance, improving the prediction accuracy by 2.5% in terms of dice similarity index while achieving a 4.5 times reduction in the number of learnable parameters compared to previously developed U-SegNet based segmentation approaches. This demonstrates that the proposed approach can achieve reliable and precise automatic segmentation of brain MRI images.


Assuntos
Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Encéfalo/diagnóstico por imagem , Redes Neurais de Computação , Semântica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA