Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 117
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-39422438

RESUMO

Post-stroke Dysarthria (PSD) is one of the common sequelae of stroke. PSD can harm patients' quality of life and, in severe cases, be life-threatening. Most of the existing methods use frequency domain features to recognize the pathological voice, which makes it hard to completely represent the characteristics of pathological voice. Although some results have been achieved, there is still a long way to go for practical applications. Therefore, an improved deep learning-based model is proposed to classify between the pathological voice and the normal voice, using a novel fusion feature (MSA) and an improved 1D ResNet network hybrid bi-directional LSTM with dilated convolution (named 1D DRN-biLSTM). The experimental results show that our fusion features bring greater improvement in pathological speech recognition than the method that only analyzes the MFCC features, and can better synthesize the hidden features that characterize pathological speech. In terms of model structure, the introduction of dilated convolution and LSTM can further improve the performance of the 1D Resnet network, compared to ordinary networks such as CNN and LSTM. The accuracy of this method reaches 82.41% and 100% at the syllable level and speaker level, respectively. Our scheme outperforms other existing methods in terms of feature learning capability and recognition rate, and will help to play an important role in the assessment and diagnosis of PSD in China.

2.
J Imaging Inform Med ; 2024 Oct 22.
Artigo em Inglês | MEDLINE | ID: mdl-39436477

RESUMO

Skin cancer is one of the top three hazardous cancer types, and it is caused by the abnormal proliferation of tumor cells. Diagnosing skin cancer accurately and early is crucial for saving patients' lives. However, it is a challenging task due to various significant issues, including lesion variations in texture, shape, color, and size; artifacts (hairs); uneven lesion boundaries; and poor contrast. To solve these issues, this research proposes a novel Convolutional Swin Transformer (CSwinformer) method for segmenting and classifying skin lesions accurately. The framework involves phases such as data preprocessing, segmentation, and classification. In the first phase, Gaussian filtering, Z-score normalization, and augmentation processes are executed to remove unnecessary noise, re-organize the data, and increase data diversity. In the phase of segmentation, we design a new model "Swinformer-Net" integrating Swin Transformer and U-Net frameworks, to accurately define a region of interest. At the final phase of classification, the segmented outcome is input into the newly proposed module "Multi-Scale Dilated Convolutional Neural Network meets Transformer (MD-CNNFormer)," where the data samples are classified into respective classes. We use four benchmark datasets-HAM10000, ISBI 2016, PH2, and Skin Cancer ISIC for evaluation. The results demonstrated the designed framework's better efficiency against the traditional approaches. The proposed method provided classification accuracy of 98.72%, pixel accuracy of 98.06%, and dice coefficient of 97.67%, respectively. The proposed method offered a promising solution in skin lesion segmentation and classification, supporting clinicians to accurately diagnose skin cancer.

3.
Plants (Basel) ; 13(18)2024 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-39339630

RESUMO

Plants play a vital role in numerous domains, including medicine, agriculture, and environmental balance. Furthermore, they contribute to the production of oxygen and the retention of carbon dioxide, both of which are necessary for living beings. Numerous researchers have conducted thorough research in the classification of plant species where certain studies have focused on limited numbers of classes, while others have employed conventional machine-learning and deep-learning models to classify them. To address these limitations, this paper introduces a novel dual-stream neural architecture embedded with a soft-attention mechanism specifically developed for accurately classifying plant species. The proposed model utilizes residual and inception blocks enhanced with dilated convolutional layers for acquiring both local and global information. Following the extraction of features, both streams are combined, and a soft-attention technique is used to improve the distinct characteristics. The efficacy of the model is shown via extensive experimentation on varied datasets, including several plant species. Moreover, we have contributed a novel dataset that comprises 48 classes of different plant species. The results demonstrate a higher level of performance when compared to current models, emphasizing the capability of the dual-stream design in improving accuracy and model generalization. The integration of a dual-stream architecture, dilated convolutions, and soft attention provides a strong and reliable foundation for the botanical community, supporting advancement in the field of plant species classification.

4.
Sensors (Basel) ; 24(17)2024 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-39275364

RESUMO

Different types of rural settlement agglomerations have been formed and mixed in space during the rural revitalization strategy implementation in China. Discriminating them from remote sensing images is of great significance for rural land planning and living environment improvement. Currently, there is a lack of automatic methods for obtaining information on rural settlement differentiation. In this paper, an improved encoder-decoder network structure, ASCEND-UNet, was designed based on the original UNet. It was implemented to segment and classify dispersed and clustered rural settlement buildings from high-resolution satellite images. The ASCEND-UNet model incorporated three components: firstly, the atrous spatial pyramid pooling (ASPP) multi-scale feature fusion module was added into the encoder, then the spatial and channel squeeze and excitation (scSE) block was embedded at the skip connection; thirdly, the hybrid dilated convolution (HDC) block was utilized in the decoder. In our proposed framework, the ASPP and HDC were used as multiple dilated convolution blocks to expand the receptive field by introducing a series of dilated rate convolutions. The scSE is an attention mechanism block focusing on features both in the spatial and channel dimension. A series of model comparisons and accuracy assessments with the original UNet, PSPNet, DeepLabV3+, and SegNet verified the effectiveness of our proposed model. Compared with the original UNet model, ASCEND-UNet achieved improvements of 4.67%, 2.80%, 3.73%, and 6.28% in precision, recall, F1-score and MIoU, respectively. The contributions of HDC, ASPP, and scSE modules were discussed in ablation experiments. Our proposed model obtained more accurate and stable results by integrating multiple dilated convolution blocks with an attention mechanism. This novel model enriches the automatic methods for semantic segmentation of different rural settlements from remote sensing images.

5.
Genome Biol ; 25(1): 243, 2024 Sep 16.
Artigo em Inglês | MEDLINE | ID: mdl-39285451

RESUMO

The process of splicing messenger RNA to remove introns plays a central role in creating genes and gene variants. We describe Splam, a novel method for predicting splice junctions in DNA using deep residual convolutional neural networks. Unlike previous models, Splam looks at a 400-base-pair window flanking each splice site, reflecting the biological splicing process that relies primarily on signals within this window. Splam also trains on donor and acceptor pairs together, mirroring how the splicing machinery recognizes both ends of each intron. Compared to SpliceAI, Splam is consistently more accurate, achieving 96% accuracy in predicting human splice junctions.


Assuntos
Aprendizado Profundo , Sítios de Splice de RNA , Splicing de RNA , Humanos , Íntrons , Alinhamento de Sequência , Redes Neurais de Computação
6.
Front Neurorobot ; 18: 1436052, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39220588

RESUMO

Aiming at the problems of traditional image super-resolution reconstruction algorithms in the image reconstruction process, such as small receptive field, insufficient multi-scale feature extraction, and easy loss of image feature information, a super-resolution reconstruction algorithm of multi-scale dilated convolution network based on dilated convolution is proposed in this paper. First, the algorithm extracts features from the same input image through the dilated convolution kernels of different receptive fields to obtain feature maps with different scales; then, through the residual attention dense block, further obtain the features of the original low resolution images, local residual connections are added to fuse multi-scale feature information between multiple channels, and residual nested networks and jump connections are used at the same time to speed up deep network convergence and avoid network degradation problems. Finally, deep network extraction features, and it is fused with input features to increase the nonlinear expression ability of the network to enhance the super-resolution reconstruction effect. Experimental results show that compared with Bicubic, SRCNN, ESPCN, VDSR, DRCN, LapSRN, MemNet, and DSRNet algorithms on the Set5, Set14, BSDS100, and Urban100 test sets, the proposed algorithm has improved peak signal-to-noise ratio and structural similarity, and reconstructed images. The visual effect is better.

7.
Neural Netw ; 179: 106568, 2024 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-39089152

RESUMO

Dilated convolution has been widely used in various computer vision tasks due to its ability to expand the receptive field while maintaining the resolution of feature maps. However, the critical challenge is the gridding problem caused by the isomorphic structure of the dilated convolution, where the holes filled in the dilated convolution destroy the integrity of the extracted information and cut off the relevance of neighboring pixels. In this work, a novel heterogeneous dilated convolution, called HDConv, is proposed to address this issue by setting independent dilation rates on grouped channels while keeping the general convolution operation. The heterogeneous structure can effectively avoid the gridding problem while introducing multi-scale kernels in the filters. Based on the heterogeneous structure of the proposed HDConv, we also explore the benefit of large receptive fields to feature extraction by comparing different combinations of dilated rates. Finally, a series of experiments are conducted to verify the effectiveness of some computer vision tasks, such as image segmentation and object detection. The results show the proposed HDConv can achieve a competitive performance on ADE20K, Cityscapes, COCO-Stuff10k, COCO, and a medical image dataset UESTC-COVID-19. The proposed module can readily replace conventional convolutions in existing convolutional neural networks (i.e., plug-and-play), and it is promising to further extend dilated convolution to wider scenarios in the field of image segmentation.


Assuntos
Redes Neurais de Computação , Humanos , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , COVID-19 , Aprendizado Profundo
8.
Sensors (Basel) ; 24(15)2024 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-39124111

RESUMO

Due to the increasing severity of aging populations in modern society, the accurate and timely identification of, and responses to, sudden abnormal behaviors of the elderly have become an urgent and important issue. In the current research on computer vision-based abnormal behavior recognition, most algorithms have shown poor generalization and recognition abilities in practical applications, as well as issues with recognizing single actions. To address these problems, an MSCS-DenseNet-LSTM model based on a multi-scale attention mechanism is proposed. This model integrates the MSCS (Multi-Scale Convolutional Structure) module into the initial convolutional layer of the DenseNet model to form a multi-scale convolution structure. It introduces the improved Inception X module into the Dense Block to form an Inception Dense structure, and gradually performs feature fusion through each Dense Block module. The CBAM attention mechanism module is added to the dual-layer LSTM to enhance the model's generalization ability while ensuring the accurate recognition of abnormal actions. Furthermore, to address the issue of single-action abnormal behavior datasets, the RGB image dataset RIDS (RGB image dataset) and the contour image dataset CIDS (contour image dataset) containing various abnormal behaviors were constructed. The experimental results validate that the proposed MSCS-DenseNet-LSTM model achieved an accuracy, sensitivity, and specificity of 98.80%, 98.75%, and 98.82% on the two datasets, and 98.30%, 98.28%, and 98.38%, respectively.


Assuntos
Algoritmos , Redes Neurais de Computação , Humanos , Reconhecimento Automatizado de Padrão/métodos , Comportamento/fisiologia , Processamento de Imagem Assistida por Computador/métodos
9.
Sensors (Basel) ; 24(11)2024 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-38894398

RESUMO

Image denoising is regarded as an ill-posed problem in computer vision tasks that removes additive noise from imaging sensors. Recently, several convolution neural network-based image-denoising methods have achieved remarkable advances. However, it is difficult for a simple denoising network to recover aesthetically pleasing images owing to the complexity of image content. Therefore, this study proposes a multi-branch network to improve the performance of the denoising method. First, the proposed network is designed based on a conventional autoencoder to learn multi-level contextual features from input images. Subsequently, we integrate two modules into the network, including the Pyramid Context Module (PCM) and the Residual Bottleneck Attention Module (RBAM), to extract salient information for the training process. More specifically, PCM is applied at the beginning of the network to enlarge the receptive field and successfully address the loss of global information using dilated convolution. Meanwhile, RBAM is inserted into the middle of the encoder and decoder to eliminate degraded features and reduce undesired artifacts. Finally, extensive experimental results prove the superiority of the proposed method over state-of-the-art deep-learning methods in terms of objective and subjective performances.

10.
Biomed Tech (Berl) ; 69(5): 465-480, 2024 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-38712825

RESUMO

Subcortical brain structure segmentation plays an important role in the diagnosis of neuroimaging and has become the basis of computer-aided diagnosis. Due to the blurred boundaries and complex shapes of subcortical brain structures, labeling these structures by hand becomes a time-consuming and subjective task, greatly limiting their potential for clinical applications. Thus, this paper proposes the sparsification transformer (STF) module for accurate brain structure segmentation. The self-attention mechanism is used to establish global dependencies to efficiently extract the global information of the feature map with low computational complexity. Also, the shallow network is used to compensate for low-level detail information through the localization of convolutional operations to promote the representation capability of the network. In addition, a hybrid residual dilated convolution (HRDC) module is introduced at the bottom layer of the network to extend the receptive field and extract multi-scale contextual information. Meanwhile, the octave convolution edge feature extraction (OCT) module is applied at the skip connections of the network to pay more attention to the edge features of brain structures. The proposed network is trained with a hybrid loss function. The experimental evaluation on two public datasets: IBSR and MALC, shows outstanding performance in terms of objective and subjective quality.


Assuntos
Encéfalo , Redes Neurais de Computação , Humanos , Encéfalo/diagnóstico por imagem , Algoritmos , Neuroimagem/métodos , Processamento de Imagem Assistida por Computador/métodos
11.
Sensors (Basel) ; 24(9)2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38733020

RESUMO

To address the various challenges in aluminum surface defect detection, such as multiscale intricacies, sensitivity to lighting variations, occlusion, and noise, this study proposes the AluDef-ClassNet model. Firstly, a Gaussian difference pyramid is utilized to capture multiscale image features. Secondly, a self-attention mechanism is introduced to enhance feature representation. Additionally, an improved residual network structure incorporating dilated convolutions is adopted to increase the receptive field, thereby enhancing the network's ability to learn from extensive information. A small-scale dataset of high-quality aluminum surface defect images is acquired using a CCD camera. To better tackle the challenges in surface defect detection, advanced deep learning techniques and data augmentation strategies are employed. To address the difficulty of data labeling, a transfer learning approach based on fine-tuning is utilized, leveraging prior knowledge to enhance the efficiency and accuracy of model training. In dataset testing, our model achieved a classification accuracy of 97.6%, demonstrating significant advantages over other classification models.

12.
Med Eng Phys ; 126: 104138, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-38621836

RESUMO

Lung cancer is one of the most deadly diseases in the world. Lung cancer detection can save the patient's life. Despite being the best imaging tool in the medical sector, clinicians find it challenging to interpret and detect cancer from Computed Tomography (CT) scan data. One of the most effective ways for the diagnosis of certain malignancies like lung tumours is Positron Emission Tomography (PET) imaging. So many diagnosis models have been implemented nowadays to diagnose various diseases. Early lung cancer identification is very important for predicting the severity level of lung cancer in cancer patients. To explore the effective model, an image fusion-based detection model is proposed for lung cancer detection using an improved heuristic algorithm of the deep learning model. Firstly, the PET and CT images are gathered from the internet. Further, these two collected images are fused for further process by using the Adaptive Dilated Convolution Neural Network (AD-CNN), in which the hyperparameters are tuned by the Modified Initial Velocity-based Capuchin Search Algorithm (MIV-CapSA). Subsequently, the abnormal regions are segmented by influencing the TransUnet3+. Finally, the segmented images are fed into the Hybrid Attention-based Deep Networks (HADN) model, encompassed with Mobilenet and Shufflenet. Therefore, the effectiveness of the novel detection model is analyzed using various metrics compared with traditional approaches. At last, the outcome evinces that it aids in early basic detection to treat the patients effectively.


Assuntos
Aprendizado Profundo , Neoplasias Pulmonares , Humanos , Neoplasias Pulmonares/diagnóstico por imagem , Heurística , Tomografia Computadorizada por Raios X , Tomografia por Emissão de Pósitrons , Algoritmos
13.
Anal Biochem ; 690: 115491, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38460901

RESUMO

Bioactive peptides can hinder oxidative processes and microbial spoilage in foodstuffs and play important roles in treating diverse diseases and disorders. While most of the methods focus on single-functional bioactive peptides and have obtained promising prediction performance, it is still a significant challenge to accurately detect complex and diverse functions simultaneously with the quick increase of multi-functional bioactive peptides. In contrast to previous research on multi-functional bioactive peptide prediction based solely on sequence, we propose a novel multimodal dual-branch (MMDB) lightweight deep learning model that designs two different branches to effectively capture the complementary information of peptide sequence and structural properties. Specifically, a multi-scale dilated convolution with Bi-LSTM branch is presented to effectively model the different scales sequence properties of peptides while a multi-layer convolution branch is proposed to capture structural information. To the best of our knowledge, this is the first effective extraction of peptide sequence features using multi-scale dilated convolution without parameter increase. Multimodal features from both branches are integrated via a fully connected layer for multi-label classification. Compared to state-of-the-art methods, our MMDB model exhibits competitive results across metrics, with a 9.1% Coverage increase and 5.3% and 3.5% improvements in Precision and Accuracy, respectively.

14.
Sci Rep ; 14(1): 5087, 2024 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-38429300

RESUMO

When traditional EEG signals are collected based on the Nyquist theorem, long-time recordings of EEG signals will produce a large amount of data. At the same time, limited bandwidth, end-to-end delay, and memory space will bring great pressure on the effective transmission of data. The birth of compressed sensing alleviates this transmission pressure. However, using an iterative compressed sensing reconstruction algorithm for EEG signal reconstruction faces complex calculation problems and slow data processing speed, limiting the application of compressed sensing in EEG signal rapid monitoring systems. As such, this paper presents a non-iterative and fast algorithm for reconstructing EEG signals using compressed sensing and deep learning techniques. This algorithm uses the improved residual network model, extracts the feature information of the EEG signal by one-dimensional dilated convolution, directly learns the nonlinear mapping relationship between the measured value and the original signal, and can quickly and accurately reconstruct the EEG signal. The method proposed in this paper has been verified by simulation on the open BCI contest dataset. Overall, it is proved that the proposed method has higher reconstruction accuracy and faster reconstruction speed than the traditional CS reconstruction algorithm and the existing deep learning reconstruction algorithm. In addition, it can realize the rapid reconstruction of EEG signals.


Assuntos
Compressão de Dados , Aprendizado Profundo , Processamento de Sinais Assistido por Computador , Compressão de Dados/métodos , Algoritmos , Eletroencefalografia/métodos
15.
Heliyon ; 10(5): e26589, 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38468917

RESUMO

Roads are closely intertwined with human existence, and the process of extracting road networks has emerged as the most prominent task in remote sensing (RS). The automated road interpretation process of remote sensing images (RSI) efficiently acquires road network data at a reduced expense in comparison to the traditional visual interpretation of RSI. However the manifestation of RSI is completely distinct because of the great difference in length, width, material, and shape of road networks in dissimilar areas. Thus, the extraction of road network data in RSI is still a complex issue. In recent times, DL-based approaches have projected a famous development in image segmentation outcomes, but a lot of them still could not retain boundary data and attain high-resolution road segmentation maps while processing the RSI. Traditional convolutional neural networks (CNNs) demonstrate impressive performance in road extract tasks; however, they frequently encounter difficulties in capturing intricate details and contextual information. The study introduces a novel method, named Archimedes Optimisation Algorithm, Quantum Dilated Convolutional Neural Network for Road Extraction (AOA-QDCNNRE), to tackle the challenges encountered in remote sensing images. The AOA-QDCNNRE technique aims to generate a high-resolution road segmentation map using DL with a hyperparameter tuning process. The AOA-QDCNNRE technique primarily relies on the QDCNN model, which integrates quantum technology (QC) with dilated convolutions to augment the network's capacity to capture local as well as global contextual information. In addition, the incorporation of the dilated convolutional technique effectively enhances the receptive field without sacrificing spatial resolution, enabling the extraction of precise road features. To develop the road extraction outcomes of the QDCNN approach, the AOA-based hyperparameter tuning process can be exploited. The AOA-QDCNNRE system's simulation results can be tested on benchmark databases, and the results indicate that the AOA-QDCNNRE method surpasses recent algorithms.

16.
Sci Prog ; 107(1): 368504241231161, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38400510

RESUMO

In modern urban traffic systems, intersection monitoring systems are used to monitor traffic flows and track vehicles by recognizing license plates. However, intersection monitors often produce motion-blurred images because of the rapid movement of cars. If a deep learning network is used for image deblurring, the blurring of the image can be eliminated first, and then the complete vehicle information can be obtained to improve the recognition rate. To restore a dynamic blurred image to a sharp image, this paper proposes a multi-scale modified U-Net image deblurring network using dilated convolution and employs a variable scaling iterative strategy to make the scheme more adaptable to actual blurred images. Multi-scale architecture uses scale changes to learn the characteristics of different scales of images, and the use of dilated convolution can improve the advantages of the receptive field and obtain more information from features without increasing the computational cost. Experimental results are obtained using a synthetic motion-blurred image dataset and a real blurred image dataset for comparison with existing deblurring methods. The experimental results demonstrate that the image deblurring method proposed in this paper has a favorable effect on actual motion-blurred images.

17.
Neural Netw ; 172: 106141, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38301340

RESUMO

Multi-view deep neural networks have shown excellent performance on 3D shape classification tasks. However, global features aggregated from multiple views data often lack content information and spatial relationship, which leads to difficult identification the small variance among subcategories in the same category. To solve this problem, in this paper, a novel multiscale dilated convolution neural network termed as MSDCNN is proposed for multi-view fine-grained 3D shape classification. Firstly, a sequence of views are rendered from 12-viewpoints around the input 3D shape by the sequential view capturing module. Then, the first 22 convolution layers of ResNeXt50 is employed to extract the semantic features of each view, and a global mixed feature map is obtained through the element-wise maximum operation of the 12 output feature maps. Furthermore, attention dilated module (ADM), which combines four concatenated attention dilated block (ADB), is designed to extract larger receptive field features from global mixed feature map to enhance context information among the views. Specifically, each ADB is consisted by an attention mechanism module and a dilated convolution with different dilation rates. In addition, prediction module with label smoothing is proposed to classify features, which contains 3 × 3 convolution and adaptive average pooling. The performance of our method is validated experimentally on the ModelNet10, ModelNet40 and FG3D datasets. Experimental results demonstrate the effectiveness and superiority of the proposed MSDCNN framework for 3D shape fine-grained classification.


Assuntos
Redes Neurais de Computação , Semântica
18.
Neural Netw ; 171: 466-473, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38150872

RESUMO

DNA molecules commonly exhibit wide interactions between the nucleobases. Modeling the interactions is important for obtaining accurate sequence-based inference. Although many deep learning methods have recently been developed for modeling DNA sequences, they still suffer from two major issues: 1) most existing methods can handle only short DNA fragments and fail to capture long-range information; 2) current methods always require massive supervised labels, which are hard to obtain in practice. We propose a new method to address both issues. Our neural network employs circular dilated convolutions as building blocks in the backbone. As a result, our network can take long DNA sequences as input without any condensation. We also incorporate the neural network into a self-supervised learning framework to capture inherent information in DNA without expensive supervised labeling. We have tested our model in two DNA inference tasks, the human variant effect and the open chromatin region of plants, where the experimental results show that our method outperforms five other deep learning models. Our code is available at https://github.com/wiedersehne/cdilDNA.


Assuntos
DNA , Redes Neurais de Computação , Humanos , Sequência de Bases , DNA/genética , Aprendizado de Máquina Supervisionado
19.
Math Biosci Eng ; 20(11): 20135-20154, 2023 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-38052640

RESUMO

Accurate segmentation of infected regions in lung computed tomography (CT) images is essential for the detection and diagnosis of coronavirus disease 2019 (COVID-19). However, lung lesion segmentation has some challenges, such as obscure boundaries, low contrast and scattered infection areas. In this paper, the dilated multiresidual boundary guidance network (Dmbg-Net) is proposed for COVID-19 infection segmentation in CT images of the lungs. This method focuses on semantic relationship modelling and boundary detail guidance. First, to effectively minimize the loss of significant features, a dilated residual block is substituted for a convolutional operation, and dilated convolutions are employed to expand the receptive field of the convolution kernel. Second, an edge-attention guidance preservation block is designed to incorporate boundary guidance of low-level features into feature integration, which is conducive to extracting the boundaries of the region of interest. Third, the various depths of features are used to generate the final prediction, and the utilization of a progressive multi-scale supervision strategy facilitates enhanced representations and highly accurate saliency maps. The proposed method is used to analyze COVID-19 datasets, and the experimental results reveal that the proposed method has a Dice similarity coefficient of 85.6% and a sensitivity of 84.2%. Extensive experimental results and ablation studies have shown the effectiveness of Dmbg-Net. Therefore, the proposed method has a potential application in the detection, labeling and segmentation of other lesion areas.


Assuntos
COVID-19 , Humanos , COVID-19/epidemiologia , Algoritmos , Semântica , Tomografia Computadorizada por Raios X , Processamento de Imagem Assistida por Computador
20.
Sensors (Basel) ; 23(22)2023 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-38005604

RESUMO

Monocular panoramic depth estimation has various applications in robotics and autonomous driving due to its ability to perceive the entire field of view. However, panoramic depth estimation faces two significant challenges: global context capturing and distortion awareness. In this paper, we propose a new framework for panoramic depth estimation that can simultaneously address panoramic distortion and extract global context information, thereby improving the performance of panoramic depth estimation. Specifically, we introduce an attention mechanism into the multi-scale dilated convolution and adaptively adjust the receptive field size between different spatial positions, designing the adaptive attention dilated convolution module, which effectively perceives distortion. At the same time, we design the global scene understanding module to integrate global context information into the feature maps generated using the feature extractor. Finally, we trained and evaluated our model on three benchmark datasets which contains the virtual and real-world RGB-D panorama datasets. The experimental results show that the proposed method achieves competitive performance, comparable to existing techniques in both quantitative and qualitative evaluations. Furthermore, our method has fewer parameters and more flexibility, making it a scalable solution in mobile AR.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA