Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 93
Filtrar
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38446739

RESUMO

Antimicrobial peptides (AMPs), short peptides with diverse functions, effectively target and combat various organisms. The widespread misuse of chemical antibiotics has led to increasing microbial resistance. Due to their low drug resistance and toxicity, AMPs are considered promising substitutes for traditional antibiotics. While existing deep learning technology enhances AMP generation, it also presents certain challenges. Firstly, AMP generation overlooks the complex interdependencies among amino acids. Secondly, current models fail to integrate crucial tasks like screening, attribute prediction and iterative optimization. Consequently, we develop a integrated deep learning framework, Diff-AMP, that automates AMP generation, identification, attribute prediction and iterative optimization. We innovatively integrate kinetic diffusion and attention mechanisms into the reinforcement learning framework for efficient AMP generation. Additionally, our prediction module incorporates pre-training and transfer learning strategies for precise AMP identification and screening. We employ a convolutional neural network for multi-attribute prediction and a reinforcement learning-based iterative optimization strategy to produce diverse AMPs. This framework automates molecule generation, screening, attribute prediction and optimization, thereby advancing AMP research. We have also deployed Diff-AMP on a web server, with code, data and server details available in the Data Availability section.


Assuntos
Aminoácidos , Peptídeos Antimicrobianos , Antibacterianos , Difusão , Cinética
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38701417

RESUMO

Transcription factors (TFs) are proteins essential for regulating genetic transcriptions by binding to transcription factor binding sites (TFBSs) in DNA sequences. Accurate predictions of TFBSs can contribute to the design and construction of metabolic regulatory systems based on TFs. Although various deep-learning algorithms have been developed for predicting TFBSs, the prediction performance needs to be improved. This paper proposes a bidirectional encoder representations from transformers (BERT)-based model, called BERT-TFBS, to predict TFBSs solely based on DNA sequences. The model consists of a pre-trained BERT module (DNABERT-2), a convolutional neural network (CNN) module, a convolutional block attention module (CBAM) and an output module. The BERT-TFBS model utilizes the pre-trained DNABERT-2 module to acquire the complex long-term dependencies in DNA sequences through a transfer learning approach, and applies the CNN module and the CBAM to extract high-order local features. The proposed model is trained and tested based on 165 ENCODE ChIP-seq datasets. We conducted experiments with model variants, cross-cell-line validations and comparisons with other models. The experimental results demonstrate the effectiveness and generalization capability of BERT-TFBS in predicting TFBSs, and they show that the proposed model outperforms other deep-learning models. The source code for BERT-TFBS is available at https://github.com/ZX1998-12/BERT-TFBS.


Assuntos
Redes Neurais de Computação , Fatores de Transcrição , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , Sítios de Ligação , Algoritmos , Biologia Computacional/métodos , Humanos , Aprendizado Profundo , Ligação Proteica
3.
BMC Bioinformatics ; 25(1): 250, 2024 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-39080535

RESUMO

BACKGROUND: The potential benefits of drug combination synergy in cancer medicine are significant, yet the risks must be carefully managed due to the possibility of increased toxicity. Although artificial intelligence applications have demonstrated notable success in predicting drug combination synergy, several key challenges persist: (1) Existing models often predict average synergy values across a restricted range of testing dosages, neglecting crucial dose amounts and the mechanisms of action of the drugs involved. (2) Many graph-based models rely on static protein-protein interactions, failing to adapt to dynamic and higher-order relationships. These limitations constrain the applicability of current methods. RESULTS: We introduce SAFER, a Sub-hypergraph Attention-based graph model, addressing these issues by incorporating complex relationships among biological knowledge networks and considering dosing effects on subject-specific networks. SAFER outperformed previous models on the benchmark and the independent test set. The analysis of subgraph attention weight for the lung cancer cell line highlighted JAK-STAT signaling pathway, PRDM12, ZNF781, and CDC5L that have been implicated in lung fibrosis. CONCLUSIONS: SAFER presents an interpretable framework designed to identify drug-responsive signals. Tailored for comprehending dose effects on subject-specific molecular contexts, our model uniquely captures dose-level drug combination responses. This capability unlocks previously inaccessible avenues of investigation compared to earlier models. Furthermore, the SAFER framework can be leveraged by future inquiries to investigate molecular networks that uniquely characterize individual patients and can be applied to prioritize personalized effective treatment based on safe dose combinations.


Assuntos
Redes Neurais de Computação , Humanos , Linhagem Celular Tumoral , Sinergismo Farmacológico , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/metabolismo , Relação Dose-Resposta a Droga , Transdução de Sinais/efeitos dos fármacos , Antineoplásicos/farmacologia
4.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36242566

RESUMO

MOTIVATION: Discovering the drug-target interactions (DTIs) is a crucial step in drug development such as the identification of drug side effects and drug repositioning. Since identifying DTIs by web-biological experiments is time-consuming and costly, many computational-based approaches have been proposed and have become an efficient manner to infer the potential interactions. Although extensive effort is invested to solve this task, the prediction accuracy still needs to be improved. More especially, heterogeneous network-based approaches do not fully consider the complex structure and rich semantic information in these heterogeneous networks. Therefore, it is still a challenge to predict DTIs efficiently. RESULTS: In this study, we develop a novel method via Multiview heterogeneous information network embedding with Hierarchical Attention mechanisms to discover potential Drug-Target Interactions (MHADTI). Firstly, MHADTI constructs different similarity networks for drugs and targets by utilizing their multisource information. Combined with the known DTI network, three drug-target heterogeneous information networks (HINs) with different views are established. Secondly, MHADTI learns embeddings of drugs and targets from multiview HINs with hierarchical attention mechanisms, which include the node-level, semantic-level and graph-level attentions. Lastly, MHADTI employs the multilayer perceptron to predict DTIs with the learned deep feature representations. The hierarchical attention mechanisms could fully consider the importance of nodes, meta-paths and graphs in learning the feature representations of drugs and targets, which makes their embeddings more comprehensively. Extensive experimental results demonstrate that MHADTI performs better than other SOTA prediction models. Moreover, analysis of prediction results for some interested drugs and targets further indicates that MHADTI has advantages in discovering DTIs. AVAILABILITY AND IMPLEMENTATION: https://github.com/pxystudy/MHADTI.


Assuntos
Reposicionamento de Medicamentos , Redes Neurais de Computação , Interações Medicamentosas , Desenvolvimento de Medicamentos , Serviços de Informação
5.
Biomed Eng Online ; 23(1): 76, 2024 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-39085884

RESUMO

BACKGROUND: Transcranial sonography (TCS) plays a crucial role in diagnosing Parkinson's disease. However, the intricate nature of TCS pathological features, the lack of consistent diagnostic criteria, and the dependence on physicians' expertise can hinder accurate diagnosis. Current TCS-based diagnostic methods, which rely on machine learning, often involve complex feature engineering and may struggle to capture deep image features. While deep learning offers advantages in image processing, it has not been tailored to address specific TCS and movement disorder considerations. Consequently, there is a scarcity of research on deep learning algorithms for TCS-based PD diagnosis. METHODS: This study introduces a deep learning residual network model, augmented with attention mechanisms and multi-scale feature extraction, termed AMSNet, to assist in accurate diagnosis. Initially, a multi-scale feature extraction module is implemented to robustly handle the irregular morphological features and significant area information present in TCS images. This module effectively mitigates the effects of artifacts and noise. When combined with a convolutional attention module, it enhances the model's ability to learn features of lesion areas. Subsequently, a residual network architecture, integrated with channel attention, is utilized to capture hierarchical and detailed textures within the images, further enhancing the model's feature representation capabilities. RESULTS: The study compiled TCS images and personal data from 1109 participants. Experiments conducted on this dataset demonstrated that AMSNet achieved remarkable classification accuracy (92.79%), precision (95.42%), and specificity (93.1%). It surpassed the performance of previously employed machine learning algorithms in this domain, as well as current general-purpose deep learning models. CONCLUSION: The AMSNet proposed in this study deviates from traditional machine learning approaches that necessitate intricate feature engineering. It is capable of automatically extracting and learning deep pathological features, and has the capacity to comprehend and articulate complex data. This underscores the substantial potential of deep learning methods in the application of TCS images for the diagnosis of movement disorders.


Assuntos
Aprendizado Profundo , Processamento de Imagem Assistida por Computador , Doença de Parkinson , Ultrassonografia Doppler Transcraniana , Humanos , Doença de Parkinson/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Ultrassonografia Doppler Transcraniana/métodos
6.
BMC Med Inform Decis Mak ; 24(1): 19, 2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38247009

RESUMO

BACKGROUND: In clinical medicine, fetal heart rate (FHR) monitoring using cardiotocography (CTG) is one of the most commonly used methods for assessing fetal acidosis. However, as the visual interpretation of CTG depends on the subjective judgment of the clinician, this has led to high inter-observer and intra-observer variability, making it necessary to introduce automated diagnostic techniques. METHODS: In this study, we propose a computer-aided diagnostic algorithm (Hybrid-FHR) for fetal acidosis to assist physicians in making objective decisions and taking timely interventions. Hybrid-FHR uses multi-modal features, including one-dimensional FHR signals and three types of expert features designed based on prior knowledge (morphological time domain, frequency domain, and nonlinear). To extract the spatiotemporal feature representation of one-dimensional FHR signals, we designed a multi-scale squeeze and excitation temporal convolutional network (SE-TCN) backbone model based on dilated causal convolution, which can effectively capture the long-term dependence of FHR signals by expanding the receptive field of each layer's convolution kernel while maintaining a relatively small parameter size. In addition, we proposed a cross-modal feature fusion (CMFF) method that uses multi-head attention mechanisms to explore the relationships between different modalities, obtaining more informative feature representations and improving diagnostic accuracy. RESULTS: Our ablation experiments show that the Hybrid-FHR outperforms traditional previous methods, with average accuracy, specificity, sensitivity, precision, and F1 score of 96.8, 97.5, 96, 97.5, and 96.7%, respectively. CONCLUSIONS: Our algorithm enables automated CTG analysis, assisting healthcare professionals in the early identification of fetal acidosis and the prompt implementation of interventions.


Assuntos
Acidose , Doenças Fetais , Feminino , Gravidez , Humanos , Acidose/diagnóstico , Algoritmos , Cardiotocografia , Tomada de Decisões , Inteligência Artificial
7.
Sensors (Basel) ; 24(1)2024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-38203134

RESUMO

In ocean remote sensing missions, recognizing an underwater acoustic target is a crucial technology for conducting marine biological surveys, ocean explorations, and other scientific activities that take place in water. The complex acoustic propagation characteristics present significant challenges for the recognition of underwater acoustic targets (UATR). Methods such as extracting the DEMON spectrum of a signal and inputting it into an artificial neural network for recognition, and fusing the multidimensional features of a signal for recognition, have been proposed. However, there is still room for improvement in terms of noise immunity, improved computational performance, and reduced reliance on specialized knowledge. In this article, we propose the Residual Attentional Convolutional Neural Network (RACNN), a convolutional neural network that quickly and accurately recognize the type of ship-radiated noise. This network is capable of extracting internal features of Mel Frequency Cepstral Coefficients (MFCC) of the underwater ship-radiated noise. Experimental results demonstrate that the proposed model achieves an overall accuracy of 99.34% on the ShipsEar dataset, surpassing conventional recognition methods and other deep learning models.

8.
Sensors (Basel) ; 24(8)2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38676273

RESUMO

Deep neural networks must address the dual challenge of delivering high-accuracy predictions and providing user-friendly explanations. While deep models are widely used in the field of time series modeling, deciphering the core principles that govern the models' outputs remains a significant challenge. This is crucial for fostering the development of trusted models and facilitating domain expert validation, thereby empowering users and domain experts to utilize them confidently in high-risk decision-making contexts (e.g., decision-support systems in healthcare). In this work, we put forward a deep prototype learning model that supports interpretable and manipulable modeling and classification of medical time series (i.e., ECG signal). Specifically, we first optimize the representation of single heartbeat data by employing a bidirectional long short-term memory and attention mechanism, and then construct prototypes during the training phase. The final classification outcomes (i.e., normal sinus rhythm, atrial fibrillation, and other rhythm) are determined by comparing the input with the obtained prototypes. Moreover, the proposed model presents a human-machine collaboration mechanism, allowing domain experts to refine the prototypes by integrating their expertise to further enhance the model's performance (contrary to the human-in-the-loop paradigm, where humans primarily act as supervisors or correctors, intervening when required, our approach focuses on a human-machine collaboration, wherein both parties engage as partners, enabling more fluid and integrated interactions). The experimental outcomes presented herein delineate that, within the realm of binary classification tasks-specifically distinguishing between normal sinus rhythm and atrial fibrillation-our proposed model, albeit registering marginally lower performance in comparison to certain established baseline models such as Convolutional Neural Networks (CNNs) and bidirectional long short-term memory with attention mechanisms (Bi-LSTMAttns), evidently surpasses other contemporary state-of-the-art prototype baseline models. Moreover, it demonstrates significantly enhanced performance relative to these prototype baseline models in the context of triple classification tasks, which encompass normal sinus rhythm, atrial fibrillation, and other rhythm classifications. The proposed model manifests a commendable prediction accuracy of 0.8414, coupled with macro precision, recall, and F1-score metrics of 0.8449, 0.8224, and 0.8235, respectively, achieving both high classification accuracy as well as good interpretability.


Assuntos
Eletrocardiografia , Redes Neurais de Computação , Humanos , Eletrocardiografia/métodos , Fibrilação Atrial/fisiopatologia , Fibrilação Atrial/diagnóstico , Aprendizado Profundo , Frequência Cardíaca/fisiologia , Algoritmos , Processamento de Sinais Assistido por Computador
9.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 41(3): 544-551, 2024 Jun 25.
Artigo em Chinês | MEDLINE | ID: mdl-38932541

RESUMO

Skin cancer is a significant public health issue, and computer-aided diagnosis technology can effectively alleviate this burden. Accurate identification of skin lesion types is crucial when employing computer-aided diagnosis. This study proposes a multi-level attention cascaded fusion model based on Swin-T and ConvNeXt. It employed hierarchical Swin-T and ConvNeXt to extract global and local features, respectively, and introduced residual channel attention and spatial attention modules for further feature extraction. Multi-level attention mechanisms were utilized to process multi-scale global and local features. To address the problem of shallow features being lost due to their distance from the classifier, a hierarchical inverted residual fusion module was proposed to dynamically adjust the extracted feature information. Balanced sampling strategies and focal loss were employed to tackle the issue of imbalanced categories of skin lesions. Experimental testing on the ISIC2018 and ISIC2019 datasets yielded accuracy, precision, recall, and F1-Score of 96.01%, 93.67%, 92.65%, and 93.11%, respectively, and 92.79%, 91.52%, 88.90%, and 90.15%, respectively. Compared to Swin-T, the proposed method achieved an accuracy improvement of 3.60% and 1.66%, and compared to ConvNeXt, it achieved an accuracy improvement of 2.87% and 3.45%. The experiments demonstrate that the proposed method accurately classifies skin lesion images, providing a new solution for skin cancer diagnosis.


Assuntos
Algoritmos , Diagnóstico por Computador , Neoplasias Cutâneas , Humanos , Neoplasias Cutâneas/patologia , Neoplasias Cutâneas/diagnóstico por imagem , Neoplasias Cutâneas/classificação , Diagnóstico por Computador/métodos , Pele/patologia , Interpretação de Imagem Assistida por Computador/métodos
10.
Sensors (Basel) ; 23(5)2023 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-36904579

RESUMO

Speech enhancement tasks for audio with a low SNR are challenging. Existing speech enhancement methods are mainly designed for high SNR audio, and they usually use RNNs to model audio sequence features, which causes the model to be unable to learn long-distance dependencies, thus limiting its performance in low-SNR speech enhancement tasks. We design a complex transformer module with sparse attention to overcome this problem. Different from the traditional transformer model, this model is extended to effectively model complex domain sequences, using the sparse attention mask balance model's attention to long-distance and nearby relations, introducing the pre-layer positional embedding module to enhance the model's perception of position information, adding the channel attention module to enable the model to dynamically adjust the weight distribution between channels according to the input audio. The experimental results show that, in the low-SNR speech enhancement tests, our models have noticeable performance improvements in speech quality and intelligibility, respectively.


Assuntos
Percepção da Fala , Fala , Cognição , Aprendizagem
11.
Sensors (Basel) ; 23(3)2023 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-36772423

RESUMO

As the monitor probes are used more and more widely these days, the task of detecting abnormal behaviors in surveillance videos has gained widespread attention. The generalization ability and parameter overhead of the model affect how accurate the detection result is. To deal with the poor generalization ability and high parameter overhead of the model in existing anomaly detection methods, we propose a three-dimensional multi-branch convolutional fusion network, named "Branch-Fusion Net". The network is designed with a multi-branch structure not only to significantly reduce parameter overhead but also to improve the generalization ability by understanding the input feature map from different perspectives. To ignore useless features during the model training, we propose a simple yet effective Channel Spatial Attention Module (CSAM), which sequentially focuses attention on key channels and spatial feature regions to suppress useless features and enhance important features. We combine the Branch-Fusion Net and the CSAM as a local feature extraction network and use the Bi-Directional Gated Recurrent Unit (Bi-GRU) to extract global feature information. The experiments are validated on a self-built Crimes-mini dataset, and the accuracy of anomaly detection in surveillance videos reaches 93.55% on the test set. The result shows that the model proposed in the paper significantly improves the accuracy of anomaly detection in surveillance videos with low parameter overhead.

12.
Sensors (Basel) ; 23(14)2023 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-37514723

RESUMO

With the wide application of visual sensors and development of digital image processing technology, image copy-move forgery detection (CMFD) has become more and more prevalent. Copy-move forgery is copying one or several areas of an image and pasting them into another part of the same image, and CMFD is an efficient means to expose this. There are improper uses of forged images in industry, the military, and daily life. In this paper, we present an efficient end-to-end deep learning approach for CMFD, using a span-partial structure and attention mechanism (SPA-Net). The SPA-Net extracts feature roughly using a pre-processing module and finely extracts deep feature maps using the span-partial structure and attention mechanism as a SPA-net feature extractor module. The span-partial structure is designed to reduce the redundant feature information, while the attention mechanism in the span-partial structure has the advantage of focusing on the tamper region and suppressing the original semantic information. To explore the correlation between high-dimension feature points, a deep feature matching module assists SPA-Net to locate the copy-move areas by computing the similarity of the feature map. A feature upsampling module is employed to upsample the features to their original size and produce a copy-move mask. Furthermore, the training strategy of SPA-Net without pretrained weights has a balance between copy-move and semantic features, and then the module can capture more features of copy-move forgery areas and reduce the confusion from semantic objects. In the experiment, we do not use pretrained weights or models from existing networks such as VGG16, which would bring the limitation of the network paying more attention to objects other than copy-move areas.To deal with this problem, we generated a SPANet-CMFD dataset by applying various processes to the benchmark images from SUN and COCO datasets, and we used existing copy-move forgery datasets, CMH, MICC-F220, MICC-F600, GRIP, Coverage, and parts of USCISI-CMFD, together with our generated SPANet-CMFD dataset, as the training set to train our model. In addition, the SPANet-CMFD dataset could play a big part in forgery detection, such as deepfakes. We employed the CASIA and CoMoFoD datasets as testing datasets to verify the performance of our proposed method. The Precision, Recall, and F1 are calculated to evaluate the CMFD results. Comparison results showed that our model achieved a satisfactory performance on both testing datasets and performed better than the existing methods.

13.
Sensors (Basel) ; 23(13)2023 Jun 22.
Artigo em Inglês | MEDLINE | ID: mdl-37447673

RESUMO

Safety helmets are essential in various indoor and outdoor workplaces, such as metallurgical high-temperature operations and high-rise building construction, to avoid injuries and ensure safety in production. However, manual supervision is costly and prone to lack of enforcement and interference from other human factors. Moreover, small target object detection frequently lacks precision. Improving safety helmets based on the helmet detection algorithm can address these issues and is a promising approach. In this study, we proposed a modified version of the YOLOv5s network, a lightweight deep learning-based object identification network model. The proposed model extends the YOLOv5s network model and enhances its performance by recalculating the prediction frames, utilizing the IoU metric for clustering, and modifying the anchor frames with the K-means++ method. The global attention mechanism (GAM) and the convolutional block attention module (CBAM) were added to the YOLOv5s network to improve its backbone and neck networks. By minimizing information feature loss and enhancing the representation of global interactions, these attention processes enhance deep learning neural networks' capacity for feature extraction. Furthermore, the CBAM is integrated into the CSP module to improve target feature extraction while minimizing computation for model operation. In order to significantly increase the efficiency and precision of the prediction box regression, the proposed model additionally makes use of the most recent SIoU (SCYLLA-IoU LOSS) as the bounding box loss function. Based on the improved YOLOv5s model, knowledge distillation technology is leveraged to realize the light weight of the network model, thereby reducing the computational workload of the model and improving the detection speed to meet the needs of real-time monitoring. The experimental results demonstrate that the proposed model outperforms the original YOLOv5s network model in terms of accuracy (Precision), recall rate (Recall), and mean average precision (mAP). The proposed model may more effectively identify helmet use in low-light situations and at a variety of distances.


Assuntos
Algoritmos , Dispositivos de Proteção da Cabeça , Humanos , Análise por Conglomerados , Redes Neurais de Computação
14.
Sensors (Basel) ; 23(15)2023 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-37571773

RESUMO

Images captured under complex conditions frequently have low quality, and image performance obtained under low-light conditions is poor and does not satisfy subsequent engineering processing. The goal of low-light image enhancement is to restore low-light images to normal illumination levels. Although many methods have emerged in this field, they are inadequate for dealing with noise, color deviation, and exposure issues. To address these issues, we present CGAAN, a new unsupervised generative adversarial network that combines a new attention module and a new normalization function based on cycle generative adversarial networks and employs a global-local discriminator trained with unpaired low-light and normal-light images and stylized region loss. Our attention generates feature maps via global and average pooling, and the weights of different feature maps are calculated by multiplying learnable parameters and feature maps in the appropriate order. These weights indicate the significance of corresponding features. Specifically, our attention is a feature map attention mechanism that improves the network's feature-extraction ability by distinguishing the normal light domain from the low-light domain to obtain an attention map to solve the color bias and exposure problems. The style region loss guides the network to more effectively eliminate the effects of noise. The new normalization function we present preserves more semantic information while normalizing the image, which can guide the model to recover more details and improve image quality even further. The experimental results demonstrate that the proposed method can produce good results that are useful for practical applications.

15.
Sensors (Basel) ; 23(17)2023 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-37687809

RESUMO

Road scene understanding, as a field of research, has attracted increasing attention in recent years. The development of road scene understanding capabilities that are applicable to real-world road scenarios has seen numerous complications. This has largely been due to the cost and complexity of achieving human-level scene understanding, at which successful segmentation of road scene elements can be achieved with a mean intersection over union score close to 1.0. There is a need for more of a unified approach to road scene segmentation for use in self-driving systems. Previous works have demonstrated how deep learning methods can be combined to improve the segmentation and perception performance of road scene understanding systems. This paper proposes a novel segmentation system that uses fully connected networks, attention mechanisms, and multiple-input data stream fusion to improve segmentation performance. Results show comparable performance compared to previous works, with a mean intersection over union of 87.4% on the Cityscapes dataset.

16.
Sensors (Basel) ; 24(1)2023 Dec 29.
Artigo em Inglês | MEDLINE | ID: mdl-38203066

RESUMO

To address the challenges of balancing accuracy and speed, as well as the parameters and FLOPs in current insulator defect detection, we propose an enhanced insulator defect detection algorithm, ML-YOLOv5, based on the YOLOv5 network. The backbone module incorporates depthwise separable convolution, and the feature fusion C3 module is replaced with the improved C2f_DG module. Furthermore, we enhance the feature pyramid network (MFPN) and employ knowledge distillation using YOLOv5m as the teacher model. Experimental results demonstrate that this approach achieved a 46.9% reduction in parameter count and a 43.0% reduction in FLOPs, while maintaining an FPS of 63.6. It exhibited good accuracy and detection speed on both the CPLID and IDID datasets, making it suitable for real-time inspection of high-altitude insulator defects.

17.
Sensors (Basel) ; 23(10)2023 May 11.
Artigo em Inglês | MEDLINE | ID: mdl-37430595

RESUMO

Industrial inspection is crucial for maintaining quality and safety in industrial processes. Deep learning models have recently demonstrated promising results in such tasks. This paper proposes YOLOX-Ray, an efficient new deep learning architecture tailored for industrial inspection. YOLOX-Ray is based on the You Only Look Once (YOLO) object detection algorithms and integrates the SimAM attention mechanism for improved feature extraction in the Feature Pyramid Network (FPN) and Path Aggregation Network (PAN). Moreover, it also employs the Alpha-IoU cost function for enhanced small-scale object detection. YOLOX-Ray's performance was assessed in three case studies: hotspot detection, infrastructure crack detection and corrosion detection. The architecture outperforms all other configurations, achieving mAP50 values of 89%, 99.6% and 87.7%, respectively. For the most challenging metric, mAP50:95, the achieved values were 44.7%, 66.1% and 51.8%, respectively. A comparative analysis demonstrated the importance of combining the SimAM attention mechanism with Alpha-IoU loss function for optimal performance. In conclusion, YOLOX-Ray's ability to detect and to locate multi-scale objects in industrial environments presents new opportunities for effective, efficient and sustainable inspection processes across various industries, revolutionizing the field of industrial inspections.

18.
Sensors (Basel) ; 23(5)2023 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-36904923

RESUMO

To determine the appropriate treatment plan for patients, radiologists must reliably detect brain tumors. Despite the fact that manual segmentation involves a great deal of knowledge and ability, it may sometimes be inaccurate. By evaluating the size, location, structure, and grade of the tumor, automatic tumor segmentation in MRI images aids in a more thorough analysis of pathological conditions. Due to the intensity differences in MRI images, gliomas may spread out, have low contrast, and are therefore difficult to detect. As a result, segmenting brain tumors is a challenging process. In the past, several methods for segmenting brain tumors in MRI scans were created. However, because of their susceptibility to noise and distortions, the usefulness of these approaches is limited. Self-Supervised Wavele- based Attention Network (SSW-AN), a new attention module with adjustable self-supervised activation functions and dynamic weights, is what we suggest as a way to collect global context information. In particular, this network's input and labels are made up of four parameters produced by the two-dimensional (2D) Wavelet transform, which makes the training process simpler by neatly segmenting the data into low-frequency and high-frequency channels. To be more precise, we make use of the channel attention and spatial attention modules of the self-supervised attention block (SSAB). As a result, this method may more easily zero in on crucial underlying channels and spatial patterns. The suggested SSW-AN has been shown to outperform the current state-of-the-art algorithms in medical image segmentation tasks, with more accuracy, more promising dependability, and less unnecessary redundancy.


Assuntos
Neoplasias Encefálicas , Semântica , Humanos , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Imageamento por Ressonância Magnética/métodos
19.
Sensors (Basel) ; 23(19)2023 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-37837012

RESUMO

To cope with the challenges of autonomous driving in complex road environments, the need for collaborative multi-tasking has been proposed. This research direction explores new solutions at the application level and has become a hot topic of great interest. In the field of natural language processing and recommendation algorithms, the use of multi-task learning networks has been proven to reduce time, computing power, and storage usage in various task coupling cases. Due to the characteristics of the multi-task learning network, it has also been applied to visual road feature extraction in recent years. This article proposes a multi-task road feature extraction network that combines group convolution with transformer and squeeze excitation attention mechanisms. The network can simultaneously perform drivable area segmentation, lane line segmentation, and traffic object detection tasks. The experimental results of the BDD-100K dataset show that the proposed method performs well for different tasks and has a higher accuracy than similar algorithms. The proposed method provides new ideas and methods for the autonomous road perception of vehicles and the generation of highly accurate maps in visual-based autonomous driving processes.

20.
Sensors (Basel) ; 24(1)2023 Dec 19.
Artigo em Inglês | MEDLINE | ID: mdl-38202878

RESUMO

Biometric authentication is a widely used method for verifying individuals' identities using photoplethysmography (PPG) cardiac signals. The PPG signal is a non-invasive optical technique that measures the heart rate, which can vary from person to person. However, these signals can also be changed due to factors like stress, physical activity, illness, or medication. Ensuring the system can accurately identify and authenticate the user despite these variations is a significant challenge. To address these issues, the PPG signals were preprocessed and transformed into a 2-D image that visually represents the time-varying frequency content of multiple PPG signals from the same human using the scalogram technique. Afterward, the features fusion approach is developed by combining features from the hybrid convolution vision transformer (CVT) and convolutional mixer (ConvMixer), known as the CVT-ConvMixer classifier, and employing attention mechanisms for the classification of human identity. This hybrid model has the potential to provide more accurate and reliable authentication results in real-world scenarios. The sensitivity (SE), specificity (SP), F1-score, and area under the receiver operating curve (AUC) metrics are utilized to assess the model's performance in accurately distinguishing genuine individuals. The results of extensive experiments on the three PPG datasets were calculated, and the proposed method achieved ACCs of 95%, SEs of 97%, SPs of 95%, and an AUC of 0.96, which indicate the effectiveness of the CVT-ConvMixer system. These results suggest that the proposed method performs well in accurately classifying or identifying patterns within the PPG signals to perform continuous human authentication.


Assuntos
Identificação Biométrica , Dispositivos Ópticos , Humanos , Fotopletismografia , Benchmarking , Fontes de Energia Elétrica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA