Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 119
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38701417

RESUMO

Transcription factors (TFs) are proteins essential for regulating genetic transcriptions by binding to transcription factor binding sites (TFBSs) in DNA sequences. Accurate predictions of TFBSs can contribute to the design and construction of metabolic regulatory systems based on TFs. Although various deep-learning algorithms have been developed for predicting TFBSs, the prediction performance needs to be improved. This paper proposes a bidirectional encoder representations from transformers (BERT)-based model, called BERT-TFBS, to predict TFBSs solely based on DNA sequences. The model consists of a pre-trained BERT module (DNABERT-2), a convolutional neural network (CNN) module, a convolutional block attention module (CBAM) and an output module. The BERT-TFBS model utilizes the pre-trained DNABERT-2 module to acquire the complex long-term dependencies in DNA sequences through a transfer learning approach, and applies the CNN module and the CBAM to extract high-order local features. The proposed model is trained and tested based on 165 ENCODE ChIP-seq datasets. We conducted experiments with model variants, cross-cell-line validations and comparisons with other models. The experimental results demonstrate the effectiveness and generalization capability of BERT-TFBS in predicting TFBSs, and they show that the proposed model outperforms other deep-learning models. The source code for BERT-TFBS is available at https://github.com/ZX1998-12/BERT-TFBS.


Assuntos
Redes Neurais de Computação , Fatores de Transcrição , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , Sítios de Ligação , Algoritmos , Biologia Computacional/métodos , Humanos , Aprendizado Profundo , Ligação Proteica
2.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38446739

RESUMO

Antimicrobial peptides (AMPs), short peptides with diverse functions, effectively target and combat various organisms. The widespread misuse of chemical antibiotics has led to increasing microbial resistance. Due to their low drug resistance and toxicity, AMPs are considered promising substitutes for traditional antibiotics. While existing deep learning technology enhances AMP generation, it also presents certain challenges. Firstly, AMP generation overlooks the complex interdependencies among amino acids. Secondly, current models fail to integrate crucial tasks like screening, attribute prediction and iterative optimization. Consequently, we develop a integrated deep learning framework, Diff-AMP, that automates AMP generation, identification, attribute prediction and iterative optimization. We innovatively integrate kinetic diffusion and attention mechanisms into the reinforcement learning framework for efficient AMP generation. Additionally, our prediction module incorporates pre-training and transfer learning strategies for precise AMP identification and screening. We employ a convolutional neural network for multi-attribute prediction and a reinforcement learning-based iterative optimization strategy to produce diverse AMPs. This framework automates molecule generation, screening, attribute prediction and optimization, thereby advancing AMP research. We have also deployed Diff-AMP on a web server, with code, data and server details available in the Data Availability section.


Assuntos
Aminoácidos , Peptídeos Antimicrobianos , Antibacterianos , Difusão , Cinética
3.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39175132

RESUMO

Numerous studies have demonstrated that microRNAs (miRNAs) are critically important for the prediction, diagnosis, and characterization of diseases. However, identifying miRNA-disease associations through traditional biological experiments is both costly and time-consuming. To further explore these associations, we proposed a model based on hybrid high-order moments combined with element-level attention mechanisms (HHOMR). This model innovatively fused hybrid higher-order statistical information along with structural and community information. Specifically, we first constructed a heterogeneous graph based on existing associations between miRNAs and diseases. HHOMR employs a structural fusion layer to capture structure-level embeddings and leverages a hybrid high-order moments encoder layer to enhance features. Element-level attention mechanisms are then used to adaptively integrate the features of these hybrid moments. Finally, a multi-layer perceptron is utilized to calculate the association scores between miRNAs and diseases. Through five-fold cross-validation on HMDD v2.0, we achieved a mean AUC of 93.28%. Compared with four state-of-the-art models, HHOMR exhibited superior performance. Additionally, case studies on three diseases-esophageal neoplasms, lymphoma, and prostate neoplasms-were conducted. Among the top 50 miRNAs with high disease association scores, 46, 47, and 45 associated with these diseases were confirmed by the dbDEMC and miR2Disease databases, respectively. Our results demonstrate that HHOMR not only outperforms existing models but also shows significant potential in predicting miRNA-disease associations.


Assuntos
MicroRNAs , MicroRNAs/genética , Humanos , Biologia Computacional/métodos , Predisposição Genética para Doença , Algoritmos , Neoplasias da Próstata/genética , Modelos Genéticos
4.
BMC Bioinformatics ; 25(1): 250, 2024 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-39080535

RESUMO

BACKGROUND: The potential benefits of drug combination synergy in cancer medicine are significant, yet the risks must be carefully managed due to the possibility of increased toxicity. Although artificial intelligence applications have demonstrated notable success in predicting drug combination synergy, several key challenges persist: (1) Existing models often predict average synergy values across a restricted range of testing dosages, neglecting crucial dose amounts and the mechanisms of action of the drugs involved. (2) Many graph-based models rely on static protein-protein interactions, failing to adapt to dynamic and higher-order relationships. These limitations constrain the applicability of current methods. RESULTS: We introduce SAFER, a Sub-hypergraph Attention-based graph model, addressing these issues by incorporating complex relationships among biological knowledge networks and considering dosing effects on subject-specific networks. SAFER outperformed previous models on the benchmark and the independent test set. The analysis of subgraph attention weight for the lung cancer cell line highlighted JAK-STAT signaling pathway, PRDM12, ZNF781, and CDC5L that have been implicated in lung fibrosis. CONCLUSIONS: SAFER presents an interpretable framework designed to identify drug-responsive signals. Tailored for comprehending dose effects on subject-specific molecular contexts, our model uniquely captures dose-level drug combination responses. This capability unlocks previously inaccessible avenues of investigation compared to earlier models. Furthermore, the SAFER framework can be leveraged by future inquiries to investigate molecular networks that uniquely characterize individual patients and can be applied to prioritize personalized effective treatment based on safe dose combinations.


Assuntos
Redes Neurais de Computação , Humanos , Linhagem Celular Tumoral , Sinergismo Farmacológico , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/metabolismo , Relação Dose-Resposta a Droga , Transdução de Sinais/efeitos dos fármacos , Antineoplásicos/farmacologia
5.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36242566

RESUMO

MOTIVATION: Discovering the drug-target interactions (DTIs) is a crucial step in drug development such as the identification of drug side effects and drug repositioning. Since identifying DTIs by web-biological experiments is time-consuming and costly, many computational-based approaches have been proposed and have become an efficient manner to infer the potential interactions. Although extensive effort is invested to solve this task, the prediction accuracy still needs to be improved. More especially, heterogeneous network-based approaches do not fully consider the complex structure and rich semantic information in these heterogeneous networks. Therefore, it is still a challenge to predict DTIs efficiently. RESULTS: In this study, we develop a novel method via Multiview heterogeneous information network embedding with Hierarchical Attention mechanisms to discover potential Drug-Target Interactions (MHADTI). Firstly, MHADTI constructs different similarity networks for drugs and targets by utilizing their multisource information. Combined with the known DTI network, three drug-target heterogeneous information networks (HINs) with different views are established. Secondly, MHADTI learns embeddings of drugs and targets from multiview HINs with hierarchical attention mechanisms, which include the node-level, semantic-level and graph-level attentions. Lastly, MHADTI employs the multilayer perceptron to predict DTIs with the learned deep feature representations. The hierarchical attention mechanisms could fully consider the importance of nodes, meta-paths and graphs in learning the feature representations of drugs and targets, which makes their embeddings more comprehensively. Extensive experimental results demonstrate that MHADTI performs better than other SOTA prediction models. Moreover, analysis of prediction results for some interested drugs and targets further indicates that MHADTI has advantages in discovering DTIs. AVAILABILITY AND IMPLEMENTATION: https://github.com/pxystudy/MHADTI.


Assuntos
Reposicionamento de Medicamentos , Redes Neurais de Computação , Interações Medicamentosas , Desenvolvimento de Medicamentos , Serviços de Informação
6.
Biomed Eng Online ; 23(1): 76, 2024 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-39085884

RESUMO

BACKGROUND: Transcranial sonography (TCS) plays a crucial role in diagnosing Parkinson's disease. However, the intricate nature of TCS pathological features, the lack of consistent diagnostic criteria, and the dependence on physicians' expertise can hinder accurate diagnosis. Current TCS-based diagnostic methods, which rely on machine learning, often involve complex feature engineering and may struggle to capture deep image features. While deep learning offers advantages in image processing, it has not been tailored to address specific TCS and movement disorder considerations. Consequently, there is a scarcity of research on deep learning algorithms for TCS-based PD diagnosis. METHODS: This study introduces a deep learning residual network model, augmented with attention mechanisms and multi-scale feature extraction, termed AMSNet, to assist in accurate diagnosis. Initially, a multi-scale feature extraction module is implemented to robustly handle the irregular morphological features and significant area information present in TCS images. This module effectively mitigates the effects of artifacts and noise. When combined with a convolutional attention module, it enhances the model's ability to learn features of lesion areas. Subsequently, a residual network architecture, integrated with channel attention, is utilized to capture hierarchical and detailed textures within the images, further enhancing the model's feature representation capabilities. RESULTS: The study compiled TCS images and personal data from 1109 participants. Experiments conducted on this dataset demonstrated that AMSNet achieved remarkable classification accuracy (92.79%), precision (95.42%), and specificity (93.1%). It surpassed the performance of previously employed machine learning algorithms in this domain, as well as current general-purpose deep learning models. CONCLUSION: The AMSNet proposed in this study deviates from traditional machine learning approaches that necessitate intricate feature engineering. It is capable of automatically extracting and learning deep pathological features, and has the capacity to comprehend and articulate complex data. This underscores the substantial potential of deep learning methods in the application of TCS images for the diagnosis of movement disorders.


Assuntos
Aprendizado Profundo , Processamento de Imagem Assistida por Computador , Doença de Parkinson , Ultrassonografia Doppler Transcraniana , Humanos , Doença de Parkinson/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Ultrassonografia Doppler Transcraniana/métodos
7.
BMC Med Inform Decis Mak ; 24(1): 19, 2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38247009

RESUMO

BACKGROUND: In clinical medicine, fetal heart rate (FHR) monitoring using cardiotocography (CTG) is one of the most commonly used methods for assessing fetal acidosis. However, as the visual interpretation of CTG depends on the subjective judgment of the clinician, this has led to high inter-observer and intra-observer variability, making it necessary to introduce automated diagnostic techniques. METHODS: In this study, we propose a computer-aided diagnostic algorithm (Hybrid-FHR) for fetal acidosis to assist physicians in making objective decisions and taking timely interventions. Hybrid-FHR uses multi-modal features, including one-dimensional FHR signals and three types of expert features designed based on prior knowledge (morphological time domain, frequency domain, and nonlinear). To extract the spatiotemporal feature representation of one-dimensional FHR signals, we designed a multi-scale squeeze and excitation temporal convolutional network (SE-TCN) backbone model based on dilated causal convolution, which can effectively capture the long-term dependence of FHR signals by expanding the receptive field of each layer's convolution kernel while maintaining a relatively small parameter size. In addition, we proposed a cross-modal feature fusion (CMFF) method that uses multi-head attention mechanisms to explore the relationships between different modalities, obtaining more informative feature representations and improving diagnostic accuracy. RESULTS: Our ablation experiments show that the Hybrid-FHR outperforms traditional previous methods, with average accuracy, specificity, sensitivity, precision, and F1 score of 96.8, 97.5, 96, 97.5, and 96.7%, respectively. CONCLUSIONS: Our algorithm enables automated CTG analysis, assisting healthcare professionals in the early identification of fetal acidosis and the prompt implementation of interventions.


Assuntos
Acidose , Doenças Fetais , Feminino , Gravidez , Humanos , Acidose/diagnóstico , Algoritmos , Cardiotocografia , Tomada de Decisões , Inteligência Artificial
8.
Sensors (Basel) ; 24(1)2024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-38203134

RESUMO

In ocean remote sensing missions, recognizing an underwater acoustic target is a crucial technology for conducting marine biological surveys, ocean explorations, and other scientific activities that take place in water. The complex acoustic propagation characteristics present significant challenges for the recognition of underwater acoustic targets (UATR). Methods such as extracting the DEMON spectrum of a signal and inputting it into an artificial neural network for recognition, and fusing the multidimensional features of a signal for recognition, have been proposed. However, there is still room for improvement in terms of noise immunity, improved computational performance, and reduced reliance on specialized knowledge. In this article, we propose the Residual Attentional Convolutional Neural Network (RACNN), a convolutional neural network that quickly and accurately recognize the type of ship-radiated noise. This network is capable of extracting internal features of Mel Frequency Cepstral Coefficients (MFCC) of the underwater ship-radiated noise. Experimental results demonstrate that the proposed model achieves an overall accuracy of 99.34% on the ShipsEar dataset, surpassing conventional recognition methods and other deep learning models.

9.
Sensors (Basel) ; 24(8)2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38676273

RESUMO

Deep neural networks must address the dual challenge of delivering high-accuracy predictions and providing user-friendly explanations. While deep models are widely used in the field of time series modeling, deciphering the core principles that govern the models' outputs remains a significant challenge. This is crucial for fostering the development of trusted models and facilitating domain expert validation, thereby empowering users and domain experts to utilize them confidently in high-risk decision-making contexts (e.g., decision-support systems in healthcare). In this work, we put forward a deep prototype learning model that supports interpretable and manipulable modeling and classification of medical time series (i.e., ECG signal). Specifically, we first optimize the representation of single heartbeat data by employing a bidirectional long short-term memory and attention mechanism, and then construct prototypes during the training phase. The final classification outcomes (i.e., normal sinus rhythm, atrial fibrillation, and other rhythm) are determined by comparing the input with the obtained prototypes. Moreover, the proposed model presents a human-machine collaboration mechanism, allowing domain experts to refine the prototypes by integrating their expertise to further enhance the model's performance (contrary to the human-in-the-loop paradigm, where humans primarily act as supervisors or correctors, intervening when required, our approach focuses on a human-machine collaboration, wherein both parties engage as partners, enabling more fluid and integrated interactions). The experimental outcomes presented herein delineate that, within the realm of binary classification tasks-specifically distinguishing between normal sinus rhythm and atrial fibrillation-our proposed model, albeit registering marginally lower performance in comparison to certain established baseline models such as Convolutional Neural Networks (CNNs) and bidirectional long short-term memory with attention mechanisms (Bi-LSTMAttns), evidently surpasses other contemporary state-of-the-art prototype baseline models. Moreover, it demonstrates significantly enhanced performance relative to these prototype baseline models in the context of triple classification tasks, which encompass normal sinus rhythm, atrial fibrillation, and other rhythm classifications. The proposed model manifests a commendable prediction accuracy of 0.8414, coupled with macro precision, recall, and F1-score metrics of 0.8449, 0.8224, and 0.8235, respectively, achieving both high classification accuracy as well as good interpretability.


Assuntos
Eletrocardiografia , Redes Neurais de Computação , Humanos , Eletrocardiografia/métodos , Fibrilação Atrial/fisiopatologia , Fibrilação Atrial/diagnóstico , Aprendizado Profundo , Frequência Cardíaca/fisiologia , Algoritmos , Processamento de Sinais Assistido por Computador
10.
Sensors (Basel) ; 24(18)2024 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-39338650

RESUMO

With the rapid advancement of intelligent manufacturing technologies, the operating environments of modern robotic arms are becoming increasingly complex. In addition to the diversity of objects, there is often a high degree of similarity between the foreground and the background. Although traditional RGB-based object-detection models have achieved remarkable success in many fields, they still face the challenge of effectively detecting targets with textures similar to the background. To address this issue, we introduce the WoodenCube dataset, which contains over 5000 images of 10 different types of blocks. All images are densely annotated with object-level categories, bounding boxes, and rotation angles. Additionally, a new evaluation metric, Cube-mAP, is proposed to more accurately assess the detection performance of cube-like objects. In addition, we have developed a simple, yet effective, framework for WoodenCube, termed CS-SKNet, which captures strong texture features in the scene by enlarging the network's receptive field. The experimental results indicate that our CS-SKNet achieves the best performance on the WoodenCube dataset, as evaluated by the Cube-mAP metric. We further evaluate the CS-SKNet on the challenging DOTAv1.0 dataset, with the consistent enhancement demonstrating its strong generalization capability.

11.
Sensors (Basel) ; 24(18)2024 Sep 21.
Artigo em Inglês | MEDLINE | ID: mdl-39338855

RESUMO

Accurate crop disease classification is crucial for ensuring food security and enhancing agricultural productivity. However, the existing crop disease classification algorithms primarily focus on a single image modality and typically require a large number of samples. Our research counters these issues by using pre-trained Vision-Language Models (VLMs), which enhance the multimodal synergy for better crop disease classification than the traditional unimodal approaches. Firstly, we apply the multimodal model Qwen-VL to generate meticulous textual descriptions for representative disease images selected through clustering from the training set, which will serve as prompt text for generating classifier weights. Compared to solely using the language model for prompt text generation, this approach better captures and conveys fine-grained and image-specific information, thereby enhancing the prompt quality. Secondly, we integrate cross-attention and SE (Squeeze-and-Excitation) Attention into the training-free mode VLCD(Vision-Language model for Crop Disease classification) and the training-required mode VLCD-T (VLCD-Training), respectively, for prompt text processing, enhancing the classifier weights by emphasizing the key text features. The experimental outcomes conclusively prove our method's heightened classification effectiveness in few-shot crop disease scenarios, tackling the data limitations and intricate disease recognition issues. It offers a pragmatic tool for agricultural pathology and reinforces the smart farming surveillance infrastructure.


Assuntos
Algoritmos , Produtos Agrícolas , Doenças das Plantas , Processamento de Imagem Assistida por Computador/métodos
12.
Sensors (Basel) ; 24(19)2024 Oct 04.
Artigo em Inglês | MEDLINE | ID: mdl-39409477

RESUMO

Detecting small objects in images poses significant challenges due to their limited pixel representation and the difficulty in extracting sufficient features, often leading to missed or false detections. To address these challenges and enhance detection accuracy, this paper presents an improved small object detection algorithm, CRL-YOLOv5. The proposed approach integrates the Convolutional Block Attention Module (CBAM) attention mechanism into the C3 module of the backbone network, which enhances the localization accuracy of small objects. Additionally, the Receptive Field Block (RFB) module is introduced to expand the model's receptive field, thereby fully leveraging contextual information. Furthermore, the network architecture is restructured to include an additional detection layer specifically for small objects, allowing for deeper feature extraction from shallow layers. When tested on the VisDrone2019 small object dataset, CRL-YOLOv5 achieved an mAP50 of 39.2%, representing a 5.4% improvement over the original YOLOv5, effectively boosting the detection precision for small objects in images.

13.
J Xray Sci Technol ; 2024 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-39269816

RESUMO

BACKGROUND: Content-based image retrieval (CBIR) systems are vital for managing the large volumes of data produced by medical imaging technologies. They enable efficient retrieval of relevant medical images from extensive databases, supporting clinical diagnosis, treatment planning, and medical research. OBJECTIVE: This study aims to enhance CBIR systems' effectiveness in medical image analysis by introducing the VisualSift Ensembling Integration with Attention Mechanisms (VEIAM). VEIAM seeks to improve diagnostic accuracy and retrieval efficiency by integrating robust feature extraction with dynamic attention mechanisms. METHODS: VEIAM combines Scale-Invariant Feature Transform (SIFT) with selective attention mechanisms to emphasize crucial regions within medical images dynamically. Implemented in Python, the model integrates seamlessly into existing medical image analysis workflows, providing a robust and accessible tool for clinicians and researchers. RESULTS: The proposed VEIAM model demonstrated an impressive accuracy of 97.34% in classifying and retrieving medical images. This performance indicates VEIAM's capability to discern subtle patterns and textures critical for accurate diagnostics. CONCLUSIONS: By merging SIFT-based feature extraction with attention processes, VEIAM offers a discriminatively powerful approach to medical image analysis. Its high accuracy and efficiency in retrieving relevant medical images make it a promising tool for enhancing diagnostic processes and supporting medical research in CBIR systems.

14.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 41(5): 895-902, 2024 Oct 25.
Artigo em Zh | MEDLINE | ID: mdl-39462656

RESUMO

Existing classification methods for myositis ultrasound images have problems of poor classification performance or high computational cost. Motivated by this difficulty, a lightweight neural network based on a soft threshold attention mechanism is proposed to cater for a better IIMs classification. The proposed network was constructed by alternately using depthwise separable convolution (DSC) and conventional convolution (CConv). Moreover, a soft threshold attention mechanism was leveraged to enhance the extraction capabilities of key features. Compared with the current dual-branch feature fusion myositis classification network with the highest classification accuracy, the classification accuracy of the network proposed in this paper increased by 5.9%, reaching 96.1%, and its computational complexity was only 0.25% of the existing method. The obtained results support that the proposed method can provide physicians with more accurate classification results at a lower computational cost, thereby greatly assisting them in their clinical diagnosis.


Assuntos
Miosite , Redes Neurais de Computação , Ultrassonografia , Humanos , Miosite/diagnóstico por imagem , Miosite/classificação , Ultrassonografia/métodos , Algoritmos , Processamento de Imagem Assistida por Computador/métodos
15.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 41(3): 544-551, 2024 Jun 25.
Artigo em Zh | MEDLINE | ID: mdl-38932541

RESUMO

Skin cancer is a significant public health issue, and computer-aided diagnosis technology can effectively alleviate this burden. Accurate identification of skin lesion types is crucial when employing computer-aided diagnosis. This study proposes a multi-level attention cascaded fusion model based on Swin-T and ConvNeXt. It employed hierarchical Swin-T and ConvNeXt to extract global and local features, respectively, and introduced residual channel attention and spatial attention modules for further feature extraction. Multi-level attention mechanisms were utilized to process multi-scale global and local features. To address the problem of shallow features being lost due to their distance from the classifier, a hierarchical inverted residual fusion module was proposed to dynamically adjust the extracted feature information. Balanced sampling strategies and focal loss were employed to tackle the issue of imbalanced categories of skin lesions. Experimental testing on the ISIC2018 and ISIC2019 datasets yielded accuracy, precision, recall, and F1-Score of 96.01%, 93.67%, 92.65%, and 93.11%, respectively, and 92.79%, 91.52%, 88.90%, and 90.15%, respectively. Compared to Swin-T, the proposed method achieved an accuracy improvement of 3.60% and 1.66%, and compared to ConvNeXt, it achieved an accuracy improvement of 2.87% and 3.45%. The experiments demonstrate that the proposed method accurately classifies skin lesion images, providing a new solution for skin cancer diagnosis.


Assuntos
Algoritmos , Diagnóstico por Computador , Neoplasias Cutâneas , Humanos , Neoplasias Cutâneas/patologia , Neoplasias Cutâneas/diagnóstico por imagem , Neoplasias Cutâneas/classificação , Diagnóstico por Computador/métodos , Pele/patologia , Interpretação de Imagem Assistida por Computador/métodos
16.
Sensors (Basel) ; 23(5)2023 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-36904579

RESUMO

Speech enhancement tasks for audio with a low SNR are challenging. Existing speech enhancement methods are mainly designed for high SNR audio, and they usually use RNNs to model audio sequence features, which causes the model to be unable to learn long-distance dependencies, thus limiting its performance in low-SNR speech enhancement tasks. We design a complex transformer module with sparse attention to overcome this problem. Different from the traditional transformer model, this model is extended to effectively model complex domain sequences, using the sparse attention mask balance model's attention to long-distance and nearby relations, introducing the pre-layer positional embedding module to enhance the model's perception of position information, adding the channel attention module to enable the model to dynamically adjust the weight distribution between channels according to the input audio. The experimental results show that, in the low-SNR speech enhancement tests, our models have noticeable performance improvements in speech quality and intelligibility, respectively.


Assuntos
Percepção da Fala , Fala , Cognição , Aprendizagem
17.
Sensors (Basel) ; 23(13)2023 Jun 22.
Artigo em Inglês | MEDLINE | ID: mdl-37447673

RESUMO

Safety helmets are essential in various indoor and outdoor workplaces, such as metallurgical high-temperature operations and high-rise building construction, to avoid injuries and ensure safety in production. However, manual supervision is costly and prone to lack of enforcement and interference from other human factors. Moreover, small target object detection frequently lacks precision. Improving safety helmets based on the helmet detection algorithm can address these issues and is a promising approach. In this study, we proposed a modified version of the YOLOv5s network, a lightweight deep learning-based object identification network model. The proposed model extends the YOLOv5s network model and enhances its performance by recalculating the prediction frames, utilizing the IoU metric for clustering, and modifying the anchor frames with the K-means++ method. The global attention mechanism (GAM) and the convolutional block attention module (CBAM) were added to the YOLOv5s network to improve its backbone and neck networks. By minimizing information feature loss and enhancing the representation of global interactions, these attention processes enhance deep learning neural networks' capacity for feature extraction. Furthermore, the CBAM is integrated into the CSP module to improve target feature extraction while minimizing computation for model operation. In order to significantly increase the efficiency and precision of the prediction box regression, the proposed model additionally makes use of the most recent SIoU (SCYLLA-IoU LOSS) as the bounding box loss function. Based on the improved YOLOv5s model, knowledge distillation technology is leveraged to realize the light weight of the network model, thereby reducing the computational workload of the model and improving the detection speed to meet the needs of real-time monitoring. The experimental results demonstrate that the proposed model outperforms the original YOLOv5s network model in terms of accuracy (Precision), recall rate (Recall), and mean average precision (mAP). The proposed model may more effectively identify helmet use in low-light situations and at a variety of distances.


Assuntos
Algoritmos , Dispositivos de Proteção da Cabeça , Humanos , Análise por Conglomerados , Redes Neurais de Computação
18.
Sensors (Basel) ; 23(15)2023 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-37571773

RESUMO

Images captured under complex conditions frequently have low quality, and image performance obtained under low-light conditions is poor and does not satisfy subsequent engineering processing. The goal of low-light image enhancement is to restore low-light images to normal illumination levels. Although many methods have emerged in this field, they are inadequate for dealing with noise, color deviation, and exposure issues. To address these issues, we present CGAAN, a new unsupervised generative adversarial network that combines a new attention module and a new normalization function based on cycle generative adversarial networks and employs a global-local discriminator trained with unpaired low-light and normal-light images and stylized region loss. Our attention generates feature maps via global and average pooling, and the weights of different feature maps are calculated by multiplying learnable parameters and feature maps in the appropriate order. These weights indicate the significance of corresponding features. Specifically, our attention is a feature map attention mechanism that improves the network's feature-extraction ability by distinguishing the normal light domain from the low-light domain to obtain an attention map to solve the color bias and exposure problems. The style region loss guides the network to more effectively eliminate the effects of noise. The new normalization function we present preserves more semantic information while normalizing the image, which can guide the model to recover more details and improve image quality even further. The experimental results demonstrate that the proposed method can produce good results that are useful for practical applications.

19.
Sensors (Basel) ; 23(17)2023 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-37687809

RESUMO

Road scene understanding, as a field of research, has attracted increasing attention in recent years. The development of road scene understanding capabilities that are applicable to real-world road scenarios has seen numerous complications. This has largely been due to the cost and complexity of achieving human-level scene understanding, at which successful segmentation of road scene elements can be achieved with a mean intersection over union score close to 1.0. There is a need for more of a unified approach to road scene segmentation for use in self-driving systems. Previous works have demonstrated how deep learning methods can be combined to improve the segmentation and perception performance of road scene understanding systems. This paper proposes a novel segmentation system that uses fully connected networks, attention mechanisms, and multiple-input data stream fusion to improve segmentation performance. Results show comparable performance compared to previous works, with a mean intersection over union of 87.4% on the Cityscapes dataset.

20.
Sensors (Basel) ; 23(3)2023 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-36772423

RESUMO

As the monitor probes are used more and more widely these days, the task of detecting abnormal behaviors in surveillance videos has gained widespread attention. The generalization ability and parameter overhead of the model affect how accurate the detection result is. To deal with the poor generalization ability and high parameter overhead of the model in existing anomaly detection methods, we propose a three-dimensional multi-branch convolutional fusion network, named "Branch-Fusion Net". The network is designed with a multi-branch structure not only to significantly reduce parameter overhead but also to improve the generalization ability by understanding the input feature map from different perspectives. To ignore useless features during the model training, we propose a simple yet effective Channel Spatial Attention Module (CSAM), which sequentially focuses attention on key channels and spatial feature regions to suppress useless features and enhance important features. We combine the Branch-Fusion Net and the CSAM as a local feature extraction network and use the Bi-Directional Gated Recurrent Unit (Bi-GRU) to extract global feature information. The experiments are validated on a self-built Crimes-mini dataset, and the accuracy of anomaly detection in surveillance videos reaches 93.55% on the test set. The result shows that the model proposed in the paper significantly improves the accuracy of anomaly detection in surveillance videos with low parameter overhead.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA