Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 6.076
Filter
1.
Front Robot AI ; 11: 1340334, 2024.
Article in English | MEDLINE | ID: mdl-39092214

ABSTRACT

Learning from demonstration is an approach that allows users to personalize a robot's tasks. While demonstrations often focus on conveying the robot's motion or task plans, they can also communicate user intentions through object attributes in manipulation tasks. For instance, users might want to teach a robot to sort fruits and vegetables into separate boxes or to place cups next to plates of matching colors. This paper introduces a novel method that enables robots to learn the semantics of user demonstrations, with a particular emphasis on the relationships between object attributes. In our approach, users demonstrate essential task steps by manually guiding the robot through the necessary sequence of poses. We reduce the amount of data by utilizing only robot poses instead of trajectories, allowing us to focus on the task's goals, specifically the objects related to these goals. At each step, known as a keyframe, we record the end-effector pose, object poses, and object attributes. However, the number of keyframes saved in each demonstration can vary due to the user's decisions. This variability in each demonstration can lead to inconsistencies in the significance of keyframes, complicating keyframe alignment to generalize the robot's motion and the user's intention. Our method addresses this issue by focusing on teaching the higher-level goals of the task using only the required keyframes and relevant objects. It aims to teach the rationale behind object selection for a task and generalize this reasoning to environments with previously unseen objects. We validate our proposed method by conducting three manipulation tasks aiming at different object attribute constraints. In the reproduction phase, we demonstrate that even when the robot encounters previously unseen objects, it can generalize the user's intention and execute the task.

2.
Spectrochim Acta A Mol Biomol Spectrosc ; 323: 124897, 2024 Jul 28.
Article in English | MEDLINE | ID: mdl-39094271

ABSTRACT

Assessing crop seed phenotypic traits is essential for breeding innovations and germplasm enhancement. However, the tough outer layers of thin-shelled seeds present significant challenges for traditional methods aimed at the rapid assessment of their internal structures and quality attributes. This study explores the potential of combining terahertz (THz) time-domain spectroscopy and imaging with semantic segmentation models for the rapid and non-destructive examination of these traits. A total of 120 watermelon seed samples from three distinct varieties, were curated in this study, facilitating a comprehensive analysis of both their outer layers and inner kernels. Utilizing a transmission imaging modality, THz spectral images were acquired and subsequently reconstructed employing a correlation coefficient method. Deep learning-based SegNet and DeepLab V3+ models were employed for automatic tissue segmentation. Our research revealed that DeepLab V3+ significantly surpassed SegNet in both speed and accuracy. Specifically, DeepLab V3+ achieved a pixel accuracy of 96.69 % and an intersection over the union of 91.3 % for the outer layer, with the inner kernel results closely following. These results underscore the proficiency of DeepLab V3+ in distinguishing between the seed coat and kernel, thereby furnishing precise phenotypic trait analyses for seeds with thin shells. Moreover, this study accentuates the instrumental role of deep learning technologies in advancing agricultural research and practices.

3.
Appl Neuropsychol Adult ; : 1-7, 2024 Aug 03.
Article in English | MEDLINE | ID: mdl-39096205

ABSTRACT

The aim of this study is to provide a test that allows for evaluation of both semantic memory (SM) and episodic memory (EM). The study sought to examine psychometric characteristics of the Modified Dead-Alive Test (M-DAT) in patients with neurocognitive disorders and the healthy elderly (HE). The M-DAT consists of 45 names of celebrities who have died in the remote past (15), died in the last five years (15), and are still alive (15), and participants are asked whether they are alive or dead. The M-DAT performances of patients with Diagnostic and Statistical Manual of Mental Disorders-5 (DSM-5) major neurocognitive disorder due to Alzheimer's Disease (MND-AD) (n = 69) and patients with minor neurocognitive disorder (MiND) (n = 27) who were admitted to a geriatric psychiatry clinic and healthy controls (HC) (n = 29) were compared. Age and level of education were taken as covariates, and an analysis of covariance (ANCOVA) was performed since the MND-AD group was older and less educated. The MND-AD group had lower performance in EM and SM scores of the M-DAT. M-DAT failed to differentiate between MiND and HE. Both subscale scores of the M-DAT were associated with other neuropsychological test performances as well as the level of education. The results suggest that M-DAT is a valid and reliable tool that examines both EM and SM performances. M-DAT is an alternative for the assessment of SM evaluated by verbal fluency or naming tests. Evaluating EM and SM together is an important advantage; however, M-DAT is influenced by education, and the items require updating.

4.
Comput Biol Med ; 180: 108975, 2024 Aug 16.
Article in English | MEDLINE | ID: mdl-39153395

ABSTRACT

Skin surface imaging has been used to examine skin lesions with a microscope for over a century and is commonly known as epiluminescence microscopy, dermatoscopy, or dermoscopy. Skin surface microscopy has been recommended to reduce the necessity of biopsy. This imaging technique could improve the clinical diagnostic performance of pigmented skin lesions. Different imaging techniques are employed in dermatology to find diseases. Segmentation and classification are the two main steps in the examination. The classification performance is influenced by the algorithm employed in the segmentation procedure. The most difficult aspect of segmentation is getting rid of the unwanted artifacts. Many deep-learning models are being created to segment skin lesions. In this paper, an analysis of common artifacts is proposed to investigate the segmentation performance of deep learning models with skin surface microscopic images. The most prevalent artifacts in skin images are hair and dark corners. These artifacts can be observed in the majority of dermoscopy images captured through various imaging techniques. While hair detection and removal methods are common, the introduction of dark corner detection and removal represents a novel approach to skin lesion segmentation. A comprehensive analysis of this segmentation performance is assessed using the surface density of artifacts. Assessment of the PH2, ISIC 2017, and ISIC 2018 datasets demonstrates significant enhancements, as reflected by Dice coefficients rising to 93.49 (86.81), 85.86 (79.91), and 75.38 (51.28) respectively, upon artifact removal. These results underscore the pivotal significance of artifact removal techniques in amplifying the efficacy of deep-learning models for skin lesion segmentation.

5.
Comput Biol Med ; 180: 108955, 2024 Aug 16.
Article in English | MEDLINE | ID: mdl-39153392

ABSTRACT

Semantic fluency tests are one of the key tests used in batteries for the early detection of Mild Cognitive Impairment (MCI) as the impairment in speech and semantic memory are among the first symptoms, attracting the attention of a large number of studies. Several new semantic categories and variables capable of providing complementary information of clinical interest have been proposed to increase their effectiveness. However, this also extends the time required to complete all tests and get the overall diagnosis. Therefore, there is a need to reduce the number of tests in the batteries and thus the time spent on them while maintaining or increasing their effectiveness. This study used machine learning methods to determine the smallest and most efficient combination of semantic categories and variables to achieve this goal. We utilized a database containing 423 assessments from 141 subjects, with each subject having undergone three assessments spaced approximately one year apart. Subjects were categorized into three diagnostic groups: Healthy (if diagnosed as healthy in all three assessments), stable MCI (consistently diagnosed as MCI), and heterogeneous MCI (when exhibiting alternations between healthy and MCI diagnoses across assessments). We obtained that the most efficient combination to distinguish between these categories of semantic fluency tests included the animals and clothes semantic categories with the variables corrects, switching, clustering, and total clusters. This combination is ideal for scenarios that require a balance between time efficiency and diagnosis capability, such as population-based screenings.

6.
Int J Neural Syst ; : 2450057, 2024 Aug 15.
Article in English | MEDLINE | ID: mdl-39155691

ABSTRACT

Typically, deep learning models for image segmentation tasks are trained using large datasets of images annotated at the pixel level, which can be expensive and highly time-consuming. A way to reduce the amount of annotated images required for training is to adopt a semi-supervised approach. In this regard, generative deep learning models, concretely Generative Adversarial Networks (GANs), have been adapted to semi-supervised training of segmentation tasks. This work proposes MaskGDM, a deep learning architecture combining some ideas from EditGAN, a GAN that jointly models images and their segmentations, together with a generative diffusion model. With careful integration, we find that using a generative diffusion model can improve EditGAN performance results in multiple segmentation datasets, both multi-class and with binary labels. According to the quantitative results obtained, the proposed model improves multi-class image segmentation when compared to the EditGAN and DatasetGAN models, respectively, by [Formula: see text] and [Formula: see text]. Moreover, using the ISIC dataset, our proposal improves the results from other models by up to [Formula: see text] for the binary image segmentation approach.

7.
Phys Med Biol ; 69(17)2024 Aug 20.
Article in English | MEDLINE | ID: mdl-39094615

ABSTRACT

Objective.Automatic segmentation of prostatic zones from MRI can improve clinical diagnosis of prostate cancer as lesions in the peripheral zone (PZ) and central gland (CG) exhibit different characteristics. Existing approaches are limited in their accuracy in localizing the edges of PZ and CG. The proposed boundary-aware semantic clustering network (BASC-Net) improves segmentation performance by learning features in the vicinity of the prostate zonal boundaries, instead of only focusing on manually segmented boundaries.Approach.BASC-Net consists of two major components: the semantic clustering attention (SCA) module and the boundary-aware contrastive (BAC) loss. The SCA module implements a self-attention mechanism that extracts feature bases representing essential features of the inner body and boundary subregions and constructs attention maps highlighting each subregion. SCA is the first self-attention algorithm that utilizes ground truth masks to supervise the feature basis construction process. The features extracted from the inner body and boundary subregions of the same zone were integrated by BAC loss, which promotes the similarity of features extracted in the two subregions of the same zone. The BAC loss further promotes the difference between features extracted from different zones.Main results.BASC-Net was evaluated on the NCI-ISBI 2013 Challenge and Prostate158 datasets. An inter-dataset evaluation was conducted to evaluate the generalizability of the proposed method. BASC-Net outperformed nine state-of-the-art methods in all three experimental settings, attaining Dice similarity coefficients of 79.9% and 88.6% for PZ and CG, respectively, in the NCI-ISBI dataset, 80.5% and 89.2% for PZ and CG, respectively, in Prostate158 dataset, and 73.2% and 87.4% for PZ and CG, respectively, in the inter-dataset evaluation.Significance.As prostate lesions in PZ and CG have different characteristics, the zonal boundaries segmented by BASC-Net will facilitate prostate lesion detection.


Subject(s)
Image Processing, Computer-Assisted , Magnetic Resonance Imaging , Prostate , Semantics , Male , Humans , Magnetic Resonance Imaging/methods , Image Processing, Computer-Assisted/methods , Cluster Analysis , Prostate/diagnostic imaging , Prostatic Neoplasms/diagnostic imaging
8.
Prog Brain Res ; 287: 111-121, 2024.
Article in English | MEDLINE | ID: mdl-39097350

ABSTRACT

In this paper we investigate the notion of silence using different tools, in particular the hexagon of oppositions.


Subject(s)
Logic , Humans
9.
Sci Rep ; 14(1): 18609, 2024 Aug 10.
Article in English | MEDLINE | ID: mdl-39127805

ABSTRACT

Semantic segmentation plays a crucial role in interpreting remote sensing images, especially in high-resolution scenarios where finer object details, complex spatial information and texture structures exist. To address the challenge of better extracting semantic information and ad-dressing class imbalance in multiclass segmentation, we propose utilizing diffusion models for remote sensing image semantic segmentation, along with a lightweight classification module based on a spatial-channel attention mechanism. Our approach incorporates unsupervised pretrained components with a classification module to accelerate model convergence. The diffusion model component, built on the UNet architecture, effectively captures multiscale features with rich contextual and edge information from images. The lightweight classification module, which leverages spatial-channel attention, focuses more efficiently on spatial-channel regions with significant feature information. We evaluated our approach using three publicly available datasets: Postdam, GID, and Five Billion Pixels. In the test of three datasets, our method achieved the best results. On the GID dataset, the overall accuracy was 96.99%, the mean IoU was 92.17%, and the mean F1 score was 95.83%. In the training phase, our model achieved good performance after only 30 training cycles. Compared with other models, our method reduces the number of parameters, improves the training speed, and has obvious performance advantages.

10.
Biodivers Data J ; 12: e125132, 2024.
Article in English | MEDLINE | ID: mdl-39131439

ABSTRACT

Background: Within the scope of the Helmholtz Metadata Collaboration (HMC), the ADVANCE project - Advanced metadata standards for biodiversity survey and monitoring data: supporting of research and conservation - aimed at supporting rich metadata generation with interoperable metadata standards and semantic artefacts that facilitate data access, integration and reuse across terrestrial, freshwater and marine realms. HMC's mission is to facilitate the discovery, access, machine-readability, and reuse of research data across and beyond the Helmholtz Association. New information: We revised, adapted and expanded existing metadata schemas, vocabularies and thesauri to build a FAIR metadata schema and a metadata entry form built on it for users to provide their metadata instances focused on biodiversity monitoring data. The schema is FAIR because it is both machine-interpretable and follows domain-relevant community standards. This report provides a general overview of the project results and instructions on how to access, re-use and complete the metadata form.

11.
Alzheimers Dement ; 2024 Aug 08.
Article in English | MEDLINE | ID: mdl-39115942

ABSTRACT

INTRODUCTION: Whether brain functional connectivity (FC) is consistently disrupted in individuals with mild cognitive impairment (MCI) with isolated language impairment (ilMCI), and its potential to differentiate between MCI subtypes remains uncertain. METHODS: Cross-sectional data from 404 participants in two cohorts (the Chinese Preclinical Alzheimer's Disease Study and the Alzheimer's Disease Neuroimaging Initiative) were analyzed, including neuropsychological tests, resting-state functional magnetic resonance imaging (fMRI), cerebral amyloid positivity, and apolipoprotein E (APOE) status. RESULTS: Temporo-frontoparietal FC, particularly between the bilateral superior temporal pole and the left inferior frontal/supramarginal gyri, was consistently decreased in ilMCI compared to amnestic MCI (aMCI) and normal controls, which was correlated with semantic impairment. Using mean temporo-frontoparietal FC as a classifier could improve accuracy in identifying ilMCI subgroups with positive cerebral amyloid deposition and APOE risk alleles. DISCUSSION: Temporal-frontoparietal hypoconnectivity was observed in individuals with ilMCI, which may reflect semantic impairment and serve as a valuable biomarker to indicate potential mechanisms of underlying neuropathology. HIGHLIGHTS: Temporo-frontoparietal hypoconnectivity was observed in impaired language mild cognitive impairment (ilMCI). Temporo-frontoparietal hypoconnectivity may reflect semantic impairment. Temporo-frontoparietal functional connectivity can classify ilMCI subtypes.

12.
PeerJ Comput Sci ; 10: e2206, 2024.
Article in English | MEDLINE | ID: mdl-39145211

ABSTRACT

With the advent and improvement of ontological dictionaries (WordNet, Babelnet), the use of synsets-based text representations is gaining popularity in classification tasks. More recently, ontological dictionaries were used for reducing dimensionality in this kind of representation (e.g., Semantic Dimensionality Reduction System (SDRS) (Vélez de Mendizabal et al., 2020)). These approaches are based on the combination of semantically related columns by taking advantage of semantic information extracted from ontological dictionaries. Their main advantage is that they not only eliminate features but can also combine them, minimizing (low-loss) or avoiding (lossless) the loss of information. The most recent (and accurate) techniques included in this group are based on using evolutionary algorithms to find how many features can be grouped to reduce false positive (FP) and false negative (FN) errors obtained. The main limitation of these evolutionary-based schemes is the computational requirements derived from the use of optimization algorithms. The contribution of this study is a new lossless feature reduction scheme exploiting information from ontological dictionaries, which achieves slightly better accuracy (specially in FP errors) than optimization-based approaches but using far fewer computational resources. Instead of using computationally expensive evolutionary algorithms, our proposal determines whether two columns (synsets) can be combined by observing whether the instances included in a dataset (e.g., training dataset) containing these synsets are mostly of the same class. The study includes experiments using three datasets and a detailed comparison with two previous optimization-based approaches.

13.
Cogn Neurodyn ; 18(4): 1743-1752, 2024 Aug.
Article in English | MEDLINE | ID: mdl-39104667

ABSTRACT

The current study investigated the neuro mechanisms of emoji processing as sentence predicate in written context. In the hybrid textuality which is more cognitively engaging, emojis in sentential intermediate positions were designed as either congruent or incongruent to the context. The results showed that incongruent words led to a robust N400 effect, while incongruent emojis only elicited the P600 effect. It implies that semantics and syntax of words can be separated while those of emojis seem to be integrated together. That is, when the meaning of the emoji is violated to the sentential context, its grammatical role cannot be well interpreted, especially when it is used as a key grammatical component in a sentence, such as the predicate. Thus, it shows that even though the meaning of emojis can be interpreted by readers, their syntactic and semantic functions cannot be clearly separated. In comparison with word processing, the larger amplitude with emojis in the time window of 350-500 ms shows more cognitive efforts in emoji semantic processing, possibly arising from the switch of modalities within the visual channel, that is, the multimodal cognitive load.

14.
Neural Netw ; 179: 106557, 2024 Jul 20.
Article in English | MEDLINE | ID: mdl-39106566

ABSTRACT

Unsupervised semantic segmentation is important for understanding that each pixel belongs to known categories without annotation. Recent studies have demonstrated promising outcomes by employing a vision transformer backbone pre-trained on an image-level dataset in a self-supervised manner. However, those methods always depend on complex architectures or meticulously designed inputs. Naturally, we are attempting to explore the investment with a straightforward approach. To prevent over-complication, we introduce a simple Dense Embedding Contrast network (DECNet) for unsupervised semantic segmentation in this paper. Specifically, we propose a Nearest Neighbor Similarity strategy (NNS) to establish well-defined positive and negative pairs for dense contrastive learning. Meanwhile, we optimize a contrastive objective named Ortho-InfoNCE to alleviate the false negative problem inherent in contrastive learning for further enhancing dense representations. Finally, extensive experiments conducted on COCO-Stuff and Cityscapes datasets demonstrate that our approach outperforms state-of-the-art methods.

15.
Global Spine J ; : 21925682241270036, 2024 Aug 07.
Article in English | MEDLINE | ID: mdl-39109794

ABSTRACT

STUDY DESIGN: Cross-sectional study. OBJECTIVES: Imaging classification of adolescent idiopathic scoliosis (AIS) is directly related to the surgical strategy, but the artificial classification is complex and depends on doctors' experience. This study investigated deep learning-based automated classification methods (DL group) for AIS and validated the consistency of machine classification and manual classification (M group). METHODS: A total of 506 cases (81 males and 425 females) and 1812 AIS full spine images in the anteroposterior (AP), lateral (LAT), left bending (LB) and right bending (RB) positions were retrospectively used for training. The mean age was 13.6 ± 1.8. The mean maximum Cobb angle was 46.8 ± 12.0. U-Net semantic segmentation neural network technology and deep learning methods were used to automatically segment and establish the alignment relationship between multiple views of the spine, and to extract spinal features such as the Cobb angle. The type of each test case was automatically calculated according to Lenke's rule. An additional 107 cases of adolescent idiopathic scoliosis imaging were prospectively used for testing. The consistency of the DL group and M group was compared. RESULTS: Automatic vertebral body segmentation and recognition, multi-view alignment of the spine and automatic Cobb angle measurement were implemented. Compare to the M group, the consistency of the DL group was significantly higher in 3 aspects: type of lateral convexity (0.989 vs 0.566), lumbar curvature modifier (0.932 vs 0.738), and sagittal plane modifier (0.987 vs 0.522). CONCLUSIONS: Deep learning enables automated Cobb angle measurement and automated Lenke classification of idiopathic scoliosis whole spine radiographs with higher consistency than manual measurement classification.

16.
Neuropsychologia ; 203: 108968, 2024 Aug 06.
Article in English | MEDLINE | ID: mdl-39117064

ABSTRACT

We examined the neural correlates underlying the semantic processing of native- and nonnative-accented sentences, presented in quiet or embedded in multi-talker noise. Implementing a semantic violation paradigm, 36 English monolingual young adults listened to American-accented (native) and Chinese-accented (nonnative) English sentences with or without semantic anomalies, presented in quiet or embedded in multi-talker noise, while EEG was recorded. After hearing each sentence, participants verbally repeated the sentence, which was coded and scored as an offline comprehension accuracy measure. In line with earlier behavioral studies, the negative impact of background noise on sentence repetition accuracy was higher for nonnative-accented than for native-accented sentences. At the neural level, the N400 effect for semantic anomaly was larger for native-accented than for nonnative-accented sentences, and was also larger for sentences presented in quiet than in noise, indicating impaired lexical-semantic access when listening to nonnative-accented speech or sentences embedded in noise. No semantic N400 effect was observed for nonnative-accented sentences presented in noise. Furthermore, the frequency of neural oscillations in the alpha frequency band (an index of online cognitive listening effort) was higher when listening to sentences in noise versus in quiet, but no difference was observed across the accent conditions. Semantic anomalies presented in background noise also elicited higher theta activity, whereas processing nonnative-accented anomalies was associated with decreased theta activity. Taken together, we found that listening to nonnative accents or background noise is associated with processing challenges during online semantic access, leading to decreased comprehension accuracy. However, the underlying cognitive mechanism (e.g., associated listening efforts) might manifest differently across accented speech processing and speech in noise processing.

17.
Eur J Neurosci ; 2024 Aug 13.
Article in English | MEDLINE | ID: mdl-39138595

ABSTRACT

Mathematical learning and ability are crucial for individual and national economic and technological development, but the neural mechanisms underlying advanced mathematical learning remain unclear. The current study used functional magnetic resonance imaging (fMRI) to investigate how brain networks were involved in advanced mathematical learning and transfer. We recorded fMRI data from 24 undergraduate students as they learned the advanced mathematical concept of a commutative mathematical group. After learning, participants were required to complete learning and transfer behavioural tests. Results of single-trial interindividual brain-behaviour correlation analysis found that brain activity in the semantic and visuospatial networks, and the functional connectivity within the semantic network during advanced mathematical learning were positively correlated with learning and transfer effects. Additionally, the functional connectivity between the semantic and visuospatial networks was negatively correlated with the learning and transfer effects. These findings suggest that advanced mathematical learning relies on both semantic and visuospatial networks.

18.
Appl Neuropsychol Child ; : 1-13, 2024 Aug 10.
Article in English | MEDLINE | ID: mdl-39126424

ABSTRACT

Graphophonological-semantic flexibility is the cognitive flexibility in reading that enables individuals to manage multiple phonological and semantic aspects of text simultaneously. This study investigated graphophonological-semantic flexibility and its contribution to reading comprehension in children with dyslexia, comparing them to age-matched, typically developing peers. Thirty children aged 8-11 were assessed using a reading-specific sorting task, where they categorized word cards by initial phoneme and meaning within a 2x2 matrix. After sorting, participants explained their arrangements, and their sorting speed, accuracy, and composite scores were evaluated. Additionally, reading comprehension was assessed through passages followed by questions. Results revealed significant differences between children with dyslexia and their peers in sorting accuracy and composite scores. Children with dyslexia exhibited poorer accuracy and longer sorting times, leading to lower composite scores indicative of reduced graphophonological-semantic flexibility. Age showed a positive correlation with sorting accuracy and composite scores. Moreover, sorting accuracy and composite scores were strong predictors of reading comprehension. These findings suggest that children with dyslexia face challenges in managing both phonological and semantic aspects of text concurrently, highlighting the importance of graphophonological-semantic flexibility in reading development.

19.
Sci Rep ; 14(1): 18092, 2024 Aug 05.
Article in English | MEDLINE | ID: mdl-39103394

ABSTRACT

Zero-shot stance detection is pivotal for autonomously discerning user stances on novel emerging topics. This task hinges on effective feature alignment transfer from known to unseen targets. To address this, we introduce a zero-shot stance detection framework utilizing multi-expert cooperative learning. This framework comprises two core components: a multi-expert feature extraction module and a gating mechanism for stance feature selection. Our approach involves a unique learning strategy tailored to decompose complex semantic features. This strategy harnesses the expertise of multiple specialists to unravel and learn diverse, intrinsic textual features, enhancing transferability. Furthermore, we employ a gating-based mechanism to selectively filter and fuse these intricate features, optimizing them for stance classification. Extensive experiments on standard benchmark datasets demonstrate that our model significantly surpasses existing baseline models in performance.

20.
Sci Rep ; 14(1): 18124, 2024 Aug 05.
Article in English | MEDLINE | ID: mdl-39103484

ABSTRACT

Printed Circuit Boards (PCBs) are key devices for the modern-day electronic technologies. During the production of these boards, defects may occur. Several methods have been proposed to detect PCB defects. However, detecting significantly smaller and visually unrecognizable defects has been a long-standing challenge. The existing two-stage and multi-stage object detectors that use only one layer of the backbone, such as Resnet's third layer ( C 4 ) or fourth layer ( C 5 ), suffer from low accuracy, and those that use multi-layer feature maps extractors, such as Feature Pyramid Network (FPN), incur higher computational cost. Founded by these challenges, we propose a robust, less computationally intensive, and plug-and-play Attentive Context and Semantic Enhancement Module (ACASEM) for two-stage and multi-stage detectors to enhance PCB defects detection. This module consists of two main parts, namely adaptable feature fusion and attention sub-modules. The proposed model, ACASEM, takes in feature maps from different layers of the backbone and fuses them in a way that enriches the resulting feature maps with more context and semantic information. We test our module with state-of-the-art two-stage object detectors, Faster R-CNN and Double-Head R-CNN, and with multi-stage Cascade R-CNN detector on DeepPCB and Augmented PCB Defect datasets. Empirical results demonstrate improvement in the accuracy of defect detection.

SELECTION OF CITATIONS
SEARCH DETAIL