RESUMO
When speakers learn to change the way they produce a speech sound, how much does that learning generalize to other speech sounds? Past studies of speech sensorimotor learning have typically tested the generalization of a single transformation learned in a single context. Here, we investigate the ability of the speech motor system to generalize learning when multiple opposing sensorimotor transformations are learned in separate regions of the vowel space. We find that speakers adapt to a non-uniform "centralization" perturbation, learning to produce vowels with greater acoustic contrast, and that this adaptation generalizes to untrained vowels, which pattern like neighboring trained vowels and show increased contrast of a similar magnitude.
RESUMO
A Brain-computer interface (BCI) system establishes a novel communication channel between the human brain and a computer. Most event related potential-based BCI applications make use of decoding models, which requires training. This training process is often time-consuming and inconvenient for new users. In recent years, deep learning models, especially participant-independent models, have garnered significant attention in the domain of ERP classification. However, individual differences in EEG signals hamper model generalization, as the ERP component and other aspects of the EEG signal vary across participants, even when they are exposed to the same stimuli. This paper proposes a novel One-source domain transfer learning method based Attention Domain Adversarial Neural Network (OADANN) to mitigate data distribution discrepancies for cross-participant classification tasks. We train and validate our proposed model on both a publicly available OpenBMI dataset and a Self-collected dataset, employing a leave one participant out cross validation scheme. Experimental results demonstrate that the proposed OADANN method achieves the highest and most robust classification performance and exhibits significant improvements when compared to baseline methods (CNN, EEGNet, ShallowNet, DeepCovNet) and domain generalization methods (ERM, Mixup, and Groupdro). These findings underscore the efficacy of our proposed method.
RESUMO
The delicate balance between discrimination and generalization of responses is crucial for survival in our ever-changing environment. In particular, it is important to understand how stimulus discrimination affects the level of stimulus generalization. For example, when we use non-differential training for Pavlovian eyeblink conditioning to investigate generalization of cerebellar-related eyelid motor responses, we find generalization effects on amount, amplitude and timing of the conditioned responses. However, it is unknown what the generalization effects are following differential training. We trained mice to close their eyelids to a 10 kHz tone with an air-puff as the reinforcing stimulus (CS+), while alternatingly exposing them to a tone frequency of either 4 kHz, 9 kHz or 9.5 kHz without the air-puff (CS-) during the training blocks. We tested the generalization effects during the expression of the responses after the training period with tones ranging from 2 kHz to 20 kHz. Our results show that the level of generalization tended to positively correlate with the difference between the CS+ and the CS- training stimuli. These effects of generalization were found for the probability, amplitude but not for the timing of the conditioned eyelid responses. These data indicate the specificity of the generalization effects following differential versus non-differential training, highlighting the relevance of discrimination learning for stimulus generalization.
RESUMO
Auditory spatial attention detection (ASAD) seeks to determine which speaker in a surround sound field a listener is focusing on based on the one's brain biosignals. Although existing studies have achieved ASAD from a single-trial electroencephalogram (EEG), the huge inter-subject variability makes them generally perform poorly in cross-subject scenarios. Besides, most ASAD methods do not take full advantage of topological relationships between EEG channels, which are crucial for high-quality ASAD. Recently, some advanced studies have introduced graph-based brain topology modeling into ASAD, but how to calculate edge weights in a graph to better capture actual brain connectivity is worthy of further investigation. To address these issues, we propose a new ASAD method in this paper. First, we model a multi-channel EEG segment as a graph, where differential entropy serves as the node feature, and a static adjacency matrix is generated based on inter-channel mutual information to quantify brain functional connectivity. Then, different subjects' EEG graphs are encoded into a shared embedding space through a total variation graph neural network. Meanwhile, feature distribution alignment based on multi-kernel maximum mean discrepancy is adopted to learn subject-invariant patterns. Note that we align EEG embeddings of different subjects to reference distributions rather than align them to each other for the purpose of privacy preservation. A series of experiments on open datasets demonstrate that the proposed model outperforms state-of-the-art ASAD models in cross-subject scenarios with relatively low computational complexity, and feature distribution alignment improves the generalizability of the proposed model to a new subject.
RESUMO
The study focusses on risk related generalization beliefs, i.e., the belief that the risk of a specific agent can be generalized across various conditions. These conditions are: G1: across the frequency of usage (from often to rare); G2: across exposure modalities (hot to cold); G3: across exposure routes (oral to dermal), and G4: across detrimental outcomes (specific detrimental endpoint to various detrimental endpoints). We examined how different risk descriptions impact those generalization beliefs using the risks of bamboo tableware for consumers as an example. The research followed a 2x2 between-subjects design with repeated measurements, and the test subjects were non-experts. The first factor, disclosure format, refers to the disclosure (yes/no) of risk generalization limitation. Half of the study participants were informed that bamboo tableware only poses a health risk if it is frequently used for hot beverages or foods. In contrast, the other half received no information about the risk restrictions regarding bamboo tableware use. The second factor referred to the agent description, either described by a particular unfamiliar term (formaldehyde) or a generic, more familiar term (plastics). Furthermore, we tested whether subjects who were initially not informed about the limits of risk generalizations altered their risk generalization beliefs G1 - G4 when they were informed that only frequent hot food and beverage consumption in bamboo tableware causes risks. It was found that respondents' four risk generalization beliefs G1 - G4 were statistically significantly lower for those who were informed about the risk generalization limitations. Additionally, the generalization beliefs G1 - G3 of subjects who were initially not informed, but received the information about the restrictions later, were statistically significantly lower than their initial beliefs, except for generalization across endpoints (G4). We discussed the findings in terms of their implications for risk communication.
RESUMO
Generalized fear is a maladaptive behavior in which non-threatening stimuli elicit a fearful response. The ventral tegmental area (VTA) has been demonstrated to play important roles in fear response and fear memory generalization, but the precious neural circuit mechanism is still unclear. Here, we demonstrated that VTA-zona incerta (ZI) glutamatergic projection is involved in regulating high-intensity threatening training induced generalization and anxiety. Combining calcium signal recording and chemogentics, our work reveals that VTA glutamatergic neurons respond to closed arm entering in the model of PTSD. Inhibition of VTA glutamatergic neurons or the glutamatergic projection to ZI could both relieve fear generalization and anxiety. Together, our study proves the VTA - ZI glutamatergic circuit is involved in mediating fear generalization and anxiety, and provides a potential target for treating post-traumatic stress disorder.
RESUMO
This paper proposes an advanced deep learning model that integrates the Diffusion-Transformer structure and parallel attention mechanism for the tasks of growth estimation and disease detection in jujube forests. Existing methods in forestry monitoring often fall short in meeting the practical needs of large-scale and highly complex forest areas due to limitations in data processing capabilities and feature extraction precision. In response to this challenge, this paper designs and conducts a series of benchmark tests and ablation experiments to systematically evaluate and verify the performance of the proposed model across key performance metrics such as precision, recall, accuracy, and F1-score. Experimental results demonstrate that compared to traditional machine learning models like Support Vector Machines and Random Forests, as well as common deep learning models such as AlexNet and ResNet, the model proposed in this paper achieves a precision of 95%, a recall of 92%, an accuracy of 93%, and an F1-score of 94% in the task of disease detection in jujube forests, showing similarly superior performance in growth estimation tasks as well. Furthermore, ablation experiments with different attention mechanisms and loss functions further validate the effectiveness of parallel attention and parallel loss function in enhancing the overall performance of the model. These research findings not only provide a new technical path for forestry disease monitoring and health assessment but also contribute rich theoretical and experimental foundations for related fields.
RESUMO
Background: Atrial fibrillation (AFib) detection via mobile ECG devices is promising, but algorithms often struggle to generalize across diverse datasets and platforms, limiting their real-world applicability. Objective: This study aims to develop a robust, generalizable AFib detection approach for mobile ECG devices using crowdsourced algorithms. Methods: We developed a voting algorithm using random forest, integrating six open-source AFib detection algorithms from the PhysioNet Challenge. The algorithm was trained on an AliveCor dataset and tested on two disjoint AliveCor datasets and one Apple Watch dataset. Results: The voting algorithm outperformed the base algorithms across all metrics: the average of sensitivity (0.884), specificity (0.988), PPV (0.917), NPV (0.985), and F1-score (0.943) on all datasets. It also demonstrated the least variability among datasets, signifying its highest robustness and effectiveness in diverse data environments. Moreover, it surpassed Apple's algorithm on all metrics and showed higher specificity but lower sensitivity than AliveCor's Kardia algorithm. Conclusions: This study demonstrates the potential of crowdsourced, multi-algorithmic strategies in enhancing AFib detection. Our approach shows robust cross-platform performance, addressing key generalization challenges in AI-enabled cardiac monitoring and underlining the potential for collaborative algorithms in wearable monitoring devices.
Assuntos
Algoritmos , Fibrilação Atrial , Crowdsourcing , Eletrocardiografia , Fibrilação Atrial/diagnóstico , Fibrilação Atrial/fisiopatologia , Humanos , Crowdsourcing/métodos , Eletrocardiografia/métodos , Dispositivos Eletrônicos VestíveisRESUMO
Background: The identification of compound-protein interactions (CPIs) is crucial for drug discovery and understanding mechanisms of action. Accurate CPI prediction can elucidate drug-target-disease interactions, aiding in the discovery of candidate compounds and effective synergistic drugs, particularly from traditional Chinese medicine (TCM). Existing in silico methods face challenges in prediction accuracy and generalization due to compound and target diversity and the lack of largescale interaction datasets and negative datasets for model learning. Methods: To address these issues, we developed a computational model for CPI prediction by integrating the constructed large-scale bioactivity benchmark dataset with a deep learning (DL) algorithm. To verify the accuracy of our CPI model, we applied it to predict the targets of compounds in TCM. An herb pair of Astragalus membranaceus and Hedyotis diffusaas was used as a model, and the active compounds in this herb pair were collected from various public databases and the literature. The complete targets of these active compounds were predicted by the CPI model, resulting in an expanded target dataset. This dataset was next used for the prediction of synergistic antitumor compound combinations. The predicted multi-compound combinations were subsequently examined through in vitro cellular experiments. Results: Our CPI model demonstrated superior performance over other machine learning models, achieving an area under the Receiver Operating Characteristic curve (AUROC) of 0.98, an area under the precision-recall curve (AUPR) of 0.98, and an accuracy (ACC) of 93.31% on the test set. The model's generalization capability and applicability were further confirmed using external databases. Utilizing this model, we predicted the targets of compounds in the herb pair of Astragalus membranaceus and Hedyotis diffusaas, yielding an expanded target dataset. Then, we integrated this expanded target dataset to predict effective drug combinations using our drug synergy prediction model DeepMDS. Experimental assay on breast cancer cell line MDA-MB-231 proved the efficacy of the best predicted multi-compound combinations: Combination I (Epicatechin, Ursolic acid, Quercetin, Aesculetin and Astragaloside IV) exhibited a half-maximal inhibitory concentration (IC50) value of 19.41 µM, and a combination index (CI) value of 0.682; and Combination II (Epicatechin, Ursolic acid, Quercetin, Vanillic acid and Astragaloside IV) displayed a IC50 value of 23.83 µM and a CI value of 0.805. These results validated the ability of our model to make accurate predictions for novel CPI data outside the training dataset and evaluated the reliability of the predictions, showing good applicability potential in drug discovery and in the elucidation of the bioactive compounds in TCM. Conclusion: Our CPI prediction model can serve as a useful tool for accurately identifying potential CPI for a wide range of proteins, and is expected to facilitate drug research, repurposing and support the understanding of TCM.
RESUMO
Transfer learning, the reuse of newly acquired knowledge under novel circumstances, is a critical hallmark of human intelligence that has frequently been pitted against the capacities of artificial learning agents. Yet, the computations relevant to transfer learning have been little investigated in humans. The benefit of efficient inductive biases (meta-level constraints that shape learning, often referred as priors in the Bayesian learning approach), has been both theoretically and experimentally established. Efficiency of inductive biases depends on their capacity to generalize earlier experiences. We argue that successful transfer learning upon task acquisition is ensured by updating inductive biases and transfer of knowledge hinges upon capturing the structure of the task in the inductive bias that can be reused in novel tasks. To explore this, we trained participants on a non-trivial visual stimulus sequence task (Alternating Serial Response Times, ASRT); during the Training phase, participants were exposed to one specific sequence for multiple days, then on the Transfer phase, the sequence changed, while the underlying structure of the task remained the same. Our results show that beyond the acquisition of the stimulus sequence, our participants were also able to update their inductive biases. Acquisition of the new sequence was considerably sped up by earlier exposure but this enhancement was specific to individuals showing signatures of abandoning initial inductive biases. Enhancement of learning was reflected in the development of a new internal model. Additionally, our findings highlight the ability of participants to construct an inventory of internal models and alternate between them based on environmental demands. Further, investigation of the behavior during transfer revealed that it is the subjective internal model of individuals that can predict the transfer across tasks. Our results demonstrate that even imperfect learning in a challenging environment helps learning in a new context by reusing the subjective and partial knowledge about environmental regularities.
RESUMO
Gastrointestinal endoscopic image analysis presents significant challenges, such as considerable variations in quality due to the challenging in-body imaging environment, the often-subtle nature of abnormalities with low interobserver agreement, and the need for real-time processing. These challenges pose strong requirements on the performance, generalization, robustness and complexity of deep learning-based techniques in such safety-critical applications. While Convolutional Neural Networks (CNNs) have been the go-to architecture for endoscopic image analysis, recent successes of the Transformer architecture in computer vision raise the possibility to update this conclusion. To this end, we evaluate and compare clinically relevant performance, generalization and robustness of state-of-the-art CNNs and Transformers for neoplasia detection in Barrett's esophagus. We have trained and validated several top-performing CNNs and Transformers on a total of 10,208 images (2,079 patients), and tested on a total of 7,118 images (998 patients) across multiple test sets, including a high-quality test set, two internal and two external generalization test sets, and a robustness test set. Furthermore, to expand the scope of the study, we have conducted the performance and robustness comparisons for colonic polyp segmentation (Kvasir-SEG) and angiodysplasia detection (Giana). The results obtained for featured models across a wide range of training set sizes demonstrate that Transformers achieve comparable performance as CNNs on various applications, show comparable or slightly improved generalization capabilities and offer equally strong resilience and robustness against common image corruptions and perturbations. These findings confirm the viability of the Transformer architecture, particularly suited to the dynamic nature of endoscopic video analysis, characterized by fluctuating image quality, appearance and equipment configurations in transition from hospital to hospital. The code is made publicly available at: https://github.com/BONS-AI-VCA-AMC/Endoscopy-CNNs-vs-Transformers.
RESUMO
Beehive health monitoring has gained interest in the study of bees in biology, ecology, and agriculture. As audio sensors are less intrusive, a number of audio datasets (mainly labeled with the presence of a queen in the hive) have appeared in the literature, and interest in their classification has been raised. All studies have exhibited good accuracy, and a few have questioned and revealed that classification cannot be generalized to unseen hives. To increase the number of known hives, a review of open datasets is described, and a merger in the form of the "BeeTogether" dataset on the open Kaggle platform is proposed. This common framework standardizes the data format and features while providing data augmentation techniques and a methodology for measuring hives' extrapolation properties. A classical classifier is proposed to benchmark the whole dataset, achieving the same good accuracy and poor hive generalization as those found in the literature. Insight into the role of the frequency of the classification of the presence of a queen is provided, and it is shown that this frequency mostly depends on a colony's belonging. New classifiers inspired by contrastive learning are introduced to circumvent the effect of colony belonging and obtain both good accuracy and hive extrapolation abilities when learning changes in labels. A process for obtaining absolute labels was prototyped on an unsupervised dataset. Solving hive extrapolation with a common open platform and contrastive approach can result in effective applications in agriculture.
Assuntos
Inteligência Artificial , Abelhas/fisiologia , Animais , AlgoritmosRESUMO
There has been much progress in understanding human social learning, including recent studies integrating social information into the reinforcement learning framework. Yet previous studies often assume identical payoffs between observer and demonstrator, overlooking the diversity of social information in real-world interactions. We address this gap by introducing a socially correlated bandit task that accommodates payoff differences among participants, allowing for the study of social learning under more realistic conditions. Our Social Generalization (SG) model, tested through evolutionary simulations and two online experiments, outperforms existing models by incorporating social information into the generalization process, but treating it as noisier than individual observations. Our findings suggest that human social learning is more flexible than previously believed, with the SG model indicating a potential resource-rational trade-off where social learning partially replaces individual exploration. This research highlights the flexibility of humans' social learning, allowing us to integrate social information from others with different preferences, skills, or goals.
Assuntos
Recompensa , Aprendizado Social , Humanos , Masculino , Aprendizado Social/fisiologia , Feminino , Adulto , Individualidade , Comportamento Social , Adulto JovemRESUMO
Accurate detection and tracking of animals across diverse environments are crucial for studying brain and behavior. Recently, computer vision techniques have become essential for high-throughput behavioral studies; however, localizing animals in complex conditions remains challenging due to intra-class visual variability and environmental diversity. These challenges hinder studies in naturalistic settings, such as when animals are partially concealed within nests. Moreover, current tools are laborious and time-consuming, requiring extensive, setup-specific annotation and training procedures. To address these challenges, we introduce the 'Detect-Any-Mouse-Model' (DAMM), an object detector for localizing mice in complex environments with minimal training. Our approach involved collecting and annotating a diverse dataset of single- and multi-housed mice in complex setups. We trained a Mask R-CNN, a popular object detector in animal studies, to perform instance segmentation and validated DAMM's performance on a collection of downstream datasets using zero-shot and few-shot inference. DAMM excels in zero-shot inference, detecting mice and even rats, in entirely unseen scenarios and further improves with minimal training. Using the SORT algorithm, we demonstrate robust tracking, competitive with keypoint-estimation-based methods. Notably, to advance and simplify behavioral studies, we release our code, model weights, and data, along with a user-friendly Python API and a Google Colab implementation.
Assuntos
Algoritmos , Comportamento Animal , Animais , Camundongos , Comportamento Animal/fisiologia , Ratos , Meio AmbienteRESUMO
Learning to solve a new problem involves identifying the operating rules, which can be accelerated if known rules generalize in the new context. We ask how prior experience affects learning a new rule that is distinct from known rules. We examined how rats learned a new spatial navigation task after having previously learned tasks with different navigation rules. The new task differed from the previous tasks in spatial layout and navigation rule. We found that experience history did not impact overall performance. However, by examining navigation choice sequences in the new task, we found experience-dependent differences in exploration patterns during early stages of learning, as well as differences in the types of errors made during stable performance. The differences were consistent with the animals adopting experience-dependent memory strategies to discover and implement the new rule. Our results indicate prior experience shapes the strategies for solving novel problems, and the impact of prior experience remains persistent.
Assuntos
Ratos Long-Evans , Aprendizagem Espacial , Animais , Aprendizagem Espacial/fisiologia , Masculino , Navegação Espacial/fisiologia , Ratos , Comportamento Exploratório/fisiologiaRESUMO
Chronic social defeat stress (CSDS), a widely used rodent model of stress, reliably leads to decreased social interaction in stress susceptible animals. Here, we investigate a role for fear learning in this response using male 129â¯Sv/Ev mice, a strain that is more vulnerable to CSDS than the commonly used C57BL/6 strain. We first demonstrate that defeated 129â¯Sv/Ev mice avoid a CD-1 mouse, but not a conspecific, indicating that motivation to socialize is intact in this strain. CD-1 avoidance is characterized by approach behavior that results in running in the opposite direction, activity that is consistent with a threat response. We next test whether CD-1 avoidance is subject to the same behavioral changes found in traditional models of Pavlovian fear conditioning. We find that associative learning occurs across 10 days CSDS, with defeated mice learning to associate the color of the CD-1 coat with threat. This leads to the gradual acquisition of avoidance behavior, a conditioned response that can be extinguished with 7 days of repeated social interaction testing (5 tests/day). Pairing a CD-1 with a tone leads to second-order conditioning, resulting in avoidance of an enclosure without a social target. Finally, we show that social interaction with a conspecific is a highly variable response in defeated mice that may reflect individual differences in generalization of fear to other social targets. Our data indicate that fear conditioning to a social target is a key component of CSDS, implicating the involvement of fear circuits in social avoidance.
RESUMO
Numerous automatic sleep stage classification systems have been developed, but none have become effective assistive tools for sleep technicians due to issues with generalization. Four key factors hinder the generalization of these models are instruments, montage of recording, subject type, and scoring manual factors. This study aimed to develop a deep learning model that addresses generalization problems by integrating enzyme-inspired specificity and employing separating training approaches. Subject type and scoring manual factors were controlled, while the focus was on instruments and montage of recording factors. The proposed model consists of three sets of signal-specific models including EEG-, EOG-, and EMG-specific model. The EEG-specific models further include three sets of channel-specific models. All signal-specific and channel-specific models were established with data manipulation and weighted loss strategies, resulting in three sets of data manipulation models and class-specific models, respectively. These models were CNNs. Additionally, BiLSTM models were applied to EEG- and EOG-specific models to obtain temporal information. Finally, classification task for sleep stage was handled by 'the-last-dense' layer. The optimal sampling frequency for each physiological signal was identified and used during the training process. The proposed model was trained on MGH dataset and evaluated using both within dataset and cross-dataset. For MGH dataset, overall accuracy of 81.05 %, MF1 of 79.05 %, Kappa of 0.7408, and per-class F1-scores: W (84.98 %), N1 (58.06 %), N2 (84.82 %), N3 (79.20 %), and REM (88.17 %) can be achieved. Performances on cross-datasets are as follows: SHHS1 200 records reached 79.54 %, 70.56 %, and 0.7078; SHHS2 200 records achieved 76.77 %, 66.30 %, and 0.6632; Sleep-EDF 153 records gained 78.52 %, 72.13 %, and 0.7031; and BCI-MU (local dataset) 94 records achieved 83.57 %, 82.17 %, and 0.7769 for overall accuracy, MF1, and Kappa respectively. Additionally, the proposed model has approximately 9.3 M trainable parameters and takes around 26 s to process one PSG record. The results indicate that the proposed model demonstrates generalizability in sleep stage classification and shows potential as a feasibility tool for real-world applications. Additionally, enzyme-inspired specificity effectively addresses the challenges posed by varying montage of recording, while the identified optimal frequencies mitigate instrument-related issues.
RESUMO
This mini-meta-analysis evaluated the internal consistency of the Anxiety and Preoccupation about Sleep Questionnaire (APSQ) across existing studies to assess its potential as an orthosomnia (an obsessive preoccupation with achieving perfect sleep) screening tool. A systematic literature search identified four studies with 2,506 participants using English, Swedish, Turkish, and Arabic versions. Cronbach's alpha ranged from 0.91 to 0.95 across studies. The APSQ demonstrated high overall internal consistency reliability (pooled Cronbach's alpha of the entire ASPQ = 0.93, 95% CI 0.91-0.94), suggesting utility for screening orthosomnia symptoms. The pooled Cronbach's alpha of the first and second factors of the ASPQ were: 0.91 (95% CI 0.89-0.93) and 0.87 (95% CI 0.84-0.89), respectively. APSQ demonstrated high overall internal consistency reliability; however, limited linguistic/cultural representation and significant heterogeneity across studies impact generalizability.
RESUMO
Identifying the origins of moral sensitivities, and their elaboration, within infancy and early childhood is a challenging task, given inherent limitations in infants' behavior. Here, I argue for a multi-pronged, multi-method approach that involves cleaving the moral response at its joints. Specifically, I chart the emergence of infants' moral expectations, evaluations, generalization and enforcement, demonstrating that while many moral sensitivities are present in the second year of life, these sensitivities are closely aligned with, and likely driven by, infants' everyday experience. Moreover, qualitative differences exist between the moral responses that are present in infancy and those of later childhood, particularly in terms of enforcement (i.e., a lack of punishment in infancy). These findings set the stage for addressing outstanding critical questions regarding moral development, that include identifying discrete causal inputs to early moral cognition, identifying whether moral cognition is distinct from social cognition early in life, and explaining gaps that exist between moral cognition and moral behavior in development.
Assuntos
Generalização Psicológica , Desenvolvimento Moral , Pré-Escolar , Humanos , Lactente , Desenvolvimento Infantil , Comportamento do Lactente/psicologia , Punição/psicologia , Cognição Social , Normas SociaisRESUMO
Cartographic map generalization involves complex rules, and a full automation has still not been achieved, despite many efforts over the past few decades. Pioneering studies show that some map generalization tasks can be partially automated by deep neural networks (DNNs). However, DNNs are still used as black-box models in previous studies. We argue that integrating explainable AI (XAI) into a DL-based map generalization process can give more insights to develop and refine the DNNs by understanding what cartographic knowledge exactly is learned. Following an XAI framework for an empirical case study, visual analytics and quantitative experiments were applied to explain the importance of input features regarding the prediction of a pre-trained ResU-Net model. This experimental case study finds that the XAI-based visualization results can easily be interpreted by human experts. With the proposed XAI workflow, we further find that the DNN pays more attention to the building boundaries than the interior parts of the buildings. We thus suggest that boundary intersection over union is a better evaluation metric than commonly used intersection over union in qualifying raster-based map generalization results. Overall, this study shows the necessity and feasibility of integrating XAI as part of future DL-based map generalization development frameworks.