RESUMO
MOTIVATION: The rapid development of spatial transcriptome technologies has enabled researchers to acquire single-cell-level spatial data at an affordable price. However, computational analysis tools, such as annotation tools, tailored for these data are still lacking. Recently, many computational frameworks have emerged to integrate single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics datasets. While some frameworks can utilize well-annotated scRNA-seq data to annotate spatial expression patterns, they overlook critical aspects. First, existing tools do not explicitly consider cell type mapping when aligning the two modalities. Second, current frameworks lack the capability to detect novel cells, which remains a key interest for biologists. RESULTS: To address these problems, we propose an annotation method for spatial transcriptome data called SPANN. The main tasks of SPANN are to transfer cell-type labels from well-annotated scRNA-seq data to newly generated single-cell resolution spatial transcriptome data and discover novel cells from spatial data. The major innovations of SPANN come from two aspects: SPANN automatically detects novel cells from unseen cell types while maintaining high annotation accuracy over known cell types. SPANN finds a mapping between spatial transcriptome samples and RNA data prototypes and thus conducts cell-type-level alignment. Comprehensive experiments using datasets from various spatial platforms demonstrate SPANN's capabilities in annotating known cell types and discovering novel cell states within complex tissue contexts. AVAILABILITY: The source code of SPANN can be accessed at https://github.com/ddb-qiwang/SPANN-torch. CONTACT: dengmh@math.pku.edu.cn.
Assuntos
Análise da Expressão Gênica de Célula Única , Transcriptoma , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , SoftwareRESUMO
Kinase fusion genes are the most active fusion gene group in human cancer fusion genes. To help choose the clinically significant kinase so that the cancer patients that have fusion genes can be better diagnosed, we need a metric to infer the assessment of kinases in pan-cancer fusion genes rather than relying on the sample frequency expressed fusion genes. Most of all, multiple studies assessed human kinases as the drug targets using multiple types of genomic and clinical information, but none used the kinase fusion genes in their study. The assessment studies of kinase without kinase fusion gene events can miss the effect of one of the mechanisms that enhance the kinase function in cancer. To fill this gap, in this study, we suggest a novel way of assessing genes using a network propagation approach to infer how likely individual kinases influence the kinase fusion gene network composed of ~5K kinase fusion gene pairs. To select a better seed of propagation, we chose the top genes via dimensionality reduction like a principal component or latent layer information of six features of individual genes in pan-cancer fusion genes. Our approach may provide a novel way to assess of human kinases in cancer.
Assuntos
Redes Reguladoras de Genes , Neoplasias , Humanos , Neoplasias/genética , Fusão GênicaRESUMO
Recent advances in spatially resolved transcriptomics (SRT) have brought ever-increasing opportunities to characterize expression landscape in the context of tissue spatiality. Nevertheless, there still exist multiple challenges to accurately detect spatial functional regions in tissue. Here, we present a novel contrastive learning framework, SPAtially Contrastive variational AutoEncoder (SpaCAE), which contrasts transcriptomic signals of each spot and its spatial neighbors to achieve fine-grained tissue structures detection. By employing a graph embedding variational autoencoder and incorporating a deep contrastive strategy, SpaCAE achieves a balance between spatial local information and global information of expression, enabling effective learning of representations with spatial constraints. Particularly, SpaCAE provides a graph deconvolutional decoder to address the smoothing effect of local spatial structure on expression's self-supervised learning, an aspect often overlooked by current graph neural networks. We demonstrated that SpaCAE could achieve effective performance on SRT data generated from multiple technologies for spatial domains identification and data denoising, making it a remarkable tool to obtain novel insights from SRT studies.
Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Redes Neurais de ComputaçãoRESUMO
Cancer is a complex and high-mortality disease regulated by multiple factors. Accurate cancer subtyping is crucial for formulating personalized treatment plans and improving patient survival rates. The underlying mechanisms that drive cancer progression can be comprehensively understood by analyzing multi-omics data. However, the high noise levels in omics data often pose challenges in capturing consistent representations and adequately integrating their information. This paper proposed a novel variational autoencoder-based deep learning model, named Deeply Integrating Latent Consistent Representations (DILCR). Firstly, multiple independent variational autoencoders and contrastive loss functions were designed to separate noise from omics data and capture latent consistent representations. Subsequently, an Attention Deep Integration Network was proposed to integrate consistent representations across different omics levels effectively. Additionally, we introduced the Improved Deep Embedded Clustering algorithm to make integrated variable clustering friendly. The effectiveness of DILCR was evaluated using 10 typical cancer datasets from The Cancer Genome Atlas and compared with 14 state-of-the-art integration methods. The results demonstrated that DILCR effectively captures the consistent representations in omics data and outperforms other integration methods in cancer subtyping. In the Kidney Renal Clear Cell Carcinoma case study, cancer subtypes were identified by DILCR with significant biological significance and interpretability.
Assuntos
Carcinoma de Células Renais , Neoplasias Renais , Neoplasias , Humanos , Multiômica , Neoplasias/genética , Carcinoma de Células Renais/genética , Algoritmos , Análise por Conglomerados , Neoplasias Renais/genéticaRESUMO
Prediction of therapy response has been a major challenge in cancer precision medicine due to the extensive tumor heterogeneity. Recently, several deep learning methods have been developed to predict drug response by utilizing various omics data. Most of them train models by using the drug-response screening data generated from cell lines and then use these models to predict response in cancer patient data. In this study, we focus on and evaluate deep learning methods using transcriptome data for the long-standing question of personalized drug-response prediction. We developed an embedding-based approach for drug-response prediction and benchmarked similar methods for their performance. For all methods, we used pretreatment transcriptome data to train models and then conducted a comprehensive evaluation and comparison of the models using cross-panels, cross-datasets and target genes. We further validated the methods using three independent datasets assessing multiple compounds for their predictive capability of drug response, survival outcome and cell line status. As a result, the methods building on gene embeddings had an overall competitive performance with reduced overfitting when we applied evaluation parameters for model fitting as well as the correlation with clinical outcomes in the validation data. We further developed an ensemble model to combine the results from the three most competitive methods for an overall prediction. Finally, we developed DrVAEN (https://bioinfo.uth.edu/drvaen), a user-friendly and easy-accessible web-server that hosts all these methods for drug-response prediction and model comparison for broad use in cancer research, method evaluation and drug development.
Assuntos
Benchmarking , Neoplasias , Humanos , Neoplasias/tratamento farmacológico , Neoplasias/genética , Medicina de Precisão/métodosRESUMO
Single-cell RNA sequencing (scRNA-seq) is a revolutionary breakthrough that determines the precise gene expressions on individual cells and deciphers cell heterogeneity and subpopulations. However, scRNA-seq data are much noisier than traditional high-throughput RNA-seq data because of technical limitations, leading to many scRNA-seq data studies about dimensionality reduction and visualization remaining at the basic data-stacking stage. In this study, we propose an improved variational autoencoder model (termed DREAM) for dimensionality reduction and a visual analysis of scRNA-seq data. Here, DREAM combines the variational autoencoder and Gaussian mixture model for cell type identification, meanwhile explicitly solving 'dropout' events by introducing the zero-inflated layer to obtain the low-dimensional representation that describes the changes in the original scRNA-seq dataset. Benchmarking comparisons across nine scRNA-seq datasets show that DREAM outperforms four state-of-the-art methods on average. Moreover, we prove that DREAM can accurately capture the expression dynamics of human preimplantation embryonic development. DREAM is implemented in Python, freely available via the GitHub website, https://github.com/Crystal-JJ/DREAM.
Assuntos
Análise de Célula Única , Análise da Expressão Gênica de Célula Única , Humanos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , RNA-Seq , Perfilação da Expressão Gênica/métodos , Análise por ConglomeradosRESUMO
Alzheimer's disease is the most common major neurocognitive disorder. Although currently, no cure exists, understanding the neurobiological substrate underlying Alzheimer's disease progression will facilitate early diagnosis and treatment, slow disease progression, and improve prognosis. In this study, we aimed to understand the morphological changes underlying Alzheimer's disease progression using structural magnetic resonance imaging data from cognitively normal individuals, individuals with mild cognitive impairment, and Alzheimer's disease via a contrastive variational autoencoder model. We used contrastive variational autoencoder to generate synthetic data to boost the downstream classification performance. Due to the ability to parse out the nonclinical factors such as age and gender, contrastive variational autoencoder facilitated a purer comparison between different Alzheimer's disease stages to identify the pathological changes specific to Alzheimer's disease progression. We showed that brain morphological changes across Alzheimer's disease stages were significantly associated with individuals' neurofilament light chain concentration, a potential biomarker for Alzheimer's disease, highlighting the biological plausibility of our results.
Assuntos
Doença de Alzheimer , Encéfalo , Disfunção Cognitiva , Progressão da Doença , Imageamento por Ressonância Magnética , Humanos , Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/patologia , Feminino , Masculino , Idoso , Imageamento por Ressonância Magnética/métodos , Encéfalo/diagnóstico por imagem , Encéfalo/patologia , Disfunção Cognitiva/diagnóstico por imagem , Disfunção Cognitiva/patologia , Disfunção Cognitiva/fisiopatologia , Proteínas de Neurofilamentos/metabolismo , Idoso de 80 Anos ou mais , Biomarcadores , Pessoa de Meia-IdadeRESUMO
BACKGROUND: The integration of single-cell RNA sequencing data from multiple experimental batches and diverse biological conditions holds significant importance in the study of cellular heterogeneity. RESULTS: To expedite the exploration of systematic disparities under various biological contexts, we propose a scRNA-seq integration method called scDisco, which involves a domain-adaptive decoupling representation learning strategy for the integration of dissimilar single-cell RNA data. It constructs a condition-specific domain-adaptive network founded on variational autoencoders. scDisco not only effectively reduces batch effects but also successfully disentangles biological effects and condition-specific effects, and further augmenting condition-specific representations through the utilization of condition-specific Domain-Specific Batch Normalization layers. This enhancement enables the identification of genes specific to particular conditions. The effectiveness and robustness of scDisco as an integration method were analyzed using both simulated and real datasets, and the results demonstrate that scDisco can yield high-quality visualizations and quantitative outcomes. Furthermore, scDisco has been validated using real datasets, affirming its proficiency in cell clustering quality, retaining batch-specific cell types and identifying condition-specific genes. CONCLUSION: scDisco is an effective integration method based on variational autoencoders, which improves analytical tasks of reducing batch effects, cell clustering, retaining batch-specific cell types and identifying condition-specific genes.
Assuntos
Aprendizagem , Análise da Expressão Gênica de Célula Única , Análise por Conglomerados , RNA , Análise de Célula Única , Análise de Sequência de RNA , Perfilação da Expressão Gênica , AlgoritmosRESUMO
Recent studies indicate that differences in cognition among individuals may be partially attributed to unique brain wiring patterns. While functional connectivity (FC)-based fingerprinting has demonstrated high accuracy in identifying adults, early studies on neonates suggest that individualized FC signatures are absent. We posit that individual uniqueness is present in neonatal FC data and that conventional linear models fail to capture the rapid developmental trajectories characteristic of newborn brains. To explore this hypothesis, we employed a deep generative model, known as a variational autoencoder (VAE), leveraging two extensive public datasets: one comprising resting-state functional MRI (rs-fMRI) scans from 100 adults and the other from 464 neonates. VAE models trained on rs-fMRI from both adults and newborns produced superior age prediction performance (with r between predicted- and actual age â¼ 0.7) and individual identification accuracy (â¼45 %) compared to models trained solely on adult or neonatal data. The VAE model also showed significantly higher individual identification accuracy than linear models (=10â¼30 %). Importantly, the VAE differentiated connections reflecting age-related changes from those indicative of individual uniqueness, a distinction not possible with linear models. Moreover, we derived 20 latent variables, each corresponding to distinct patterns of cortical functional network (CFNs). These CFNs varied in their representation of brain maturation and individual signatures; notably, certain CFNs that failed to capture neurodevelopmental traits, in fact, exhibited individual signatures. CFNs associated with neonatal neurodevelopment predominantly encompassed unimodal regions such as visual and sensorimotor areas, whereas those linked to individual uniqueness spanned multimodal and transmodal brain regions. The VAE's capacity to extract features from rs-fMRI data beyond the capabilities of linear models positions it as a valuable tool for delineating cognitive traits inherent in rs-fMRI and exploring individualized imaging phenotypes.
Assuntos
Encéfalo , Conectoma , Aprendizado Profundo , Imageamento por Ressonância Magnética , Humanos , Recém-Nascido , Conectoma/métodos , Imageamento por Ressonância Magnética/métodos , Masculino , Feminino , Adulto , Encéfalo/diagnóstico por imagem , Encéfalo/fisiologia , Encéfalo/crescimento & desenvolvimento , Adulto Jovem , Rede Nervosa/diagnóstico por imagem , Rede Nervosa/fisiologiaRESUMO
The functional connectivity (FC) graph of the brain has been widely recognized as a ``fingerprint'' that can be used to identify individuals from a group of subjects. Research has indicated that individual identification accuracy can be improved by eliminating the impact of shared information among individuals. However, current research extracts not only shared information of inter-subject but also individual-specific information from FC graphs, resulting in incomplete separation of shared information and fingerprint information among individuals, leading to lower individual identification accuracy across all functional magnetic resonance imaging (fMRI) states session pairs and poor cognitive behavior prediction performance. In this paper, we propose a method to enhance inter-subject variability combining conditional variational autoencoder (CVAE) network and sparse dictionary learning (SDL) module. By embedding fMRI state information in the encoding and decoding processes, the CVAE network can better capture and represent the common features among individuals and enhance inter-subject variability by residual. Our experimental results on Human Connectome Project (HCP) data show that the refined connectomes obtained by using CVAE with SDL can accurately distinguish an individual from the remaining participants. The success accuracies reached 99.7 % and 99.6 % in the session pair rest1-rest2 and reverse rest2-rest1, respectively. In the identification experiment involving task-task combinations carried out on the same day, the identification accuracies ranged from 94.2 % to 98.8 %. Furthermore, we showed the Frontoparietal and Default networks make the most significant contributions to individual identification and the edges that significantly contribute to individual identification are found within and between the Frontoparietal and Default networks. Additionally, high-level cognitive behaviors can also be better predicted with the obtained refined connectomes, suggesting that higher fingerprinting can be useful for resulting in higher behavioral associations. In summary, our proposed framework provides a promising approach to use functional connectivity networks for studying cognition and behavior, promoting a deeper understanding of brain functions.
Assuntos
Encéfalo , Cognição , Conectoma , Imageamento por Ressonância Magnética , Humanos , Conectoma/métodos , Imageamento por Ressonância Magnética/métodos , Encéfalo/fisiologia , Encéfalo/diagnóstico por imagem , Cognição/fisiologia , Adulto , Rede Nervosa/fisiologia , Rede Nervosa/diagnóstico por imagem , Masculino , FemininoRESUMO
Protein production in the biopharmaceutical industry necessitates the utilization of multiple analytical techniques and control methodologies to ensure both safety and consistency. To facilitate real-time monitoring and control of cell culture processes, Raman spectroscopy has emerged as a versatile analytical technology. This technique, categorized as a Process Analytical Technology, employs chemometric models to establish correlations between Raman signals and key variables of interest. One notable approach for achieving real-time monitoring is through the application of just-in-time learning (JITL), an industrial soft sensor modeling technique that utilizes Raman signals to estimate process variables promptly. The conventional Raman-based JITL method relies on the K-nearest neighbor (KNN) algorithm with Euclidean distance as the similarity measure. However, it falls short of addressing the impact of data uncertainties. To rectify this limitation, this study endeavors to integrate JITL with a variational autoencoder (VAE). This integration aims to extract dominant Raman features in a nonlinear fashion, which are expressed as multivariate Gaussian distributions. Three experimental runs using different cell lines were chosen to compare the performance of the proposed algorithm with commonly utilized methods in the literature. The findings indicate that the VAE-JITL approach consistently outperforms partial least squares, convolutional neural network, and JITL with KNN similarity measure in accurately predicting key process variables.
Assuntos
Análise Espectral Raman , Análise Espectral Raman/métodos , Cricetulus , Células CHO , Animais , Técnicas de Cultura de Células/métodos , Aprendizado de Máquina , AlgoritmosRESUMO
In settings requiring synthetic data generation based on a clinical cohort, e.g., due to data protection regulations, heterogeneity across individuals might be a nuisance that we need to control or faithfully preserve. The sources of such heterogeneity might be known, e.g., as indicated by sub-groups labels, or might be unknown and thus reflected only in properties of distributions, such as bimodality or skewness. We investigate how such heterogeneity can be preserved and controlled when obtaining synthetic data from variational autoencoders (VAEs), i.e., a generative deep learning technique that utilizes a low-dimensional latent representation. To faithfully reproduce unknown heterogeneity reflected in marginal distributions, we propose to combine VAEs with pre-transformations. For dealing with known heterogeneity due to sub-groups, we complement VAEs with models for group membership, specifically from propensity score regression. The evaluation is performed with a realistic simulation design that features sub-groups and challenging marginal distributions. The proposed approach faithfully recovers the latter, compared to synthetic data approaches that focus purely on marginal distributions. Propensity scores add complementary information, e.g., when visualized in the latent space, and enable sampling of synthetic data with or without sub-group specific characteristics. We also illustrate the proposed approach with real data from an international stroke trial that exhibits considerable distribution differences between study sites, in addition to bimodality. These results indicate that describing heterogeneity by statistical approaches, such as propensity score regression, might be more generally useful for complementing generative deep learning for obtaining synthetic data that faithfully reflects structure from clinical cohorts.
Assuntos
Pontuação de Propensão , Humanos , Aprendizado Profundo , Algoritmos , Simulação por ComputadorRESUMO
Generative models have the potential to revolutionize 3D extended reality. A primary obstacle is that augmented and virtual reality need real-time computing. Current state-of-the-art point cloud random generation methods are not fast enough for these applications. We introduce a vector-quantized variational autoencoder model (VQVAE) that can synthesize high-quality point clouds in milliseconds. Unlike previous work in VQVAEs, our model offers a compact sample representation suitable for conditional generation and data exploration with potential applications in rapid prototyping. We achieve this result by combining architectural improvements with an innovative approach for probabilistic random generation. First, we rethink current parallel point cloud autoencoder structures, and we propose several solutions to improve robustness, efficiency and reconstruction quality. Notable contributions in the decoder architecture include an innovative computation layer to process the shape semantic information, an attention mechanism that helps the model focus on different areas and a filter to cover possible sampling errors. Secondly, we introduce a parallel sampling strategy for VQVAE models consisting of a double encoding system, where a variational autoencoder learns how to generate the complex discrete distribution of the VQVAE, not only allowing quick inference but also describing the shape with a few global variables. We compare the proposed decoder and our VQVAE model with established and concurrent work, and we prove, one by one, the validity of the single contributions.
RESUMO
With the rapid development of industry, the risks factories face are increasing. Therefore, the anomaly detection algorithms deployed in factories need to have high accuracy, and they need to be able to promptly discover and locate the specific equipment causing the anomaly to restore the regular operation of the abnormal equipment. However, the neural network models currently deployed in factories cannot effectively capture both temporal features within dimensions and relationship features between dimensions; some algorithms that consider both types of features lack interpretability. Therefore, we propose a high-precision, interpretable anomaly detection algorithm based on variational autoencoder (VAE). We use a multi-scale local weight-sharing convolutional neural network structure to fully extract the temporal features within each dimension of the multi-dimensional time series. Then, we model the features from various aspects through multiple attention heads, extracting the relationship features between dimensions. We map the attention output results to the latent space distribution of the VAE and propose an optimization method to improve the reconstruction performance of the VAE, detecting anomalies through reconstruction errors. Regarding anomaly interpretability, we utilize the VAE probability distribution characteristics, decompose the obtained joint probability density into conditional probabilities on each dimension, and calculate the anomaly score, which provides helpful value for technicians. Experimental results show that our algorithm performed best in terms of F1 score and AUC value. The AUC value for anomaly detection is 0.982, and the F1 score is 0.905, which is 4% higher than the best-performing baseline algorithm, Transformer with a Discriminator for Anomaly Detection (TDAD). It also provides accurate anomaly interpretation capability.
RESUMO
Structural health monitoring (SHM) has become paramount for developing cheaper and more reliable maintenance policies. The advantages coming from adopting such process have turned out to be particularly evident when dealing with plated structures. In this context, state-of-the-art methods are based on exciting and acquiring ultrasonic-guided waves through a permanently installed sensor network. A baseline is registered when the structure is healthy, and newly acquired signals are compared to it to detect, localize, and quantify damage. To this purpose, the performance of traditional methods has been overcome by data-driven approaches, which allow processing a larger amount of data without losing diagnostic information. However, to date, no diagnostic method can deal with varying environmental and operational conditions (EOCs). This work aims to present a proof-of-concept that state-of-the-art machine learning methods can be used for reducing the impact of EOCs on the performance of damage diagnosis methods. Generative artificial intelligence was leveraged to mitigate the impact of temperature variations on ultrasonic guided wave-based SHM. Specifically, variational autoencoders and singular value decomposition were combined to learn the influence of temperature on guided waves. After training, the generative part of the algorithm was used to reconstruct signals at new unseen temperatures. Moreover, a refined version of the algorithm called forced variational autoencoder was introduced to further improve the reconstruction capabilities. The accuracy of the proposed framework was demonstrated against real measurements on a composite plate.
RESUMO
Generative Adversarial Networks (GANs) for 3D volume generation and reconstruction, such as shape generation, visualization, automated design, real-time simulation, and research applications, are receiving increased amounts of attention in various fields. However, challenges such as limited training data, high computational costs, and mode collapse issues persist. We propose combining a Variational Autoencoder (VAE) and a GAN to uncover enhanced 3D structures and introduce a stable and scalable progressive growth approach for generating and reconstructing intricate voxel-based 3D shapes. The cascade-structured network involves a generator and discriminator, starting with small voxel sizes and incrementally adding layers, while subsequently supervising the discriminator with ground-truth labels in each newly added layer to model a broader voxel space. Our method enhances the convergence speed and improves the quality of the generated 3D models through stable growth, thereby facilitating an accurate representation of intricate voxel-level details. Through comparative experiments with existing methods, we demonstrate the effectiveness of our approach in evaluating voxel quality, variations, and diversity. The generated models exhibit improved accuracy in 3D evaluation metrics and visual quality, making them valuable across various fields, including virtual reality, the metaverse, and gaming.
RESUMO
Identifying the structural state without baseline data is an important engineering problem in the field of structural health monitoring, which is crucial for assessing the safety condition of structures. In the context of limited accelerometers available, this paper proposes a correlation-based damage identification method using Variational Autoencoder neural networks. The approach involves initially constructing a Variational Autoencoder network model for bridge damage detection, optimizing parameters such as loss functions and learning rates for the model, and ultimately utilizing response data from limited sensors for model training analysis to determine the structural state. The contribution of this paper lies in the ability to identify structural damage without baseline data using response data from a small number of sensors, reducing sensor costs and enhancing practical applications in engineering. The effectiveness of the proposed method is demonstrated through numerical simulations and experimental structures. The results show that the method can identify the location of damage under different damage conditions, exhibits strong robustness in detecting multiple damages, and further enhances the accuracy of identifying bridge structures.
RESUMO
Ensuring precise prediction of the remaining useful life (RUL) for bearings in rolling machinery is crucial for preventing sudden machine failures and optimizing equipment maintenance strategies. Since the significant interference encountered in real industrial environments and the high complexity of the machining process, accurate and robust RUL prediction of rolling bearings is of tremendous research importance. Hence, a novel RUL prediction model called CNN-VAE-MBiLSTM is proposed in this paper by integrating advantages of convolutional neural network (CNN), variational autoencoder (VAE), and multiple bi-directional long short-term memory (MBiLSTM). The proposed approach includes a CNN-VAE model and a MBiLSTM model. The CNN-VAE model performs well for automatically extracting low-dimensional features from time-frequency spectrum of multi-axis signals, which simplifies the construction of features and minimizes the subjective bias of designers. Based on these features, the MBiLSTM model achieves a commendable performance in the prediction of RUL for bearings, which independently captures sequential characteristics of features in each axis and further obtains differences among multi-axis features. The performance of the proposed approach is validated through an industrial case, and the result indicates that it exhibits a higher accuracy and a better anti-noise capacity in RUL predictions than comparable methods.
RESUMO
To address the class imbalance issue in network intrusion detection, which degrades performance of intrusion detection models, this paper proposes a novel generative model called VAE-WACGAN to generate minority class samples and balance the dataset. This model extends the Variational Autoencoder Generative Adversarial Network (VAEGAN) by integrating key features from the Auxiliary Classifier Generative Adversarial Network (ACGAN) and the Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP). These enhancements significantly improve both the quality of generated samples and the stability of the training process. By utilizing the VAE-WACGAN model to oversample anomalous data, more realistic synthetic anomalies that closely mirror the actual network traffic distribution can be generated. This approach effectively balances the network traffic dataset and enhances the overall performance of the intrusion detection model. Experimental validation was conducted using two widely utilized intrusion detection datasets, UNSW-NB15 and CIC-IDS2017. The results demonstrate that the VAE-WACGAN method effectively enhances the performance metrics of the intrusion detection model. Furthermore, the VAE-WACGAN-based intrusion detection approach surpasses several other advanced methods, underscoring its effectiveness in tackling network security challenges.
RESUMO
The remarkable human ability to predict others' intent during physical interactions develops at a very early age and is crucial for development. Intent prediction, defined as the simultaneous recognition and generation of human-human interactions, has many applications such as in assistive robotics, human-robot interaction, video and robotic surveillance, and autonomous driving. However, models for solving the problem are scarce. This paper proposes two attention-based agent models to predict the intent of interacting 3D skeletons by sampling them via a sequence of glimpses. The novelty of these agent models is that they are inherently multimodal, consisting of perceptual and proprioceptive pathways. The action (attention) is driven by the agent's generation error, and not by reinforcement. At each sampling instant, the agent completes the partially observed skeletal motion and infers the interaction class. It learns where and what to sample by minimizing the generation and classification errors. Extensive evaluation of our models is carried out on benchmark datasets and in comparison to a state-of-the-art model for intent prediction, which reveals that classification and generation accuracies of one of the proposed models are comparable to those of the state of the art even though our model contains fewer trainable parameters. The insights gained from our model designs can inform the development of efficient agents, the future of artificial intelligence (AI).