Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
1.
IEEE Intell Syst ; 37(2): 3-13, 2022 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-35935446

RESUMEN

The SARS-CoV-2 virus, the COVID-19 disease, and the resulting pandemic have reshaped the entire world in an unprecedented manner. Massive efforts have been made by AI communities to combat the pandemic. What roles has AI played in tackling COVID-19? How has AI performed in the battle against COVID-19? Where are the gaps and opportunities? What lessons can we learn to enhance the ability of AI to battle future pandemics? These questions, despite being fundamental, are yet to be answered in full or systematically. They need to be addressed by AI communities as a priority despite the easing of the omicron infectiousness and threat. This article reviews these issues with reflections on global AI research and the literature on tackling COVID-19. It is envisaged that the demand and priority of developing "pandemic AI" will increase over time, with smart global epidemic early warning systems to be developed by a global collaborative AI effort.

2.
Entropy (Basel) ; 20(6)2018 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-33265561

RESUMEN

Attributed networks consist of not only a network structure but also node attributes. Most existing community detection algorithms only focus on network structures and ignore node attributes, which are also important. Although some algorithms using both node attributes and network structure information have been proposed in recent years, the complex hierarchical coupling relationships within and between attributes, nodes and network structure have not been considered. Such hierarchical couplings are driving factors in community formation. This paper introduces a novel coupled node similarity (CNS) to involve and learn attribute and structure couplings and compute the similarity within and between nodes with categorical attributes in a network. CNS learns and integrates the frequency-based intra-attribute coupled similarity within an attribute, the co-occurrence-based inter-attribute coupled similarity between attributes, and coupled attribute-to-structure similarity based on the homophily property. CNS is then used to generate the weights of edges and transfer a plain graph to a weighted graph. Clustering algorithms detect community structures that are topologically well-connected and semantically coherent on the weighted graphs. Extensive experiments verify the effectiveness of CNS-based community detection algorithms on several data sets by comparing with the state-of-the-art node similarity measures, whether they involve node attribute information and hierarchical interactions, and on various levels of network structure complexity.

3.
J Biomed Inform ; 66: 19-31, 2017 02.
Artículo en Inglés | MEDLINE | ID: mdl-28011233

RESUMEN

BACKGROUND AND OBJECTIVE: Critical care patient events like sepsis or septic shock in intensive care units (ICUs) are dangerous complications which can cause multiple organ failures and eventual death. Preventive prediction of such events will allow clinicians to stage effective interventions for averting these critical complications. METHODS: It is widely understood that physiological conditions of patients on variables such as blood pressure and heart rate are suggestive to gradual changes over a certain period of time, prior to the occurrence of a septic shock. This work investigates the performance of a novel machine learning approach for the early prediction of septic shock. The approach combines highly informative sequential patterns extracted from multiple physiological variables and captures the interactions among these patterns via coupled hidden Markov models (CHMM). In particular, the patterns are extracted from three non-invasive waveform measurements: the mean arterial pressure levels, the heart rates and respiratory rates of septic shock patients from a large clinical ICU dataset called MIMIC-II. EVALUATION AND RESULTS: For baseline estimations, SVM and HMM models on the continuous time series data for the given patients, using MAP (mean arterial pressure), HR (heart rate), and RR (respiratory rate) are employed. Single channel patterns based HMM (SCP-HMM) and multi-channel patterns based coupled HMM (MCP-HMM) are compared against baseline models using 5-fold cross validation accuracies over multiple rounds. Particularly, the results of MCP-HMM are statistically significant having a p-value of 0.0014, in comparison to baseline models. Our experiments demonstrate a strong competitive accuracy in the prediction of septic shock, especially when the interactions between the multiple variables are coupled by the learning model. CONCLUSIONS: It can be concluded that the novelty of the approach, stems from the integration of sequence-based physiological pattern markers with the sequential CHMM model to learn dynamic physiological behavior, as well as from the coupling of such patterns to build powerful risk stratification models for septic shock patients.


Asunto(s)
Unidades de Cuidados Intensivos , Aprendizaje Automático , Medición de Riesgo/métodos , Choque Séptico , Presión Sanguínea , Cuidados Críticos , Predicción , Frecuencia Cardíaca , Humanos , Insuficiencia Multiorgánica
4.
Sci Rep ; 14(1): 707, 2024 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-38184669

RESUMEN

As COVID-19 vaccines became widely available worldwide, many countries implemented vaccination certification, also known as a "green pass", to promote and expedite vaccination on containing virus spread from the latter half of 2021. This policy allowed those vaccinated to have more freedom in public activities compared to more constraints on the unvaccinated in addition to existing non-pharmaceutical interventions (NPIs). Accordingly, the vaccination certification also induced heterogeneous behaviors of unvaccinated and vaccinated groups. This makes it essential yet challenging to model the behavioral impact of vaccination certification on the two groups and the transmission dynamics of COVID-19 within and between the groups. Very limited quantitative work is available for addressing these purposes. Here we propose an extended epidemiological model SEIQRD[Formula: see text] to effectively distinguish the behavioral impact of vaccination certification on unvaccinated and vaccinated groups through incorporating two contrastive transmission chains. SEIQRD[Formula: see text] also quantifies the impact of the green pass policy. With the resurgence of COVID-19 in Greece, Austria, and Israel in 2021, our simulation results indicate that their implementation of vaccination certification brought about more than a 14-fold decrease in the total number of infections and deaths as compared to a scenario with no such a policy. Additionally, a green pass policy may offer a reasonable practical solution to strike the balance between public health and individual's freedom during the pandemic.


Asunto(s)
COVID-19 , Vacunación Masiva , Humanos , Vacunas contra la COVID-19 , COVID-19/epidemiología , COVID-19/prevención & control , Vacunación , Certificación
5.
Artículo en Inglés | MEDLINE | ID: mdl-38546992

RESUMEN

Variational autoencoders (VAEs) are challenged by the imbalance between representation inference and task fitting caused by surrogate loss. To address this issue, existing methods adjust their balance by directly tuning their coefficients. However, these methods suffer from a tradeoff uncertainty, i.e., nondynamic regulation over iterations and inflexible hyperparameters for learning tasks. Accordingly, we make the first attempt to introduce an evolutionary VAE (eVAE), building on the variational information bottleneck (VIB) theory and integrative evolutionary neural learning. eVAE integrates a variational genetic algorithm (VGA) into VAE with variational evolutionary operators, including variational mutation (V-mutation), crossover, and evolution. Its training mechanism synergistically and dynamically addresses and updates the learning tradeoff uncertainty in the evidence lower bound (ELBO) without additional constraints and hyperparameter tuning. Furthermore, eVAE presents an evolutionary paradigm to tune critical factors of VAEs and addresses the premature convergence and random search problem in integrating evolutionary optimization into deep learning. Experiments show that eVAE addresses the KL-vanishing problem for text generation with low reconstruction loss, generates all the disentangled factors with sharp images, and improves image generation quality. eVAE achieves better disentanglement, generation performance, and generation-inference balance than its competitors. Code available at: https://github.com/amasawa/eVAE.

6.
Artículo en Inglés | MEDLINE | ID: mdl-38683706

RESUMEN

Due to the nonstationary nature, the distribution of real-world multivariate time series (MTS) changes over time, which is known as distribution drift. Most existing MTS forecasting models greatly suffer from distribution drift and degrade the forecasting performance over time. Existing methods address distribution drift via adapting to the latest arrived data or self-correcting per the meta knowledge derived from future data. Despite their great success in MTS forecasting, these methods hardly capture the intrinsic distribution changes, especially from a distributional perspective. Accordingly, we propose a novel framework temporal conditional variational autoencoder (TCVAE) to model the dynamic distributional dependencies over time between historical observations and future data in MTSs and infer the dependencies as a temporal conditional distribution to leverage latent variables. Specifically, a novel temporal Hawkes attention (THA) mechanism represents temporal factors that subsequently fed into feedforward networks to estimate the prior Gaussian distribution of latent variables. The representation of temporal factors further dynamically adjusts the structures of Transformer-based encoder and decoder to distribution changes by leveraging a gated attention mechanism (GAM). Moreover, we introduce conditional continuous normalization flow (CCNF) to transform the prior Gaussian to a complex and form-free distribution to facilitate flexible inference of the temporal conditional distribution. Extensive experiments conducted on six real-world MTS datasets demonstrate the TCVAE's superior robustness and effectiveness over the state-of-the-art MTS forecasting baselines. We further illustrate the TCVAE applicability through multifaceted case studies and visualization in real-world scenarios.

7.
Int J Data Sci Anal ; 15(3): 231-246, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37035277

RESUMEN

The uncertain world has seen increasing emergencies, crises and disasters (ECDs), such as the COVID-19 pandemic, hurricane Ian, global financial inflation and recession, misinformation disaster, and cyberattacks. AI for smart disaster resilience (AISDR) transforms classic reactive and scripted disaster management to digital proactive and intelligent resilience across ECD ecosystems. A systematic overview of diverse ECDs, classic ECD management, ECD data complexities, and an AISDR research landscape are presented in this article. Translational disaster AI is essential to enable smart disaster resilience.

8.
Artículo en Inglés | MEDLINE | ID: mdl-37506022

RESUMEN

We address a challenging problem-modeling high-dimensional, long-range dependencies between nonnormal multivariates, which is important for demanding applications such as cross-market modeling (CMM). With heterogeneous indicators and markets, CMM aims to capture between-market financial couplings and influence over time and within-market interactions between financial variables. We make the first attempt to integrate deep variational sequential learning with copula-based statistical dependence modeling and characterize both temporal dependence degrees and structures between hidden variables representing nonnormal multivariates. Our copula variational learning network weighted partial regular vine copula-based variational long short-term memory (WPVC-VLSTM) integrates variational long short-term memory (LSTM) networks and regular vine copula to model variational sequential dependence degrees and structures. The regular vine copula models nonnormal distributional dependence degrees and structures. VLSTM captures variational long-range dependencies coupling high-dimensional dynamic hidden variables without strong hypotheses and multivariate constraints. WPVC-VLSTM outperforms benchmarks, including linear models, stochastic volatility models, deep neural networks, and variational recurrent networks in terms of both technical significance and portfolio forecasting performance. WPVC-VLSTM shows a step-forward for CMM and deep variational learning.

9.
Artículo en Inglés | MEDLINE | ID: mdl-37235465

RESUMEN

Deep neural networks for image classification only learn to map in-distribution inputs to their corresponding ground-truth labels in training without differentiating out-of-distribution samples from in-distribution ones. This results from the assumption that all samples are independent and identically distributed (IID) without distributional distinction. Therefore, a pretrained network learned from in-distribution samples treats out-of-distribution samples as in-distribution and makes high-confidence predictions on them in the test phase. To address this issue, we draw out-of-distribution samples from the vicinity distribution of training in-distribution samples for learning to reject the prediction on out-of-distribution inputs. A cross-class vicinity distribution is introduced by assuming that an out-of-distribution sample generated by mixing multiple in-distribution samples does not share the same classes of its constituents. We, thus, improve the discriminability of a pretrained network by finetuning it with out-of-distribution samples drawn from the cross-class vicinity distribution, where each out-of-distribution input corresponds to a complementary label. Experiments on various in-/out-of-distribution datasets show that the proposed method significantly outperforms the existing methods in improving the capacity of discriminating between in-and out-of-distribution samples.

10.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 8888-8901, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37015685

RESUMEN

In deep neural learning, a discriminator trained on in-distribution (ID) samples may make high-confidence predictions on out-of-distribution (OOD) samples. This triggers a significant matter for robust, trustworthy and safe deep learning. The issue is primarily caused by the limited ID samples observable in training the discriminator when OOD samples are unavailable. We propose a general approach for fine-tuning discriminators by implicit generators (FIG). FIG is grounded on information theory and applicable to standard discriminators without retraining. It improves the ability of a standard discriminator in distinguishing ID and OOD samples by generating and penalizing its specific OOD samples. According to the Shannon entropy, an energy-based implicit generator is inferred from a discriminator without extra training costs. Then, a Langevin dynamic sampler draws specific OOD samples for the implicit generator. Lastly, we design a regularizer fitting the design principle of the implicit generator to induce high entropy on those generated OOD samples. The experiments on different networks and datasets demonstrate that FIG achieves the state-of-the-art OOD detection performance.

11.
Artículo en Inglés | MEDLINE | ID: mdl-37962995

RESUMEN

The integrity of training data, even when annotated by experts, is far from guaranteed, especially for non-independent and identically distributed (non-IID) datasets comprising both in-and out-of-distribution samples. In an ideal scenario, the majority of samples would be in-distribution, while samples that deviate semantically would be identified as out-of-distribution and excluded during the annotation process. However, experts may erroneously classify these out-of-distribution samples as in-distribution, assigning them labels that are inherently unreliable. This mixture of unreliable labels and varied data types makes the task of learning robust neural networks notably challenging. We observe that both in-and out-of-distribution samples can almost invariably be ruled out from belonging to certain classes, aside from those corresponding to unreliable ground-truth labels. This opens the possibility of utilizing reliable complementary labels that indicate the classes to which a sample does not belong. Guided by this insight, we introduce a novel approach, termed gray learning (GL), which leverages both ground-truth and complementary labels. Crucially, GL adaptively adjusts the loss weights for these two label types based on prediction confidence levels. By grounding our approach in statistical learning theory, we derive bounds for the generalization error, demonstrating that GL achieves tight constraints even in non-IID settings. Extensive experimental evaluations reveal that our method significantly outperforms alternative approaches grounded in robust statistics.

12.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 15743-15758, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37792646

RESUMEN

The discrepancy between in-distribution (ID) and out-of-distribution (OOD) samples can lead to distributional vulnerability in deep neural networks, which can subsequently lead to high-confidence predictions for OOD samples. This is mainly due to the absence of OOD samples during training, which fails to constrain the network properly. To tackle this issue, several state-of-the-art methods include adding extra OOD samples to training and assign them with manually-defined labels. However, this practice can introduce unreliable labeling, negatively affecting ID classification. The distributional vulnerability presents a critical challenge for non-IID deep learning, which aims for OOD-tolerant ID classification by balancing ID generalization and OOD detection. In this paper, we introduce a novel supervision adaptation approach to generate adaptive supervision information for OOD samples, making them more compatible with ID samples. First, we measure the dependency between ID samples and their labels using mutual information, revealing that the supervision information can be represented in terms of negative probabilities across all classes. Second, we investigate data correlations between ID and OOD samples by solving a series of binary regression problems, with the goal of refining the supervision information for more distinctly separable ID classes. Our extensive experiments on four advanced network architectures, two ID datasets, and eleven diversified OOD datasets demonstrate the efficacy of our supervision adaptation approach in improving both ID classification and OOD detection capabilities.

13.
IEEE Trans Neural Netw Learn Syst ; 34(4): 1864-1878, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-33729957

RESUMEN

The sequence analysis handles sequential discrete events and behaviors, which can be represented by temporal point processes (TPPs). However, TPP models only occurring events and behaviors. This article explores an efficient method for the negative sequential pattern (NSP) mining to leverage TPP in modeling both frequently occurring and nonoccurring events and behaviors. NSP mining is good at the challenging modeling of nonoccurrences of events and behaviors and their combinations with occurring events, with existing methods built on incorporating various constraints into NSP representations, e.g., simplifying NSP formulations and reducing computational costs. Such constraints restrict the flexibility of NSPs, and nonoccurring behaviors (NOBs) cannot be comprehensively exposed. This article addresses this issue by loosening some inflexible constraints in NSP mining and solves a series of consequent challenges. First, we provide a new definition of negative containment with the set theory according to the loose constraints. Second, an efficient method quickly calculates the supports of negative sequences. Our method only uses the information about the corresponding positive sequential patterns (PSPs) and avoids additional database scans. Finally, a novel and efficient algorithm, NegI-NSP, is proposed to efficiently identify highly valuable NSPs. Theoretical analyses, comparisons, and experiments on four synthetic and two real-life data sets clearly show that NegI-NSP can efficiently discover more useful NOBs.

14.
Comput Biol Med ; 155: 106586, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36774888

RESUMEN

Mortality prediction is crucial to evaluate the severity of illness and assist in improving the prognosis of patients. In clinical settings, one way is to analyze the multivariate time series (MTSs) of patients based on their medical data, such as heart rates and invasive mean arterial blood pressure. However, this suffers from sparse, irregularly sampled, and incomplete data issues. These issues can compromise the performance of follow-up MTS-based analytic applications. Plenty of existing methods try to deal with such irregular MTSs with missing values by capturing the temporal dependencies within a time series, yet in-depth research on modeling inter-MTS couplings remains rare and lacks model interpretability. To this end, we propose a bidirectional time and multi-feature attention coupled network (BiT-MAC) to capture the temporal dependencies (i.e., intra-time series coupling) and the hidden relationships among variables (i.e., inter-time series coupling) with a bidirectional recurrent neural network and multi-head attention, respectively. The resulting intra- and inter-time series coupling representations are then fused to estimate the missing values for a more robust MTS-based prediction. We evaluate BiT-MAC by applying it to the missing-data corrupted mortality prediction on two real-world clinical datasets, i.e., PhysioNet'2012 and COVID-19. Extensive experiments demonstrate the superiority of BiT-MAC over cutting-edge models, verifying the great value of the deep and hidden relations captured by MTSs. The interpretability of features is further demonstrated through a case study.


Asunto(s)
COVID-19 , Humanos , Factores de Tiempo , Frecuencia Cardíaca , Redes Neurales de la Computación
15.
Transl Oncol ; 35: 101714, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37331103

RESUMEN

Persistent human papillomavirus (HPV) infections is necessary for the development of cervical cancers. An increasing number of retrospective studies have found the depletion of Lactobacillus microbiota in the cervico-vagina facilitate HPV infection and might be involved in viral persistence and cancer development. However, there have been no reports confirming the immunomodulatory effects of Lactobacillus microbiota isolated from cervico-vaginal samples of HPV clearance in women. Using cervico-vaginal samples from HPV persistent infection and clearance in women, this study investigated the local immune properties in cervical mucosa. As expected, type I interferons, such as IFN-α and IFN-ß, and TLR3 globally downregulated in HPV+ persistence group. Luminex cytokine/chemokine panel analysis revealed that L. jannaschii LJV03, L. vaginalis LVV03, L. reuteri LRV03, and L. gasseri LGV03 isolated from cervicovaginal samples of HPV clearance in women altered the host's epithelial immune response, particularly L. gasseri LGV03. Furthermore, L. gasseri LGV03 enhanced the poly (I:C)-induced production of IFN by modulating the IRF3 pathway and attenuating poly (I:C)-induced production of proinflammatory mediators by regulating the NF-κB pathway in Ect1/E6E7 cells, indicating that L. gasseri LGV03 keeps the innate system alert to potential pathogens and reduces the inflammatory effects during persistent pathogen infection. L. gasseri LGV03 also markedly inhibited the proliferation of Ect1/E6E7 cells in a zebrafish xenograft model, which may be attributed to an increased immune response mediated by L. gasseri LGV03.

16.
Sci Rep ; 12(1): 5891, 2022 04 07.
Artículo en Inglés | MEDLINE | ID: mdl-35393500

RESUMEN

The COVID-19 pandemic has posed significant challenges in modeling its complex epidemic transmissions, infection and contagion, which are very different from known epidemics. The challenges in quantifying COVID-19 complexities include effectively modeling its process and data uncertainties. The uncertainties are embedded in implicit and high-proportional undocumented infections, asymptomatic contagion, social reinforcement of infections, and various quality issues in the reported data. These uncertainties become even more apparent in the first 2 months of the COVID-19 pandemic, when the relevant knowledge, case reporting and testing were all limited. Here we introduce a novel hybrid approach SUDR by expanding the foundational compartmental epidemic Susceptible-Infected-Recovered (SIR) model with two compartments to a Susceptible-Undocumented infected-Documented infected-Recovered (SUDR) model. First, SUDR (1) characterizes and distinguishes Undocumented (U) and Documented (D) infections commonly seen during COVID-19 incubation periods and asymptomatic infections. Second, SUDR characterizes the probabilistic density of infections by capturing exogenous processes like clustering contagion interactions, superspreading, and social reinforcement. Lastly, SUDR approximates the density likelihood of COVID-19 prevalence over time by incorporating Bayesian inference into SUDR. Different from existing COVID-19 models, SUDR characterizes the undocumented infections during unknown transmission processes. To capture the uncertainties of temporal transmission and social reinforcement during COVID-19 contagion, the transmission rate is modeled by a time-varying density function of undocumented infectious cases. By sampling from the mean-field posterior distribution with reasonable priors, SUDR handles the randomness, noise and sparsity of COVID-19 observations widely seen in the public COVID-19 case data. The results demonstrate a deeper quantitative understanding of the above uncertainties, in comparison with classic SIR, time-dependent SIR, and probabilistic SIR models.


Asunto(s)
COVID-19 , Infecciones Asintomáticas/epidemiología , Teorema de Bayes , COVID-19/epidemiología , Humanos , Pandemias , Refuerzo Social , SARS-CoV-2
17.
Artículo en Inglés | MEDLINE | ID: mdl-36215383

RESUMEN

Real-life events, behaviors, and interactions produce sequential data. An important but rarely explored problem is to analyze those nonoccurring (also called negative) yet important sequences, forming negative sequence analysis (NSA). A typical NSA area is to discover negative sequential patterns (NSPs) consisting of important nonoccurring and occurring elements and patterns. The limited existing work on NSP mining relies on frequentist and downward closure property-based pattern selection, producing large and highly redundant NSPs, nonactionable for business decision-making. This work makes the first attempt for actionable NSP discovery. It builds an NSP graph representation, quantifies both explicit occurrence and implicit nonoccurrence-based element and pattern relations, and then discovers significant, diverse, and informative NSPs in the NSP graph to represent the entire NSP set for discovering actionable NSPs. A DPP-based NSP representation and actionable NSP discovery method, EINSP, introduces novel and significant contributions to NSA and sequence analysis: 1) it represents NSPs by a determinantal point process (DPP)-based graph; 2) it quantifies actionable NSPs in terms of their statistical significance, diversity, and strength of explicit/implicit element/pattern relations; and 3) it models and measures both explicit and implicit element/pattern relations in the DPP-based NSP graph to represent direct and indirect couplings between NSP items, elements, and patterns. We substantially analyze the effectiveness of EINSP in terms of various theoretical and empirical aspects, including complexity, item/pattern coverage, pattern size and diversity, implicit pattern relation strength, and data factors.

18.
PLoS One ; 17(1): e0263010, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35085347

RESUMEN

Automated next-best action recommendation for each customer in a sequential, dynamic and interactive context has been widely needed in natural, social and business decision-making. Personalized next-best action recommendation must involve past, current and future customer demographics and circumstances (states) and behaviors, long-range sequential interactions between customers and decision-makers, multi-sequence interactions between states, behaviors and actions, and their reactions to their counterpart's actions. No existing modeling theories and tools, including Markovian decision processes, user and behavior modeling, deep sequential modeling, and personalized sequential recommendation, can quantify such complex decision-making on a personal level. We take a data-driven approach to learn the next-best actions for personalized decision-making by a reinforced coupled recurrent neural network (CRN). CRN represents multiple coupled dynamic sequences of a customer's historical and current states, responses to decision-makers' actions, decision rewards to actions, and learns long-term multi-sequence interactions between parties (customer and decision-maker). Next-best actions are then recommended on each customer at a time point to change their state for an optimal decision-making objective. Our study demonstrates the potential of personalized deep learning of multi-sequence interactions and automated dynamic intervention for personalized decision-making in complex systems.


Asunto(s)
Toma de Decisiones Asistida por Computador , Modelos Teóricos , Redes Neurales de la Computación
19.
IEEE Trans Pattern Anal Mach Intell ; 44(1): 533-549, 2022 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-32750827

RESUMEN

Complex categorical data is often hierarchically coupled with heterogeneous relationships between attributes and attribute values and the couplings between objects. Such value-to-object couplings are heterogeneous with complementary and inconsistent interactions and distributions. Limited research exists on unlabeled categorical data representations, ignores the heterogeneous and hierarchical couplings, underestimates data characteristics and complexities, and overuses redundant information, etc. The deep representation learning of unlabeled categorical data is challenging, overseeing such value-to-object couplings, complementarity and inconsistency, and requiring large data, disentanglement, and high computational power. This work introduces a shallow but powerful UNsupervised heTerogeneous couplIng lEarning (UNTIE) approach for representing coupled categorical data by untying the interactions between couplings and revealing heterogeneous distributions embedded in each type of couplings. UNTIE is efficiently optimized w.r.t. a kernel k-means objective function for unsupervised representation learning of heterogeneous and hierarchical value-to-object couplings. Theoretical analysis shows that UNTIE can represent categorical data with maximal separability while effectively represent heterogeneous couplings and disclose their roles in categorical data. The UNTIE-learned representations make significant performance improvement against the state-of-the-art categorical representations and deep representation models on 25 categorical data sets with diversified characteristics.

20.
IEEE Trans Neural Netw Learn Syst ; 33(10): 5125-5137, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-33852391

RESUMEN

In recommendation, both stationary and dynamic user preferences on items are embedded in the interactions between users and items (e.g., rating or clicking) within their contexts. Sequential recommender systems (SRSs) need to jointly involve such context-aware user-item interactions in terms of the couplings between the user and item features and sequential user actions on items over time. However, such joint modeling is non-trivial and significantly challenges the existing work on preference modeling, which either only models user-item interactions by latent factorization models but ignores user preference dynamics or only captures sequential user action patterns without involving user/item features and context factors and their coupling and influence on user actions. We propose a neural time-aware recommendation network (TARN) with a temporal context to jointly model 1) stationary user preferences by a feature interaction network and 2) user preference dynamics by a tailored convolutional network. The feature interaction network factorizes the pairwise couplings between non-zero features of users, items, and temporal context by the inner product of their feature embeddings while alleviating data sparsity issues. In the convolutional network, we introduce a convolutional layer with multiple filter widths to capture multi-fold sequential patterns, where an attentive average pooling (AAP) obtains significant and large-span feature combinations. To learn the preference dynamics, a novel temporal action embedding represents user actions by incorporating the embeddings of items and temporal context as the inputs of the convolutional network. The experiments on typical public data sets demonstrate that TARN outperforms state-of-the-art methods and show the necessity and contribution of involving time-aware preference dynamics and explicit user/item feature couplings in modeling and interpreting evolving user preferences.


Asunto(s)
Aprendizaje , Redes Neurales de la Computación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA