Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
IEEE Trans Neural Netw Learn Syst ; 34(8): 4763-4775, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-34780337

RESUMEN

Contextual bandit is a popular sequential decision-making framework to balance the exploration and exploitation tradeoff in many applications such as recommender systems, search engines, etc. Motivated by two important factors in real-world applications: 1) latent contexts (or features) often exist and 2) feedbacks often have humans in the loop leading to human biases, we formulate a generalized contextual bandit framework with latent contexts. Our proposed framework includes a two-layer probabilistic interpretable model for the feedbacks from human with latent features. We design a GCL-PS algorithm for the proposed framework, which utilizes posterior sampling to balance the exploration and exploitation tradeoff. We prove a sublinear regret upper bound for GCL-PS, and prove a lower bound for the proposed bandit framework revealing insights on the optimality of GCL-PS. To further improve the computational efficiency of GCL-PS, we propose a Markov Chain Monte Carlo (MCMC) algorithm to generate approximate samples, resulting in our GCL-PSMC algorithm. We not only prove a sublinear Bayesian regret upper bound for our GCL-PSMC algorithm, but also reveal insights into the tradeoff between computational efficiency and sequential decision accuracy. Finally, we apply the proposed framework to hotel recommendations and news article recommendations, and show its superior performance over a variety of baselines via experiments on two public datasets.

2.
IEEE Trans Neural Netw Learn Syst ; 33(12): 7448-7460, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-34111014

RESUMEN

How to generate more revenues is crucial to cloud providers. Evidences from the Amazon cloud system indicate that "dynamic pricing" would be more profitable than "static pricing." The challenges are: How to set the price in real-time so to maximize revenues? How to estimate the price dependent demand so to optimize the pricing decision? We first design a discrete-time based dynamic pricing scheme and formulate a Markov decision process to characterize the evolving dynamics of the price-dependent demand. We formulate a revenue maximization framework to determine the optimal price and theoretically characterize the "structure" of the optimal revenue and optimal price. We apply the Q -learning to infer the optimal price from historical transaction data and derive sufficient conditions on the model to guarantee its convergence to the optimal price, but it converges slowly. To speed up the convergence, we incorporate the structure of the optimal revenue that we obtained earlier, leading to the VpQ-learning ( Q -learning with value projection) algorithm. We derive sufficient conditions, under which the VpQ-learning algorithm converges to the optimal policy. Experiments on a real-world dataset show that the VpQ-learning algorithm outperforms a variety of baselines, i.e., improves the revenue by as high as 50% over the Q -learning, speedy Q -learning, and adaptive real-time dynamic programming (ARTDP), and by as high as 20% over the fixed pricing scheme.

3.
Sci Rep ; 9(1): 19819, 2019 Dec 19.
Artículo en Inglés | MEDLINE | ID: mdl-31852974

RESUMEN

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

4.
Phys Rev E ; 100(5-1): 052311, 2019 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-31870000

RESUMEN

In contemporary society, understanding how information, such as trends and viruses, spreads in various social networks is an important topic in many areas. However, it is difficult to mathematically measure how widespread the information is, especially for a general network structure. There have been studies on opinion spreading, but many studies are limited to specific spreading models such as the susceptible-infected-recovered model and the independent cascade model, and it is difficult to apply these studies to various situations. In this paper, we first suggest a general opinion spreading model (GOSM) that generalizes a large class of popular spreading models. In this model, each node has one of several states, and the state changes through interaction with neighboring nodes at discrete time intervals. Next, we show that many GOSMs have a stable property that is a GOSM version of a uniform equicontinuity. Then, we provide an approximation method to approximate the expected spread size for stable GOSMs. For the approximation method, we propose a concentration theorem that guarantees that a generalized mean-field theorem calculates the expected spreading size within small error bounds for finite time steps for a slightly dense network structure. Furthermore, we prove that a "single simulation" of running the Monte Carlo simulation is sufficient to approximate the expected spreading size. We conduct experiments on both synthetic and real-world networks and show that our generalized approximation method well predicts the state density of the various models, especially in graphs with a large number of nodes. Experimental results show that the generalized mean-field approximation and a single Monte Carlo simulation converge as shown in the concentration theorem.


Asunto(s)
Modelos Teóricos , Opinión Pública , Red Social
5.
Sci Rep ; 7(1): 3723, 2017 06 16.
Artículo en Inglés | MEDLINE | ID: mdl-28623348

RESUMEN

This paper establishes a Markov chain model as a unified framework for describing the evolution processes in complex networks. The unique feature of the proposed model is its capability in addressing the formation mechanism that can reflect the "trichotomy" observed in degree distributions, based on which closed-form solutions can be derived. Important special cases of the proposed unified framework are those classical models, including Poisson, Exponential, Power-law distributed networks. Both simulation and experimental results demonstrate a good match of the proposed model with real datasets, showing its superiority over the classical models. Implications of the model to various applications including citation analysis, online social networks, and vehicular networks design, are also discussed in the paper.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...