RESUMO
There is increasing interest in moving away from "one size fits all (OSFA)" approaches toward stratifying treatment decisions. Understanding how expected effectiveness and cost-effectiveness varies with patient covariates is a key aspect of stratified decision making. Recently proposed machine learning (ML) methods can learn heterogeneity in outcomes without pre-specifying subgroups or functional forms, enabling the construction of decision rules ('policies') that map individual covariates into a treatment decision. However, these methods do not yet integrate ML estimates into a decision modeling framework in order to reflect long-term policy-relevant outcomes and synthesize information from multiple sources. In this paper, we propose a method to integrate ML and decision modeling, when individual patient data is available to estimate treatment-specific survival time. We also propose a novel implementation of policy tree algorithms to define subgroups using decision model output. We demonstrate these methods using the SPRINT (Systolic Blood Pressure Intervention Trial), comparing outcomes for "standard" and "intensive" blood pressure targets. We find that including ML into a decision model can impact the estimate of incremental net health benefit (INHB) for OSFA policies. We also find evidence that stratifying treatment using subgroups defined by a tree-based algorithm can increase the estimates of the INHB.
Assuntos
Análise Custo-Benefício , Técnicas de Apoio para a Decisão , Aprendizado de Máquina , Humanos , Algoritmos , Masculino , FemininoRESUMO
Carbon tax and decarbonization subsidy are an effective policy mix in reducing carbon emissions. However, there is a research gap between the deterministic and static analysis related to carbon reduction policy instruments and the dynamic green transition influenced by stochastic factors. This research investigates the optimal dynamic carbon reduction strategies that develop green technologies, increase abatement inputs, and reduce carbon emissions by applying the stochastic optimal control theory. Firms that are incentivized by decarbonization subsidies and regulated by carbon tax choose optimal closed-loop control strategies of abatement inputs to achieve profit-maximizing objectives with carbon reduction constraints. The explicit solutions of the optimal carbon tax and decarbonization subsidy are provided. The simulation results illustrate that the optimal policy mix is feasible in the effective period when the carbon emission decreases significantly, which indicates that the abatement policy mix can effectively promote carbon reduction. Our results reveal that the dynamic optimal policy mix is conducive to achieving carbon abatement goals with capital uncertainty. The government should implement a dynamic carbon tax and decarbonization subsidy policy mix simultaneously associated with optimal closed-loop carbon reduction strategies. Firms with asymmetric decarbonization efficiency can transfer progressively into a cleaner productive pattern.
Assuntos
Carbono , Governo , Simulação por Computador , Políticas , Tecnologia , ChinaRESUMO
Alzheimer's Disease (AD) is believed to be the most common type of dementia. Even though screening for AD has been discussed widely, there is no screening program implemented as part of a policy in any country. Current medical research motivates focusing on the preclinical stages of the disease in a modeling initiative. We develop a partially observable Markov decision process model to determine optimal screening programs. The model contains disease free and preclinical AD partially observable states and the screening decision is taken while an individual is in one of those states. An observable diagnosed preclinical AD state is integrated along with observable mild cognitive impairment, AD and death states. Transition probabilities among states are estimated using data from Knight Alzheimer's Disease Research Center (KADRC) and relevant literature. With an objective of maximizing expected total quality-adjusted life years (QALYs), the output of the model is an optimal screening program that specifies at what points in time an individual over 50 years of age with a given risk of AD will be directed to undergo screening. The screening test used to diagnose preclinical AD has a positive disutility, is imperfect and its sensitivity and specificity are estimated using the KADRC data set. We study the impact of a potential intervention with a parameterized effectiveness and disutility on model outcomes for three different risk profiles (low, medium and high). When intervention effectiveness and disutility are at their best, the optimal screening policy is to screen every year between ages 50 and 95, with an overall QALY gain of 0.94, 1.9 and 2.9 for low, medium and high risk profiles, respectively. As intervention effectiveness diminishes and/or its disutility increases, the optimal policy changes to sporadic screening and then to never screening. Under several scenarios, some screening within the time horizon is optimal from a QALY perspective. Moreover, an in-depth analysis of costs reveals that implementing these policies are either cost-saving or cost-effective.
Assuntos
Doença de Alzheimer , Humanos , Pessoa de Meia-Idade , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/diagnóstico , Sensibilidade e Especificidade , Análise Custo-Benefício , Anos de Vida Ajustados por Qualidade de Vida , Cadeias de MarkovRESUMO
We analyze the role of disease containment policy in the form of treatment in a stochastic economic-epidemiological framework in which the probability of the occurrence of random shocks is state-dependent, namely it is related to the level of disease prevalence. Random shocks are associated with the diffusion of a new strain of the disease which affects both the number of infectives and the growth rate of infection, and the probability of such shocks realization may be either increasing or decreasing in the number of infectives. We determine the optimal policy and the steady state of such a stochastic framework, which is characterized by an invariant measure supported on strictly positive prevalence levels, suggesting that complete eradication is never a possible long run outcome where instead endemicity will prevail. Our results show that: (i) independently of the features of the state-dependent probabilities, treatment allows to shift leftward the support of the invariant measure; and (ii) the features of the state-dependent probabilities affect the shape and spread of the distribution of disease prevalence over its support, allowing for a steady state outcome characterized by a distribution alternatively highly concentrated over low prevalence levels or more spread out over a larger range of prevalence (possibly higher) levels.
RESUMO
Agriculture is under transformation in sub-Saharan Africa where millions still do not have access to a healthy diet. Policy makers in this region should find ways to accelerate agricultural transformation while increasing access to healthy diets. Optimizing agriculture's public budget stands out as a handy option. By combining a dynamic computable general equilibrium model and a multi-criteria decision-making technique, and applying them in the context of Ethiopia, this paper points to an important trade-off that policy makers should keep in mind. An optimal allocation of agriculture's public budget aimed at increasing agri-food output, creating off-farm jobs and reducing rural poverty, which are agricultural transformation objectives, will help to reduce the cost of a healthy diet, allowing around 2 million more Ethiopians to afford it. This number could even be higher should policy makers allocate the budget optimally aiming at only lowering the cost of a healthy diet, but at the cost of reducing household income and slowing down transformation.
RESUMO
Poker has been considered a challenging problem in both artificial intelligence and game theory because poker is characterized by imperfect information and uncertainty, which are similar to many realistic problems like auctioning, pricing, cyber security, and operations. However, it is not clear that playing an equilibrium policy in multi-player games would be wise so far, and it is infeasible to theoretically validate whether a policy is optimal. Therefore, designing an effective optimal policy learning method has more realistic significance. This paper proposes an optimal policy learning method for multi-player poker games based on Actor-Critic reinforcement learning. Firstly, this paper builds the Actor network to make decisions with imperfect information and the Critic network to evaluate policies with perfect information. Secondly, this paper proposes a novel multi-player poker policy update method: asynchronous policy update algorithm (APU) and dual-network asynchronous policy update algorithm (Dual-APU) for multi-player multi-policy scenarios and multi-player sharing-policy scenarios, respectively. Finally, this paper takes the most popular six-player Texas hold 'em poker to validate the performance of the proposed optimal policy learning method. The experiments demonstrate the policies learned by the proposed methods perform well and gain steadily compared with the existing approaches. In sum, the policy learning methods of imperfect information games based on Actor-Critic reinforcement learning perform well on poker and can be transferred to other imperfect information games. Such training with perfect information and testing with imperfect information models show an effective and explainable approach to learning an approximately optimal policy.
RESUMO
We investigate the optimal response of unemployment insurance to economic shocks, both with and without commitment. The optimal policy with commitment follows a modified Baily-Chetty formula that accounts for job search responses to future UI benefit changes. As a result, the optimal policy with commitment tends to front-load UI, unlike the optimal discretionary policy. In response to shocks intended to mimic those that induced the COVID-19 recession, we find that a large and transitory increase in UI is optimal; and that a policy rule contingent on the change in unemployment, rather than its level, is a good approximation to the optimal policy.
RESUMO
Wireless sensors are becoming essential in machine-type communications and Internet of Things. As the key performance metrics, the spectral efficiency as well as the energy efficiency have been considered while determining the effectiveness of sensor networks. In this paper, we present several power-splitting solutions to maximize the average harvested energy under a rate constraint when both the information and power are transmitted through the same wireless channel to a sensor (i.e., a receiver). More specifically, we first designed the optimal dynamic power-splitting policy, which decides the optimal fractional power of the received signal used for energy harvesting at the receiver. As effective solutions, we proposed two types of single-threshold-based power-splitting policies, namely, Policies I and II, which decide to switch between energy harvesting and information decoding by comparing the received signal power with some given thresholds. Additionally, we performed asymptotic analysis for a large number of packets along with practical statistics-based policies. Consequently, we demonstrated the effectiveness of the proposed power-splitting solutions in terms of the rate-energy trade-off.
RESUMO
Pharmaceutical spending in the United States, Canada, and the EU is growing. Public payers cover a large portion of these costs and have responded by instituting various pricing and access policies to limit their expenditure. One challenge that public payers face is additional demand induced by a manufacturer's marketing effort. We use a game theoretic approach to study the impact of pharmaceutical marketing on six practical pricing and access policies: negotiated pricing, open pricing, controlled pricing, a listing process, a risk-sharing arrangement, and a value-based pricing with risk-sharing arrangement. We find that all non-value-based policies result in either restricted access or suboptimal treatment coverage. We find that marketing is the highest in the first-best setting where all decisions are made by a social planner. We also find that the value-based pricing with risk-sharing arrangement is preferred by the manufacturer and from a societal perspective whereas no policy is universally preferred by a health care payer. A value-based pricing with risk-sharing arrangement always results in zero net monetary benefit for a health care payer. Therefore, considering non-value-based arrangements, we find that a negotiated pricing policy, a controlled pricing policy, or a risk-sharing arrangement may be socially preferred.
Assuntos
Custos de Medicamentos , Indústria Farmacêutica/economia , Farmacoeconomia/estatística & dados numéricos , Acessibilidade aos Serviços de Saúde/economia , Marketing de Serviços de Saúde , Modelos Econômicos , Seguridade Social , Benchmarking , Custos e Análise de Custo , Tomada de Decisões , Política de Saúde/economia , Humanos , Seguro de Serviços Farmacêuticos/economiaRESUMO
This paper considers satellite communication networks where each satellite terminal is equipped with energy harvesting (EH) devices to supply energy continuously, and randomly transmits bursty packets to a geostationary satellite over a shared wireless channel. Packet replicas combined with a successive iteration cancellation scheme can reduce the negative impact of packet collisions but consume more energy. Hence, appropriate energy management policies are required to mitigate the adverse effect of energy outages. Although centralized access schemes can provide better performance on the networks' throughput, they expend extra signallings to allocate the resources, which leads to non-negligible communication latencies, especially for the satellite communication networks. In order to reduce the communication overhead and delay, a distributed random access (RA) scheme considering the energy constraints is studied. Each EH satellite terminal (EH-ST) decides whether to transmit the packet and how many replicas are transmitted according to its local energy and EH rates to maximize the average long-term network throughput. Owing to the nonconvexity of this problem, we adopted a game theoretic method to approximate the optimal solution. By forcing all the EH-STs to employ the same policy, we characterized and proved the existence and uniqueness of the symmetric Nash equilibrium (NE) of the game. Moreover, an efficient algorithm is proposed to calculate the symmetric NE by combining a policy iteration algorithm and the bisection method. The performance of the proposed RA scheme was investigated via numerous simulations. Simulation results showed that the proposed RA scheme is applicable to the EH devices in the future low-cost interactive satellite communication system.
RESUMO
Ecological balance and stable economic development are crucial for the fishery. This study proposes a predator-prey system for marine communities, where the growth of predators follows the Allee effect and takes into account the rapid fluctuations in resource prices caused by supply and demand. The system predicts the existence of catastrophic equilibrium, which may lead to the extinction of prey, consequently leading to the extinction of predators, but fishing efforts remain high. Marine protected areas are established near fishing areas to avoid such situations. Fish migrate rapidly between these two areas and are only harvested in the nonprotected areas. A three-dimensional simplified model is derived by applying variable aggregation to describe the variation of global variables on a slow time scale. To seek conditions to avoid species extinction and maintain sustainable fishing activities, the existence of positive equilibrium points and their local stability are explored based on the simplified model. Moreover, the long-term impact of establishing marine protected areas and levying taxes based on unit catch on fishery dynamics is studied, and the optimal tax policy is obtained by applying Pontryagin's maximum principle. The theoretical analysis and numerical examples of this study demonstrate the comprehensive effectiveness of increasing the proportion of marine protected areas and controlling taxes on the sustainable development of fishery.
Assuntos
Conservação dos Recursos Naturais , Pesqueiros , Peixes , Animais , Pesqueiros/economia , Pesqueiros/estatística & dados numéricos , Conservação dos Recursos Naturais/economia , Conservação dos Recursos Naturais/métodos , Comportamento Predatório , Modelos Biológicos , Impostos , Dinâmica Populacional/estatística & dados numéricosRESUMO
We consider the scheduling of battery charging of electric vehicles (EVs) integrated with renewable power generation. The increasing adoption of EVs and the development of renewable energies contribute importance to this research. The optimization of charging scheduling is challenging because of the large action space, the multi-stage decision making, and the high uncertainty. To solve this problem is time-consuming when the scale of the system is large. It is urgent to develop a practical and efficient method to properly schedule the charging of EVs. The contribution of this work is threefold. First, we provide a sufficient condition on which the charging of EVs can be completely self-sustained by distributed generation. An algorithm is proposed to obtain the optimal charging policy when the sufficient condition holds. Second, the scenario when the supply of the renewable power generation is deficient is investigated. We prove that when the renewable generation is deterministic there exists an optimal policy which follows the modified least laxity and longer remaining processing time first (mLLLP) rule. Third, we provide an adaptive rule-based algorithm which obtains a near-optimal charging policy efficiently in general situations. We test the proposed algorithm by numerical experiments. The results show that it performs better than the other existing rule-based methods.
RESUMO
Avoiding physical contact is regarded as one of the safest and most advisable strategies to follow to reduce pathogen spread. The flip side of this approach is that a lack of social interactions may negatively affect other dimensions of health, like induction of immunosuppressive anxiety and depression or preventing interactions of importance with a diversity of microbes, which may be necessary to train our immune system or to maintain its normal levels of activity. These may in turn negatively affect a population's susceptibility to infection and the incidence of severe disease. We suggest that future pandemic modelling may benefit from relying on 'SIR+ models': epidemiological models extended to account for the benefits of social interactions that affect immune resilience. We develop an SIR+ model and discuss which specific interventions may be more effective in balancing the trade-off between minimizing pathogen spread and maximizing other interaction-dependent health benefits. Our SIR+ model reflects the idea that health is not just the mere absence of disease, but rather a state of physical, mental and social well-being that can also be dependent on the same social connections that allow pathogen spread, and the modelling of public health interventions for future pandemics should account for this multidimensionality.
Assuntos
Saúde Pública , Humanos , Suscetibilidade a Doenças , Modelos Epidemiológicos , Pandemias/prevenção & controle , Interação Social , COVID-19/epidemiologia , COVID-19/prevenção & controleRESUMO
This paper provides a framework for understanding optimal lockdowns and makes three contributions. First, it theoretically analyzes lockdown policies and argues that policy makers systematically enact too strict lockdowns because their incentives are misaligned with achieving desired ends and they cannot adapt to changing circumstances. Second, it provides a benchmark to determine how strongly policy makers in different locations should respond to COVID-19. Finally, it provides a framework for understanding how, when, and why lockdown policy is expected to change.
RESUMO
Mass public quarantining, colloquially known as a lock-down, is a non-pharmaceutical intervention to check spread of disease. This paper presents ESOP (Epidemiologically and Socio-economically Optimal Policies), a novel application of active machine learning techniques using Bayesian optimization, that interacts with an epidemiological model to arrive at lock-down schedules that optimally balance public health benefits and socio-economic downsides of reduced economic activity during lock-down periods. The utility of ESOP is demonstrated using case studies with VIPER (Virus-Individual-Policy-EnviRonment), a stochastic agent-based simulator that this paper also proposes. However, ESOP is flexible enough to interact with arbitrary epidemiological simulators in a black-box manner, and produce schedules that involve multiple phases of lock-downs.