When Inaccuracies in Value Functions Do Not Propagate on Optima and Equilibria

Agnieszka Wiszniewska-Matyszkiel; Rajani Singh

doi:10.3390/math8071109

When Inaccuracies in Value Functions Do Not Propagate on Optima and Equilibria

Mathematics ◽

10.3390/math8071109 ◽

2020 ◽

Vol 8 (7) ◽

pp. 1109 ◽

Cited By ~ 1

Author(s):

Agnieszka Wiszniewska-Matyszkiel ◽

Rajani Singh

Keyword(s):

Dynamic Optimization ◽

Dynamic Games ◽

Value Function ◽

Optimization Problems ◽

A Priori ◽

Value Functions ◽

Feedback Controls ◽

Time Dynamic ◽

Coupled Equations ◽

The Value Function

We study general classes of discrete time dynamic optimization problems and dynamic games with feedback controls. In such problems, the solution is usually found by using the Bellman or Hamilton–Jacobi–Bellman equation for the value function in the case of dynamic optimization and a set of such coupled equations for dynamic games, which is not always possible accurately. We derive general rules stating what kind of errors in the calculation or computation of the value function do not result in errors in calculation or computation of an optimal control or a Nash equilibrium along the corresponding trajectory. This general result concerns not only errors resulting from using numerical methods but also errors resulting from some preliminary assumptions related to replacing the actual value functions by some a priori assumed constraints for them on certain subsets. We illustrate the results by a motivating example of the Fish Wars, with singularities in payoffs.

Download Full-text

A contraction approach to dynamic optimization problems

PLoS ONE ◽

10.1371/journal.pone.0260257 ◽

2021 ◽

Vol 16 (11) ◽

pp. e0260257

Author(s):

Leif K. Sandal ◽

Sturla F. Kvamsdal ◽

José M. Maroto ◽

Manuel Morán

Keyword(s):

Dynamic Optimization ◽

General Class ◽

Dynamic Games ◽

Optimization Problems ◽

Infinite Horizon ◽

Management Problem ◽

Coupled Equations ◽

Special Cases ◽

Rigorous Treatment ◽

Special Case

An infinite-horizon, multidimensional optimization problem with arbitrary yet finite periodicity in discrete time is considered. The problem can be posed as a set of coupled equations. It is shown that the problem is a special case of a more general class of contraction problems that have unique solutions. Solutions are obtained by considering a vector-valued value function and by using an iterative process. Special cases of the general class of contraction problems include the classical Bellman problem and its stochastic formulations. Thus, our approach can be viewed as an extension of the Bellman problem to the special case of nonautonomy that periodicity represents, and our approach thereby facilitates consistent and rigorous treatment of, for example, seasonality in discrete, dynamic optimization, and furthermore, certain types of dynamic games. The contraction approach is illustrated in simple examples. In the main example, which is an infinite-horizon resource management problem with a periodic price, it is found that the optimal exploitation level differs between high and low price time intervals and that the solution time paths approach a limit cycle.

Download Full-text

The Value Functions Approach and Hopf-Lax Formula for Multiobjective Costs via Set Optimization

10.20944/preprints201907.0032.v1 ◽

2019 ◽

Author(s):

Andreas Heinrich Hamel ◽

Daniela Visetti

Keyword(s):

Complete Lattice ◽

Value Function ◽

Optimization Problems ◽

Risk Measures ◽

Directional Derivative ◽

Value Functions ◽

New Concepts ◽

Stochastic Case ◽

The Value Function ◽

Lattice Approach

The complete-lattice approach to optimization problems with a vector- or even set-valued objective already produced a variety of new concepts and results and was successfully applied in finance, statistics and game theory. For example, the duality issue for multi-criteria and vector optimization problems could be solved using the complete-lattice approach, compare [11]. So far, it has been applied to set-valued dynamic risk measures (in the stochastic case), as discussed in Feinstein, Rudloff etc. (see [11], for example), but it has not been applied to deterministic calculus of variations and optimal control problems. In this paper, the following problem of set-valued optimization is considered: minimize the functional $$ \overline J_t[y]=\int_0^t \overline L(s,y(s),\dot y(s))\ ds + U_0(y(0)) $$ over all admissible arcs $y$, where $\overline L$ is the associated multifunction to a vector-valued Lagrangian $L$, the integral is in the Aumann sense and $U_0$ is the initial cost. A new concept of \emph{value function}, for which a Bellman's optimality principle holds, is introduced. Also the classical result of the Hopf-Lax formula holds for the generalized value function. Finally, a derivative with respect to the time $t$ and a directional derivative with respect to $x$ of the value function are defined, based on ideas close to the concepts in [12]. The value function is proved to be solution of a suitable Hamilton-Jacobi equation.

Download Full-text

Hill Climbing on Value Estimates for Search-control in Dyna

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/445 ◽

2019 ◽

Author(s):

Yangchen Pan ◽

Hengshuai Yao ◽

Amir-massoud Farahmand ◽

Martha White

Keyword(s):

Value Function ◽

Gradient Algorithm ◽

Hill Climbing ◽

Value Functions ◽

Current Estimate ◽

Sampling Distributions ◽

Current Value ◽

Empirical Demonstration ◽

Search Control ◽

The Value Function

Dyna is an architecture for model based reinforcement learning (RL), where simulated experience from a model is used to update policies or value functions. A key component of Dyna is search control, the mechanism to generate the state and action from which the agent queries the model, which remains largely unexplored. In this work, we propose to generate such states by using the trajectory obtained from Hill Climbing (HC) the current estimate of the value function. This has the effect of propagating value from high value regions and of preemptively updating value estimates of the regions that the agent is likely to visit next. We derive a noisy projected natural gradient algorithm for hill climbing, and highlight a connection to Langevin dynamics. We provide an empirical demonstration on four classical domains that our algorithm, HC Dyna, can obtain significant sample efficiency improvements. We study the properties of different sampling distributions for search control, and find that there appears to be a benefit specifically from using the samples generated by climbing on current value estimates from low value to high value region.

Download Full-text

Dynkin game under g-expectation in continuous time

Arabian Journal of Mathematics ◽

10.1007/s40065-020-00281-2 ◽

2020 ◽

Vol 9 (2) ◽

pp. 459-470

Author(s):

Helin Wu ◽

Yong Ren ◽

Feng Hu

Keyword(s):

Differential Equation ◽

Stochastic Differential Equation ◽

Continuous Time ◽

Value Function ◽

Backward Stochastic Differential Equation ◽

Saddle Points ◽

Value Functions ◽

Dynkin Game ◽

The Value Function

Abstract In this paper, we investigate some kind of Dynkin game under g-expectation induced by backward stochastic differential equation (short for BSDE). The lower and upper value functions $$\underline{V}_t=ess\sup \nolimits _{\tau \in {\mathcal {T}_t}} ess\inf \nolimits _{\sigma \in {\mathcal {T}_t}}\mathcal {E}^g_t[R(\tau ,\sigma )]$$ V ̲ t = e s s sup τ ∈ T t e s s inf σ ∈ T t E t g [ R ( τ , σ ) ] and $$\overline{V}_t=ess\inf \nolimits _{\sigma \in {\mathcal {T}_t}} ess\sup \nolimits _{\tau \in {\mathcal {T}_t}}\mathcal {E}^g_t[R(\tau ,\sigma )]$$ V ¯ t = e s s inf σ ∈ T t e s s sup τ ∈ T t E t g [ R ( τ , σ ) ] are defined, respectively. Under some suitable assumptions, a pair of saddle points is obtained and the value function of Dynkin game $$V(t)=\underline{V}_t=\overline{V}_t$$ V ( t ) = V ̲ t = V ¯ t follows. Furthermore, we also consider the constrained case of Dynkin game.

Download Full-text

On the subdifferential of the value function in economic optimization problems

Journal of Mathematical Economics ◽

10.1016/0304-4068(95)00717-2 ◽

1996 ◽

Vol 25 (1) ◽

pp. 55-73 ◽

Cited By ~ 7

Author(s):

Jean-Marc Bonnisseau ◽

Cuong Le Van

Keyword(s):

Value Function ◽

Optimization Problems ◽

Economic Optimization ◽

The Value Function

Download Full-text

Construction of the Value Function and Optimal Rules in Optimal Stopping of One-Dimensional Diffusions

Advances in Applied Probability ◽

10.1239/aap/1269611148 ◽

2010 ◽

Vol 42 (1) ◽

pp. 158-182 ◽

Cited By ~ 9

Author(s):

Kurt Helmes ◽

Richard H. Stockbridge

Keyword(s):

Optimal Stopping ◽

Value Function ◽

Optimization Problems ◽

Strong Duality ◽

Stopping Rules ◽

One Dimensional ◽

Reward Function ◽

Restricted Form ◽

The Family ◽

The Value Function

A new approach to the solution of optimal stopping problems for one-dimensional diffusions is developed. It arises by imbedding the stochastic problem in a linear programming problem over a space of measures. Optimizing over a smaller class of stopping rules provides a lower bound on the value of the original problem. Then the weak duality of a restricted form of the dual linear program provides an upper bound on the value. An explicit formula for the reward earned using a two-point hitting time stopping rule allows us to prove strong duality between these problems and, therefore, allows us to either optimize over these simpler stopping rules or to solve the restricted dual program. Each optimization problem is parameterized by the initial value of the diffusion and, thus, we are able to construct the value function by solving the family of optimization problems. This methodology requires little regularity of the terminal reward function. When the reward function is smooth, the optimal stopping locations are shown to satisfy the smooth pasting principle. The procedure is illustrated using two examples.

Download Full-text

Sensitivity analysis of the value function for nonsmooth optimization problems

Operations Research Letters ◽

10.1016/j.orl.2017.04.012 ◽

2017 ◽

Vol 45 (4) ◽

pp. 348-352

Author(s):

M. Alavi Hejazi ◽

S. Nobakhtian

Keyword(s):

Sensitivity Analysis ◽

Nonsmooth Optimization ◽

Value Function ◽

Optimization Problems ◽

The Value Function

Download Full-text

Geometric Asymptotic Approximation of Value Functions

The B E Journal of Theoretical Economics ◽

10.2202/1935-1704.1532 ◽

2009 ◽

Vol 9 (1) ◽

Author(s):

Axel Anderson

Keyword(s):

Value Function ◽

Payoff Function ◽

The State ◽

Second Derivative ◽

Value Functions ◽

State Variable ◽

Specific Formula ◽

Geometric Term ◽

Dynamic Stochastic ◽

The Value Function

This paper characterizes the behavior of value functions in dynamic stochastic discounted programming models near fixed points of the state space. When the second derivative of the flow payoff function is bounded, the value function is proportional to a linear function plus geometric term. A specific formula for the exponent of this geometric term is provided. This exponent continuously falls in the rate of patience.If the state variable is a martingale, the second derivative of the value function is unbounded. If the state variable is instead a strict local submartingale, then the same holds for the first derivative of the value function. Thus, the proposed approximation is more accurate than Taylor series approximation.The approximation result is used to characterize locally optimal policies in several fundamental economic problems.

Download Full-text

Bilevel Integer Programs with Stochastic Right-Hand Sides

INFORMS Journal on Computing ◽

10.1287/ijoc.2020.1055 ◽

2021 ◽

Author(s):

Junlong Zhang ◽

Osman Y. Özaltın

Keyword(s):

Structural Properties ◽

Large Scale ◽

Value Function ◽

Integer Program ◽

Value Functions ◽

Integer Programs ◽

Solution Algorithms ◽

Right Hand ◽

Solution Methods ◽

The Value Function

We develop an exact value function-based approach to solve a class of bilevel integer programs with stochastic right-hand sides. We first study structural properties and design two methods to efficiently construct the value function of a bilevel integer program. Most notably, we generalize the integer complementary slackness theorem to bilevel integer programs. We also show that the value function of a bilevel integer program can be characterized by its values on a set of so-called bilevel minimal vectors. We then solve the value function reformulation of the original bilevel integer program with stochastic right-hand sides using a branch-and-bound algorithm. We demonstrate the performance of our solution methods on a set of randomly generated instances. We also apply the proposed approach to a bilevel facility interdiction problem. Our computational experiments show that the proposed solution methods can efficiently optimize large-scale instances. The performance of our value function-based approach is relatively insensitive to the number of scenarios, but it is sensitive to the number of constraints with stochastic right-hand sides. Summary of Contribution: Bilevel integer programs arise in many different application areas of operations research including supply chain, energy, defense, and revenue management. This paper derives structural properties of the value functions of bilevel integer programs. Furthermore, it proposes exact solution algorithms for a class of bilevel integer programs with stochastic right-hand sides. These algorithms extend the applicability of bilevel integer programs to a larger set of decision-making problems under uncertainty.

Download Full-text