On gradual-impulse control of continuous-time Markov decision processes with exponential utility

Xin Guo; Aiko Kurushima; Alexey Piunovskiy; Yi Zhang

doi:10.1017/apr.2020.64

On gradual-impulse control of continuous-time Markov decision processes with exponential utility

Part of: Markov processes

Published online by Cambridge University Press: 01 July 2021

Xin Guo ,

Aiko Kurushima ,

Alexey Piunovskiy and

Yi Zhang

Show author details

Xin Guo*: Affiliation:
Tsinghua University
Aiko Kurushima*: Affiliation:
Sophia University
Alexey Piunovskiy*: Affiliation:
University of Liverpool
Yi Zhang*: Affiliation:
University of Liverpool
*: *Postal address: School of Economics and Management, Tsinghua University, Beijing 100084, China. Email address: [email protected]
**Postal address: Department of Economics, Sophia University, 7-1 Kioi-cho, Chiyoda-ku, Tokyo, 102-8554, Japan. Email address: [email protected]
***Postal address: Department of Mathematical Sciences, University of Liverpool, Liverpool, L69 7ZL, UK.
***Postal address: Department of Mathematical Sciences, University of Liverpool, Liverpool, L69 7ZL, UK.

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We consider a gradual-impulse control problem of continuous-time Markov decision processes, where the system performance is measured by the expectation of the exponential utility of the total cost. We show, under natural conditions on the system primitives, the existence of a deterministic stationary optimal policy out of a more general class of policies that allow multiple simultaneous impulses, randomized selection of impulses with random effects, and accumulation of jumps. After characterizing the value function using the optimality equation, we reduce the gradual-impulse control problem to an equivalent simple discrete-time Markov decision process, whose action space is the union of the sets of gradual and impulsive actions.

Keywords

Continuous-time Markov decision processes dynamic programming gradual-impulse control optimality equation

MSC classification

Primary: 90C40: Markov and semi-Markov decision processes

Secondary: 60J75: Jump processes

Type: Original Article
Information: Advances in Applied Probability , Volume 53 , Issue 2 , June 2021 , pp. 301 - 334

DOI: https://doi.org/10.1017/apr.2020.64 [Opens in a new window]
Copyright: © The Author(s), 2021. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Bäuerle, N. and Popp, A. (2018). Risk-sensitive stopping problems for continuous-time Markov chains. Stochastics 90, 411–431.CrossRef Google Scholar

Bertsekas, D. and Shreve, S. (1978). Stochastic Optimal Control. Academic Press, New York.Google Scholar

Costa, O. and Davis, M. (1989). Impulsive control of piecewise-deterministic processes. Math. Control Signals Systems 2, 187–206.CrossRef Google Scholar

Costa, O. and Raymundo, C. (2000). Impulse and continuous control of piecewise deterministic Markov processes. Stochastics 70, 75–107.Google Scholar

Costa, O. and Dufour, F. (2013). Continuous Average Control of Piecewise Deterministic Markov Processes. Springer, New York.CrossRef Google Scholar

Davis, M. (1993). Markov Models and Optimization. Chapman and Hall, London.CrossRef Google Scholar

Dufour, F. and Piunovskiy, A. (2015). Impulsive control for continuous-time Markov decision processes. Adv. Appl. Prob. 47, 106–127.CrossRef Google Scholar

Feinberg, E. (2005). On essential information in sequential decision processes. Math. Meth. Operat. Res. 62, 399–410.CrossRef Google Scholar

Feinberg, E., Mandava, M. and Shiryaev, A. (2017). Kolmogorov’s equations for jump Markov processes with unbounded jump rates. To appear in Ann. Operat. Res.CrossRef Google Scholar

Forwick, L., Schäl, M. and Schmitz, M. (2004). Piecewise deterministic Markov control processes with feedback controls and unbounded costs. Acta Appl. Math. 82, 239–267.CrossRef Google Scholar

Ghosh, M. and Saha, S. (2014). Risk-sensitive control of continuous time Markov chains. Stochastics 86, 655–675.CrossRef Google Scholar

Guo, X. and Zhang, Y. (2020). On risk-sensitive piecewise deterministic Markov decision processes. Appl. Math. Optimization 81, 685–710.CrossRef Google Scholar

Hernández-Lerma, O. and Lasserre, J. (1996). Discrete-Time Markov Control Processes. Springer, New York.CrossRef Google Scholar

Hordijk, A. and van der Duyn Shouten, F. (1984). Discretization and weak convergence in Markov decision drift processes. Math. Operat. Res. 9, 121–141.CrossRef Google Scholar

Jaśkiewicz, A. (2008). A note on negative dynamic programming for risk-sensitive control. Operat. Res. Lett. 36, 531–534.CrossRef Google Scholar

Kumar, S. and Pal, C. (2013). Risk-sensitive control of pure jump process on countable space with near monotone cost. Appl. Math. Optimization 68, 311–331.Google Scholar

Miller, A., Miller, B. and Stepanyan, K. (2018). Simultaneous impulse and continuous control of a Markov chain in continuous time. Automation Remote Control 81, 469–482.CrossRef Google Scholar

Palczewski, J. and Stettner, L. (2017). Impulse control maximising average cost per unit time: a non-uniformly ergodic case. SIAM J. Control Optimization 55, 936–960.CrossRef Google Scholar

Piunovski, A. and Khametov, V. (1985). New effective solutions of optimality equations for the controlled Markov chains with continuous parameter (the unbounded price-function). Problems Control Inf. Theory 14, 303–318.Google Scholar

Piunovskiy, A. (1997). Optimal Control of Random Sequences in Problems with Constraints. Kluwer, Dordrecht.CrossRef Google Scholar

Van der Duyn Schouten, F. (1983). Markov Decision Processes with Continuous Time Parameter. Mathematisch Centrum, Amsterdam.Google Scholar

Wei, Q. (2016) Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion. Math. Meth. Operat. Res. 84, 461–487.CrossRef Google Scholar

Yushkevich, A. (1980). On reducing a jump controllable Markov model to a model with discrete time. Theory Prob. Appl. 25, 58–68.CrossRef Google Scholar

Yushkevich, A. (1988). Bellman inequalities in Markov decision deterministic drift processes. Stochastics 23, 25–77.CrossRef Google Scholar

Zhang, Y. (2017). Continuous-time Markov decision processes with exponential utility. SIAM J. Control Optimization 55, 2636–2660.CrossRef Google Scholar

Article contents

On gradual-impulse control of continuous-time Markov decision processes with exponential utility

Abstract

Keywords

MSC classification

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests