Vanishing discount approximations in controlled Markov chains with risk-sensitive average criterion

Rolando Cavazos-Cadena; Daniel Hernández-Hernández

doi:10.1017/apr.2018.10

Vanishing discount approximations in controlled Markov chains with risk-sensitive average criterion

Published online by Cambridge University Press: 20 March 2018

Rolando Cavazos-Cadena and

Daniel Hernández-Hernández

Show author details

Rolando Cavazos-Cadena*: Affiliation:
Universidad Autónoma Agraria Antonio Narro
Daniel Hernández-Hernández*: Affiliation:
Centro de Investigación en Matemáticas
*: * Postal address: Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Boulevard Antonio Narro 1923, Buenavista, Saltillo, Coah, 25315, México.
** Postal address: Centro de Investigación en Matemáticas, Apartado Postal 402, Guanajuato, Gto, 36000, México. Email address: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This work concerns Markov decision chains on a finite state space. The decision-maker has a constant and nonnull risk sensitivity coefficient, and the performance of a control policy is measured by two different indices, namely, the discounted and average criteria. Motivated by well-known results for the risk-neutral case, the problem of approximating the optimal risk-sensitive average cost in terms of the optimal risk-sensitive discounted value functions is addressed. Under suitable communication assumptions, it is shown that, as the discount factor increases to 1, appropriate normalizations of the optimal discounted value functions converge to the optimal average cost, and to the functional part of the solution of the risk-sensitive average cost optimality equation.

Keywords

Exponential utility certainty equivalent vanishing discount method Hölder's inequality convex function

MSC classification

Secondary: 90C39: Dynamic programming

Type: Original Article
Information: Advances in Applied Probability , Volume 50 , Issue 1 , March 2018 , pp. 204 - 230

DOI: https://doi.org/10.1017/apr.2018.10 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

[1] Arapostathis, A. et al. (1993). Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J. Control Optimization 31, 282–344. CrossRef Google Scholar

[2] Balaji, S. and Meyn, S. P. (2000). Multiplicative ergodicity and large deviations for an irreducible Markov chain. Stoch. Process. Appl. 90, 123–144. CrossRef Google Scholar

[3] Bäuerle, N. and Rieder, U. (2011). Markov Decision Processes with Applications to Finance. Springer, Heidelberg. CrossRef Google Scholar

[4] Bäuerle, N. and Rieder, U. (2014). More risk-sensitive Markov decision processes. Math. Operat. Res. 39, 105–120. CrossRef Google Scholar

[5] Bäuerle, N. and Rieder, U. (2017). Zero-sum risk-sensitive stochastic games. Stoch. Process. Appl. 127, 622–642. CrossRef Google Scholar

[6] Borkar, V. S. and Meyn, S. P. (2002). Risk-sensitive optimal control for Markov decision processes with monotone cost. Math. Operat. Res. 27, 192–209. CrossRef Google Scholar

[7] Cavazos-Cadena, R. (2009). Solutions of the average cost optimality equation for finite Markov decision chains: risk-sensitive and risk-neutral criteria. Math. Meth. Operat. Res. 70, 541–566. CrossRef Google Scholar

[8] Cavazos-Cadena, R. and Cruz-Suárez, D. (2017). Discounted approximations to the risk-sensitive average cost in finite Markov chains. J. Math. Anal. Appl. 450, 1345–1362. CrossRef Google Scholar

[9] Cavazos-Cadena, R. and Fernández-Gaucherand, E. (2000). The vanishing discount approach in Markov chains with risk-sensitive criteria. IEEE Trans. Automatic Control 45, 1800–1816. CrossRef Google Scholar

[10] Chávez-Rodríguez, S., Cavazos-Cadena, R. and Cruz-Suárez, H. (2015). Continuity of the optimal average cost in Markov decision chains with small risk-sensitivity. Math. Meth. Operat. Res. 81, 269–298. CrossRef Google Scholar

[11] Denardo, E. V. and Rothblum, U. G. (2006). A turnpike theorem for a risk-sensitive Markov decision process with stopping. SIAM J. Control Optimization 45, 414–431. CrossRef Google Scholar

[12] Di Masi, G. B. and Stettner, L. (1999). Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM J. Control Optimization 38, 61–78. CrossRef Google Scholar

[13] Di Masi, G. B. and Stettner, L. (2000). Infinite horizon risk sensitive control of discrete time Markov processes with small risk. Systems Control Lett. 40, 15–20. CrossRef Google Scholar

[14] Di Masi, G. B. and Stettner, L. (2007). Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J. Control Optimization 46, 231–252. CrossRef Google Scholar

[15] Hernández-Hernández, D. and Marcus, S. I. (1996). Risk sensitive control of Markov processes in countable state space. Systems Control Lett. 29, 147–155. CrossRef Google Scholar

[16] Hernández-Hernández, D. and Marcus, S. I. (1999). Existence of risk-sensitive optimal stationary polices for controlled Markov processes. Appl. Math. Optimization 40, 273–285. Google Scholar

[17] Hernández-Lerma, O. (1989). Adaptive Markov Control Processes. Springer, New York. CrossRef Google Scholar

[18] Howard, R. A. and Matheson, J. E. (1972). Risk-sensitive Markov decision processes. Manag. Sci. 18, 356–369. CrossRef Google Scholar

[19] Jaśkiewicz, A. (2007). Average optimality for risk sensitive control with general state space. Ann. Appl. Prob. 17, 654–675. CrossRef Google Scholar

[20] Kontoyiannis, I. and Meyn, S. P. (2003). Spectral theory and limit theorems for geometrically ergodic Markov processes. Ann. Appl. Prob. 13, 304–362. CrossRef Google Scholar

[21] Pitera, M. and Stettner, L. (2016). Long run risk sensitive portfolio with general factors. Math. Meth. Operat. Res. 83, 265–293. CrossRef Google Scholar

[22] Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York. CrossRef Google Scholar

[23] Shen, Y., Stannat, W. and Obermayer, K. (2013). Risk-sensitive Markov control processes. SIAM J. Control Optimization 51, 3652–3672. CrossRef Google Scholar

[24] Sladký, K. (2008). Growth rates and average optimality in risk-sensitive Markov decision chains. Kybernetika 44, 205–226. Google Scholar

[25] Stettner, L. (1999). Risk sensitive portfolio optimization. Math. Meth. Operat. Res. 50, 463–474. CrossRef Google Scholar

[26] Thomas, L. C. (1981). Connectedness conditions for denumerable state Markov decision processes. In Recent Developments in Markov Decision Processes, Academic Press, New York, pp. 181–204. Google Scholar

[27] Tijms, H. C. (2003). A First Course in Stochastic Models. John Wiley, Chichester. CrossRef Google Scholar

Article contents

Vanishing discount approximations in controlled Markov chains with risk-sensitive average criterion

Abstract

Keywords

MSC classification

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests