Hostname: page-component-586b7cd67f-g8jcs Total loading time: 0 Render date: 2024-11-24T16:40:43.207Z Has data issue: false hasContentIssue false

Vanishing discount approximations in controlled Markov chains with risk-sensitive average criterion

Published online by Cambridge University Press:  20 March 2018

Rolando Cavazos-Cadena*
Affiliation:
Universidad Autónoma Agraria Antonio Narro
Daniel Hernández-Hernández*
Affiliation:
Centro de Investigación en Matemáticas
*
* Postal address: Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Boulevard Antonio Narro 1923, Buenavista, Saltillo, Coah, 25315, México.
** Postal address: Centro de Investigación en Matemáticas, Apartado Postal 402, Guanajuato, Gto, 36000, México. Email address: [email protected]

Abstract

This work concerns Markov decision chains on a finite state space. The decision-maker has a constant and nonnull risk sensitivity coefficient, and the performance of a control policy is measured by two different indices, namely, the discounted and average criteria. Motivated by well-known results for the risk-neutral case, the problem of approximating the optimal risk-sensitive average cost in terms of the optimal risk-sensitive discounted value functions is addressed. Under suitable communication assumptions, it is shown that, as the discount factor increases to 1, appropriate normalizations of the optimal discounted value functions converge to the optimal average cost, and to the functional part of the solution of the risk-sensitive average cost optimality equation.

MSC classification

Type
Original Article
Copyright
Copyright © Applied Probability Trust 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Arapostathis, A. et al. (1993). Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J. Control Optimization 31, 282344. CrossRefGoogle Scholar
[2] Balaji, S. and Meyn, S. P. (2000). Multiplicative ergodicity and large deviations for an irreducible Markov chain. Stoch. Process. Appl. 90, 123144. CrossRefGoogle Scholar
[3] Bäuerle, N. and Rieder, U. (2011). Markov Decision Processes with Applications to Finance. Springer, Heidelberg. CrossRefGoogle Scholar
[4] Bäuerle, N. and Rieder, U. (2014). More risk-sensitive Markov decision processes. Math. Operat. Res. 39, 105120. CrossRefGoogle Scholar
[5] Bäuerle, N. and Rieder, U. (2017). Zero-sum risk-sensitive stochastic games. Stoch. Process. Appl. 127, 622642. CrossRefGoogle Scholar
[6] Borkar, V. S. and Meyn, S. P. (2002). Risk-sensitive optimal control for Markov decision processes with monotone cost. Math. Operat. Res. 27, 192209. CrossRefGoogle Scholar
[7] Cavazos-Cadena, R. (2009). Solutions of the average cost optimality equation for finite Markov decision chains: risk-sensitive and risk-neutral criteria. Math. Meth. Operat. Res. 70, 541566. CrossRefGoogle Scholar
[8] Cavazos-Cadena, R. and Cruz-Suárez, D. (2017). Discounted approximations to the risk-sensitive average cost in finite Markov chains. J. Math. Anal. Appl. 450, 13451362. CrossRefGoogle Scholar
[9] Cavazos-Cadena, R. and Fernández-Gaucherand, E. (2000). The vanishing discount approach in Markov chains with risk-sensitive criteria. IEEE Trans. Automatic Control 45, 18001816. CrossRefGoogle Scholar
[10] Chávez-Rodríguez, S., Cavazos-Cadena, R. and Cruz-Suárez, H. (2015). Continuity of the optimal average cost in Markov decision chains with small risk-sensitivity. Math. Meth. Operat. Res. 81, 269298. CrossRefGoogle Scholar
[11] Denardo, E. V. and Rothblum, U. G. (2006). A turnpike theorem for a risk-sensitive Markov decision process with stopping. SIAM J. Control Optimization 45, 414431. CrossRefGoogle Scholar
[12] Di Masi, G. B. and Stettner, L. (1999). Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM J. Control Optimization 38, 6178. CrossRefGoogle Scholar
[13] Di Masi, G. B. and Stettner, L. (2000). Infinite horizon risk sensitive control of discrete time Markov processes with small risk. Systems Control Lett. 40, 1520. CrossRefGoogle Scholar
[14] Di Masi, G. B. and Stettner, L. (2007). Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J. Control Optimization 46, 231252. CrossRefGoogle Scholar
[15] Hernández-Hernández, D. and Marcus, S. I. (1996). Risk sensitive control of Markov processes in countable state space. Systems Control Lett. 29, 147155. CrossRefGoogle Scholar
[16] Hernández-Hernández, D. and Marcus, S. I. (1999). Existence of risk-sensitive optimal stationary polices for controlled Markov processes. Appl. Math. Optimization 40, 273285. Google Scholar
[17] Hernández-Lerma, O. (1989). Adaptive Markov Control Processes. Springer, New York. CrossRefGoogle Scholar
[18] Howard, R. A. and Matheson, J. E. (1972). Risk-sensitive Markov decision processes. Manag. Sci. 18, 356369. CrossRefGoogle Scholar
[19] Jaśkiewicz, A. (2007). Average optimality for risk sensitive control with general state space. Ann. Appl. Prob. 17, 654675. CrossRefGoogle Scholar
[20] Kontoyiannis, I. and Meyn, S. P. (2003). Spectral theory and limit theorems for geometrically ergodic Markov processes. Ann. Appl. Prob. 13, 304362. CrossRefGoogle Scholar
[21] Pitera, M. and Stettner, L. (2016). Long run risk sensitive portfolio with general factors. Math. Meth. Operat. Res. 83, 265293. CrossRefGoogle Scholar
[22] Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York. CrossRefGoogle Scholar
[23] Shen, Y., Stannat, W. and Obermayer, K. (2013). Risk-sensitive Markov control processes. SIAM J. Control Optimization 51, 36523672. CrossRefGoogle Scholar
[24] Sladký, K. (2008). Growth rates and average optimality in risk-sensitive Markov decision chains. Kybernetika 44, 205226. Google Scholar
[25] Stettner, L. (1999). Risk sensitive portfolio optimization. Math. Meth. Operat. Res. 50, 463474. CrossRefGoogle Scholar
[26] Thomas, L. C. (1981). Connectedness conditions for denumerable state Markov decision processes. In Recent Developments in Markov Decision Processes, Academic Press, New York, pp. 181204. Google Scholar
[27] Tijms, H. C. (2003). A First Course in Stochastic Models. John Wiley, Chichester. CrossRefGoogle Scholar