An Inequality for Variances of the Discounted Rewards

Eugene A. Feinberg; Jun Fei

doi:10.1239/jap/1261670699

An Inequality for Variances of the Discounted Rewards

Part of: Stochastic processes

Published online by Cambridge University Press: 14 July 2016

Eugene A. Feinberg and

Jun Fei

Show author details

Eugene A. Feinberg*: Affiliation:
Stony Brook University
Jun Fei*: Affiliation:
Stony Brook University
*: ∗Postal address: Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794, USA.
∗Postal address: Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794, USA.

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We consider the following two definitions of discounting: (i) multiplicative coefficient in front of the rewards, and (ii) probability that the process has not been stopped if the stopping time has an exponential distribution independent of the process. It is well known that the expected total discounted rewards corresponding to these definitions are the same. In this note we show that, the variance of the total discounted rewards is smaller for the first definition than for the second definition.

Keywords

Total discounted reward; variance stopping time

MSC classification

Secondary: 60G40: Stopping times; optimal stopping problems; gambling theory 90C40: Markov and semi-Markov decision processes

Type: Research Article
Information: Journal of Applied Probability , Volume 46 , Issue 4 , December 2009 , pp. 1209 - 1212

DOI: https://doi.org/10.1239/jap/1261670699 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 2009

References

[1] Baykal-Gürsoy, M. and Gürsoy, K. (2007). Semi-Markov decision processes: nonstandard criteria. Prob. Eng. Inf. Sci. 21, 635–657.Google Scholar

[2] Feinberg, E. A. (2004). Continuous time discounted Jump Markov decision processes: a discrete-event approach. Math. Operat. Res. 29, 492–524.Google Scholar

[3] Fristedt, B. and Gray, L. (1997). A Modern Approach to Probability Theory. Birkhäuser, Boston, MA.Google Scholar

[4] Jaquette, S. C. (1975). Markov decision processes with a new optimality criterion: continuous time. Ann. Statist. 3, 547–553.CrossRef Google Scholar

[5] Markowitz, H. M. (1952). Portfolio selection. J. Finance 7, 77–91.Google Scholar

[6] Shiryaev, A. N. (1996). Probability, 2nd edn. Springer, New York.Google Scholar

[7] Sobel, M. J. (1982). The variance of discounted Markov decision processes. J. Appl. Prob. 19, 794–802.Google Scholar

[8] Sobel, M. J. (1985). Maximal mean/standard deviation ratio in an undiscounted MDP. Operat. Res. Lett. 4, 157–159.Google Scholar

[9] Sobel, M. J. (1994). Mean-variance tradeoffs in an undiscounted MDP. Operat. Res. 42, 175–183.CrossRef Google Scholar

[10] Van Dijk, N. M. and Sladký, K. (2006). On the total reward variance for continuous-time Markov reward chains. J. Appl. Prob. 43, 1044–1052.Google Scholar

[11] White, D. J. (1988). Mean, variance, and probabilistic criteria in finite Markov decision processes: a review. J. Optimization Theory Appl. 56, 1–29.Google Scholar

Article contents

An Inequality for Variances of the Discounted Rewards

Abstract

Keywords

MSC classification

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests