Article contents
Optimality conditions for a Markov decision chain with unbounded costs
Published online by Cambridge University Press: 14 July 2016
Abstract
It is known that when costs are unbounded satisfaction of the appropriate dynamic programming ‘optimality' equation by a policy is not sufficient to guarantee its average optimality. A ‘lowest-order potential' condition is introduced which, along with the dynamic programming equation, is sufficient to establish the optimality of the policy. Also, it is shown that under fairly general conditions, if the lowest-order potential condition is not satisfied there exists a non-memoryless policy with smaller average cost than the policy satisfying the dynamic programming equation.
- Type
- Research Papers
- Information
- Copyright
- Copyright © Applied Probability Trust
References
- 3
- Cited by