Hostname: page-component-cd9895bd7-dk4vv Total loading time: 0 Render date: 2024-12-27T08:30:18.121Z Has data issue: false hasContentIssue false

A MARKOV CHAIN CHOICE PROBLEM

Published online by Cambridge University Press:  10 December 2012

Sheldon M. Ross*
Affiliation:
Epstein Department of Industrial and Systems Engineering, University of Southern California Los Angeles, CA 90089, USA E-mail: [email protected]

Abstract

Consider two independent Markov chains having states 0, 1, and identical transition probabilities. At each stage one of the chains is observed, and a reward equal to the observed state is earned. Assuming prior probabilities on the initial states of the chains it is shown that the myopic policy that always chooses to observe the chain most likely to be in state 1 stochastically maximizes the sequence of rewards earned in each period.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2013

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

1.Koole, G., Liu, Z. & Righter, R. (2001). Optimal transmission policies for noisy channels. Operations Research 49(6): 892899.CrossRefGoogle Scholar
2.Ahmad, S.H.A., Liu, M., Javidi, T., Zhao, Q. & Krishnamachari, B. (2009). Optimality of myopic sensing in multichannel opportunistic access. IEEE Transactions on Information Theory 55(9): 40404050.CrossRefGoogle Scholar
3.Zhao, Q., Krishnamachari, B. & Liu, K. (2008). On myopic sensing in multichannel opportunistic access: structure, optimality, and performance. IEEE Transactions Wireless Communications 7(12): 54315440.CrossRefGoogle Scholar