Two-armed bandits with a goal, II. Dependent arms

Donald A. Berry; Bert Fristedt

doi:10.2307/1426751

Two-armed bandits with a goal, II. Dependent arms

Published online by Cambridge University Press: 01 July 2016

Donald A. Berry and

Bert Fristedt

Show author details

Donald A. Berry*: Affiliation:
University of Minnesota
Bert Fristedt*: Affiliation:
University of Minnesota
*: Postal address: ∗ School of Statistics, University of Minnesota, 270 Vincent Hall, 206 Church St S.E., Minneapolis, MN 55455, U.S.A.
Postal address: ∗∗ School of Mathematics, University of Minnesota, 270 Vincent Hall, 206 Church St S.E., Minneapolis, MN 55455, U.S.A.

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

One of two random variables, X and Y, can be selected at each of a possibly infinite number of stages. Depending on the outcome, one's fortune is either increased or decreased by 1. The probability of increase may not be known for either X or Y. The objective is to increase one's fortune to G before it decreases to g, for some integral g and G; either may be infinite.

In Part I (Berry and Fristedt (1980)), the distribution of X is unknown and that of Y is known. In the current part, it is known that either X or Y has probability α of increasing the current fortune by 1 and the other has probability β of increasing the fortune by 1, where α and β are known, but which goes with X is not known. We show that optimal strategies exist in general and find all optimal schemes when α = 0 and when α + β = 1. In both cases myopic strategies are shown to be optimal. A counterexample is used to show that myopic strategies, while intuitively very appealing, are not optimal for general (α, β).

Keywords

ACHIEVING A GOAL TWO-ARMED BANDITS HOW TO GAMBLE IF YOU MUST GAMBLER'S RUIN: SEQUENTIAL DECISIONS: BAYESIAN DECISION MAKING: SEQUENTIAL MEDICAL TREATMENTS STOCHASTIC CONTROL OPTIMAL DYNAMIC DESIGNS MYOPIC STRATEGIES

Type: Research Article
Information: Advances in Applied Probability , Volume 12 , Issue 4 , December 1980 , pp. 958 - 971

DOI: https://doi.org/10.2307/1426751 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 1980

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

∗

This author's research sponsored by the NSF under Grant No. MCS 78-02694.

∗∗

This author's research sponsored by the NSF under Grant No. MCS 78-01168 A01.

References

Berry, D. A. (1972) A Bernoulli two-armed bandit. Ann. Math. Statist. 43, 871–897.CrossRef Google Scholar

Berry, D. A. and Fristedt, B. (1980) Two-armed bandits with a goal, I. One arm known. Adv. Appl. Prob. 12, 775–798.Google Scholar

Degroot, M. H. (1970) Optimal Statistical Decisions. McGraw-Hill, New York.Google Scholar

Fabius, J. and Van Zwet, W. R. (1970) Some remarks on the two-armed bandit. Ann. Math. Statist. 41, 1906–1916.Google Scholar

Feldman, D. (1962) Contributions to the ‘two-armed bandit’ problem. Ann. Math. Statist. 33, 847–856.CrossRef Google Scholar

Kelly, T. A. (1974) A note on the Bernoulli two-armed bandit. Ann. Statist. 2, 1056–1062.Google Scholar

Article contents

Two-armed bandits with a goal, II. Dependent arms

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests