Contextual Bandits

Tong Zhang

doi:10.1017/9781009093057.018

17 - Contextual Bandits

Published online by Cambridge University Press: 20 July 2023

Tong Zhang

Show author details

Tong Zhang: Affiliation:
Hong Kong University of Science and Technology

Book contents

Get access

Summary

In the standard multiarmed bandit problem, one observes a fixed number of arms. To achieve optimal regret bounds, one estimates confidence intervals of the arms by counting. In the contextual bandit problem, one observes side information for each arm, which can be used as features for more accurate confidence interval estimation. This chapter studies contextual bandit problems with both linear and nonlinear models

Type: Chapter
Information: Mathematical Analysis of Machine Learning Algorithms , pp. 345 - 372

DOI: https://doi.org/10.1017/9781009093057.018 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2023

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

17 - Contextual Bandits

Summary

Access options

Book purchase

Temporarily unavailable

Book contents

17 - Contextual Bandits

Summary

Access options

Book purchase

Temporarily unavailable

Save book to Kindle

Save book to Dropbox

Save book to Google Drive