Journal article icon

Journal article

When can the two-armed bandit algorithm be trusted?

Abstract:

We investigate the asymptotic behavior of one version of the so-called two-armed bandit algorithm. It is an example of stochastic approximation procedure whose associated ODE has both a repulsive and an attractive equilibrium, at which the procedure is noiseless. We show that if the gain parameter is constant or goes to 0 not too fast, the algorithm does fall in the noiseless repulsive equilibrium with positive probability, whereas it always converges to its natural attractive target when the...

Expand abstract
Publication status:
Published

Actions


Access Document


Publisher copy:
10.1214/105051604000000350

Authors


More by this author
Institution:
University of Oxford
Department:
Oxford, MPLS, Mathematical Inst
Role:
Author
Journal:
ANNALS OF APPLIED PROBABILITY
Volume:
14
Issue:
3
Pages:
1424-1454
Publication date:
2004-08-05
DOI:
EISSN:
1050-5164
ISSN:
1050-5164
URN:
uuid:6c38dcc1-ceb6-48ba-a3af-ec6bb3d74de2
Source identifiers:
20051
Local pid:
pubs:20051

Terms of use


Metrics


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP