Conference item icon

Conference item

An alternative to variance: Gini deviation for risk-averse policy gradient

Abstract:
Restricting the variance of a policy's return is a popular choice in risk-averse Reinforcement Learning (RL) due to its clear mathematical definition and easy interpretability. Traditional methods directly restrict the total return variance. Recent methods restrict the per-step reward variance as a proxy. We thoroughly examine the limitations of these variance-based methods, such as sensitivity to numerical scale and hindering of policy learning, and propose to use an alternative risk measure, Gini deviation, as a substitute. We study various properties of this new risk measure and derive a policy gradient algorithm to minimize it. Empirical evaluation in domains where risk-aversion can be clearly defined, shows that our algorithm can mitigate the limitations of variance-based risk measures and achieves high return with low risk in terms of variance and Gini deviation when others fail to learn a reasonable policy.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publisher copy:
10.52202/075280-2662

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
ORCID:
0009-0000-8297-9045


More from this funder
Funder identifier:
https://ror.org/00t33hh48
Grant:
UDF01002911


Publisher:
Curran Associates
Host title:
Advances in Neural Information Processing Systems 36
Volume:
36
Pages:
60922-60946
Publication date:
2024-07-01
Event title:
37th Conference on Neural Information Processing Systems (NeurIPS 2023)
Event location:
New Orleans, Louisiana, USA
Event website:
https://neurips.cc/Conferences/2023
Event start date:
2023-12-10
Event end date:
2023-12-16
DOI:
ISSN:
1049-5258
EISBN:
9781713899921


Language:
English
Pubs id:
1994747
Local pid:
pubs:1994747
Deposit date:
2026-06-16
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP