Conference item icon

Conference item

A baseline for any order gradient estimation in stochastic computation graphs

Abstract:

By enabling correct differentiation in Stochastic Computation Graphs (SCGs), the infinitely differentiable Monte-Carlo estimator (DiCE) can generate correct estimates for the higher order gradients that arise in, e.g., multi-agent reinforcement learning and meta-learning. However, the baseline term in DiCE that serves as a control variate for reducing variance applies only to first order gradient estimation, limiting the utility of higher-order gradient estimates. To improve the sample effici...

Expand abstract
Publication status:
Published
Peer review status:
Peer reviewed
Version:
Publisher's Version

Actions


Access Document


Files:

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Computer Science
Rocktaschel, T More by this author
Al-Shedivat, M More by this author
More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Computer Science
Expand authors...
Publisher:
Proceedings of Machine Learning Research Publisher's website
Volume:
97
Pages:
4343-4351
Publication date:
2019-05-24
Acceptance date:
2019-05-14
Pubs id:
pubs:998015
URN:
uri:39df29a4-1e9a-4e06-a6a0-291dd673c682
UUID:
uuid:39df29a4-1e9a-4e06-a6a0-291dd673c682
Local pid:
pubs:998015

Terms of use


Metrics



If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP