Conference item icon

Conference item

DiCE: The infinitely differentiable Monte Carlo estimator

Abstract:
The score function estimator is widely used for estimating gradients of stochastic objectives in stochastic computation graphs (SCG), e.g., in reinforcement learning and meta-learning. While deriving the first order gradient estimators by differentiating a surrogate loss (SL) objective is computationally and conceptually simple, using the same approach for higher order derivatives is more challenging. Firstly, analytically deriving and implementing such estimators is laborious and not compliant with automatic differentiation. Secondly, repeatedly applying SL to construct new objectives for each order derivative involves increasingly cumbersome graph manipulations. Lastly, to match the first order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for estimators of higher order derivatives. To address all these shortcomings in a unified way, we introduce DICE, which provides a single objective that can be differentiated repeatedly, generating correct estimators of derivatives of any order in SCGs. Unlike SL, DICE relies on automatic differentiation for performing the requisite graph manipulations. We verify the correctness of DICE both through a proof and numerical evaluation of the DICE derivative estimates. We also use DICE to propose and evaluate a novel approach for multi-agent learning. Our code is available at https://goo.gl/xkkGxN.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Authors


More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Computer Science
Role:
Author


Publisher:
Journal of Machine Learning Research
Host title:
35th International Conference on Machine Learning (ICML 2018)
Journal:
35th International Conference on Machine Learning (ICML 2018) More from this journal
Publication date:
2018-07-03
Acceptance date:
2018-06-12


Pubs id:
pubs:857026
UUID:
uuid:4cc58c06-d591-498a-9d67-05f359356931
Local pid:
pubs:857026
Source identifiers:
857026
Deposit date:
2018-06-12

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP