Conference item
A baseline for any order gradient estimation in stochastic computation graphs
- Abstract:
-
By enabling correct differentiation in Stochastic Computation Graphs (SCGs), the infinitely differentiable Monte-Carlo estimator (DiCE) can generate correct estimates for the higher order gradients that arise in, e.g., multi-agent reinforcement learning and meta-learning. However, the baseline term in DiCE that serves as a control variate for reducing variance applies only to first order gradient estimation, limiting the utility of higher-order gradient estimates. To improve the sample effici...
Expand abstract
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Authors
Funding
NVIDIA
More from this funder
Bibliographic Details
- Publisher:
- Proceedings of Machine Learning Research Publisher's website
- Journal:
- Proceedings of Machine Learning Research Journal website
- Volume:
- 97
- Pages:
- 4343-4351
- Host title:
- Proceedings of Machine Learning Research
- Publication date:
- 2019-05-24
- Acceptance date:
- 2019-05-14
- Source identifiers:
-
998015
Item Description
- Pubs id:
-
pubs:998015
- UUID:
-
uuid:39df29a4-1e9a-4e06-a6a0-291dd673c682
- Local pid:
- pubs:998015
- Deposit date:
- 2019-05-14
Terms of use
- Copyright holder:
- Mao, J et al
- Copyright date:
- 2019
- Notes:
- © Mao, J et al. Conference paper presented at the 36th International Conference on Machine Learning (ICML 2019), 10-15 June 2019, Long Beach, California. The final published version and supplementary materials are available online from Proceedings of Machine Learning Research at: http://proceedings.mlr.press/v97/mao19a.html
If you are the owner of this record, you can report an update to it here: Report update to this record