Journal article icon

Journal article

An expectation maximization algorithm for continuous Markov decision processes with arbitrary rewards

Abstract:

We derive a new expectation maximization algorithm for policy optimization in linear Gaussian Markov decision processes, where the reward function is parameterized in terms of a flexible mixture of Gaussians. This approach exploits both analytical tractability and numerical optimization. Consequently, on the one hand, it is more flexible and general than closed-form solutions, such as the widely used linear quadratic Gaussian (LQG) controllers. On the other hand, it is more accurate and faste...

Expand abstract

Actions


Authors


Journal:
Journal of Machine Learning Research
Volume:
5
Pages:
232-239
Publication date:
2009-01-01
EISSN:
1533-7928
ISSN:
1532-4435
URN:
uuid:8d413ac0-b40c-4db3-93b9-43933a17311e
Source identifiers:
341630
Local pid:
pubs:341630
Language:
English

Terms of use


Metrics


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP