Journal article icon

Journal article

NLBAC: a neural ODE-based algorithm for state-wise stable and safe reinforcement learning

Abstract:
Ensuring safety and stability is critical when using reinforcement learning (RL) to control safety-critical systems. However, model-free RL algorithms usually suffer from low sample efficiency, and employing widely-used methods like dual ascent to solve constrained RL problems may be challenging due to their sensitivity to hyperparameters. To address these difficulties, in this work, we first propose an augmented Lagrangian-based method to maintain safety and stability through state-wise control Lyapunov function (CLF) and pre-defined control barrier function (CBFs) constraints in non-constrained Markov decision process (non-CMDP) settings. To handle tasks without pre-defined CBFs, we extend this method by training a barrier certificate jointly with the control policy, supported by theoretical guarantees to ensure monotonically improved control performance. Moreover, we investigate the issue of infeasibility arising from the presence of multiple state-wise constraints. A practical algorithm, Neural ordinary differential equations-based Lyapunov-Barrier Actor-Critic (NLBAC), is further designed by integrating the proposed method with the Soft Actor-Critic (SAC) and leveraging neural ordinary differential equations (NODEs) for system modeling. Comparisons with baselines and ablation experiments demonstrate that our algorithm achieves superior performance in terms of safety and driving the system towards the desired state with higher sample efficiency.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publisher copy:
10.1016/j.neucom.2025.130041

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Oxford college:
Kellogg College
Role:
Author
ORCID:
0000-0002-3565-8967



Publisher:
Elsevier
Journal:
Neurocomputing More from this journal
Volume:
638
Article number:
130041
Publication date:
2025-03-26
Acceptance date:
2025-03-15
DOI:
EISSN:
1872-8286
ISSN:
0925-2312

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP