V-OCBF: Learning Safe Filters from Offline Data via Value-Guided Offline Control Barrier Functions

Mumuksh Tayal; Manan Tayal; Ravi Prakash

V-OCBF: Learning Safe Filters from Offline Data via Value-Guided Offline Control Barrier Functions

Mumuksh Tayal, Manan Tayal, Ravi Prakash

Published: 11 Nov 2025, Last Modified: 16 Jan 2026DAI OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Safe Control, HJ Reachability, Value Function, Control Barrier Function

TLDR: We propose a framework to learn an Control Barrier Function from offline demonstrations using Value functions to address hard safety constraints.

Abstract: Deploying autonomous systems in safety-critical domains requires hard, state-wise safety guarantees. Most offline RL methods only enforce soft constraints, allowing non-zero violation risk and struggling with high dimensional systems or complex actuation limits. We introduce Value-Guided Offline Control Barrier Function (V-OCBF), a framework that inspires from Hamilton-Jacobi Reachability to learn a forward-invariant, actuation-aware Neural CBF from offline demonstrations. Our pipeline shows an optimal control barrier function that identifies the maximal safe set which can be certified as a valid Neural CBF, encouraging hard safety that respects control limits without risky online interaction. We first validate V-OCBF on a simple Dubin's Car and then evaluate it on several high-dimensional gymnasium tasks (HalfCheetah, Hopper, Swimmer, Ant, Walker2D), where it consistently outperforms prior methods in safety and effectiveness.

Submission Number: 25

Loading