Frame-Based Scene Understanding: Structured Representations for Introspective Perception in Autonomous Driving

Li Liu; Leilani H. Gilpin

Frame-Based Scene Understanding: Structured Representations for Introspective Perception in Autonomous Driving

Li Liu, Leilani H. Gilpin

Published: 17 Sept 2025, Last Modified: 06 Nov 2025ACS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: frame theory, autonomous system, spatial perception, neuro-symbolic AI

Abstract: Autonomous driving systems excel at low-level perception but often lack structured, human-interpretable understanding of dynamic scenes, limiting transparency, robustness, and introspection. We present a cognitive framework for frame-based scene understanding, that transforms sensor-aligned observations into a hierarchical set of symbolic frames at the sample, object, and scene levels. Our pipeline constructs ego-referenced trajectories, applies rule-based behavior parsing, and produces natural language descriptors aligned with symbolic slot-filler structures. These representations support introspective capabilities such as expectation-driven anomaly detection, reasoning over uncertainty, and queryable explanations. We further integrate frames with large language models to probe symbolic-to-language reasoning tasks (summarization, intent inference, and counterfactuals) without raw sensor input. We describe implementation details, visualization tools, and application use cases, and outline an evaluation protocol combining qualitative case studies and task-based assessments. This work takes a step toward hybrid neuro-symbolic cognition for autonomy, enabling interpretable, reflective scene-understanding and human-aligned communication.

Paper Track: Technical paper

Submission Number: 11

Loading