Keywords: robotics, domain adaptation, continual learning, semantic segmentation, depth estimation, embodied agent
TL;DR: Embodied agents should adapt to their own states, such as position, altitude, and orientation, in order to better handle changing environments and improve performance on fundamental vision tasks.
Abstract: Fundamental vision tasks are increasingly important for robotics, from ground navigation to aerial monitoring. As robots are deployed in diverse scenarios, domain adaptation has been proposed to address the challenge of changing environments. However, current methods focus on bridging the gap between pre-defined, static domains such as synthetic-to-real or sunny-to-rainy. We argue that this static view of adaptation is insufficient for embodied agents. Instead, embodied agents should adapt to their own states, such as position, altitude, and orientation, to better handle changing environments and improve performance in core vision tasks. The goal should not be to merely cope with pre-defined shifts, but to enable systems to continuously adapt based on their operational status. Current models, despite their impressive performance, remain fundamentally unaware of their own states. We posit that the next generation of robust perception systems must be state-adaptive: dynamically modulating their internal processes in response to ever-changing conditions. This position paper calls for a paradigm shift from building generic, one-size-fits-all models toward adaptive systems that are intrinsically aware of their own states, paving the way for true domain robustness in robotic vision.
Submission Type: Position Paper (< 4 Pages)
Submission Number: 73
Loading