Abstract: Goal imagination in robotics is an emerging concept and involves the capability to automatically generate realistic goals, which, in turn, requires the assessment of the feasibility of transitioning from the current conditions of an initial scene to the desired goal state. Existing research has explored the utilization of diverse image-generative models to create images depicting potential goal states based on the current state and instructions. In this paper, we illustrate the limitations of current state-of-the-art image generative models in accurately assessing the feasibility of specific actions in particular situations. Consequently, we present how integrating large language models, which possess profound knowledge of real-world objects and affordances, can enhance the performance of image-generative models in discerning plausible from implausible actions and simulating the outcomes of actions in a given context. This will be a step towards achieving the pragmatic goal of imagination in robotics.
External IDs:dblp:conf/icdl/AregbedeAPLL24
Loading