Keywords: Task and motion planning, learning-to-plan, mobile manipulation
TL;DR: To speed up TAMP planning, we developed a novel learning architecture, PIGINet, that encodes initial states, goal, and candidate plans to predict plan feasibility, achieving substantial reduction in planning time..
Abstract: We present a learning-enabled robot Task and Motion Planning (TAMP) algorithm that generates diverse plan skeletons and sorts them by their feasibility, i.e. the likelihood of finding values for the action parameters that satisfy all geometric constraints. We propose PIGINet, a novel Transformer-based architecture that predicts plan feasibility by tokenizing the plan skeleton, goal condition, and initial state as a sequence, fusing image, text, and value embeddings. We evaluate the runtime of our learning-enabled TAMP algorithm on several distributions of kitchen rearrangement problems, comparing its performance to that of non-learning baselines. Our experiments show that PIGINet substantially improves planning efficiency, cutting down runtime by 80\% on average on pick-and-place problems with articulated obstacles. It also achieves zero-shot generalization to problems with unseen object categories thanks to its visual encoding of objects.