SSVTP: Self-Supervised Visuo-Tactile Pretraining to contact deformation representation learning via multi-sensor
Keywords: Contact-Rich Manipulation
Abstract: In most contact-rich manipulation tasks, humans
apply time-varying forces to the target object, compensating
for inaccuracies in the vision-guided hand trajectory. How
ever, current robot learning algorithms primarily focus on
trajectory-based policy, with limited attention given to learning
force-related skills. To address this limitation, we introduce
ForceMimic, a force-centric robot learning system, providing
a natural, force-aware and robot-free robotic demonstration
collection system, along with a hybrid force-motion imitation
learning algorithm for robust contact-rich manipulation. Using
the proposed ForceCapture system, an operator can peel a
zucchini in 5 minutes, while force-feedback teleoperation takes
over 13 minutes and struggles with task completion. With the
collected data, we propose HybridIL to train a force-centric
imitation learning model, equipped with hybrid force-position
control primitive to fit the predicted wrench-position param
eters during robot execution. Experiments demonstrate that
our approach enables the model to learn a more robust policy
under the contact-rich task of vegetable peeling, increasing the
success rates by 54.5% relatively compared to state-of-the-art pure-vision-based imitation learning.
Submission Number: 5
Loading