FORTE: Force Optimization via Riemannian Trajectory Estimation for Zero-Shot Contact-Rich Manipulation
Keywords: contact-rich manipulation, Riemannian stiffness estimation, SPD manifold, force-aware MPPI, vision-language models, LLM control loop, zero-shot manipulation
Abstract: Vision-language planners decompose manipulation tasks well at the semantic level but remain brittle in contact-rich regimes: their cost functions treat contact as rigid and position-controlled, so the resulting interaction force is whatever the underlying tracking error happens to produce. We propose \textbf{FORTE}, a zero-shot framework that closes this gap inside the running control loop. A vision-language model emits a structured multi-phase plan, per-phase cost weights, force band, contact strategy, action prior, gated by a small per-task validator that filters physically impossible LLM outputs; a semantic monitor rotates phases on live triggers and re-queries the LLM on failure. The planner is an \emph{Energy-Aware MPPI} whose cost augments geometric tracking with an interaction-energy term $\bm{\delta}^\top \hat{\bm{\Sigma}}\bm{\delta}$, a force upper bound that prevents jamming, and a contact-maintenance lower bound that stops the planner from minimising the energy term by avoiding contact entirely. The stiffness matrix $\hat{\bm{\Sigma}} \in \mathrm{SPD}(3)$ is identified online by a Riemannian estimator that uses the affine-invariant exponential map, guaranteeing positive-definiteness for any learning rate and scale-invariance across orders of magnitude in environmental stiffness. We validate the estimator on a spring-button task with closed-form ground-truth $k \in \{50, 200, 800\}$~N/m and report a failure analysis on two box-and-wall tasks.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 33
Loading