Rollout Examples
Comparing ground truth(left) and generated(right) rollouts.
Rollout Rollout Rollout Rollout Rollout Rollout Rollout Rollout Rollout Rollout
Gripper Control & Control Sweeps
Evaluation Rollouts For Bridge Data
Task ▼ / Policy ▶ OPENVLA-7B OCTO BASE-1.5 RT-1-X
Put eggplant into Pot TASK_A – POLICY_A TASK_A – POLICY_B TASK_A – POLICY_C
Put corn on plate TASK_B – POLICY_A TASK_B – POLICY_B
Take grapes out of Pot TASK_C – POLICY_A TASK_C – POLICY_B TASK_C – POLICY_C
Evaluation Rollouts For RT-1 Data
Task ▼ / Policy ▶ OPENVLA-7B OCTO BASE-1.5 RT-1-X
Pick Blue Chip Bag TASK_A – POLICY_A TASK_A – POLICY_B TASK_A – POLICY_C
Close the drawer TASK_B – POLICY_A TASK_B – POLICY_B
OOD Input Image: Successes
Task ▼ / Policy ▶ OPENVLA-7B OCTO BASE-1.5 RT-1-X
Pick up Orange (Replace Carrot with Radish) TASK_B – POLICY_A TASK_B – POLICY_A TASK_B – POLICY_B TASK_B – POLICY_B
Pick up Carrot (With Computer on side) TASK_A – POLICY_A TASK_A – POLICY_B TASK_A – POLICY_C
Pick Circle TASK_B – POLICY_A TASK_B – POLICY_B
OOD Input Image: Failures
Task ▼ / Policy ▶ OPENVLA-7B OCTO BASE-1.5 RT-1-X
Pick up Carrot (With Computer on side) TASK_A – POLICY_A TASK_A – POLICY_B TASK_A – POLICY_C
Pick up Carrot (Computer closer to gripper) TASK_A – POLICY_A TASK_A – POLICY_B TASK_A – POLICY_C
Pick up Orange (Carrot closer to Gripper) TASK_A – POLICY_A TASK_A – POLICY_B TASK_A – POLICY_C
Pick Circle (circle on the right side) TASK_B – POLICY_A TASK_B – POLICY_B
OOD Language Instruction: Successes
Task ▼ / Policy ▶ OPENVLA-7B OCTO BASE-1.5 RT-1-X
Put Plate On Drying Rack TASK_A – POLICY_A TASK_A – POLICY_B TASK_A – POLICY_C
Move Pot With Grapes Into Drying Rack TASK_B – POLICY_A TASK_B – POLICY_B
OOD Language Instruction: Failures
Task ▼ / Policy ▶ OPENVLA-7B OCTO BASE-1.5 RT-1-X
Put Yellow Corn In Red Cup TASK_A – POLICY_A TASK_A – POLICY_B TASK_A – POLICY_C
Move The Pot To The Counter TASK_B – POLICY_A TASK_B – POLICY_B
Ground Truth vs Distraction Rollouts
Task ▼ / Policy ▶ OPENVLA-7B OCTO BASE-1.5 RT-1-X
Original TASK_A – POLICY_A TASK_A – POLICY_B TASK_A – POLICY_C
Out of Distribution (Distraction) TASK_B – POLICY_A TASK_B – POLICY_B