The following videos correspond to Fig. 4 of the paper, showing a full range of views. The full scene is shown alongside the background scene and the foreground object.
| Full Scene | Background Scene | Foreground Object | ObjectNeRF (Yang et al.,) |
The following videos correspond to Fig. 5 of the paper. The full scene is shown alongside the background scene.
| Full Scene | Background Scene |
The following videos correspond to Fig. 6 of the paper. The first column shows the full scene. The second column shows the background scene without the TV. The third column shows the disentangled foreground TV object. Lastly, the fourth column demonstrates the scene with the enlarged TV.
| Full Scene | Background Scene | Foreground Object | Transformed Scene |
The following videos correspond to Fig. 7 of the paper. The first row corresponds to RGB output while the second to disparity. The first column shows the full scene. The second column shows the background without the fortress object. The third column shows the camouflaged scene. Note how the RGB output resembles that of column 2, while the disparity matches that of the full scene in the first column.
| Full Scene | Background Scene | Camouflaged Foreground Object |
The following videos correspond to Fig. 8 of the paper. The first column shows the full scene. The second column shows the residual 3D scene added to the full scene. The third column shows the resulting scene of adding the residual to the full scene. The fourth column shows the desired output scene (background scene).
| Full Scene | Residual Scene | Inpainted Scene | Background Scene |
The following videos correspond to Fig. 9 of the paper. (a) The first row, first column shows the full scene. (b) The first row, second column shows the removal of the tree trunk. (c) The first row, second column shows the removal of the window mullion. Note how the occluding tree object was not removed. (d-f) The second row (columns one, two and three) show the semantic manipulation results of the tree trunk using target text prompts. The text prompts used are: "old tree" (d), "aspen tree" (e) and "strawberry" (d).
2D basline comparison to 3D Object Removal and 3D text-based semantic manipulation. For Object Removal we show the scenes corresponding to Leaf and Whiteboard removal. For Semantic Manipulation we show the semantic manipulation of the tree trunk using using the text prompts of "old tree" and "strawberry".
| Full Scene | Background (Ours) | DeepFill-v2 (Yu et al.,) | EdgeConnect (Nazeri et al.,) |
| Full Scene | "Old Tree" | Blended (Avrahami et al.,) | GLIDE (Nichol et al.,) |
| Full Scene | "Straweberry" | Blended (Avrahami et al.,) | GLIDE (Nichol et al.,) |
The following videos correspond to Fig. 10 (a) of the paper. The first column shows the full scene of the trex. The second column shows the trex scene with the attempted light source on the left removed.
| Trex Full Scene | Trex Left Light Removed |