Relationship to layered-depth images (2gPY)
Added sub-section in related works section introducing the relationship (page 3 in sub-section "Layered Representations for View Synthesis").
Discussion of more recent fast NeRF training as potential solution to slow object-NeRFs (z9ns)
Expanded on why these methods are not viable solutions for our problem, in the related works section (page 3 in section "3D Compositional Scene Representations" from "While recent work" to "yet been demonstrated")
More thorough discussion of design choices for the light field compositor (z9ns)
Elaborated on the theoretical motivation behind the compositor and the current parameterization over other solutions we considered (page 5, paragraph just below equation 2).
Consider moving experiments 5.5 and 5.4 from appendix to main paper (z9ns)
Merged 5.5 from appendix to main paper in Figure 4 (page 7) and a reference in page 5 (just above section 3.3). We decided to keep 5.4 in the supplement as we felt it was not significant enough of an experiment to merit space in the main paper.
Shadow-based segmentation results concerning and confusing (2gPY, 2uG2)
Elaborated on the shadow-based model segmentations and their correctness in comparison to standard annotated segmentation benchmarks (page 10, section "results" of 4.2, from "Although some" to "removed as well").
Hints on applications to real world scenes (2gPY)
We expanded on potential applications and on the concurrent work on improving the robustness of object-encoders to real world scenes, which we believe will benefit from our lightweight 3D representation when applied to large-scale real-world datasets (page 12, section "Discussion", from "These concurrent works" to "can be large").
Discrepancy in FG-ARI values of uORF in Table 2 (2uG2)
We fixed these results in Table 2 and thank 2uG2 for catching our error.