Abstract: Current approaches for Lego 3d structural assembly are usually learned to maximize intersection over union between generated output and target construction. We propose a new approach which is able to build stable structures based on physics-aware reward. Our method employs a two-level agent architecture in which a high-level proximal policy optimization based planner proposes a scheme, while a low-level wave function collapse agent handles precise brick placement with constraint satisfaction. Experimental results demonstrate that our hierarchical method consistently constructs buildings that satisfy stress constraints while reducing material usage. We also show that replacing the finite element method solver with a Fourier neural operator achieves comparable performance, providing proof-of-concept that the proposed approach can work with neural surrogates.
Submission Type: Long submission (more than 12 pages of main content)
Changes Since Last Submission: - Structural reorganization. A new "Preliminaries" section now consolidates all domain-specific concepts (Action Masking, WFC, FEM, FNO, von Mises stress) and is placed after Related Work. Section "Problem Formulation" has been refactored to contain only the environment description and the multi-objective goal. The reward design has been extracted into a dedicated "Reward Design" subsection inside "Proposed Method", and the "Stress Evaluation" content has been merged with "Physics Integration" in the same section.
- New content in the method section. A pipeline overview figure has been added at the beginning of "Proposed Method", together with two paragraphs explaining the high-level motivation for the 2D and 3D planner designs. The Preliminaries section also includes an algorithmic contrast between FEM and FNO.
- New experiments and numerical evidence. A non-hierarchical flat PPO baseline has been added to Figure 7 (page 11), evaluated at 10x10 and 15x15 scales, demonstrating that flat PPO reaches near-perfect success on 10x10 but collapses to 0 percent on 15x15. A new subsection "Low-Level Executor Ablation" with Figure 8 (page 13) reports the comparison between WFC and a low-level PPO executor under the same 2D high-level planner.
- Presentation improvements: tags for reward equations, new paragraph titles, our method's curves in Figure 7 are drawn with thicker lines and labeled "(ours)", a notation table has been added to the Appendix (page 16).
Assigned Action Editor: ~Weijian_Deng1
Submission Number: 6341
Loading