Keywords: LMM Agent, Geometry Reasoning, Integrated Circuit Footprint
Abstract: Integrated circuit (IC) footprint geometry labeling refers to the process of converting pin diagrams in IC datasheets into machine-readable geometric parameters. This task is critical in printed circuit board (PCB) design and component assembly, as accurate labeling ensures proper IC placement and reliable connectivity. The process is challenged by unstructured annotations, complex footprint arrangements, and abstract geometric diagrams, making fully automated labeling methods inadequate. Traditional EDA tools require heavy manual input and are slow. Existing automation methods, such as OCR or object detection, can extract text or simple shapes but fail to capture the implicit geometric relationships in IC diagrams, leaving the labeling task incomplete. Recent work has shown that end-to-end large multimodal models (LMMs) can perform IC geometry labeling. However, by treating the task as a black box, these methods are prone to shortcut learning and lack interpretability. In this work, we introduce ICLabAgent, the first multi-agent framework for fully automated IC footprint geometry labeling that explicitly models the workflow of expert engineers to produce more interpretable and reliable labeling outcomes. Furthermore, we present ICAgent-Instruct, the first dynamic planning and reasoning dataset tailored for IC footprint geometry labeling. Extensive experiments show that ICLabAgent improves overall accuracy by $10.3$% compared to the previous SOTA method and by $79.5$% compared to manual annotation. Despite using only simple supervised fine-tuning on a 7B model (Qwen2-VL-7B), ICLabAgent surpasses general-purpose LMMs such as GPT-5 (by $94.6$%) and Gemini-2.5 Flash (by $378.8$%).
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 9030
Loading