Abstract: Highlights•Presents BlueprintSymVL, a new benchmark for VLM symbol recognition.•A novel one-shot evaluation method eliminates need for pre-trained knowledge.•Benchmarks GPT-4o, Gemini 2.5 Pro, InternVL 2.5 78B, and Qwen 2.5 VL 72B.•Pinpoints key VLM failure modes: clutter, similarity, and hallucination.•Shows current VLMs are not yet ready for autonomous industrial deployment.
External IDs:doi:10.1016/j.rineng.2025.108171
Loading