Spatial Reasoning with MLLMs: A New Path to Graph-Structured Optimization

Jie Zhao; Kang Hao Cheong

Spatial Reasoning with MLLMs: A New Path to Graph-Structured Optimization

Jie Zhao, Kang Hao Cheong

28 Sept 2024 (modified: 25 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multimodal large language model, graph representation, graph-structured problem, combinatorial optimization

Abstract: Graph-structured problems pose significant challenges due to their complex structures and large scales, often making traditional computational approaches suboptimal or costly. However, when these problems are visually represented, humans can often solve them more intuitively, leveraging our inherent spatial reasoning capabilities. In this work, we introduce an original and novel approach by feeding graphs as images into multimodal large language models (MLLMs), aiming for a loss-free representation that preserves the graph's structural integrity and enables machines to mimic this human-like thinking. Our pioneering exploration of MLLMs addresses various graph-structured challenges, from combinatorial tasks like influence maximization to sequential decision-making processes such as network dismantling, along with tackling six basic graph-related problems. Our experiments reveal that MLLMs possess remarkable spatial intelligence and a unique aptitude for these problems, marking a significant step forward in enabling machines to understand and analyze graph-structured data with human-like depth and intuition. These findings also suggest that combining MLLMs with straightforward optimization techniques could offer a new, effective paradigm for managing large-scale graph problems without complex derivations, computationally demanding training and fine-tuning.

Supplementary Material: zip

Primary Area: learning on graphs and other geometries & topologies

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 12596

Loading