Matrix-Driven Detection and Reconstruction of LLM Weight Homology

Matrix-Driven Detection and Reconstruction of LLM Weight Homology

ICLR 2026 Conference Submission16995 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Weight Homology, Similarity Detection, Large Deviation

Abstract: Recently, concerns about intellectual property in large language models (LLMs) have grown significantly, particularly around the unattributed reuse or replication of model weights. However, existing methods for detecting LLM weight homology fall short in key areas, including recovering the correspondence between weights and computing significance measures such as $p$-values. We propose Matrix-Driven Instant Review (MDIR), leveraging matrix analysis and Large Deviation Theory. MDIR achieves accurate reconstruction of weight relationships, provides rigorous $p$-value estimation, and focuses exclusively on homologous weights without requiring full model inference. We demonstrate that MDIR reliably detects homology even after extensive mutations, such as random permutations and continual pretraining with trillions of tokens. Moreover, all detections can be performed on a single consumer PC, making MDIR efficient and accessible.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 16995

Loading