Cosine Similarity as Logits?: A Scalable Knowledge Probe Using Embedding Vectors from Generative Language Models

ACL ARR 2025 July Submission1354 Authors

29 Jul 2025 (modified: 04 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Recently, the use of pretrained language models (PLMs) as soft knowledge bases has gained growing interest, sparking the development of knowledge probes to evaluate their factual knowledge retrieval capabilities. However, existing knowledge probes for generative PLMs that support multi-token entities exhibit quadratic time complexity $\mathcal{O}(n^2)$, limiting the size of knowledge graphs used for probing. To address this, we propose DEcoder Embedding-based Relational (DEER) probe, utilizing embedding vectors extracted from generative PLMs. DEER probe achieves effective time complexity of linear order $\mathcal{O}(n)$, supports rank-based evaluation metrics including Hit@$k$, handles multi-token entity names and enables probing whilst disambiguation of homographic tail-enity names. We empirically show that DEER-probe correlates with existing knowledge probes, validating its probing capability, and we demonstrate the practical benefits of its improved scalability.
Paper Type: Short
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: interpretability, knowledge base QA, benchmarking, evaluation
Contribution Types: Model analysis & interpretability, Approaches low compute settings-efficiency
Languages Studied: English
Previous URL: https://openreview.net/forum?id=FtOD8CdSi5
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: Yes, I want a different area chair for our submission
Reassignment Request Reviewers: Yes, I want a different set of reviewers
Justification For Not Keeping Action Editor Or Reviewers: The focus of our paper has altered since last submission. We believe reviewers with experties aligned with interpretability would be best suited as a result of this alternation.
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: N/A
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Section 5.1, Appendix C3
B2 Discuss The License For Artifacts: Yes
B2 Elaboration: Section 8. Ethical Considerations
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: Section 8. Ethical Considerations
B4 Data Contains Personally Identifying Info Or Offensive Content: N/A
B5 Documentation Of Artifacts: N/A
B6 Statistics For Data: Yes
B6 Elaboration: Table 3 Appendix
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Section 5. Experiments
C2 Experimental Setup And Hyperparameters: N/A
C3 Descriptive Statistics: N/A
C4 Parameters For Packages: N/A
D Human Subjects Including Annotators: No
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: Yes
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 1354
Loading