Abstract: We introduce LM-Lexicon, an innovative definition modeling approach that incorporates data clustering, semantic expert learning, and model merging using a sparse mixture-of-experts architecture. By decomposing the definition modeling task into specialized semantic domains, where small language models are trained as domain experts, LM-Lexicon achieves substantial improvements (+7% BLEU score compared with the prior state-of-the-art model) over existing methods on five widely used benchmarks. Empirically, we demonstrate that 1) the clustering strategy enables fine-grained expert specialization with nearly 10% improvement in definition quality; 2) the semantic-aware domain-level routing mechanism achieves higher expert efficacy (+1%) than conventional token-level routing; and 3) further performance gains can be obtained through test-time compute and semantic expert scaling. Our work advances definition modeling while providing insights into the development of efficient language models for semantic-intensive applications. The code, data, and models will be made publicly available upon completion of the review process.
Paper Type: Long
Research Area: Semantics: Lexical and Sentence-Level
Research Area Keywords: paraphrasing,definition modeling,polysemy,sparse models
Contribution Types: NLP engineering experiment
Languages Studied: English
Previous URL: https://openreview.net/forum?id=QJWrIrDrUe
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: Yes, I want a different area chair for our submission
Reassignment Request Reviewers: Yes, I want a different set of reviewers
Justification For Not Keeping Action Editor Or Reviewers: Dismissing the work without any concrete comments regarding correctness of the results or argumentation.
Software: zip
Data: zip
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: N/A
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Section 4
B2 Discuss The License For Artifacts: Yes
B2 Elaboration: Section 4
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: Section 4
B4 Data Contains Personally Identifying Info Or Offensive Content: Yes
B4 Elaboration: Section 4
B5 Documentation Of Artifacts: Yes
B5 Elaboration: Section 4
B6 Statistics For Data: Yes
B6 Elaboration: Section 4
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Section 4
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: Section 4
C3 Descriptive Statistics: Yes
C3 Elaboration: Section 4
C4 Parameters For Packages: Yes
C4 Elaboration: Section 4
D Human Subjects Including Annotators: Yes
D1 Instructions Given To Participants: Yes
D1 Elaboration: Section 4
D2 Recruitment And Payment: Yes
D2 Elaboration: Section 4
D3 Data Consent: Yes
D3 Elaboration: Section 4
D4 Ethics Review Board Approval: Yes
D4 Elaboration: Section 4
D5 Characteristics Of Annotators: Yes
D5 Elaboration: Section 4
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 468
Loading