Indian Grammatical Tradition-Inspired Universal Semantic Representation Bank (USR Bank 1.0)

Indian Grammatical Tradition-Inspired Universal Semantic Representation Bank (USR Bank 1.0)

ACL ARR 2025 July Submission690 Authors

28 Jul 2025 (modified: 02 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: We introduce USR Bank 1.0, a multi-layered text-level semantic representation system specifically designed to capture speakers’ intention as it is linguistically expressed, thus making the representation unique amongst all the existing ones. Universal Semantic Representation (USR) is rigorously modelled on Universal Semantic Grammar (USG), a foundational framework deeply inspired by Pāṇini and the rich Indian Grammatical Tradition (IGT). This work presents the development of the USR Bank, where initial USRs are automatically generated by a dedicated USR builder tool and then meticulously validated using a web-based validation interface. The high inter-annotator consistency observed in the annotation of both dependency and discourse relations empirically demonstrates the robustness, clarity, and semantic groundedness of our proposed tagset, further affirming the practical viability and unique contributions of IGT principles for contemporary Natural Language Processing.

Paper Type: Long

Research Area: Semantics: Lexical and Sentence-Level

Research Area Keywords: Indian Grammatical Tradition, Panini, Universal Semantic Representation, Treebank, Semantics, Speaker's Intention, Annotation

Contribution Types: Data resources, Data analysis, Theory

Languages Studied: Hindi

Reassignment Request Area Chair: This is not a resubmission

Reassignment Request Reviewers: This is not a resubmission

Justification For Not Keeping Action Editor Or Reviewers: NA

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: N/A

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: Section 5

B2 Discuss The License For Artifacts: No

B2 Elaboration: Not yet ready for public distribution.

B3 Artifact Use Consistent With Intended Use: N/A

B4 Data Contains Personally Identifying Info Or Offensive Content: No

B4 Elaboration: The data taken is mainly from text books.

B5 Documentation Of Artifacts: Yes

B5 Elaboration: Section 4

B6 Statistics For Data: Yes

B6 Elaboration: 5.2.3 Annotated Data Statistics

C Computational Experiments: No

C1 Model Size And Budget: N/A

C2 Experimental Setup And Hyperparameters: N/A

C3 Descriptive Statistics: Yes

C3 Elaboration: Section 6 Evaluation

C4 Parameters For Packages: N/A

D Human Subjects Including Annotators: Yes

D1 Instructions Given To Participants: Yes

D1 Elaboration: 5.2.1 First Data: Manually Curated Simple Sentences; 6 Evaluation; 6.1 Inter-Annotator Agreement (IAA)

D2 Recruitment And Payment: No

D2 Elaboration: Annotators were appointed as interns and project staff, who are working on various tasks of annotations some of which are not relevant for this paper.

D3 Data Consent: No

D3 Elaboration: All data have been used for research purpose only.

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: N/A

E Ai Assistants In Research Or Writing: No

E1 Information About Use Of Ai Assistants: N/A

Author Submission Checklist: yes

Submission Number: 690

Loading