Gentags: Discrete Semantic State for Constraint-Sensitive Decision Pipelines

Published: 10 Jun 2026, Last Modified: 10 Jun 2026OpenReview ArchiveEveryoneRevisionsCC BY 4.0
Abstract: Large language model (LLM) pipelines increasingly use free-form text to support decisions under explicit constraints, but they often lack an inspectable intermediate semantic state before decision-making. This paper introduces Gentags, a discrete, schema-free representation that externalizes semantic content from text as short, evidence-conditioned propositional units. Gentags are designed to make semantic state easier to inspect, compare, and reuse in constraint-sensitive LLM systems. We evaluate Gentags in a restaurant-review decision setting by comparing them with lexical baselines, including RAKE, YAKE, and TF-IDF. The evaluation examines three dimensions: stability across runs, prompts, and extractor models; structural properties such as semantic facet coverage and State-Gini; and downstream decision utility using four personas with explicit hard constraints. Results show that Gentags are more semantically stable than surface-level keyword representations, provide broader and more balanced coverage of diagnostic semantic facets, and improve downstream decision reliability. In the decision study, Gentags achieve higher agreement with full-evidence reference decisions and stronger hard-constraint compliance than lexical baselines, including under a token-matched truncation condition. These findings suggest that the structure of intermediate semantic state is an important design variable in LLM decision pipelines, and that discrete propositional representations can provide a practical interface for more reliable and auditable constraint-sensitive decisions.
Loading