Hierarchical Implicit/Explicit Feedback Recommender System

Published: 28 Nov 2025, Last Modified: 30 Nov 2025NeurIPS 2025 Workshop MLxOREveryoneRevisionsBibTeXCC BY 4.0
Keywords: Recommender System, Partially observable Markov decision process, Bayesian experimental design, dialog systems
TL;DR: We introduce the Mixed Slate Agent (MSA), which comprises an inner explicit feedback loop inside any standard implicit feedback recommender system -- together, this is called the Hierarchical Implicit/Explicit Feedback Recommender System (HIER).
Abstract: In the modern attention economy, ranking is a ubiquitous task–across relevant news feeds in social media, websites in search, products in e-commerce, music and movies in audio and video streaming services, etc. Actions (tasks) in task-oriented dialogue systems (TODS) can be viewed through this lens also. Current recommender systems often deliver ranked items only and feedback comes mostly from clicks, dwell-time, and other implicit feedback. They are therefore prone to wasting substantial resources on ambiguous items, especially when the target item is buried in a larger set of candidate items and the user needs to navigate multiple slates–this scenario is expected to become more prevalent with the next generation of resource-constrained wearable computing platforms, where TODS will be bandwidth-constrained and users will have a low tolerance for errors. We propose the Mixed-Slate Agent (MSA) method, which replaces the item-only slate with a mixed-slate including either a fixed or dynamic set of binary facet or attribute queries, selected by maximizing an acquisition function depending on the joint item/response belief state. A partially observable Markov decision process (POMDP) on the item belief-state formalises the dialog. This explicit feedback loop is used only for immediate disambiguation and is embedded inside any existing recommender system. The resulting method is called the Hierarchical Implicit/Explicit Feedback Recommender System (HIER). For $K$-element slates out of $N$ ranked items, our method can deliver up to a factor of $\mathcal{O}(N\log_2⁡K/ K\log_2N)$ asymptotic improvement in scroll depth in comparison to the usual top-K approach. Numerical experiments on a toy problem, a realistic simulated goal-space environment, and real e-commerce and movie recommendation datasets demonstrate the impact of the method.
Submission Number: 100
Loading