Iterative Machine Teaching for Black-Box Markov Learners

TMLR Paper2127 Authors

31 Jan 2024 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Machine teaching has traditionally been constrained by the assumption of a fixed learner model, where the learner’s progress follows given rules, such as gradient update with fixed learning rates and version space update with a given preference function. In this paper, we consider a generic setting which views the learner as a black box, and the learner’s dynamics can be learned during the teaching process. We model the learner’s dynamics as a Markov decision process (MDP) with unknown parameters, encompassing a wide range of learner types studied in the machine teaching literature. In such a setting, machine teaching reduces to finding an optimal policy for the underlying MDP. We then introduce an algorithm for teaching such black-box Markov learners, and provide an analysis of the teaching cost under both discounted and non-discounted settings. The Markov learners considered in this work can be naturally linked to epiphany learning as studied in decision psychology. Supported by numerical study results, this paper delivers a novel perspective for machine teaching under the black-box setting, introducing a robust, versatile learner model with a rigorous theoretical foundation.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Amir-massoud_Farahmand1
Submission Number: 2127
Loading