Interpretable AI in Human-Machine Systems: Insights from Human-in-the-Loop Product Recommendation Engines
Keywords: Interpretable AI, Human-Machine Systems, Human-in-the-Loop, Product Recommendation Engines
Abstract: Recent advances in human-machine systems have renewed a commonly-held expectation that Machine Learning (ML) may be most effectively used in conjunction with human intervention. This expectation is built on the assumption that synthesized human-machine systems which bring humans in the loop by allowing them to provide input, oversight, or supervision outperform humans or machines alone and create a whole that is superior to the sum of its parts. Despite the appeal of such an expectation, what we know about the technical and practical requirements of effective ML utilization has not been applied to carefully consider when human-machine systems might deliver on such expectations: i.e., outperform humans or machines alone. In this paper, we showcase the importance of the recognition and quantification of the false alarms in any technically sound and practically interpretable analysis of the effectiveness of ML systems. Specifically, we propose that the quantification of the costs and risks of the ML-generated false alarms is directly tied to tuning the regularization hyper-parameters, and consequently reducing the complexity and improving the effectiveness of ML systems. Using a series of A/B experiments and simulations, we demonstrate the application of our theory to tease out the effectiveness of two popular human-centric product recommendation engines: Assessment Based Recommendation (ABR) where the customers are primarily filtered by human assessment; and Broad Spectrum Recommendation (BSR) where the new product or service is introduced to all possible customers. We show that non-existent recognition or incorrect quantification of the false alarms undermines the measurability and interpretability of the economic value of using ML (absent or in conjunction with humans) to the extent that it might render it practically unjustifiable, and postulate conditions under which human-machine systems outperform humans or machines alone. By doing so, we call for researchers to transform how they conceptualize and utilize ML from one that is primarily concerned with accuracy-consistency trade-offs to one that also incorporates the costs and risks of the false alarms in the ML objective function to enhance its interpretability.
Track: Main track
Submitted Paper: No
Published Paper: No
Submission Number: 40
Loading