Abstract: Machine-learning (ML) models are ubiquitously used to make a variety of inferences, a common application being to predict and categorize user behavior. However, ML models often suffer from only being exposed to biased data – for instance, a search ranking model that uses clicks to determine how to rank will suffer from position bias. The difficulty arises due to user feedback only being observed for one treatment and not existing counterfactually for other potential treatments. In this work, we discuss a real-world situation in which a binary classification model is used in production in order to make decisions about how to treat users. We introduce the model and discuss the limitations of our modeling approach. We show that by using unit selection criterion we can do a better job classifying users. Following, we propose a causal modeling method in which we can take the existing data and use it to derive bounds that can be used to modify the objective function in order to incorporate causal learning into our training process. We demonstrate the effectiveness of this approach in a real-world setting.
0 Replies
Loading