Abstract: This paper addresses the problem of creating decision trees for identifying hypotheses, also known as entities, in a setting where the cost of an action is dependent on the true hypothesis. Specifically, we consider the scenario where n hypotheses are divided into m groups based on their priority levels. Taking an action on a higher priority hypothesis incurs a higher cost. This is relevant to many real-world applications where cost-sensitive decisions need to be made. For example, in a medical diagnosis task, the goal is to take a series of actions (such as medical tests) to identify a cause. Each action in this process requires conducting a test on the patient and observing the outcome, which can take anywhere from a few minutes to several weeks depending on the test. In this case, the cost (the result of waiting for the outcome) is higher if the true hypothesis is more time-sensitive. For example, if the true hypothesis is toxic chemical exposure (as opposed to a chronic disease such as diabetes), a delay of a few minutes could significantly increase the patient's risk of mortality. We propose a group greedy algorithm to solve this problem. We demonstrate that under worst-case scenarios, our algorithm has an approximation ratio of O(mlogn)<math><mi is="true">O</mi><mo stretchy="false" is="true">(</mo><mi is="true">m</mi><mi mathvariant="normal" is="true">log</mi><mo is="true"></mo><mi is="true">n</mi><mo stretchy="false" is="true">)</mo></math>. Importantly, when m=1<math><mi is="true">m</mi><mo linebreak="goodbreak" linebreakstyle="after" is="true">=</mo><mn is="true">1</mn></math>, meaning there is only one group of hypotheses, our result is consistent with the logarithmic approximation bound for the traditional optimal decision tree problem.
Loading