Abstract: Checklists have been widely recognized as effective tools for completing complex tasks in a systematic manner. Although originally intended for use in procedural tasks, their interpretability and ease of use have led to their adoption for predictive tasks as well, including in clinical settings. However, designing checklists can be challenging, often requiring expert knowledge and manual rule design based on available data. Recent work has attempted to address this issue by using machine learning to automatically generate predictive checklists from data, although these approaches have been limited to Boolean data. We propose a novel method for learning predictive checklists from diverse data modalities, such as images and time series. Our approach relies on probabilistic logic programming, a learning paradigm that enables matching the discrete nature of checklist with continuous-valued data. We propose a regularization technique to tradeoff between the information captured in discrete concepts of continuous data and permit a tunable level of interpretability for the learned checklist concepts. We demonstrate that our method outperforms various explainable machine learning techniques on prediction tasks involving image sequences, medical time series, and clinical notes.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: In this revision, we have incorporated the feedback provided by Reviewer n7Qo, and all changes are highlighted in red. The changes are:
- Updated the caption of Figure 1.
- Added a comparison of our technique with DeepProbLog in Related Works.
- Corrected the typing mistake in the notation for binary mapping in Section 3.
- Elaborated on the hyperparameter d_k’ in Section 4.3
- Moved parts of the proof of Proposition 4.1 from Appendix A to Section 4.4
- Corrected the typing mistake in Equation 6 (Decision Tree Logic Rule)
- Reordered the datasets in Section 5.1 to match with the order in Table 1
- Updated the discussion on interpretability of concepts in Section 6 to talk about reasoning shortcuts in Neuro-Symbolic models.
- Updated Appendix A to include an example of Probabilistic Logic Programming, the program used for learning checklists and decision trees.
- Added a discussion on why balanced trees enhance interpretability of the concepts in Appendix K.2
- Included a discussion on k-subset sampling in Appendix L (Limitations).
Assigned Action Editor: ~Alp_Kucukelbir1
Submission Number: 3053
Loading