TL;DR: We propose the principle of optimal information retention to guide the interpretation of deep models for time series data, and theoretically derive the objective function and propose practical solutions.
Abstract: Explaining deep models for time-series data is crucial for identifying key patterns in sensitive domains, such as healthcare and finance. However, due to the lack of unified optimization criterion, existing explanation methods often suffer from redundancy and incompleteness, where irrelevant patterns are included or key patterns are missed in explanations. To address this challenge, we propose the Optimal Information Retention Principle, where conditional mutual information defines minimizing redundancy and maximizing completeness as optimization objectives. We then derive the corresponding objective function theoretically. As a practical framework, we introduce an explanation framework ORTE, learning a binary mask to eliminate redundant information while mining temporal patterns of explanations. We decouple the discrete mapping process to ensure the stability of gradient propagation, while employing contrastive learning to achieve precise filtering of explanatory patterns through the mask, thereby realizing a trade-off between low redundancy and high completeness. Extensive quantitative and qualitative experiments on synthetic and real-world datasets demonstrate that the proposed principle significantly improves the accuracy and completeness of explanations compared to baseline methods. The code is available at https://github.com/moon2yue/ORTE_public.
Lay Summary: Explaining deep models for time series data is crucial in domains such as healthcare and finance. However, time-series explanation is often mixed with redundant information or incomplete key information. To solve this problem, we propose the principle of optimal information retention from the perspective of information theory, which is to minimize redundant information and maximize effective information. We further theoretically derive the objective functions based on this principle. As a practical framework, we propose an explanation framework ORTE, that captures temporal patterns by learning a mask matrix. We decouple the discrete mapping process to ensure the stability of gradient propagation. We use contrastive learning to achieve accurate filtering of interpretation patterns through masks, thus achieving a trade-off between low redundancy and high completeness. Extensive quantitative and qualitative experiments on synthetic and real-world datasets show that the proposed principles significantly improve the accuracy and completeness of interpretation compared to baseline methods.
Primary Area: Applications->Time Series
Keywords: Explanations, Time Series, Optimal Information Retention
Submission Number: 1308
Loading