- Abstract: Learning explainable patient temporal embeddings from observational data has mostly ignored the use of RNN architecture that excel in capturing temporal data dependencies but at the expense of explainability. This paper addresses this problem by introducing and applying an information theoretic approach to estimate the degree of explainability of such architectures. Using a communication paradigm, we formalize metrics of explainability by estimating the amount of information that an AI model needs to convey to a human end user to explain and rationalize its outputs. A key aspect of this work is to model human prior knowledge at the receiving end and measure the lack of explainability as a deviation from human prior knowledge. We apply this paradigm to medical concept representation problems by regularizing loss functions of temporal autoencoders according to the derived explainability metrics to guide the learning process towards models producing explainable outputs. We illustrate the approach with convincing experimental results for the generation of explainable temporal embeddings for critical care patient data.