\section{Conclusion}
We used HMLNs to combine embeddings learned from a DNN with symbolic relational knowledge. However, when we use representations from different DNNs during training and testing, the embeddings may have covariate shift even if the two DNNs are optimizing the same function. We developed an approach to learn HMLNs assuming a covariate shift in embeddings using two main ideas. First, instead of a single model, we learned a mixture model to reduce uncertainty in the learned representation. Next, we reparameterized the learned HMLN weights to account for covariate shift in embeddings during inference. To determine this shift efficiently, we used a calibrated probabilistic classifier that estimates the density ratio between embeddings in the training and test distributions. We evaluated our approach on Graph Neural Networks and also within a well-known cognitive model called Deep Knowledge Tracing that infers student knowledge over time. We illustrated that in the presence of relational structure and covariate shift, our approach outperforms state-of-the-art methods. 

In the future, we plan to extend our approach to verify properties of embeddings. Specifically, we would want to understand if embeddings align with symbolic relational properties that can be specified by domain experts. We plan to apply this approach in cognitive models of student learning, where explainability of embeddings is important to have practical applicability. Further, we also plan to develop more scalable solutions for inference within our model. As mentioned earlier, there is a trade-off between inference efficiency and lower uncertainty. There are two potential directions that could be explored to improve scalability. First, using a mixture of experts has been very successful in DNNs, including Large Language Models, and we can adapt similar approaches for computational efficiency within our model. Next, there has been substantial research in {\em lifted inference} methods in Markov Logic Networks~\cite{gogate2011probabilistic} where the idea is to leverage exact/approximate symmetries to improve scalability in standard inference approaches such as Gibbs sampling~\cite{venugopal&gogate12}. We plan to adapt some of these approaches to scale up inference in our HMLN model.
