\section{Introduction}\label{sec:intro}
Hybrid Markov Logic Networks (HMLNs)\citep{10.5555/1620163.1620244} are statistical relational models that compactly represent probabilistic graphical models using first-order logic structures. They are particularly well-suited for Neuro-symbolic (NeSy) reasoning~\citep{Kautz_2022} since they allow us to unify discrete symbolic knowledge with functions. Specifically, real-world data in critical domains such as education, healthcare, etc. often has relational structure, and using a HMLN, we can develop models where we express symbolic relationships in the data combined with deep representations (or embeddings) learned from the same data.

While representation-wise, a standard HMLN can be directly used to learn such a model, there are two main challenges in learning and inference that we address in this paper. First, real-valued functions defined over embeddings may have greater uncertainty compared to symbolic relationships. For instance, suppose $e_1,e_2$ are embeddings learned by a DNN for $X_1,X_2$ respectively, consider the product between a function over the embeddings, $f(e_1,e_2)$ and a symbolic relationship such as $\texttt{Friends}(X_1,X_2)$ $\wedge$ $\texttt{LikesSports}(X_1)$ $\Rightarrow$ $\texttt{LikesSports}(X_2)$. In this case, the symbolic relationship is directly observable in data, however, $f(e_1,e_2)$ is indirectly inferred from data through the DNN. In a typical HMLN formulation, we would assume that a single real-valued weight can parameterize the hybrid formula that combines the real-valued function with the symbolic relationship. However, when the domain of the real-valued function corresponds to embeddings learned from the DNN, this adds an extra layer of uncertainty due to embedding variability. To address this, we develop a mixture model where we combine variations of the embedding to reduce uncertainty in the parameterization.

The second challenge is related to inference. In a typical scenario, we fix the parameters of the HMLN learned from training data and then perform inference conditioned on evidence observed in test data. However, in some cases, the embeddings learned from test data could result in covariate shift, though the conditional label distribution remains invariant (in the case of discriminative learning). For instance, suppose we update the model architecture or the data during test varies even slightly, then the change in embeddings can be quite significant~\citep{Shu_Zhu_2019}. This implies that we may not be able to utilize the exact same parameters that we learned during training to perform inference on test data. To address this, we develop a reparameterization approach that modifies the parameters learned during training utilizing the covariate shift in embeddings observed during test. Specifically, we normalize the learned parameters of the HMLN with density ratios of embeddings that occur within those formulas. However, since the exact densities are intractable to compute, we estimate them with a probabilistic classifier~\citep{NEURIPS2019_d76d8dee}.

In our experiments, we show that using embeddings learned from Graph Neural Networks on commonly used benchmarks, we can learn a model that has a better fit (measured through conditional log-likelihood) than current state-of-the-art methods that augment statistical relational models with DNNs such as Neural PSL~\citep{ijcai2023p0461} and DeepStochLog~\citep{deepstoch}. We show that the difference between our approach and existing methods is particularly significant when the test embedding distribution varies from the training distribution. 
Next, we demonstrate our approach using a cognitive model called  Deep Knowledge Tracing (DKT)~\citet{NIPS2015_bac9162b} commonly used to represent student knowledge, and show that our approach outperforms the standard DKT model in the presence of covariate shift.

