% This is samplepaper.tex, a sample chapter demonstrating the
% LLNCS macro package for Springer Computer Science proceedings;
% Version 2.21 of 2022/01/12
%
\documentclass[runningheads]{llncs}
%
\usepackage[T1]{fontenc}
% T1 fonts will be used to generate the final print and online PDFs,
% so please use T1 fonts in your manuscript whenever possible.
% Other font encondings may result in incorrect characters.
%
\usepackage{graphicx}
% Used for displaying a sample figure. If possible, figure files should
% be included in EPS format.
%
% If you use the hyperref package, please uncomment the following two lines
% to display URLs in blue roman font according to Springer's eBook style:
\usepackage{hyperref}
% \usepackage{cite}
\usepackage{color}
\renewcommand\UrlFont{\color{blue}\rmfamily}
\urlstyle{rm}
%references
\usepackage[english]{babel}
\usepackage[backend=biber, style=lncs]{biblatex}
\addbibresource{references.bib}
%comments
\usepackage{comment}
\usepackage[textwidth=2.5cm,textsize=footnotesize,colorinlistoftodos]{todonotes}
%macros
\usepackage{macros}
%math
\usepackage{amsmath}
\usepackage{amssymb}
% \usepackage{amsthm}
\usepackage{cleveref}
\usepackage{stmaryrd}
%\theoremstyle{definition}
%code
\usepackage{minted}
\usemintedstyle{tango}

% macros
\newcommand{\tensor}{\cmcon}
\newcommand{\cotensor}{\cmdis}
\newcommand{\de}{\mathrm{d}}

\begin{document}
%
%\title{Formalizing First-Order Differentiable Logics}
\title{Quantifiers for Differentiable Logics in Rocq (Extended Abstract)}
%
\titlerunning{Quantifiers for Quantitative Logics in Rocq}
% If the paper title is too long for the running head, you can set
% an abbreviated paper title here
%
\author{Jairo Miguel Marulanda-Giraldo\inst{1} \and
Ekaterina Komendantskaya \inst{1,2}
%\orcidID{0000-0002-3240-0987}
\and
Alessandro Bruni\inst{3}
%\orcidID{0000-0003-2946-9462}
\and
Reynald Affeldt\inst{4}
%\orcidID{0000-0002-2327-953X}
\and
Matteo Capucci\inst{5}
\and
Enrico Marchioni\inst{1}
}
%
\authorrunning{Jairo M.\ Marulanda-Giraldo et al.}
% First names are abbreviated in the running head.
% If there are more than two authors, 'et al.' is used.
%
\institute{University of Southampton, UK \and
Heriot-Watt University, UK \and
IT-University of Copenhagen, Denmark
\and
National Institute of Advanced Industrial Science and Technology (AIST), Japan
 \and
Independent Researcher, Italy
}
%
\maketitle              % typeset the header of the contribution
%
\footnotetext{Komendantskaya and Capucci are funded by the Advanced Research + Invention Agency (ARIA).}
%
%
%
\begin{abstract}
    The interpretation of logical expressions into loss functions has given rise to so-called differentiable logics. They function as a bridge between formal logic and machine learning, offering a novel approach for property-driven training. The added expressiveness of these logics comes at the price of a more intricate semantics for first-order quantifiers. To ease their integration into machine-learning backends, we explore how to formalize semantics for first-order differentiable logics using the Mathematical Components library in the Rocq proof assistant. We seek to give rigorous semantics for quantifiers, verify their properties with respect to other logical connectives, as well as prove the soundness and completeness of the resulting logics.
\end{abstract}

\keywords{Neural
Network Verification  \and Formal Specifications \and Loss Functions \and Differentiable Logics \and Interactive Theorem Proving.}

\section{Introduction}
Quantitative logics, i.e.~logics that have semantics over the real numbers instead of over $\{0,1\}$ have been studied for decades, and date back to the ideas of Kleene, G\"{o}del, and Łukasiewicz at the start of the 20th century \cite{cintula2011handbook,prooffuzzy}. Fuzzy logics \cite{prooffuzzy}, and the logics of the Lawvere quantale \cite{prooffuzzy, bacci, dl2} are important examples of quantitative logics. To illustrate, let us have a toy syntax with atomic propositions and conjunction, such as
\begin{equation}
\begin{split}
    \Phi \ni \phi &:= A \,|\, \phi \land \phi
\end{split}
\end{equation}
where $A$ is interpreted in a domain $D \subseteq \Ereal$. $D$ varies among logics and restricts the interpretation of connectives. For example, the
G\"{o}del logic has a standard semantics over $[0, 1]$ where the conjunction is interpreted as the minimum function.

Recently, there was a surge of interest in quantitative logics, stimulated by the growing interest in \emph{AI safety} \cite{davidad24, dalrymple2024guaranteedsafeaiframework}. Differentiable Logics (DLs) form a family of methods that applies key insights from quantitative logics to this domain for property-driven learning\cite{ldl}.
%\knote{what does interpretability refers to here? either  make clear or remove}
Generally, it is considered desirable to be able to use machine learning algorithms in a way that imposes certain logical specifications during training \cite{robust,varnai}. Differentiable logics have been shown to effectively translate arbitrary logical specifications into real-valued and differentiable functions that, in turn, can be used as \emph{loss functions} in standard gradient-descent algorithms \cite{casadio}. Such loss functions
%\knote{I'd remove property-driven here, and just say "such loss functions"}
help improve the adherence of the resulting neural networks  to specifications \cite{comparingdls}. At the same time, DLs have proven to be useful in compiling specifications for the back-ends of neural network verifiers \cite{vehicle}, a process necessary to provide programming language support to property-driven training \cite{FoMLAS2023:Vehicle_Tutorial_Neural_Network}. This calls for stronger guarantees about the correctness of such compilers, and rigorous semantics for DLs, as well as their soundness, completeness, and compositionality \cite{taming, casadio, ldl}. 

Nevertheless, there is one fundamental problem that differentiable logics face. Many specifications of interest for machine learning involve quantifiers, yet the majority of quantitative logics is propositional \cite{bacci, prooffuzzy, ldl}. 
A canonical specification of this kind is  \textit{robustness} \cite{casadio2022neuralnetworkrobustnessverification},  i.e. small perturbations to the inputs of a neural network should result in small changes to its output, formally:
\begin{definition}[$\epsilon$-$\delta$-Robustness] % no space here
\label{Robustness}%
    Let $\epsilon, \delta \in \real^+$, $||\cdot||$ be a norm, and $f : \real^n \rightarrow \real^m$ be a measurable function.
    One says \textit{$f$ is $\epsilon$-$\delta$-robust} around $\bar x \in \real ^ n$ if 
    \begin{equation}
    \label{eq:robustness}
        \forall x\in \real^n , ||x - \bar x|| \leq \epsilon \Rightarrow || 
			f(x) - f(\bar x)|| \leq \delta  
    \end{equation}
\end{definition}
Expanding some sound and complete propositional quantitative logics to first-order logic often comes at the expense of either completeness or continuity.  
For example, the first-order extension of Gödel logic is the only one, among the most prominent fuzzy logics \cite{prooffuzzy,ldl}, that is sound and complete w.r.t. models with values in $[0,1]$ and with universal and existential quantifiers interpreted as infima and suprema \cite{firstgodel}.
However, connectives of this logic are continuous and therefore not suitable for gradient-descent algorithms.
%\knote{the end of this sentence is grammatically wrong, and implies the opposite of what you are trying to say. It seems you say that not all connectives of this logic are not suitable. Should be: "However, connectives of this logic are not continuous and therefore not suitable for gradient-descent algorithms."}

%For example, among several known fuzzy logics \cite{prooffuzzy, ldl}, the only first-order extension that is sound and complete involves the Gödel logic, that interprets conjunction as $\min$, disjunction as a $\max$, and universal and existential quantifiers as infima and suprema \cite{firstgodel}. However, connectives of this logic are not continuous and therefore not suitable for gradient-descent algorithms.

Recently, a promising solution was proposed by \citeauthor{capucci}: interpreting quantifiers as \textit{$p$-means} \cite{capucci}, a generalization of $p$-norms over a probability space \cite{lpspaces}. This new semantics gives hope that the open problem of finding a suitable approach to quantification in DLs will find its resolution, and we can soon find a logic that is sound and complete relative to this new quantitative semantics. 

With rigorous semantics for quantifiers, first-order DLs could be integrated into verifier back-ends. We must hence provide guarantees of the resulting logics, as well as of quantifiers with respect the other logical connectives.  Rigorous computer formalizations of propositional semantics for DLs have been used to this end \cite{taming}. Extending these formalizations to first-order logics is a non trivial challenge that is yet to be overcome. Furthermore, 
%\knote{Furthermore}
the new semantics proposed by Capucci presents a particular challenge for formal verification, since, unlike the previous formalizations of DLs \cite{taming}, it now also involves results from real analysis and probability. Most notably, it involves formalisations of measure spaces, probability spaces, and Lebsegue integrals, as well as the use of results such as Jensen's  and Hölder's inequalities \cite{inequalities}.

Rocq's Mathematical Components library (\mathcomp{}) \cite{mathcomp},
is a particularly good fit for this task, due to its extensive mathematical libraries. Many of the aforementioned standard results from measure theory are formalized in the library modules on algebra and analysis. However, some, such as the encoding of extended real numbers, still require further development.

 In this extended abstract, we first quickly review the approach to quantification proposed by Capucci, explain its relation to the available mathematical libraries in Rocq, and report on our current work on formalizing the novel semantics. With this formalization, we contribute towards developing the semantics for quantifiers in DLs. %Within this project, we plan to eventually prove the soundness and completeness of the resulting logic in Rocq.
 Tangentially, we extend \mathcomp{} as necessary. In the long term, this formalization is expected to become part of a larger collaborative project \cite{grant}, that develops a novel first-order quantitative logic and provides its full formalization in Rocq, including, when and if that will be proven, the formalisation of the soundness and completeness results for the logic.
 Our work seeks to aid in the development of programming language support for property-driven development of neural networks, as well as influence machine learning research in general \cite{vehicle,grant}. 
 

\section{Preliminaries}
\label{Preliminaries}
We introduce preliminaries from the extended arithmetic of the reals. They are an abridged version of \cite{capucci}, specifically we do not address the `non-linear' fragment therein. We also diverge from \emph{ibid.} in notation, preferring standard linear logic notation.

Our base setting are the positive extended reals $[0,\infty]$, considered as sup-lattice with the usual order $\leq$.
The topology on $\real^+$ is extended to $[0,\infty]$ by adding to the opens all the intervals $(a, \infty]$.
As a measure space, $[0,\infty]$ is considered equipped with completion of its Borel $\sigma$-field (i.e. the Lebesgue $\sigma$-field); and then further equipped with the obvious extension of the Lebesgue measure given by setting $\lambda((a,\infty]) = \infty$ for $a < \infty$ and $\lambda(\{\infty\})=0$.

\begin{definition}[Multiplication]
\label{Multiplication}
    On $[0,\infty]$, \textbf{conjunctive multiplication} and \textbf{disjunctive multiplication} are, respectively, the following operations:
    \begin{equation}
		\begin{tabular}{c|ccc}
			$a \tensor b$ & $0$ & $a \in (0,\infty)$ & $\infty$\\
			\cline{1-4}
			$0$ 			   & $0$ & $0$ 		& $0$\\
			$b \in (0,\infty)$ & $0$ & $ab$		& $\infty$\\
			$\infty$ 		   & $0$ & $\infty$ & $\infty$
		\end{tabular}
		\hspace*{10ex}
		\begin{tabular}{c|ccc}
			\textnormal{$a \cotensor b$} & $0$ & $a \in (0,\infty)$ & $\infty$\\
			\cline{1-4}
			$0$ 		 	   & $0$ 		& $0$ 	   & $\infty$\\
			$b \in (0,\infty)$ & $0$ 		& $ab$	   & $\infty$\\
			$\infty$ 		   & $\infty$ 	& $\infty$ & $\infty$
		\end{tabular}
	\end{equation}
\end{definition}

Notice $\tensor$ and $\cotensor$ differ only when $a$ is $0$ and $b$ is $\infty$, or \textit{vice versa}. Often we write $ab$ instead of $a \tensor b$.

\begin{definition}[Duality Operator]
\label{dual}
    Let $a \in [0,\infty]$. Then the \textbf{dual} of $a$ is
    \[  \cdual{a} =
    \begin{cases}
    1/a  & a \in (0,\infty)  \\
    \infty & a = 0 \\
    0 & a = \infty \\
   \end{cases}
    \]
\end{definition}

Note $a \cotensor b = \cdual{(\cdual{a} \tensor \cdual{b})}$. Moreover we define $a \multimap b = \cdual{a} \cotensor b$, which extends the definition of $b/a$.

\begin{comment}

\begin{definition}[Measurable Spaces and Functions]
    Let $S_{1}, S_{2}$ be sets and $\sigmal{S_1}, \sigmal{S_2}$ be $\sigma$-algebras. The pairs $(S_1, \sigmal{S_1})$ and $(S_2, \sigmal{S_1})$ are \textbf{measurable spaces}.\\ The function $f : S_1 \rightarrow S_2$ is \textbf{measurable} if and only if for every $E \in \sigmal{S_2}$ the preimage of $E$ under $f$ is in $\sigmal{S_1}$, that is, for all $E \in \sigmal{S_2}$
    \begin{equation}
        f^{-1}(E) = \{x \in S_{1} \, | \,f(x) \in E \} \in \sigmal{S_1}.
    \end{equation}
\end{definition}

\begin{definition}[Measure Space]
    Let $S$ be a set and $\sigmal{S}$ be a $\sigma$-algebra over $S$. A \textbf{measure} on $(S,\sigmal{S})$ is a function $\mu : \sigmal{S} \rightarrow [0,\infty]$ such that (1) $\mu (\varnothing) = 0$ and (2) if $\{ A_i : i \in I \}$ is a countable collection of pairwise disjoint sets in $\sigmal{S}$ then
    \begin{equation}
        \mu \left( \bigcup_{i \in I} A_i \right) = \sum_{i \in I} \mu (A_i).
    \end{equation}
    The triple $(S, \sigmal{S}, \mu)$ is called a \textbf{measure space}, and a \textbf{probability space} when $\mu(S)=1$, in which case $\mu$ is often denoted as $\mathbb{P}$.
\end{definition}

\begin{definition}[Random Variable]
    A \textbf{random variable} with values in a measurable space $T$ is a measurable function $X : S \rightarrow T$.
\end{definition}

We give the following definitions for positive functions only, since this is the integrals we use below.

\begin{definition}[Lebesgue Integral]
    Let $(S, \sigmal{S})$ be a measurable space.
    A \textbf{simple function} on $S$ is one that can be written as a finite linear combination of indicator functions of measurable subsets of $S$, i.e. one of the form $f = \sum_{i \in I}a_{i}\textbf{1}_{A_i}$.
    If $\mu$ is a measure on $(S, \sigmal{S})$ then:
    \begin{enumerate}
        \item If $f = \sum_{i \in I}a_{i}\textbf{1}_{A_i}$ is a nonnegative simple function, the \textbf{Lebesgue integral} of $f$ is
        \begin{equation}
            \int_{S} f \, \de\mu = \sum_{i \in I} a_i \tensor \mu(A_i).
        \end{equation}
        \item If $f : S \rightarrow [0, \infty]$ is a measurable function, the \textbf{Lebesgue integral} of $f$ is
        \begin{equation}
            \int_{S} f \, \de\mu = \sup\left\{ \int_{S} g\, \de\mu : g \text{ is simple and } g \leq f \right\}.
        \end{equation}
        % \item If $f : S \rightarrow \Ereal$ is a measurable function, the \textbf{Lebesgue integral} of $f$ is
        %  \begin{equation}
        %      \int_{S} f \, \de\mu = \int_{S} max(f,0) \, \de\mu - \int_{S} max(-f,0) \, \de\mu.
        %  \end{equation}
        %  Assuming at least one of the integrals on the right is finite.
    \end{enumerate}
\end{definition}
\end{comment}

\subsection[p-Means]{$p$-Means}
\label{p-mean}
The following definitions relate specifically to the new quantifier semantics. They are what are classically known as generalized weighted means \cite{mitrinovic1970analytic}, though geometric mean, much like multiplication above, bifurcates into a conjunctive and a disjunctive version.
\\\\
Throughout the following, fix a probability space $(S,\sigmal{S}, \mathbb{P})$.

\begin{definition}[$p$-Means]
\label{pmean}
    Let $f : S \rightarrow \PEreal$ be a measurable function. For $p \in (0, \infty)$, the \textbf{(generalized weighted) $p$-mean} of $f$ is
    \begin{equation}
        \LM{f}_{S,p} := \left(\int_{S} f(s)^p\, \de\mathbb{P}(s)\right)^{1/p}
    \end{equation}
    where we extended the functions $(-)^p$ as follows
    \begin{equation}
        \infty^{p} =
        \begin{cases}
            1  & p = 0  \\
            \infty & p > 0
        \end{cases}
        \hspace*{10ex}
        0^{p} = 0.
    \end{equation}
    Dually, the \textbf{(generalized weighted) harmonic $p$-mean} of $f$ is
    \begin{equation}
        \LM{f}_{S,-p} := \cdual{\left(\LM{\cdual{f}}_{S,p}\right)}.
    \end{equation}
\end{definition}

When $S$ can be inferred from the context, we write $\LM{f}_{p}$.

The definition of $p$-means can be extended to $p=0$ and $p=\infty$ by taking limits \cite{capucci}. First we have

\begin{lemma}
\label{limitinfty}
    As $p \longrightarrow +\infty$,
    \begin{equation}
        \LM{f}_{+p} \longrightarrow \esup{f} =: \LM{f}_{+\infty},
        \qquad
        \LM{f}_{-p} \longrightarrow \einf{f} =: \LM{f}_{-\infty}.
    \end{equation}
\end{lemma}

These quantities are so defined:

\begin{definition}[Essential Extrema]
    Let $(S, \sigmal{S}, \mu)$ be a measure space and $f : S \rightarrow \PEreal$ a measurable function.
    \begin{enumerate}
        \item Let $U = \left\{ a \in \PEreal : \mu(\{ x \in X : a < f(x)\}) = 0\right\}$ and $\inf(U)$ be the infimum of U. The \textbf{essential supremum} of $f$
        is
        \begin{equation}
            \esup{f} = \inf U
        \end{equation}
        recalling that $\inf \varnothing = \infty$.
        \item The \textbf{essential infimum} of $f$ is
        \begin{equation}
            \einf{f} = - \,\esup{- f}
        \end{equation}
    \end{enumerate}
\end{definition}

On the other end of the spectrum, we have:

\begin{lemma}
\label{limitzero}
    As $p \longrightarrow 0$, both $\LM{f}_{+p}$ and $\LM{f}_{-p}$ converge to a limit, thus defining \textbf{disjunctive} and \textbf{conjunctive geometric means}:
    \begin{equation}
        \LM{f}_{+p} \longrightarrow: \LM{f}_{+0},
        \qquad
        \LM{f}_{-p} \longrightarrow: \LM{f}_{-0}.
    \end{equation}
\end{lemma}

For bounded functions, these quantities coincide with the classical (weighted) geometric mean:

\begin{definition}[Geometric Mean]
    Let $f : S \rightarrow [0,\infty)$ be a measurable function and $(S,\sigmal{S}, \mu)$ a measure space. The \textbf{geometric mean} of $f$ is
    \begin{equation}
        GM[f] = \exp \left( \frac{1}{\mu(S)} \int_{S} \ln f(s)\, \de\mu(s)\right)
    \end{equation}
\end{definition}

For unbounded functions, conjunctive and disjunctive geometric means may differ in the same way as $\tensor$ and $\cotensor$, namely in the way they handle $0$ and $\infty$. See \cite{capucci} for clarifications.

\section{Proposed Language and its Semantics}
We introduce the main ideas for first-order quantitative logic following Capucci \cite{capucci}, where the case is made that the positive reals support a family of substructural logics where the multiplicative connectives are interpreted as actual multiplication, and the additives as the $p$-norm and converge to the \emph{actual} additives as $p \to \infty$. 
We stress that only a language (i.e. a syntax for formulae), and not a logic (i.e. an entailment relation), are defined therein. Here we propose a simplified version of that language which features only multiplicative connectives (in the style of classical multiplicative linear logic \cite{galatos2007residuated}). 

For simplicity, we use the same symbols of \Cref{Preliminaries} for our \emph{language}.

A \emph{first-order theory} over this language is given by a fixed set of sorts $\cal S$ and a family of atomic predicates for each context, denoted as $\{{\cal A}(\vec X)\}_{\vec X \in \mathrm{List}\cal S}$.
Recall a context is a finite (and possibly empty) list of typed variables $\vec X = (x_1:X_1, \ldots, x_n:X_n)$, where $X_i \in \cal S$. Then, for each context, and simultaneously over all contexts, we inductively define the set of formulae of the theory $\Phi(\vec X)$ by closing the atomic predicates under duality, multiplicative conjunction, and universal and existential quantification over \emph{fresh} variables:
\begin{equation}
    \begin{split}
        \Phi(\vec X) \ni \phi(\vec x) :=\ &A(\vec x) \in {\cal A}(\vec X) \\
        &|\, \cdual{\phi(\vec x)} \\
        &|\, \phi(\vec x) \cmcon \phi(\vec x) \\
        &|\, \pforall{p}{y}{Y}{\psi(y,\vec x)} \\
        &|\, \pexists{p}{y}{Y}{\psi(y,\vec x)}
    \end{split}
\end{equation}
where $p \in [0,\infty]$ and $\psi \in \Phi(Y, \vec X)$, where $Y$ is another sort.
Let $\Phi = \bigcup_{\vec X} \Phi(\vec X)$ be the set of formulae over arbitrary contexts.
We encode multiplicative disjunction and linear implication respectively as 

$$\phi_{1} \cmdis \phi_{2} := \cdual{(\cdual{\phi_{1}} \cmcon \cdual{\phi_{2}})} \quad \quad \phi_{1} \cdiv \phi_{2} := \cdual{\phi_{1}} \cmdis \phi_{2}$$

An interpretation of such a theory is given by (1) a choice of probability space $\m{X}$ for each sort $X \in \cal S$, where, for a context $\vec X$ as above, we let $\m{\vec X} = \m{X_1} \times \cdots \times \m{X_n}$ (as well as $\m{()} = 1$); and (2) a given measurable function $\m{A}:\m{\vec X} \to [0,\infty]$ for each atomic predicate $A \in {\cal A}(\vec X)$.
Then the translation function (corresponding to \emph{multiplicative semantics} in \cite{capucci}) $\m{\,\cdot\,} : \Phi \rightarrow [0, \infty]$ is defined inductively on the structure of formulae as follows:
\label{semantics}
\begin{equation}
    \begin{split}
    &\m{\cdual{\phi(\vec x)}} := \cdual{\m{\phi(\vec x)}}\\
    &\m{\phi_{1}(\vec x) \cmcon \phi_{2}(\vec x)} := \m{\phi_{1}(\vec x)} \m{\phi_{2}(\vec x)}\\
    &\m{\pforall{p}{y}{Y}{\psi(y, \vec x)}} := \LM{\m{\psi(\cdot,\vec x)}}_{\m{Y},-p}\\
    &\m{\pexists{p}{y}{Y}{\psi(y,\vec x)}} := \LM{\m{\psi(\cdot,\vec x)}}_{\m{Y},p}
    \end{split}
\end{equation}
Hence the semantics is defined w.r.t. the quantale $[0,\infty]_{\cmcon}$ we described above (\cref{Multiplication}, \cite{capucci}), which we note is isomorphic to the \emph{Lawvere quantale} introduced in \cite{lawvereMetricSpacesGeneralized1973} and central in \cite{bacci, bacciPolynomialLawvereLogic2024}.

As an example of the usefulness of this semantics, we can use it to construct the \textit{softmax operator} \cite{softmax}, using the same logical formulae used for \textit{argmax}, as shown in \cite{capucci}.
Indeed, suppose $f : S \rightarrow [0,\infty)$ is a measurable function we want to express the softmax of.
The first-order theory of softmax has one sort $X$ and a single atomic predicate $\phi(x) \in {\cal A}(X)$.
We target $f$ by interpreting $X$ as $S$ and set $\m{\phi(\bar x)} = f(\bar s)$.
Then the softmax of $f$ is obtained as follows:
\begin{equation}
\begin{split}
    (\operatorname{softmax} f)(\bar s) &= \m{(\pexists{1}{x}{X}{\phi(x)})\cdiv \phi(\bar x)}\\
    &=\cdual{\m{(\pexists{1}{x}{X}{\phi(x)})}} \cotensor \m{\phi(\bar x)}\\
    &=\cdual{(\LM{\m{\phi(x)}}_{S,1})} \cotensor \m{\phi(\bar x)}\\
    &=\cdual{\left(\int_{S} \m{\phi}(s)\de\mathbb{P}(s)\right)} \cotensor \m{\phi(\bar x)}\\
    &=\cdual{\left(\int_{S} f(s) \de\mathbb{P}(s)\right)} \cotensor f(\bar s)\\
    &=\dfrac{f(\bar s)}{\int_{S} f(s) \de\mathbb{P}(s)}
\end{split}
\end{equation}
Note that often $f=\exp(-\beta u)$ for some scoring function $u:S \to [-\infty,\infty]$ and inverse temperature $\beta \in (0,\infty]$---this is the form most common in machine learning \cite{deeplearning}.
Similarly, $\bar s \in \operatorname{argmax} f \iff \m{(\pexists{\infty}{x}{X}{\phi(x)})\cdiv \phi(\bar x)} \geq 1$.

For a second example, we show how the robustness property of \Cref{Robustness} can be encoded in such a language.
Since this is usually a `hard' predicate, we have many choices on how to approach it as a soft predicate, here we give a very crude such encoding, parametrised by the given constants $\epsilon,\delta \in \real$, the function $f \colon \real^m \to \real^n$, the point $\bar x \in \real^n$, as well as by a `softness degree' $p \in [0,\infty]$.
Thus we look at a first-order theory with one sort $X$ and predicates $E,D \in {\cal A}(X)$, and we interpret it by setting
\begin{equation}
    \m{X} = \real^m, \quad \m{E} = \mathbf{1}_{\{x \in \real^n \,\mid\, \|x-\bar x\| \leq \epsilon\}}, \quad \m{D} = \mathbf{1}_{\{x \in \real^n\,\mid\,\|f(x)-f(\bar x)\| \leq \delta\}}.
\end{equation}
%\knote{what is the bang?}
where $\mathbf{1}_A$ denotes the indicator function of a measurable set $A$.
Then \eqref{eq:robustness} is
\begin{equation}
\begin{split}
     &\m{\pforall{p}{x}{X}{(E(x) \cdiv D(x))}} = \left(\int_{\real^m} \left(\dfrac{\m{E}(s)}{\m{D}(s)}\,\right)^{p} \de\mathbb{P}(s)\right)^{-1/p}
\end{split}
\end{equation}
\begin{comment}
\begin{equation}
    \m{X} = \real^m, \quad \m{E}(x) = \|x-\bar x\| \leq \epsilon, \quad \m{D}(x) = \|f(x)-f(\bar x)\| \leq \delta
\end{equation}
then
\begin{equation}
\begin{split}
    \eqref{eq:robustness} = \m{\pforall{\infty}{x}{X}{E(x) \cdiv D(x)}}% = \LM{\LM{\;g_2 / g_1\;}_{S,-p}}_{S,-p}
\end{split}
\end{equation}
\end{comment}
% We recover \Cref{Robustness} at $p=\infty$, whereas finite $p$ are soft versions of the same predicate.

% The predicate itself is then defined on $\real^m \times \real^m$, which is also the only sort in the theory

% and given measurable functions $g_1: S \to [0,\infty]$, $g_2: S \to [0,\infty]$ such that $g_1 = \m{\; ||\cdot - \cdot|| \leq \epsilon \;}$, $g_2 = \m{\; ||f(\cdot) - f(\cdot)|| \leq \delta \;}$, $S = \m{\real^m \times \real^m}$, the $\epsilon$-$\delta$-robustness of $f$ is
% \begin{equation}
% \begin{split}
%     \m{\textbf{$\epsilon$-$\delta$-}\operatorname{robustness} f} &= \m{\pforall{p}{x}{\real^m}{\pforall{p}{y}{\real^m}\\
%     &\quad\quad\quad{||x - y|| \leq \epsilon \cdiv ||f(x) - f(y)|| \leq \delta}}}\\
%     &= \LM{\LM{\;g_2 / g_1\;}_{S,-p}}_{S,-p}
% \end{split}
% \end{equation}

\begin{comment}

So far, $p$-means have been encoded as follows:

\begin{minted}{Coq}
Definition expeR x :=
  match x with
  | r%:E => (expR r)%:E
  | +oo => +oo
  | -oo => 0
end.

Definition lne x := 
    match x with
    | x'%:E => if x' == 0%R  then -oo else (ln x')%:E
    | +oo => +oo
    | -oo => 0 
end.

Definition ess_supe f :=
    ereal_inf [set r | mu [set x | r < f x] = 0].

Definition ess_infe f := - ess_supe (\- f).

Definition geo_mean P f :=
    expeR (\int[P]_x lne (f x)).

Definition Lnorme T mu p f :=
  match p with
  | p%:E => 
    if p == 0%R then mu (f @^-1` (setT `\ 0))
    else (\int[mu]_x (`|f x| `^ p%:E)) `^ p^-1%:E
  | +oo%E => 
    if mu [set: T] > 0 then ess_supe mu (abse \o f) else 0
  | -oo%E => 
    if mu [set: T] > 0 then ess_infe mu (abse \o f) else 0
  end.

Definition Lmeane T P p f :=
  if p == 0 then geo_mean P f else Lnorme P p f.
\end{minted}

\indent
Where \texttt{expR} denotes the real exponential function and \texttt{ln} denotes the natural logarithm \cite{mathcomp}. The functions \texttt{expeR} and \texttt{lne} correspond to their respective extensions into the extended real numbers. 
\\
This implementation corresponds to the \cref{pmean}. Here the encodings of the extended natural logarithm, the essential infimum, the  and geometric mean are and the $p$-mean are novel, while the encodings of the essential supremum and the $p$-norm are extensions of previous implementations able to take functions that go to extended reals.
\end{comment}

\section{Properties of Quantifiers}
\label{Properties of Quantifiers}
In the machine learning community there is a general consensus on the desirable properties of loss functions---convexity or continuity are widely considered desirable \cite{robust}. From a programming language perspective, there is no consensus as to how to define soundness for quantitative logics.
In the future, we intend to follow the general approach applied by \citeauthor{ldl}. %that is, for a typed FOL,  take provable FOL formulae to characterize the set of true FOL formulae \cite{ldl}. 
Moreover, \citeauthor{varnai} suggest characterizing quantitative logics in terms of their \textit{geometric properties}, valuable for optimization tasks \cite{varnai}. As for quantifiers, we wish that our formulation possesses good numerical properties, as well as behave similarly to quantifiers in classical logic. Currently we are working to formalize and prove the following properties in Rocq, which were presented by \citeauthor{capucci} \cite{capucci}.\\\\
\noindent
Through the following, Let $\vec X$ be a context, $Y$ a sort, and $\phi(\vec x) \in \Phi(\vec X)$, $\psi_i(y, \vec x) \in \Phi(Y, \vec X)$.
\begin{lemma}[Duality]
\label{Duality}
\begin{enumerate}
\itemsep1em
    \item $\m{\pforall{p}{y}{Y}{\psi(y, \vec x)}} = \cdual{\m{\pexists{p}{y}{Y}{\cdual{\psi(y, \vec x)}}}}$ 
    
    \item $\m{\pexists{p}{y}{Y}{\psi(y, \vec x)}} = \cdual{\m{\pforall{p}{y}{Y}{\cdual{\psi(y, \vec x)}}}}$ 
\end{enumerate}
\end{lemma}

\begin{lemma}[Distributivity over Implication]
\begin{enumerate}
\itemsep1em
    \item $\m{\phi(\vec x) \cdiv \pforall{p}{y}{Y}{\psi(y, \vec x)}} = \m{\pforall{p}{y}{Y}{(\phi(\vec x) \cdiv \psi(y, \vec x))}}$ 
    
    \item $\m{\pforall{p}{y}{Y}{(\psi(y, \vec x) \cdiv \phi(\vec x))}} = \m{(\pexists{p}{y}{Y}{\psi(y, \vec x)}) \cdiv \phi(\vec x)}.$
\end{enumerate}
\end{lemma}
These lemmas will be potentially useful to prove the \textit{residuation property}, an important feature of many quantitative logics \cite{galatos2007residuated}.
\begin{lemma}[Abductive]
Let $\m{Z} \subseteq \m{Y}$ then 
\begin{enumerate}
\itemsep1em
    \item $\m{\pexists{p}{z}{Z}{\psi(z, \vec x)}} \leq \m{\pexists{p}{y}{Y}{\psi(y, \vec x)}}$ 
    
    \item $\m{\pforall{p}{y}{Y}{\psi(y, \vec x)}} \leq \m{\pforall{p}{z}{Z}{\psi(z, \vec x)}}$.
\end{enumerate}
\end{lemma}
Intuitively, confidence depends on the amount of evidence. \\\\
%\begin{lemma}[Mean Embedding]
%\end{lemma}
\noindent
The following are often desirable properties of loss functions. 
\begin{lemma}[Monotonic]
If $\m{\psi_1} \leq \m{\psi_2}$ then 
\begin{enumerate}
\itemsep1em
    \item $\m{\pexists{p}{y}{Y}{\psi_1(y, \vec x)}} \leq \m{\pexists{p}{y}{Y}{\psi_2(y, \vec x)}}$
    \item $\m{\pforall{p}{y}{Y}{\psi_1(y, \vec x)}} \leq \m{\pforall{p}{y}{Y}{\psi_2(y, \vec x)}}$.
\end{enumerate}
\end{lemma}
\begin{lemma}[$p$-Monotonic and Bounded]
If $0 \leq q \leq p$ then 
\begin{enumerate}
\itemsep1em
    \item $\m{\pexists{q}{y}{Y}{\psi(y, \vec x)}} \leq \m{\pexists{p}{y}{Y}{\psi(y, \vec x)}} \leq \m{\pexists{\infty}{y}{Y}{\psi(y, \vec x)}}$
    \item $\m{\pforall{\infty}{y}{Y}{\psi(y, \vec x)}} \leq \m{\pforall{p}{y}{Y}{\psi(y, \vec x)}} \leq \m{\pforall{q}{y}{Y}{\psi(y, \vec x)}}$.
\end{enumerate}
\end{lemma}
Hence we can approximate the quantifier semantics of Gödel logic while mantaining differentiability.

\section{Work in progress on the Rocq formalization}
In \citetitle{taming} a formalization for several quantitative logics was developed \cite{taming}. We seek to expand this formalization so that it is suitable for reasoning about first-order DLs, with $p$-means as the semantics for quantifiers. 
So far we have formalized the semantics presented in \cref{semantics}, and some basic properties of the $p$-means. To illustrate, we present the encodings needed for \cref{Duality}. Note the following implementations have been simplified for clarity.

To encode the $p$-mean, we make use of the \texttt{Lnorm}, \mathcomp{}'s encoding of the $p$-norm \cite{lpspaces}, and add an encoding for the geometric mean. 
\begin{minted}{Coq}
Definition Lnorm P p f :=
  match p with
  | p%:E => (\int[mu]_x `|f x| `^ p) `^ p^-1
  | +oo => ess_sup P (abse \o f)
  | -oo => ess_inf P (abse \o f)
end.

Definition geo_mean P f :=  
    expeR \int[P]_x (lne (f x)). 

Definition pmean P p f := 
    if p == 0 then geo_mean P f else Lnorm P p f.
\end{minted}

Where \texttt{ess\_sup}, \texttt{ess\_inf},  \texttt{geo\_mean}, and \texttt{pmean} correspond respectively to the essential supremum, essential infimum, geometric mean and $p$-mean. For the dual, we use \mathcomp{}'s power function.
\begin{minted}{Coq}
Definition dual a := if a == 0 then +oo else x `^ -1.
\end{minted}

We can represent quantifiers in terms of the previous encodings, and add notations for clarity.
\begin{minted}{Coq}
Notation "x ^'" := (cdual x).
Notation "'forall_  p f " := (pmean P p f).
Notation "'exists_  p f " := (('forall_p (fun y => (f y)^'))^').
\end{minted}
Lastly, \cref{Duality} is encoded as \texttt{Lemma Duality}, using the facts that the dual is idempotent and the harmonic $p$-mean non-negatove, encoded as \texttt{Lemma idem\_dual} and \texttt{Lemma forall\_gt0}, respectively.
\begin{minted}{Coq}
Lemma Duality p x : 
    (0 < p) -> 
    'forall_p (psi x) = ('exists_p (fun y => (psi x y)^'))^'. 
Proof.
  by move=> ?; rewrite (*this is true since*)
    idem_dual //= (*the dual is idempotent and*)
    ?forall_gt0 //; (*the harmonic p-mean is non-negative and*)
  under eq_fun do rewrite (*in the body of the harmonic p-mean*)
    idem_dual //. (*the dual is idempotent.*)
Qed.
\end{minted}
To formalize the rest of \cref{Properties of Quantifiers} in Rocq, as well as the lemmas in \cref{p-mean}, we are currently working on extending the analysis module of \mathcomp{}.
In particular, Hölder's inequalities must be generalized to functions that go to the extended reals. In this process, we noticed the original encoding of the power function over extended real numbers incorrectly assumed its exponent is a real number greater than or equal to zero. The implementation has now been generalized for negative exponents.

\section{Conclusions and Future Work}
In this extended abstract we described our work in progress. We presented the main ideas behind a first-order quantitative logic to be applied in AI verification.
We presented a promising translation for quantifiers and introduced some desirable properties for this translation, following closely \cite{capucci}.
We argued for the usefulness of a computer formalization to provide compilation guarantees.
Lastly, we presented some preliminary progress in formalization of these results in Rocq.\\\\
\noindent
In the future we hope to: 
\begin{enumerate}
    \item Develop a Hilbert and Sequent Calculus for the language. 
    \item Prove soundness and completeness for the resulting logic.
    \item Formalize the properties mentioned in \cref{Properties of Quantifiers} and the resulting proofs of soundness and completeness.
    \item Test the performance of the logic for property-driven training.
    \item Integrate our results into verification back-ends such as that of Vehicle \cite{vehicle}. 
\end{enumerate}

\section{Acknowledgements}
J. Marulanda-Giraldo and E. Komendantskaya acknowledge the partial support of the EPSRC grant AISEC: AI Secure and Explainable by Construction (EP/T026960/1).
M. Capucci and E. Komendantskaya were supported by ARIA: Mathematics for Safe AI grant.
J. Marulanda-Giraldo received PhD Scholarship from the University of Southampton.\\\\
\hfill\begin{minipage}{\dimexpr\textwidth-1cm}
\textbf{Disclosure of Interests.} The authors have no competing interests to declare that are relevant to the content of this article. 
\end{minipage}


\printbibliography


\end{document}
