\section{strictly proper IP scores}
\label{sec:rand-tailored-rules}

With the randomized choice of aggregation, the DM can pick an aggregation rule randomly post-elicitation to evaluate the reported forecast. The forecaster then becomes unaware of the aggregation function which can lead the forecaster back to indecision. To resolve the forecaster's indecision, the DM shares a distribution $\theta\in\Delta(\bm{\rho})$ where $\bm{\rho}$ is the class of aggregation functions the DM will pick from, thereby enabling the forecaster to resolve their indecision as follows:
\begin{equation}
\label{eq:random-tailored-scoring-truthfulness}
V^{\cP}_{\theta}(\cQ):= \mathbb{E}_{\rho\sim \theta}[V^{\cP}_{\rho}(\cQ)].
\end{equation}
This allows the tailored scoring rule $s_\rho$ to be randomized with respect to the random variable $\rho$. Analogous to \Cref{def:tailoredscoringrule} for tailored scoring rules, we now define randomized tailored scoring rule $s_{\theta}$. 
\begin{definition}
    A regular IP scoring rule $s_{\theta}$ is randomized tailored for a DM with a class of aggregation functions $\bm{\rho}$ and a distribution $\theta\in\Delta(\bm{\rho})$, if for any $k_\rho,c_\rho\in\mathbb{R}_{\geq0}$ and an arbitrary function $\Pi:2^{\Delta(\cO)}\rightarrow \mathbb{R}$, the score is defined as
    \begin{equation*}
        s_{\theta}(\cQ,o)(\rho) = \begin{cases}
        k_\rho u(a^*_{\rho,\cQ},o)+c_\rho &\text{if } \theta(\rho)>0 \\
        \Pi_o(\cQ) & \text{if } \theta(\rho)=0
        \end{cases}.
    \end{equation*}
\end{definition}
Given that we have now extended the tailored scoring rule to random variables, in a similar spirit to \Cref{def:properscoringimprecisedecisionrule} on properness of IP scoring rules with aggregation, we define properness of randomized tailored scoring rules as follows.
\begin{definition}
    \label{def:properscoring-tailored}
    A randomized tailored scoring rule $s_{\theta}$ for a distribution $\theta\in\Delta(\bm{\rho})$ and a class of aggregation rules $\bm{\rho}$, is considered proper if, for all $\cP,\cQ\subseteq\Delta(\cO)$ and $\cQ\not\simeq\cP$,
    \begin{equation}
    \label{eq:properscoring-tailored-random}
        V_\theta^\cP(\cP)\geq V_\theta^\cP(\cQ).
    \end{equation}
    $s_{\theta}$ is strictly proper if the inequality in \Cref{eq:properscoring-tailored-random} is strict.
\end{definition}
Again the strictness in \Cref{def:properscoring-tailored} adheres to the notion of truthfulness defined in \Cref{def:truthfulness}. We will establish this connection later in this section. We can observe from Equation~\ref{eq:random-tailored-scoring-truthfulness} that randomized tailored scoring rules are proper for any choice of $\theta\in\Delta(\bm{\rho})$ as a direct consequence of \Cref{prop:tailoredscoringrule}. Before we discuss how to build strictly proper IP scoring rules, we need to identify if there exists a unique representation of the credal set in the action space which will let the DM identify the credal set. 
\begin{lemma}
\label{lemma:unqiue-representation-of-credal-set}
For any reported credal set $\cQ\subseteq\Delta(\cO)$ and a DM using a utility function $u$ such that $a^*_q := \arg\max_{a\in\cA}\mathbb{E}_q[u(a,o)]$ is unique for all $q\in\Delta(\cO)$, the set of actions $\cA^{\ext}_\cQ:=\Big\{a^{*}_{q}\Big\}_{q\in\ext(\cQ)}$ acts as a unique representation of a credal set $\cQ$ in action space $\cA$.
\end{lemma}
The implication of unique representation $\cA^{\ext}_\cQ$ in the action space for any credal set $\cQ$ is that the DM is able to identify the credal set from the set of actions $\cA^{\ext}_\cQ$. In a naive analogy, all actions in $\cA^{\ext}_\cQ$ together act as a fingerprint of credal set $\cQ$ which can be uniquely incentivised by the DM to elicit $\cQ$. We now introduce a common class of linear aggregations to operationalise scoring rules based on \Cref{lemma:unqiue-representation-of-credal-set}.

\textbf{Fixed Linear Aggregations} is another common class of aggregation rules which aggregates the expected utilities of a credal set $\cQ$ for any input $x\in\cX$, i.e., $\{\mathbb{E}_q[u(x,o)]\}_{q\in\cQ}$,  into a convex combination of utilities with mixing coefficient $\bm{\lambda}\in\Delta^{|\cQ|}$ as 
\begin{align*}
    \rho_{\bm{\lambda}}[\{\mathbb{E}_q[u(x,o)]\}_{q\in\cQ}] &:= \int_{q\in\cQ}\bm{\lambda}(q)\mathbb{E}_q[u(x,o)]dq\\
    &=\mathbb{E}_{\int\bm{\lambda}(q)q dq}[u(x,o)] .
\end{align*}
Although the class of fixed linear aggregations is Pareto-efficient and non-dictatorial in classic social choice theory, in our setup, fixed linear aggregations are dictatorships as they directly aggregate the epistemic uncertainty. Due to \Cref{prop:credalsets}, a forecaster can report $\cQ$ or $\co(\cQ)$. We illustrate this with an example, for any report $\cQ\subseteq\Delta(\cO)$ and any choice of fixed linear aggregation $\bm{\lambda}$, we obtain $Q:=\bm{\lambda}^\top \cQ$. Even though $Q$ may not be in $\cQ$, it is guaranteed that $Q\in \co(\cQ)$, and therefore $Q$ acts as a dictator. This means that although the DM uses the full credal set in the sense of all extreme points to perform decision-making, their preference over actions can be fully represented by a precise belief $P\in co(\cP)$. 
From \Cref{subsection:aggregation}, non-dictatorship was only desirable due to the strategic manipulation by the forecaster. In the scenario where forecasters are unaware of the exact aggregation rule, using a random dictatorial $\rho_{\bm{\lambda}}$ allows the DM to keep PE and IIA. To this end, we show the strict properness of these randomized dictatorships. Since strict properness for imprecise forecasters implicitly requires strictness for precise forecasts, which means that the $s_{\theta}$ must satisfy \Cref{lemma:strictness-for-precise-distributions} for every $\bm{\lambda}$.
\begin{theorem}
\label{theorem:strictly-proper-imprecise-scoring-rule}
    Assuming $s_\theta$ to be strictly proper for precise distributions and $\bm{\rho}$ as fixed linear aggregations, $s_{\theta}$ is strictly proper for imprecise forecasts, i.e. $s_\theta$ is a strictly proper IP scoring rule if $\theta$ has full support over $\bm{\rho}$.
\end{theorem}
\Cref{theorem:strictly-proper-imprecise-scoring-rule} allows us to build strictly proper IP scoring rules which can be characterized as follows. A randomized tailored scoring rule $s_{\theta}$ made using the class of fixed linear aggregation rules is characterized as
\begin{align*}
    s_{\theta}(\cQ,o)(\bm{\lambda}) = \begin{cases}
        k_{\bm{\lambda}} u(a^*_{\rho_\lambda,\cQ},o)+c_{\bm{\lambda}} &\text{if } \theta(\bm{\lambda})>0 \\
        \Pi_o(\cQ) & \text{if } \theta(\bm{\lambda})=0
        \end{cases},
\end{align*}
where $\bm{\lambda}\in\Delta^{|\ext(\cQ)|}$ is considered strictly proper if $\text{supp}(\theta)=[0,1]$ where $\Pi:2^{\Delta(\cO)}\times \cO\rightarrow \mathbb{R}$ is an arbitrary regular scoring function. To verify the strict properness of our score, we conduct a simulation (see \Cref{appendix:simulation}).

In recent years, several frameworks have been proposed for learning that challenge the implicit assumptions made in standard ML pipeline about loss functions \citep{gopalan2021omnipredictors}) or preferences \citep{singh2024domain} of the users being known to the model trainer. They focus on training models that perform well for a class of losses or aggregation rules. Within our setup, these frameworks translate to the DM abstaining from sharing the exact aggregation rule with the forecaster. However, they are not exact implementations of the score we propose. Applying the proposed score to ML applications is one of the future research avenues. 
