\section{Introduction}
\label{sec::intro}

Self-supervised learning (SSL) has opened the door to the development of general-purpose embedding functions, often referred to as \emph{foundation models}, that can be used or fine-tuned for various downstream tasks \citep{jaiswal2020survey, jing2020self}. 
These embedding functions are pre-trained on large corpora of different data modalities, spanning visual \citep{chen2020simple}, textual \citep{brown2020language}, audio \citep{al2021clar}, and their combinations \citep{radford2021learning, morgado2021audio}, aimed at being general-purpose and agnostic to the downstream tasks they may be utilized for. For instance, the recent surge in large pre-trained models such as CLIP \citep{radford2021learning} and ChatGPT \citep{chatgpt} has resulted in the development of many prompt-based or dialogue-based downstream use cases, none of which are known a priori when the pre-trained model is being deployed. %


Embedding functions learned through SSL may produce unreliable outputs. For example, large language models can generate factually inaccurate information with a high level of confidence \citep{bommasani2021opportunities, tran2022plex}. With the increasing use of SSL to generate textual, visual, and audio content, unreliable embedding functions could have significant implications. Furthermore, given that these embedding functions are frequently employed as frozen backbones for various downstream use cases, %
adding more labeled downstream data may not improve the performance if the initial representation is unreliable. %
Therefore, having notion(s) of \textbf{reliability/uncertainty} for such pre-trained models, alongside their abstract representations, would be a key enabler for their reliable deployment, especially in safety-critical settings.



In this paper, we introduce a formal definition of representation reliability based on its impact on downstream tasks. Our definition pertains to a representation of a given test point produced by an embedding function. If a variety of downstream tasks that build upon this representation consistently yield accurate results for the test point, we consider this representation reliable. Existing uncertainty quantification (UQ) frameworks mostly focus on the supervised learning setting, where they rely on the consistency of predictions across various predictive models. We provide a counter-example showing that they cannot be directly applied to our setting, as representations lack a ground truth for comparison. In other words, inconsistent predictions often indicate that the predictions are not reliable, but inconsistent representations do not necessarily imply that the representations are unreliable. Hence, it is critical to align representation spaces in such a way that corresponding regions have similar semantic meanings before comparing them.



We propose an ensemble-based approach for estimating representation reliability without knowing downstream tasks a priori. We prove that a test point has a reliable representation if it has a reliable neighbor which remains consistently close to the test point, across multiple representation spaces generated by different embedding functions. Based on this theoretical insight, we select a set of embedding functions and reference data (e.g., data used for training the embedding functions). We then compute the number of consistent neighboring points among the reference data to estimate the representation reliability. The underlying reasoning is that a test point with more consistent neighbors is more likely to have a reliable and consistent neighbor. This reliable and consistent neighbor can be used to align different representation spaces that are generated by different embedding functions. 



\rev{We conduct extensive numerical experiments to validate our approach and compare it with baselines, including state-of-the-art out-of-distribution (OOD) detection measures and UQ in supervised learning. Our approach consistently captures the representation reliability in all different applications. These applications range from predicting the performance of embedding functions when adapting them to in-distribution or out-of-distribution downstream tasks to ranking the reliability of different embedding functions. While the baselines may occasionally surpass our approach, their performance fluctuates significantly across different settings and can even become negative, posing a risk when used to assess reliability in safety-critical settings. 
}



















In summary, our main contributions are:
\begin{itemize}[leftmargin=1.0em]
    
    \item We present a formal definition of representation reliability, which quantifies how well an abstract representation can be used across various downstream tasks. 
    To the best of our knowledge, this is the first comprehensive study to investigate uncertainty in representation space.

    \item We provide a counter-example, showing that existing UQ tools in supervised learning cannot be directly applied to estimate the representation reliability. 
    
    
    \item We prove a theorem stating that a reliable and consistent neighbor of a test point can serve as an anchor point for aligning different representation spaces and assuring the representation reliability.
    
    \item Based on our theoretical findings, we introduce an ensemble-based approach that uses neighborhood consistency to estimate the representation reliability.
    
    \item We conduct comprehensive numerical experiments, showing that our approach consistently captures the representation reliability whereas the baselines could not. 
\end{itemize}


\paragraph{Broad Impact and Implication.}
Our work introduces a way to quantify the reliability of pre-trained models prior to their deployment in specific downstream tasks, which has several implications. Imagine a practitioner has access to multiple pre-trained models that have been trained using distinct learning paradigms, data, or architectures. Our method helps compare and rank these models based on their reliability scores, enabling the practitioner to select the model with the highest reliability score. This is particularly valuable when transmitting downstream training data is challenging due to privacy concerns, or when the downstream tasks shift over time. 
Similarly, in cases where a pre-trained model yields incorrect (or harmful) decisions for specific individuals, our method can help explain whether the issue stems from the abstract representation or the projection heads added to the pre-trained model. 
We hope our effort can push the frontiers of self-supervised learning towards more responsible and reliable deployment, encouraging further research to ensure the transferability of knowledge acquired during pre-training across diverse tasks and domains.



\input{related_works}
