% \documentclass{uai2024} % for initial submission
\documentclass[accepted]{uai2024} % after acceptance, for a revised version; 
% also before submission to see how the non-anonymous paper would look like 
                        
%% There is a class option to choose the math font
% \documentclass[mathfont=ptmx]{uai2024} % ptmx math instead of Computer
                                         % Modern (has noticeable issues)
% \documentclass[mathfont=newtx]{uai2024} % newtx fonts (improves upon
                                          % ptmx; less tested, no support)
% NOTE: Only keep *one* line above as appropriate, as it will be replaced
%       automatically for papers to be published. Do not make any other
%       change above this note for an accepted version.

%% Choose your variant of English; be consistent
\usepackage[american]{babel}
% \usepackage[british]{babel}

%% Some suggested packages, as needed:
\usepackage{natbib} % has a nice set of citation styles and commands
    \bibliographystyle{plainnat}
    \renewcommand{\bibsection}{\subsubsection*{References}}
\usepackage{mathtools} % amsmath with fixes and additions
% \usepackage{siunitx} % for proper typesetting of numbers and units
\usepackage{booktabs} % commands to create good-looking tables
\usepackage{tikz} % nice language for creating drawings and diagrams

\usepackage{hyperref}       % hyperlinks

\hypersetup{
colorlinks   = true, %Colours links instead of ugly boxes
urlcolor     = blue, %Colour for external hyperlinks
linkcolor    = blue, %Colour of internal links
citecolor    = blue %Colour of citations
}

% \hypersetup{
% colorlinks   = true, %Colours links instead of ugly boxes
% urlcolor     = blue, %Colour for external hyperlinks
% linkcolor    = blue, %Colour of internal links
% citecolor    = blue %Colour of citations
% }

\usepackage{url}            % simple URL typesetting
\usepackage{booktabs}       % professional-quality tables
\usepackage{amsfonts}       % blackboard math symbols
\usepackage{nicefrac}       % compact symbols for 1/2, etc.
\usepackage{microtype}      % microtypography
\usepackage{xcolor}         % colors

\usepackage{algorithm}
\usepackage{algorithmic}

\usepackage{amsmath}  
\usepackage{multirow}

\usepackage{url}            % simple URL typesetting
\usepackage{booktabs}       % professional-quality tables
\usepackage{amsfonts}     
\usepackage{amsthm}     
% blackboard math symbols
\usepackage{nicefrac}       % compact symbols for 1/2, etc.
\usepackage{microtype}      % microtypography
\usepackage{xcolor}
\usepackage{tabularx}

\usepackage{graphicx} % more modern
%\usepackage{epsfig} % less modern
\usepackage{subfigure} 

\usepackage{titletoc}

\newcommand\DoToC{%
  \startcontents
  \printcontents{}{1}{\textbf{Table of Contents}\vskip3pt\hrule\vskip5pt}
  \vskip3pt\hrule\vskip5pt
}

% For math	
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{multirow}

% customized commands
\DeclareMathOperator{\argmax}{\arg\max}
\DeclareMathOperator{\argmin}{\arg\min}
\DeclareMathOperator*{\minimize}{\text{minimize}}
\DeclareMathOperator*{\maximize}{\text{maximize}}
\DeclareMathOperator*{\st}{\text{subject to}}

% customized theorem environment
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem{corollary}{Corollary}
\newtheorem{definition}{Definition}
\newtheorem{remark}{Remark}

% \usepackage[labelsep=period]{caption}
% \captionsetup[table]{name=TABLE}
\renewcommand{\thetable}{\Roman{table}}
\renewcommand{\thefigure}{\Roman{figure}}
\usepackage{caption}

\usepackage{wrapfig}

\usepackage{color}
\definecolor{lightgray}{gray}{0.75}
\usepackage{tcolorbox}

%\usepackage{algorithm}
%\usepackage{algpseudocode}

\usepackage{caption}

\usepackage{titlesec}
\titlespacing\section{0pt}{0pt plus 0pt minus 2pt}{0pt plus 0pt minus 2pt}
\titlespacing\subsection{0pt}{0pt plus 0pt minus 2pt}{0pt plus 0pt minus 2pt}
\titlespacing\subsubsection{0pt}{0pt plus 0pt minus 2pt}{0pt plus 0pt minus 2pt}
% \usepackage[nodisplayskipstretch]{setspace}
% \setstretch{1.5}

% \AtBeginDocument{%
%   \addtolength\abovedisplayskip{-0.3\baselineskip}%
%   \addtolength\belowdisplayskip{-0.3\baselineskip}%

\AtBeginDocument{%
  \addtolength\abovedisplayskip{-0.3\baselineskip}%
  \addtolength\belowdisplayskip{-0.3\baselineskip}%
 \addtolength\abovedisplayshortskip{-0.3\baselineskip}%
 \addtolength\belowdisplayshortskip{-0.3\baselineskip}%
}


%% Provided macros
% \smaller: Because the class footnote size is essentially LaTeX's \small,
%           redefining \footnotesize, we provide the original \footnotesize
%           using this macro.
%           (Use only sparingly, e.g., in drawings, as it is quite small.)

%% Self-defined macros
\newcommand{\swap}[3][-]{#3#1#2} % just an example

\title{Revisiting Kernel Attention with Correlated Gaussian Process Representation}

% The standard author block has changed for UAI 2024 to provide
% more space for long author lists and allow for complex affiliations
%
% All author information is authomatically removed by the class for the
% anonymous submission version of your paper, so you can already add your
% information below.
%
% Add authors
\author[1,3]{\href{mailto:<minhlongbui2000@gmail.com>?Subject=Your UAI 2024 paper}{Long Minh Bui}{}}
\author[1,2]{Tho Tran Huu}
\author[1]{Duy Dinh}
\author[2,*]{Tan Minh Nguyen}
\author[3,*]{Trong Nghia Hoang}
% Add affiliations after the authors
\affil[1]{%
    FPT Software AI Center
}
\affil[2]{%
    Department of Mathematics,
    National University of Singapore    

}
\affil[3]{%
    School of Electrical Engineering and Computer Science, Washington State University
}

\affil[*]{Co-last author}

  
\begin{document}

\maketitle

\begin{abstract}
\vspace{-6mm}
Transformers have increasingly become the de facto method to model sequential data with state-of-the-art performance. Due to its widespread use, being able to estimate and calibrate its modeling uncertainty is important to understand and design robust transformer models. To achieve this, previous works have used Gaussian processes (GPs) to perform uncertainty calibration for the attention units of transformers and attained notable successes. However, such approaches have to confine the transformers to the space of symmetric attention to ensure the necessary symmetric requirement of their GP's kernel specification, which reduces the representation capacity of the model. To mitigate this restriction, we propose the Correlated Gaussian Process Transformer (CGPT), a new class of transformers whose self-attention units are modeled as cross-covariance between two correlated GPs (CGPs). This allows asymmetries in attention and can enhance the representation capacity of GP-based transformers. We also derive a sparse approximation for CGP to make it scale better. Our empirical studies show that both CGP-based and sparse CGP-based transformers achieve better performance than state-of-the-art GP-based transformers on a variety of benchmark tasks.
\end{abstract}

\section{Introduction}
\label{sec:intro}
\input{gptransformers/sections/intro}

\section{Preliminaries}
\label{sec:background}
\input{gptransformers/sections/background}

\section{Revisiting Kernel Attention}
\label{sec:method}
\input{gptransformers/sections/method}

\section{Experimental Results}
\label{sec:experiments}
\input{gptransformers/sections/experiments}

% \section{Empirical Analysis}
% \label{sec:analysis}
% \input{gptransformers/sections/analysis}

\section{Related Work}
\label{sec:related_work}
\input{gptransformers/sections/related_work}

\section{Concluding Remarks}
\label{sec:conclusion}
\input{gptransformers/sections/conclusion}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\bibliography{gptransformers/references}
%\bibliographystyle{plain}

\newpage
\appendix
\onecolumn
\begin{center}
{\Large \bf Supplementary Materials for}
\end{center}
\vspace{-0.4cm}
\begin{center}
\large \bf \textit{Revisiting Kernel Attention with Correlated Gaussian Process Representation}
\end{center}

\DoToC

\newpage

\input{gptransformers/sections/appendix}
\end{document}
