\section{Introduction}\label{sec:intro}
Many machine learning (ML) systems are built based on an assumption that training and testing data are sampled independently and identically from the same distribution. However, this is commonly violated in real applications where the environment changes during model deployment, and there exist distribution shifts between training and testing data. The problem of training models that are robust under distribution shifts is typically referred to as domain adaptation (or generalization), where the goal is to train a model on \textit{source} domain that can generalize well on a \textit{target} domain. Specifically, domain adaptation (DA) aims to deploy model on a \textit{specific} target domain, and it assumes the data from this target domain is accessible during training. In contrast, domain generalization (DG) considers a more realistic scenario where target domain data is unavailable during training; instead it leverages multiple source domains to learn models that generalize to \textit{unseen} target domains.   

For both DA and DG, various approaches have been proposed to learn a robust model with high performance on target domains. However, most of them assume both source and target domains are sampled from a \textit{stationary} environment; they are not suitable for settings where the data distribution evolves along a specific direction (e.g., time, space). In stationary DG, the domains are treated as an unordered set, while in non-stationary DG, they form an ordered tuple with a sequential structure (see Figure~\ref{fig:5}). This defining characteristic of non-stationary DG renders this setting a challenging task, necessitating novel solutions that account for non-stationary mechanisms. In practice, evolvable data distributions have been observed in many applications. For example, satellite images change over time due to city development and climate change \citep{christie2018functional}, clinical data evolves due to changes in disease prevalence \citep{guo2022evaluation}, facial images gradually evolve because of the changes in fashion and social norms \citep{ginosar2015century}. Without accounting for the non-stationary patterns across domains, existing methods in DA/DG designed for stationary settings may not perform well in non-stationary environments. As evidenced by \citet{guo2022evaluation}, clinical predictive models trained under existing DA/DG methods cannot perform better on future clinical data compared to empirical risk minimization. 

In this paper, we study \textit{domain generalization} (DG) in non-stationary environments. The goal is to learn a model from a sequence of source domains that can capture the non-stationary patterns and generalize well to (multiple) \textit{unseen} target domains. We first examine the impacts of non-stationary distribution shifts and study how the model performance attained on source domains can be affected when the model is deployed on target domains. Based on the theoretical findings, we propose an algorithm named Adaptive Invariant Representation Learning (\texttt{AIRL}); it minimizes the error on target domains by learning a sequence of representations that are \textit{invariant} for every two consecutive source domains but are \textit{adaptive} across these pairs.

In particular, \texttt{AIRL} consists of two components: (i)  \textit{representation network}, which is trained on the sequence of source domains to learn invariant representations between every two consecutive source domains, (ii) \textit{classification network} that minimizes the prediction errors on source domains. Our main idea is to create adaptive representation and classification networks that can evolve in response to the dynamic environment. In other words, we aim to find networks that can effectively capture the non-stationary patterns from the sequence of source domains. At the inference stage, the representation network is used to generate the optimal representation mappings and the classification network is used to make predictions in the target domains, without the need to access their data. To verify the effectiveness of \texttt{AIRL}, we conduct extensive experiments on both synthetic and real data and compare \texttt{AIRL} with various existing methods.

\begin{figure*}[t]
    \centering
    \includegraphics[width=\linewidth]{figs/Temporal_DG.png}
    \caption{An illustrative comparison between conventional DG and DG in non-stationary environment: domains in conventional DG are independently sampled from a stationary environment, whereas DG in non-stationary environment considers domains that evolve along a specific direction. As shown in the right plot, data (i.e., images) changes over time and the model trained on past data may not have good performance on future data due to non-stationarity (i.e., temporal shift).}
    \label{fig:5}
\end{figure*}
