\begin{abstract}
In this work, we solve the problem of novel category detection under distribution shift. This problem is critical to ensuring the safety and efficacy of machine learning models, particularly in domains such as healthcare where timely detection of novel subgroups of patients is crucial.

To address this problem, we propose a method based on constrained learning. Our approach is guaranteed to detect a novel category under a relatively weak assumption, namely that rare events in past data have bounded frequency under the shifted distribution. Prior works on the problem do not provide such guarantees, as they either attend to very specific types of distribution shift or make stringent assumptions that limit their guarantees.

We demonstrate favorable performance of our method on challenging novel category detection problems over real world datasets.
% We solve the problem of novel class detection under distribution shift, by developing a method based on constrained learning.
% This problem is critical to ensuring the safety and efficacy of machine learning models, particularly in domains such as healthcare where timely detection of novel subgroups of patients is crucial.

% Given two datasets, where one contains a novel class and the rest of the data is shifted from 
% Our approach is guaranteed to detect a novel class, under a relatively weak assumption. Namely, that rare events in past data have bounded frequency under the shifted distribution. Such guarantees are not given in prior works on the problem, as they either attend to very specific types of distribution shift, or make stringent assumptions which limit their guarantees.

% The problem is important for ensuring safety of machine learning models. For instance in the healthcare domain, by detecting potential failures that stem from the emergence of a novel subgroup of patients, such as COVID-19 patients at the onset of the pandemic. In such events, changes in the underlying population can lead machine learning models trained on past data to underperform, and detection of novel subgroups is crucial for timely attendance of the issue.

% We show favorable performance of our method on challenging novel class detection problems over real world datasets, where we introduce shifts to the dataset where the novel class appears.

\begin{comment}
We propose a constrained learning approach to the problem of identifying a novel subgroup within a dataset $S_{\gT}$.
Assuming events that are rare in previously observed data have bounded frequency in $S_{\gT}$, our method is guaranteed to detect the novel group even when the distribution of the rest of the data shifts. Our motivation is in safety-related problems, where detection of a novel subgroup can help prevent failures, and distribution shifts occur in real-world data due to a variety of possible changes in the data generating process.

We provide finite sample guarantees on the classification of a novel subgroup under distribution shift, a problem we pose in the framework of learning from Positive and Unlabelled data.
Although the true error of a classifier for the task cannot be estimated from observed data, our method optimizes an upper bound on this error, which holds under our assumption. Previous methods rely on more stringent conditions and hence do not provide this type of guarantee. 

To demonstrate the effectiveness of our approach, we apply it to the MIMIC-III and Tabula-Muris datasets, subjecting them to randomly generated distribution shifts. Our experiments show that our method performs favorably over baselines, highlighting its potential for practical applications.
\end{comment}
\end{abstract}