TL;DR: We propose an offline multichange point detection method for regenerative Markov chains via minimising clutering variance
Abstract: Offline change point detection tries to detect $\textit{time points}$ of distribution change in a given data sequence; and is now routinely used in signal processing, speech processing, climatology etc. Despite this broad applicability across economics, computer science, and planetary sciences, rigorous, nonparametric techniques for change point detection with non-independent and identically distributed (i.i.d.) datasets has remained elusive. This paper establishes such guarantees by proposing a non-parametric clustering algorithm which can accurately obtain the change points from a given Markovian dataset of length $n$. It does so by bridging together two different components of mathematical statistics; Rademacher complexities of Markov chains, and adaptive clustering via penalisation. Our first result uses recent advances in Rademacher complexities of regenerating Markov chains to derive a Dvoretzky Kiefer Wolfowitz (DKW) type inequality for the empirical distribution of the Markov chain. We then use this to show that an adaptive clustering algorithm recovers the correct change points for a Markovian sequence. We establish the tightness of our rates by showing that they essentially coincide with the best known rates for i.i.d. data. We end the paper by discussing the computational considerations of the problem.
Code Dataset Promise: Yes
Code Dataset Url: https://github.com/Ibanerj750/Ibanerj750.github.io/blob/master/files/ai26cd.txt
Signed Copyright Form: pdf
Format Confirmation: I agree that I have read and followed the formatting instructions for the camera ready version.
Submission Number: 974
Loading