Canonical correlation for principal components of time series

S. Yaser Samadi, Lynne Billard, Mohammad Reza Meshkani, Ahmad Khodadadi

Published: 18 Jun 2016, Last Modified: 14 May 2024Computational StatisticsEveryoneCC BY 4.0

Abstract: With contemporary data collection capacity, data sets containing large numbers of different multivariate time series relating to a common entity (e.g., fMRI, financial stocks) are becoming more prevalent. One pervasive question is whether or not there are patterns or groups of series within the larger data set (e.g., disease patterns in brain scans, mining stocks may be internally similar but themselves may be distinct from banking stocks). There is a relatively large body of literature centered on clustering methods for univariate and multivariate time series, though most do not utilize the time dependencies inherent to time series. This paper develops an exploratory data methodology which in addition to the time dependencies, utilizes the dependency information between S series themselves as well as the dependency information between p variables within the series simultaneously while still retaining the distinctiveness of the two types of variables. This is achieved by combining the principles of both canonical correlation analysis and principal component analysis for time series to obtain a new type of covariance/correlation matrix for a principal component analysis to produce a so-called “principal component time series”. The results are illustrated on two data sets.