Keywords: transfer entropy estimator, information theory, machine learning, locality sensitive hashing, perturbation, regularization, deep neural networks, unsupervised learning, conditional entropy estimator
TL;DR: Estimating Transfer Entropy under Long Ranged Dependencies
Abstract: Estimating Transfer Entropy (TE) between time series is a highly impactful problem in fields such as finance and neuroscience. The well known nearest neighbor estimator of TE potentially fails if temporal dependencies are noisy and long ranged, primarily because it estimates TE indirectly relying on the estimation of joint entropy terms in high dimensions, which is a hard problem in itself. Other estimators, such as those based on Copula entropy or conditional mutual information have similar limitations. Leveraging the successes of modern discriminative models that operate in high dimensional (noisy) feature spaces, we express TE as a difference of two conditional entropy terms, which we directly estimate from conditional likelihoods computed in-sample from any discriminator (timeseries forecaster) trained per maximum likelihood principle. To ensure that the in-sample log likelihood estimates are not overfit to the data, we propose a novel perturbation model based on locality sensitive hash (LSH) functions, which regularizes a discriminative model to have smooth functional outputs within local neighborhoods of the input space. Our estimator is consistent, and its variance reduces linearly in sample size. We also demonstrate its superiority w.r.t. state-of-the-art estimators through empirical evaluations on a synthetic as well as real world datasets from the neuroscience and finance domains.
Supplementary Material: zip