D2KE: From Distance to Kernel and Embedding via Random Features For Structured InputsDownload PDF

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Withdrawn SubmissionReaders: Everyone
Abstract: We present a new methodology that constructs a family of \emph{positive definite kernels} from any given dissimilarity measure on structured inputs whose elements are either real-valued time series or discrete structures such as strings, histograms, and graphs. Our approach, which we call D2KE (from Distance to Kernel and Embedding), draws from the literature of Random Features. However, instead of deriving random feature maps from a user-defined kernel to approximate kernel machines, we build a kernel from a random feature map, that we specify given the distance measure. We further propose use of a finite number of random objects to produce a random feature embedding of each instance. We provide a theoretical analysis showing that D2KE enjoys better generalizability than universal Nearest-Neighbor estimates. On one hand, D2KE subsumes the widely-used \emph{representative-set method} as a special case, and relates to the well-known \emph{distance substitution kernel} in a limiting case. On the other hand, D2KE generalizes existing \emph{Random Features methods} applicable only to vector input representations to complex structured inputs of variable sizes. We conduct classification experiments over such disparate domains as time series, strings, and histograms (for texts and images), for which our proposed framework compares favorably to existing distance-based learning methods in terms of both testing accuracy and computational time.
Keywords: Distance Kernel, Embeddings, Random Features, Structured Inputs
TL;DR: From Distance to Kernel and Embedding via Random Features For Structured Inputs
4 Replies

Loading