Abstract: Online learning algorithms in a reproducing kernel Hilbert space associated with convex loss functions are studied. We show that in terms of the expected excess generalization error, they can converge comparably fast as corresponding kernelbased batch learning algorithms. Under mild conditions on loss functions and approximation errors, fast learning rates and finite sample upper bounds are established using polynomially decreasing step-size sequences. For some commonly used loss functions for classification, such as the logistic and the p-norm hinge loss functions with p ∈ [1, 2], the learning rates are the same as those for Tikhonov regularization and can be of order O(T <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">-(1/2)</sup> log T), which are nearly optimal up to a logarithmic factor. Our novelty lies in a sharp estimate for the expected values of norms of the learning sequence (or an inductive argument to uniformly bound the expected risks of the learning sequence in expectation) and a refined error decomposition for online learning algorithms.
0 Replies
Loading