Keywords: multi-armed bandits, best-of-both-worlds, FTPL, utility model, extreme value theory
TL;DR: This paper aims to broaden the theoretical foundation of FTPL and emphasize the need for further investigation to better understand the behavior of FTPL in broader settings.
Abstract: Follow-the-Regularized-Leader (FTRL) policies have achieved Best-of-Both-Worlds (BOBW) results in various settings through hybrid regularizers, whereas analogous results for Follow-the-Perturbed-Leader (FTPL) remain limited due to inherent analytical challenges. 
    To advance the analytical foundations of FTPL, we revisit classical FTRL-FTPL duality for unbounded perturbations and establish BOBW results for FTPL under a broad family of asymmetric unbounded Fréchet-type perturbations, including hybrid perturbations combining Gumbel-type and Fréchet-type tails.
    These results not only extend the BOBW results of FTPL but also offer new insights into designing alternative FTPL policies competitive with hybrid regularization approaches.
    Motivated by earlier observations in two-armed bandits, we further investigate the connection between the $1/2$-Tsallis entropy and a Fréchet-type perturbation.
    Our numerical observations suggest that it corresponds to a symmetric Fréchet-type perturbation, and based on this, we establish the first BOBW guarantee for symmetric unbounded perturbations in the two-armed setting.
    In contrast, in general multi-armed bandits, we find an instance in which symmetric Fréchet-type perturbations violate the standard condition for BOBW analysis, which is a problem not observed with asymmetric or nonnegative Fréchet-type perturbations. 
    Although this example does not rule out alternative analyses achieving BOBW results, it suggests the limitations of directly applying the relationship observed in two-armed cases to the general case and thus emphasizes the need for further investigation to fully understand the behavior of FTPL in broader settings.
Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)
Submission Number: 14635
Loading