Abstract: Modernrobotic visuomotor policy learning has witnessed significant progress through Diffusion Policy (DP) frameworks built upon Convolutional Neural Networks (CNNs) and Transformers. Despite their empirical success, these architectures remain fundamentally constrained by their relatively discrete computational nature, inherently limiting their capacity to generate efficient and smooth motion trajectories. To address this challenge, we introduce Kolmogorov-Arnold Networks (KANs) into Diffusion Policy learning. The proposed KAN Policy (KP) leverages KANs' intrinsic continuity through learnable base-parameterized activation functions, thereby producing continuous trajectories with shorter execution time and fewer jerks. Specifically, we design a novel Embedding KAN (Emb-KAN) for CNN-based models, which preserves structural continuity in high-dimensional latent spaces through adaptive spline embeddings. Besides, we apply Group-KAN to Transformer-based models for learning continuous representations. Across main simulation experiments, KP achieves average improvements of 6.06%, 8.03%, and 26.4% in terms of success rate, execution time, and smoothness, respectively. Similarly, in real-world experiments, KP achieves average improvements of 53.8%, 7.89%, and 29.4% across the same metrics.
External IDs:dblp:journals/ral/ChenGYL25
Loading