EFDTR: Learnable Elliptical Fourier Descriptor Transformer for Instance Segmentation

Jiawei Cao; Chaochen Gu; Hao Cheng; Xiaofeng Zhang; Kaijie Wu; Changsheng Lu

EFDTR: Learnable Elliptical Fourier Descriptor Transformer for Instance Segmentation

Jiawei Cao, Chaochen Gu, Hao Cheng, Xiaofeng Zhang, Kaijie Wu, Changsheng Lu

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC 4.0

TL;DR: EFDTR: A polygon-based instance segmentation framework using Elliptic Fourier Descriptors (EFDs) for precise contour modeling, outperforming polygon methods and rivaling mask-based approaches.

Abstract: Polygon-based object representations efficiently model object boundaries but are limited by high optimization complexity, which hinders their adoption compared to more flexible pixel-based methods. In this paper, we introduce a novel vertex regression loss grounded in Fourier elliptic descriptors, which removes the need for rasterization or heuristic approximations and resolves ambiguities in boundary point assignment through frequency-domain matching. To advance polygon-based instance segmentation, we further propose EFDTR (\textbf{E}lliptical \textbf{F}ourier \textbf{D}escriptor \textbf{Tr}ansformer), an end-to-end learnable framework that leverages the expressiveness of Fourier-based representations. The model achieves precise contour predictions through a two-stage approach: the first stage predicts elliptical Fourier descriptors for global contour modeling, while the second stage refines contours for fine-grained accuracy. Experimental results on the COCO dataset show that EFDTR outperforms existing polygon-based methods, offering a promising alternative to pixel-based approaches. Code is available at \url{https://github.com/chrisclear3/EFDTR}.

Lay Summary: Most image segmentation methods work by teaching computers to label every pixel in a picture, creating detailed object masks. While accurate, these masks can be large and take more effort for the computer to handle. A more compact alternative is to trace just the outlines — the contours — of objects. But earlier attempts at this often struggled with figuring out exactly where the boundary points should go, making them less accurate than pixel-based methods. We introduce EFDTR, a new approach that solves this problem using elliptical Fourier descriptors — a mathematical tool that captures object boundaries as smooth curves. By expressing contours in the frequency domain, EFDTR avoids the ambiguity in point matching and improves prediction accuracy. Built with a transformer-based architecture, EFDTR combines the global modeling power of Fourier descriptors with precise boundary refinement. It outperforms previous contour-based methods and narrows the accuracy gap with pixel-based techniques, offering a promising direction for applications like autonomous driving or medical image analysis.

Link To Code: https://github.com/chrisclear3/EFDTR

Primary Area: Applications->Computer Vision

Keywords: Instance Segmentation, Learnable Elliptical Fourier Descriptor, Contour Prediction

Submission Number: 14224

Loading