UniCalli: A Unified Diffusion Framework for Column-Level Generation and Recognition of Chinese Calligraphy
Keywords: calligraphy
Abstract: Computational replication of Chinese calligraphy, a cornerstone of cultural heritage, remains challenging. Existing methods split into two flawed camps: some render high-quality isolated characters yet miss page-level aesthetics (ligatures, spacing, scale), while others attempt page/column synthesis but sacrifice calligraphic correctness. We introduce UniCalli, a unified diffusion framework for column-level recognition and generation. Training both tasks in one model is deliberate: recognition constrains the generator to preserve character identity and stroke structure, while generation supplies strong style/layout priors—together fostering concept-level abstractions (radicals, stroke configurations) that improve both tasks under long-tail, limited-label regimes. We curate a dataset of 8,000+ digitized pieces, with ~4,000 densely annotated (script labels, character boxes, transcriptions). UniCalli employs asymmetric noising and a rasterized box map to inject spatial priors, and is trained on a mix of synthetic, labeled, and unlabeled data. The model is robust to rare styles, better disentangles style from script, and attains state-of-the-art generative quality with clear gains in ligature continuity and layout fidelity, alongside stronger recognition. The framework extends to other ancient scripts, demonstrated by successful transfer to Oracle bone inscriptions and Egyptian hieroglyphs. Code and data will be released.
Primary Area: generative models
Submission Number: 2963
Loading