Keywords: Error correction, Nuclei Segmentation, Connectomics
TL;DR: A benchmark dataset NucEMFix and Tokenized Analysis-by-synthesis framework TANGO for correcting nuclei segmentation errors in microscopy.
Abstract: Accurate 3D nuclei segmentation underpins studies from development and regeneration to large-scale anatomy. Landmark volumetric EM datasets, whole-brain Drosophila (FAFB) and petabyte-scale mouse brain tissue (MICrONS), have enabled cellular-scale mapping, yet their released nuclei segmentations retain errors despite extensive proofreading. The last mile—rare, heterogeneous false merges/splits and missing-slice or misalignment artifacts—remains difficult, where discriminative correction models overfit to training degradations and generalize poorly.
We introduce TANGO, a tokenized analysis-by-synthesis framework for 3D nuclei segmentation correction. TANGO tokenizes the erroneous seed mask into sub-nucleus fragments and generatively completes multiple shape hypotheses conditioned on the image and tokens. Training applies slice-patch masking to complete-nucleus annotations (without using error labels). A lightweight ordinal selector ranks overlapping hypotheses, and simple NMS decodes a reliable subset of fixes.
To evaluate at the brain scale, we curate NucEMFix, a systematic benchmark of nuclei error cases across FAFB and MICrONS (8,000+ annotated error nuclei). Beyond EM, we assess generality on public C. elegans L1 confocal volumes. TANGO consistently improves F1 over strong baselines, achieving state-of-the-art correction without prompt engineering or error-specific supervision. We release NucEMFix, code, and evaluation scripts for reproducible assessment and for quantifying proofreading-time savings.
Primary Area: applications to neuroscience & cognitive science
Submission Number: 9354
Loading