Abstract: This study focuses on the segmentation of handwritten ink in historical documents using hyperspectral imaging in two spectral ranges (visible and near-infrared). Binarization is useful as a pre-processing step for material identification using the reflectance spectra. To showcase the challenges of using hyperspectral imaging, classical single-band (Howe and Sauvola) and deep learning-based algorithms (DeepLabv3, SAM, DINOv2) are compared. For algorithms that take a single image as input, a procedure is presented to select the optimal band for binarization. The deep learning-based semantic segmentation algorithm DeepLabv3 uses the full spectrum instead. A hyperspectral database encompassing 226 samples is introduced as a benchmark to compare the performance of the algorithms. The study also introduces a novel semi-automatic method for generating ground truths, which are needed for computing performance metrics. DeepLabv3 performs on par with the best traditional algorithm in both ranges, but overall, it offers more consistent and reliable results. DINOv2 demonstrates good semantic understanding in separating foreground and background but suffers from limited spatial resolution. Conversely, SAM excels at capturing fine details but lacks the ability to identify text regions. The binarization quality obtained with three-channel images is also assessed, generally resulting in lower average performance. Our findings contribute to the advancement of technologies for the analysis of text in documents of historical interest.
External IDs:dblp:journals/mta/BuzzelliMFLNV25
Loading