MegaHan97K: A large-scale dataset for mega-category Chinese character recognition with over 97K categories
Abstract: Highlights•MegaHan97K contains 97,455 Chinese character categories, six times more than existing datasets.•MegaHan97K supports the GB18030-2022 standard for comprehensive Chinese character coverage.•MegaHan97K comprises handwritten, historical, and synthetic subsets.•MegaHan97K provides balanced samples across categories.
Loading