Abstract: Highlights•We adopt a Channel Grouping Vision Transformer (CGViT) for lightweight fruit and vegetable recognition.•We benchmark various lightweight deep learning networks on these four fruit datasets.•Evaluations on four fruit and vegetable datasets demonstrate that our approach achieves state-of-the-art performance while consuming fewer resources.
External IDs:dblp:journals/eswa/LiuMSYSYWJ25
Loading