Abstract: Highlights•Describing overall features by capturing apple images from multiple perspectives.•Lightweight CNN helps model more practical.•Aggregating spatial apple features is realized by bilateral long short-term memory.•The performance and efficiency are far excellent with a 99.23% accuracy.
Loading