Abstract: The recognition of insect pests is a critical task in agricultural technology, vital for ensuring food security and environmental sustainability. However, due to factors like high camouflage and species diversity, the complexity of pest identification poses significant obstacles. Existing methods struggle with fine-grained feature extraction to distinguish between closely related pest species. Although recent advancements have utilized modified network structures and combined deep learning approaches to improve accuracy, challenges persist due to the similarity between pests and their surroundings. To address this problem, we introduce InsectMamba, a novel approach that integrates State Space Models (SSMs), Convolutional Neural Networks (CNNs), Multi-Head Self-Attention mechanism (MSA), and Multilayer Perceptrons (MLPs) within Mix-SSM blocks. This integration facilitates the extraction of comprehensive visual features by leveraging the strengths of each encoding strategy. A selective module is also proposed to composite these features adaptively, enhancing the model’s ability to discern pest characteristics. InsectMamba was evaluated against strong competitors across insect classification and detection datasets. The results demonstrate its superior performance and verify the significance of each model component by an ablation study.
Loading