ITF-WPI: Image and text based cross-modal feature fusion model for wolfberry pest recognition

Published: 01 Jan 2023, Last Modified: 05 Apr 2025Comput. Electron. Agric. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Propose the image and text cross-modal feature fusion ITF-WPI model for identifying 17 pests common to wolfberry.•Introduce contextual Transformer network and Pyramid Squeezed Attention (PSA) mechanism for visual recognition into the model.•The class convolutional neural network-long-term memory (CNN-LSTM) model constructed by stacking 1D convolutional and bidirectional long and short-term memory (BiLSTM) networks achieved competitive performance.•An image and text dataset was constructed for application to wolfberry pest identification scenarios. The text explains the pest images, and the description contains the scientific name profile, source distribution, habitat, and control methods.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview