Abstract: Computer vision is the science that aims to enable computers to emulate human visual perception, and it encompasses various techniques and methods for extracting and interpreting information from two-dimensional images. Supervised deep 2-D image feature representation is a fundamental problem in computer vision that applies deep learning techniques to extract and process information from a given 2-D image under supervised settings. The goal is to obtain a feature vector that can be utilized for various downstream computer vision applications. The quality of supervised deep 2-D image feature representation algorithms directly affects the performance of downstream applications. However, most of the existing vision research only explores supervised deep 2-D image feature representation for specific subtasks. Therefore, a comprehensive discussion on this topic is needed. In this article, we propose a taxonomy of supervised deep 2-D image feature representation methods based on four categories: global representation, region representation, hash representation, and hybrid representation, and we introduce their typical approaches. Furthermore, we perform a comparative analysis of the representative methods on three fundamental tasks: image classification, object detection, and semantic segmentation, as well as other common tasks. We also discuss the limitations of supervised deep 2-D image feature representation and investigate future directions in image representation to facilitate the advancement of computer vision through image representation.
External IDs:dblp:journals/tai/DongWDYRLLWT25
Loading