Abstract: The multimedia community has witnessed the rise of deep learning–based techniques in analyzing multimedia content more effectively. In the past decade, the convergence of deep-learning and multimedia analytics has boosted the performance of several traditional tasks, such as classification, detection, and regression, and has also fundamentally changed the landscape of several relatively new areas, such as semantic segmentation, captioning, and content generation. This article aims to review the development path of major tasks in multimedia analytics and take a look into future directions. We start by summarizing the fundamental deep techniques related to multimedia analytics, especially in the visual domain, and then review representative high-level tasks powered by recent advances. Moreover, the performance review of popular benchmarks gives a pathway to technology advancement and helps identify both milestone works and future directions.
0 Replies
Loading