Abstract: Image AI is essential for various applications, such as self-driving
cars, medical imaging, and smart farming. Data management is
key for efficient image AI, from how to store images to how to
manage data while processing the images. This tutorial overviews
the emerging area of image AI pipelines by combining approaches
from various cross-disciplinary areas such as data management,
digital signal processing, computer vision, and machine learning.
We specifically focus on image storage and data management.
The tutorial first gives an overview of image AI pipelines step by
step, how they work and the main the challenges. We then describe
the main approaches to making image AI pipelines more efficient.
We first cover how image AI pipelines store images based on stan-
dard storage formats, learned formats, task-specific learned formats,
and self-designed formats. Second, we cover how state-of-the-art
approaches manage data within image AI pipelines. We identify
and describe three main approaches to making image AI pipelines
more efficient by efficiently managing data within the pipeline: (i)
compressing intermediate data, (ii) materializing and re-using data
objects, and (iii) parallelism for better hardware utilization. Lastly,
the tutorial covers open data management and systems problems
and future directions in making image AI pipelines more efficient.
Loading