Keywords: Computer Vision, Active Learning, Uncertainty Quantification
Abstract: Understanding semantics and dynamics has been crucial for embodied agents in various tasks. Both tasks have much more data redundancy than the static scene understanding task. We formulate the view selection problem as an active learning problem and propose a view selection algorithm with Fisher Information that could work for semantics and dynamics in scene understanding. We evaluate our method on large-scale static images and dynamic video datasets by selecting informative frames from multi-camera setups. Experimental results demonstrate that our approach consistently improves rendering quality and semantic segmentation performance, outperforming baseline methods based on random selection and uncertainty-based heuristics.
Supplementary Material: zip
Submission Number: 306
Loading