Keywords: autoencoder, inverse problem, Siamese networks
TL;DR: The paper introduces SMCVAE, using Siamese networks to restore missing information in video frames effectively.
Abstract: Restoring missing information in video frames is a challenging inverse problem, particularly in applications such as autonomous driving and surveillance. This paper introduces the Siamese Masked Conditional Variational Autoencoder (SMCVAE), a novel model that utilizes a Siamese network architecture with Siamese Vision Transformer (SiamViT) encoders. By leveraging the inherent similarities between paired frames, SMCVAE enhances the model's ability to accurately reconstruct missing content. This approach effectively tackles the problem of missing patches—often resulting from camera malfunctions—through advanced variational inference techniques. Experimental results demonstrate SMCVAE's superior performance in restoring lost information, highlighting its potential to solve complex inverse problems in real-world environments.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 463
Loading