LSTMVAEF: Vivid Layout via LSTM-Based Variational Autoencoder Framework

Jie He, Xingjiao Wu, Wenxin Hu, Jing Yang

2021 (modified: 15 Apr 2022)ICDAR (2) 2021Readers: Everyone

Abstract: The lack of training data is still a challenge in the Document Layout Analysis task (DLA). Synthetic data is an effective way to tackle this challenge. In this paper, we propose an LSTM-based Variational Autoencoder framework (LSTMVAF) to synthesize layouts for DLA. Compared with the previous method, our method can generate more complicated layouts and only need training data from DLA without extra annotation. We use LSTM models as basic models to learn the potential representing of class and position information of elements within a page. It is worth mentioning that we design a weight adaptation strategy to help model train faster. The experiment shows our model can generate more vivid layouts that only need a few real document pages.

0 Replies