Abstract: Diagram question answering is a challenging multi-modal machine learning task that focuses on answering questions according to given diagrams on specific fields. Compared to natural imaged, these diagrams have more abstract expressions and complex logical relations, which makes diagram question answering more difficult. In this paper, we propose a new approach for diagram question answering task. We add bottom-up and top-down attention to identify regions of interest to questions and use a same model to jointly train multiple choice questions and true false questions. Our approach on test dataset of official CCKS2022 textbook diagram question answering session achieves the accuracy of 58.09%.
0 Replies
Loading