Hierarchical Attention Decoder for Solving Math Word Problems

Published: 2024, Last Modified: 22 Feb 2025ICPRAI (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: To answer math word problems (MWPs), models must formalize equations from the source text of math problems. Recently, the tree-structured decoder has significantly improved model performance on this task by generating the target equation in a tree format. However, current decoders usually ignore the hierarchical relationships between tree nodes and their parents, which hinders further improvement. Thus, we propose a structure called a hierarchical attention tree to aid the generation procedure of the decoder. As our decoder follows a graph-based encoder, our full model is therefore named Graph to Hierarchical Attention Tree (G2HAT). We show that a tree-structured decoder with hierarchical accumulative multi-head attention leads to a significant performance improvement and reaches a significant improvement on various strong baselines on both English MAWPS and Chinese Math23k MWP benchmarks. For further study, we also apply a pre-trained language model to G2HAT, which even results in a new higher performance.
Loading