Abstract: Highlights•A novel AdamR-GRU, decoder unit in encoding–decoding architecture for HMER Problems.•Proposed model incorporates momentum adaptive momentum with GRU and dropout as a regularization at the input nodes.•AdamR-GRU outperforms many state-of-art models with less memory and training time in top 1, 2, 3 error accuracy in CROHME 2014 and 2016 datasets. Furthermore, It achieves state-of-art in top 2 and 3 error accuracy in CROHME 2019 dataset.