Abstract: Highlights•Object-agnostic hand-object 3D reconstruction from monocular hand-object motion video•Robust rigid-transformation estimation network that leverages large pre-trained model•Two-stage pipeline for 3D hand-object reconsruction•New hand-object dataset to benchmark hand-object 3D reconstruction•Rigid-transformation estimation performance analysis v/s object sizes and textures