Abstract: Different manual assembly orientations have a significant impact on assembly accuracy. The success or confidence of posture estimation depends on the accurate six degree-of-freedom (6DoF) position and orientation (pose) estimation of the tracked objects. In this paper, we present an improved Efficient Pose algorithm, which is a single-shot learning-based approach to hand and object pose estimation. Based on the original Efficient Pose algorithm, we added a subnetwork for hand prediction, replaced some MBConv modules with Fused-MBConv modules, modified the number of network layers, and used different training strategies. Experimental results show that on the public dataset for monocular red-green-blue (RGB) 6DoF marker-less hand and surgical instrument pose tracking, it improves performance and shortens training time compared to other methods.