Guidelines for Optimal Mesh Generation Using Deep Learning-Driven Avatar Reconstruction For Gait Analysis

Odysseas I. Stavrakakis; Athanasios Mastrogeorgiou; Aikaterini Smyrli; Evangelos Papadopoulos

Guidelines for Optimal Mesh Generation Using Deep Learning-Driven Avatar Reconstruction For Gait Analysis

Odysseas I. Stavrakakis, Athanasios Mastrogeorgiou, Aikaterini Smyrli, Evangelos Papadopoulos

Published: 23 Jun 2025, Last Modified: 23 Jun 2025Greeks in AI 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: AI for Health

TL;DR: A systematic approach for determining the optimal number and orientation of cameras required for deep learning-based mesh reconstruction aimed at gait analysis.

Abstract: ABSTRACT Gait analysis plays a vital role across numerous scientific domains, typically relying on marker-based or markerless motion capture (MoCap) techniques. Recent advances in deep learning (DL), particularly in human mesh generation, have greatly enhanced markerless methods. However, poor camera placement often degrades mesh quality. This paper introduces a DL-based framework for accurate human mesh reconstruction using avatar reconstruction algorithms (ARAs). Leveraging a simulated environment, our method [1] systematically determines the optimal number and placement of cameras. Mesh quality is further refined through alignment, evaluation, and surface reconstruction to eliminate artifacts. We also present a simulation- and reality-tested gait analysis tool that detects gait phases, extracts key joint angles, and animates the gait cycle in 3D. This open-source framework is adaptable and suited for diverse gait analysis applications. I. INTRODUCTION Recent advances in computer vision and deep learning (DL) have enabled accessible gait analysis via human avatar generation from RGB images, mainly for AR/VR applications [2, 3]. However, results are highly sensitive to camera placement, with suboptimal angles producing poor meshes and inaccurate stance detection [4]. This work presents a novel, open-source approach for gait analysis utilizing DL-based ARAs to produce robust results that are significant to the motion analysis community. II. METHODOLOGY Our methodology is based on the following key steps: (a) Simulation-based Camera Placement Optimization: A simulated environment is employed to produce a naturally moving digital actor, who is recorded by the simulated MoCap system. The videos are used to produce 3D meshes, whose quality is maximized via the optimization of the camera setup, (b) Avatar Reconstruction: ECON is utilized to generate detailed and realistic human avatars from monocular RGB input, facilitating markerless motion capture. (c) Mesh Quality Assessment: Reconstructed meshes are quantitatively evaluated using the mean cloud-to-cloud (C2C) distance metric, offering an objective measure of geometric accuracy relative to ground-truth models. (d) Novel Evaluation Indices: Two metrics are introduced-Attainable Improvement Index (AII) and Attainable Deterioration Index (ADI)-to quantify the expected enhancement or degradation in mesh quality due to changes in camera configurations. (e) Modular Processing Pipeline: This framework includes sequential modules for data initialization, video processing, integration of the avatar reconstruction algorithm (ARA), mesh post-processing and optimization, and final visualization and animation via Blender. III. RESULTS Among the parameters of high scientific relevance in gait analysis are the joint angles of the hip, knee, and ankle. These angles are computed based on anatomically defined landmarks on the leg and foot. Fig. 16 in [1] illustrates the joint angle trajectories obtained from ECON keypoints, compared against population-average trajectories reported in [5]. The experimental results fall within the expected range for the reference dataset. Changes in the foot contact area indicate phase-changing events and can, therefore, be utilized for phase detection (Fig. 15 in [1]). The phases are separated by six events: Initial Contact (IC), Foot Flat (FF), Heel Rise (HR), Contralateral IC (CIC), Toe Off (TO) & Feet Adjacent (FA). REFERENCES [1] O. I. Stavrakakis, A. Mastrogeorgiou, A. Smyrli and E. Papadopoulos, "Guidelines For Optimal Human Mesh Generation Using Deep Learning-Driven Avatar Reconstruction For Gait Analysis," 2024 IEEE-RAS 23rd International Conference on Humanoid Robots (Humanoids), Nancy, France, 2024, pp. 787-794, doi: 10.1109/Humanoids58906.2024.10769832. [2] Abdullah S. Alharthi, Syed U. Yunas, and Krikor B. Ozanyan, “Deep learning for monitoring of human gait: A review,” IEEE Sensors Journal, vol. 19, no. 21, pp. 9575–9591, 2019. [3] Yuliang Xiu, Jinlong Yang, Xu Cao, Dimitrios Tzionas, and Michael J. Black, “ECON: Explicit Clothed humans Optimized via Normal integration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023 [4] O. Stavrakakis, A. Mastrogeorgiou, A. Smyrli and E. Papadopoulos, "Gait Analysis with Trinocular Computer Vision Using Deep Learning," 2023 IEEE International Conference on Image Processing Challenges and Workshops (ICIPCW), Kuala Lumpur, Malaysia, 2023, pp. 3702-3706, doi: 10.1109/ICIPC59416.2023.10328370. [5] Carlos A. Fukuchi et al., “A public dataset of overground and treadmill walking kinematics and kinetics in healthy individuals,” PeerJ, 2018.

Submission Number: 124

Loading