Care-PD: A Multi-Site Anonymized Clinical Dataset for Parkinson’s Disease Gait Assessment

Published: 18 Sept 2025, Last Modified: 30 Oct 2025NeurIPS 2025 Datasets and Benchmarks Track posterEveryoneRevisionsBibTeXCC BY-NC 4.0
Keywords: Parkinson’s Disease, Gait Assesment, Clinical Severity Prediction, Multi-site Dataset, Human Motion Encoder, Healthcare AI
TL;DR: We introduce Care-PD a multi-site dataset and benchmark for Parkinson’s gait analysis, enabling robust clinical severity prediction and improving motion representation learning through diverse, anonymized pathological gait data.
Abstract: Objective gait assessment in Parkinson’s Disease (PD) is limited by the absence of large, diverse, and clinically annotated motion datasets. We introduce Care-PD, the largest publicly available archive of 3D mesh gait data for PD, and the first multi-site collection spanning 9 cohorts from 8 clinical centers. All recordings (RGB video or motion capture) are converted into anonymized SMPL meshes via a harmonized preprocessing pipeline. Care-PD supports two key benchmarks: supervised clinical score prediction (estimating Unified Parkinson’s Disease Rating Scale, UPDRS, gait scores) and unsupervised motion pretext tasks (2D-to-3D keypoint lifting and full-body 3D reconstruction). Clinical prediction is evaluated under four generalization protocols: within-dataset, cross-dataset, leave-one-dataset-out, and multi-dataset in-domain adaptation. To assess clinical relevance, we compare state-of-the-art motion encoders with a traditional gait-feature baseline, finding that encoders consistently outperform handcrafted features. Pretraining on Care-PD reduces MPJPE (from 60.8mm to 7.5mm) and boosts PD severity macro-F1 by 17\%, underscoring the value of clinically curated, diverse training data. Care-PD and all benchmark code are released for non-commercial research (Code, Data).
Croissant File: json
Dataset URL: https://doi.org/10.5683/SP3/TWIKMK
Code URL: https://github.com/TaatiTeam/CARE-PD
Supplementary Material: pdf
Primary Area: AL/ML Datasets & Benchmarks for health sciences (e.g. climate, health, life sciences, physics, social sciences)
Submission Number: 718
Loading