Abstract: Parkinson’s Disease (PD) is a progressive neurode-
generative disorder that affects motor and speech functions.
Early and accurate detection of PD is crucial for timely medical
intervention. This study uses machine learning techniques to de-
velop a non-invasive classification model based on vocal biomark-
ers extracted from the UCI Parkinson’s Disease dataset that in-
cludes jitter, shimmer, fundamental frequency, recurrence period
density entropy (RPDE), and pitch period entropy (PPE), which
have been previously identified as indicators of PD. To classify
PD patients from healthy individuals, ten machine learning
models were evaluated, including LightGBM, XGBoost, Random
Forest, AdaBoost, Bagging, Decision Tree, Logistic Regression,
Support Vector Machine (SVM), K-Nearest Neighbor (KNN),
and Na¨ ıve Bayes. Feature selection techniques were employed
to enhance model efficiency by reducing redundancy while
maintaining classification performance. Experimental results
demonstrated that LightGBM achieved the highest accuracy of
98.00% with an AUC of 97.00%, outperforming other classifiers.
This study highlights the potential of machine learning-based
speech analysis for early, cost-effective, and scalable PD detec-
tion, providing a foundation for future clinical applications in
non-invasive neurological assessments.
Loading