Predictive Modeling of Early Stage Parkinsons Disease
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Background: Early stage (preclinical) detection of Parkinsons disease (PD) remains challenged yet is crucial to both differentiate it from other disorders and facilitate timely administration of neuroprotective treatment as it becomes available. Objective: In a cross-validation paradigm, dual binary classifications analyses were conducted: early PD versus controls and early PD versus SWEDD (scan without evidence of dopaminergic deficit). It was hypothesized that five distinct model types using combined non-motor and biomarker features would distinguish early PD from controls with > 80% cross-validated AUC, but that the diverse nature of SWEDD would reduce early PD versus SWEDD CV classification AUC and alter model-based rank of predictor importance among model types. Methods: Baseline data was acquired from the Parkinsons Progressive Markers Initiative (PPMI). Logistic regression, general additive (GAM), decision tree, random forest and XGBoost models were fitted using non-motor clinical and biomarker features. Randomized train and test data partitions were used. Model classification CV performance was compared using the area under the curve (AUC), accuracy, sensitivity, specificity and the Kappa statistic. Results: All five models achieved >.80 AUC CV accuracy to distinguish early PD from controls using non-motor clinical and biomarker features. The GAM (CV AUC .928, sensitivity .898, specificity .897) and XGBoost (CV AUC .923, sensitivity .875, specificity .897) models were the top classifiers. Performance across all models was consistently lower in the early PD/SWEDD analyses. The two highest performing models were XGBoost (CV AUC .863, sensitivity .905, specificity .748) and random forest (CV AUC .822, sensitivity .809, specificity .721); XGBoost detection of non-PD SWEDD matched 1-2yr curated diagnoses in 81.25% (13/16) cases. In both early PD/control and early PD/SWEDD analyses, and across all models, olfactory function was the single most important feature to classification; rapid eye movement behaviour disorder and cognition were the next most commonly high ranked features. Alpha-synuclein was a feature of import to early PD/control but not to early PD/SWEDD classification and daytime sleepiness was antithetically important to the latter but not former. Interpretation: Non-motor clinical and biomarker variables enable high CV discrimination of early PD versus controls but are less effective discriminating early PD from SWEDD.