Predictive Modeling of Early Stage Parkinsons Disease

Leger, Charles Stevens

Predictive Modeling of Early Stage Parkinsons Disease

dc.contributor.advisor	DeSouza, Joseph FX
dc.contributor.author	Leger, Charles Stevens
dc.date.accessioned	2020-11-13T14:03:05Z
dc.date.available	2020-11-13T14:03:05Z
dc.date.copyright	2020-10
dc.date.issued	2020-11-13
dc.date.updated	2020-11-13T14:03:04Z
dc.degree.discipline	Psychology (Functional Area: Brain, Behaviour & Cognitive Science)
dc.degree.level	Doctoral
dc.degree.name	PhD - Doctor of Philosophy
dc.description.abstract	Background: Early stage (preclinical) detection of Parkinsons disease (PD) remains challenged yet is crucial to both differentiate it from other disorders and facilitate timely administration of neuroprotective treatment as it becomes available. Objective: In a cross-validation paradigm, dual binary classifications analyses were conducted: early PD versus controls and early PD versus SWEDD (scan without evidence of dopaminergic deficit). It was hypothesized that five distinct model types using combined non-motor and biomarker features would distinguish early PD from controls with > 80% cross-validated AUC, but that the diverse nature of SWEDD would reduce early PD versus SWEDD CV classification AUC and alter model-based rank of predictor importance among model types. Methods: Baseline data was acquired from the Parkinsons Progressive Markers Initiative (PPMI). Logistic regression, general additive (GAM), decision tree, random forest and XGBoost models were fitted using non-motor clinical and biomarker features. Randomized train and test data partitions were used. Model classification CV performance was compared using the area under the curve (AUC), accuracy, sensitivity, specificity and the Kappa statistic. Results: All five models achieved >.80 AUC CV accuracy to distinguish early PD from controls using non-motor clinical and biomarker features. The GAM (CV AUC .928, sensitivity .898, specificity .897) and XGBoost (CV AUC .923, sensitivity .875, specificity .897) models were the top classifiers. Performance across all models was consistently lower in the early PD/SWEDD analyses. The two highest performing models were XGBoost (CV AUC .863, sensitivity .905, specificity .748) and random forest (CV AUC .822, sensitivity .809, specificity .721); XGBoost detection of non-PD SWEDD matched 1-2yr curated diagnoses in 81.25% (13/16) cases. In both early PD/control and early PD/SWEDD analyses, and across all models, olfactory function was the single most important feature to classification; rapid eye movement behaviour disorder and cognition were the next most commonly high ranked features. Alpha-synuclein was a feature of import to early PD/control but not to early PD/SWEDD classification and daytime sleepiness was antithetically important to the latter but not former. Interpretation: Non-motor clinical and biomarker variables enable high CV discrimination of early PD versus controls but are less effective discriminating early PD from SWEDD.
dc.identifier.uri	http://hdl.handle.net/10315/37993
dc.language	en
dc.rights	Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subject	Health sciences
dc.subject.keywords	Predicting Parkinson’s
dc.subject.keywords	SWEDD
dc.subject.keywords	Random forest
dc.subject.keywords	XGBoost
dc.subject.keywords	Logistic regression
dc.title	Predictive Modeling of Early Stage Parkinsons Disease
dc.type	Electronic Thesis or Dissertation