Published: June 22, 2022
Background: Disability assessment using the Expanded Disability Status Scale (EDSS) is important to inform treatment decisions and monitor the progression of multiple sclerosis. Yet, EDSS scores are documented infrequently in electronic medical records.
Objective: To validate a machine learning model to estimate EDSS scores for multiple sclerosis patients using clinical notes from neurologists.
Methods: A machine learning model was developed to estimate EDSS scores on specific encounter dates using clinical notes from neurologist visits. The OM1 MS Registry data were used to create a training cohort of 2632 encounters and a separate validation cohort of 857 encounters, all with clinician-recorded EDSS scores. Model performance was assessed using the area under the receiver-operating-characteristic curve (AUC), positive predictive value (PPV), and negative predictive value (NPV), calculated using a binarized version of the outcome. The Spearman R and Pearson R values were calculated. The model was then applied to encounters without clinician-recorded EDSS scores in the MS Registry.
Results: The model had a PPV of 0.85, NPV of 0.85, and AUC of 0.91. The model had a Spearman R value of 0.75 and Pearson R value of 0.74 when evaluating performance using the continuous estimated EDSS and clinician-recorded EDSS scores. Application of the model to eligible encounters resulted in the generation of eEDSS scores for an additional 190,282 encounters from 13,249 patients.
Conclusion: EDSS scores can be estimated with very good performance using a machine learning model applied to clinical notes, thus increasing the utility of real-world data sources for research purposes.