Peripheral Artery Disease Prediction using Medical Notes

img
In this project, we pursue to develop a BERT fine-tuned model on medical notes to predict Peripheral Artery Disease (PAD). PAD, or atherosclerotic occlusive disease of the lower extremities, affects 8-12 million American adults and more than 200 million worldwide. The prevalence of PAD is as high as 12-30% in patients over the age of 65 years and annual Medicare expenditures related to the treatment of PAD alone total $4 billion. PAD is a highly morbid condition that can lead to limb loss secondary to acute or chronically progressive lower extremity ischemia. Moreover, PAD can lead to a 6-fold increased risk of premature mortality and major adverse cardiovascular and cerebrovascular events (MACCE) . To date, machine learning algorithms have been applied to EHR data such as logistic regression, random forest, in classification of PAD. We investigate if deep learning produces a more accurate classification of PAD than standard machine learning algorithms.