Transfer Learning for Medication Adherence Prediction from Social Forums Self-Reported Data
Medication non-adherence and non-compliance left unaddressed can compound into severe medical problems for patients. Identifying patients that are likely to become non-adherent can help reduce these problems. Despite these benefits, monitoring adherence at scale is cost-prohibitive. Social forums offer an easily accessible, affordable, and timely alternative to the traditional methods based on claims data. This study investigates the potential of medication adherence prediction based on social forum data for diabetes and fibromyalgia therapies by using transfer learning from the Medical Expenditure Panel Survey (MEPS).
Predictive adherence models are developed by using both survey and social forums data and different random forest (RF) techniques. The first of these implementations uses binned inputs from k-means clustering. The second technique is based on ternary trees instead of the widely used binary decision trees. These techniques are able to handle missing data, a prevalent characteristic of social forums data.
The results of this study show that transfer learning between survey models and social forum models is possible. Using MEPS survey data and the techniques listed above to derive RF models, less than 5% difference in accuracy was observed between the MEPS test dataset and the social forum test dataset. Along with these RF techniques, another RF implementation with imputed means for the missing values was developed and shown to predict adherence for social forum patients with an accuracy >70%.
This thesis shows that a model trained with verified survey data can be used to complement traditional medical adherence models by predicting adherence from unverified, self-reported data in a dynamic and timely manner. Furthermore, this model provides a method for discovering objective insights from subjective social reports. Additional investigation is needed to improve the prediction accuracy of the proposed model and to assess biases that may be inherent to self-reported adherence measures in social health networks.