Model-based Approach for Determining Optimal Dynamic Treatment Regimes
thesisposted on 19.12.2021, 17:56 by Bing YuBing Yu
Dynamic treatment regimes (DTRs) are often considered for the medical care of chronic diseases and complex conditions. They consist of multistage treatment decisions, each based on the individual's health information and their treatment and response history. In this dissertation, we consider this setting with binary responses (i.e., either respond favorably or unfavorably to a treatment) and highlight one type of heterogeneity, specifically the existence of subgroups of patients who respond favorably to only a distinct subset of study treatments.
Currently, most works employ model-free approaches to find the optimal DTR. In contrast, we propose a model-based approach, which focuses more on describing heterogeneity in treatment responses. We first consider the scenario when baseline covariates are not included. A mixture of mixed logit models is proposed along with an EM alogorithm to estimate these subgroup proportions and the probabilities of a favorable response. We describe how an optimal dynamic treatment regime can be determined given the model information. We also discuss the necessary identifiability conditions (i.e., what sets of parameters are necessary for DTR determination).
Then, we extend the proposed model to incorporate baseline covariates. Specifically, we include certain baseline covariates in the logistic model for the probabilities of a favorable response and develop a multivariate Bernoulli model to incorporate the remaining covariates in the determination of subgroup proportions. Furthermore, time effects are considered in the model to allow for a potential overall decline in response effectiveness over time.
In each setting, simulation studies are performed to demonstrate the effectiveness of the proposed method in both parameter and DTR estimation. We also compare our approach with another competing method, Q-learning, and provide the scenarios when our mixture model outperforms Q-learning in terms of finding the optimal DTR.