Clinical trials are the gold standard for inferring the causal effects of treatments or interventions. This thesis is concerned with the development of methodologies for two problems in modern clinical trials. First is analyzing binary repeated measures in clinical trials using models that reflect the complicated autocorrelation patterns in the data, so as to obtain high power when inferring treatment effects. Second is simulating realistic outcomes and subject nonadherence mechanisms in Phase III pharmaceutical clinical trials under the Tripartite Framework.

Bayesian Models for Binary Repeated Data: The Bayesian General Logistic Autoregressive Model and the Polya-Gamma Logistic Autoregressive Model

Autoregressive processes in generalized linear mixed effects regression models are convenient for the analysis of clinical trials that have a moderate to large number of binary repeated measurements, collected across a fixed set of structured time points, for each subject. However, much of the existing literature and methods for autoregressive processes on repeated binary measurements permit only one order and only one autoregressive process in the model. This limits the flexibility of the resulting generalized linear mixed effects regression model to fully capture the dynamics in the data, which can result in decreased power for testing treatment effects. Nested autoregressive structures enable more holistic modeling of clinical trials that can lead to increased power for testing effects.

We introduce the Bayesian General Logistic Autoregressive Model (BGLAM) for the analysis of repeated binary measures in clinical trials. The BGLAM extends previous Bayesian models for binary repeated measures by accommodating flexible and nested autoregressive processes with non-informative priors. We describe methods for selecting the order of the autoregressive process in the BGLAM based on the Deviance Information Criterion (DIC) and marginal log-likelihood, and develop an importance sampling-weighted posterior predictive p-value to test for treatment effects in BGLAM. The frequentist properties of BGLAM compared to existing likelihood- and non-likelihood-based statistical models are evaluated by means of extensive simulation studies involving different data generation mechanisms.

Two features of BGLAM that can limit its application in practice is the computational effort involved in executing it and the inability to integrate added heterogeneity across time in its autoregressive processes. We develop the Polya-Gamma Logistic Autoregressive Model (PGLAM) for addressing these limiting features of the BGLAM. This new model enables the integration of additional layers of variability through random effects and heterogeneity across time in nested autoregressive processes. Furthermore, PGLAM is computationally more efficient than BGLAM because it eliminates the need to use the complex types of samplers for truncated latent variables that is involved in the Markov Chain Monte Carlo algorithm for BGLAM.

Data Generating Model for Phase III Clinical Trials With Intercurrent Events

Although clinical trials are designed with strict controls, inevitably complications will arise during the course of the trials. One significant type of complication is missing subject outcomes due to subject drop-out or nonadherence during the trial, which are referred to in general as intercurrent events. This complication can arise from, among other causes, adverse reactions, lack of efficacy of the assigned treatment, administrative reasons, and excess efficacy from the assigned treatment. Intercurrent events typically confound causal inferences on the effects of the treatments under investigation because the missingness that occurs as a result corresponds to a Missing Not at Random missing data mechanism, the pharmaceutical industry is increasingly focused on developing methods for obtaining valid causal inferences on the receipt of treatment in clinical trials with intercurrent events. However, it is extremely difficult to compare the frequentist properties and performance of these competing methods, as real-life clinical trial data cannot be easily accessed or shared, and as the different methods consider distinct assumptions for the underlying data generating mechanism in the clinical trial. We develop a novel simulation model for clinical trials with intercurrent events. Our simulator operates under the Rubin Causal Model. We implement the simulator by means of an R Shiny application. This app enables users to control patient compliance through different sources of discontinuity with varying functional trends, and understand the frequentist properties of treatment effect estimators obtained by different models for various estimands.