Adjusting for confounding bias in a heterogeneous control arm in a pragmatic trial: a simulation study

What is being explored by the simulation study?

As the aim of pragmatic trials is to mimic real-life situations, the comparator or control treatment used for pragmatic trials is ideally what is currently used in clinical practice. However, as there is often not just one treatment used in clinical practice, the ‘comparator’ arm in the trial is likely to include a number of treatments. As the choice of treatment in the patients in the control arm is also likely to be affected by some factors that also affect the binary outcome (confounders), it may introduce confounding bias. When a comparison of the new treatment with the control treatments as a whole is perfomed, confounding is not an issue. In the case, however, where one needs to compare the new treatment with a specific treatment from the comparator arm, confounding is a problem. This bias presents issues in conducting analyses for and interpreting results from pragmatic trials. GetReal has examined different methods for adjusting for this type of confounding in pragmatic clinical trials using a simulation study.

What was examined in the simulation study?

This simulation study aimed to answer how adjusting for confounding in three different scenarios will affect the estimates of the treatment.

  • Scenario 1: a fully-explained confounder was included
  • Scenario 2: a partially-explained confounder was included
  • Scenario 3: a fully observed confounder with missing variables was included.

The fully-explained confounder scenario was using a hypothetical risk of cardiovascular disease (CVD) that was affecting the control group. The partially-explained confounder included a scenario where the observed confounder is the level of LDL and the unobserved is the risk of pharmacokinetic drug. The fully observed confounder included the same hypothetical risk like in the first scenario with a measurement error.

The following methods of accounting for confounding bias in the control arm were compared for each scenario in terms of their ability to address the bias (in other words, which method produced results that were closest to the actual treatment effect):

  • Multivariable logistic regression, corrected for confounders: modelling the binary outcome using treatment and confounders
  • Propensity score method: modelling the binary outcome using treatment and weights as a continuous covariate (weights came from the predictions when the treatment was modelling with the confounders)
  • Disease risk score: modelling the binary outcome using treatment and weights as a continuous covariate (weights came from the predictions when the binary outcome was modelled with the confounders)
  • Doubly robust inverse probability weighting: modelling the binary outcome using treatment and confounders (stabilised weights are used not as a continuous variable, but as weights; stabilised weights came from dividing the observed frequency for each treatment divided by the prediction when the treatment was modelled with the confounders)
  • Inverse probability weighting: modelling the binary outcome using treatment (stabilised weights are used not as a continuous variable, but as weights; stabilised weights came from dividing the observed frequency for each treatment divided by the prediction when we model the treatment with the confounders)
  • Standardisation: fitting a model for the binary outcome and averaging by treatment

(note: some of these methods are also used to adjust for confounding bias in non-randomised or observational studies – see here)

What were the findings and conclusions?

  • Scenario 1: all methods worked perfectly and there was no bias which was expected as we had explained the confounding perfectly.
  • Scenario 2: all methods were biased on the same level. In some cases so severe, that a positive treatment effect could be found negative and vice versa.
  • Scenario 3: all methods gave biased results, except the marginal models obtained from the inverse probability weighting and standardisation.

What are the limitations?

  • There is a risk of believing all the confounders have been explained when this may not be true. The safest option if there is missing data on a variable is to use inverse probability weighting or standardisation.
  • All the scenarios are hypothetical.

Key contributor

Paraskevi Pericleous, University of Manchester