Studies that use non-randomised methods to determine who will receive different treatments (for example, by clinician preference and patient suitability) may, as a result, have systematic differences between participants in different treatment arms. When these differences, whether known or unknown, are also related to the outcome they are considered to be confounding factors. For example, if participants in one arm have more severe disease, they may respond differently to the treatment. Results from studies with confounding are less reliable and considered to be biased (this is called selection bias).
By design, well-conducted randomised studies with an adequate study size should eliminate both known and unknown differences between treatment arms which may influence the outcome (i.e. have low risk of selection bias) due to the randomised nature of treatment selection.
It is possible to control for some known factors where randomisation has not occurred and attempt to produce less biased results, for example by stratification or matching, but this is not always possible.
In studies that do not use randomisation to control for confounding, statistical methods can be used to adjust the results and provide a less biased and more accurate estimate of treatment effect. However, research is still ongoing on different methods to control for confounding; also, statistical methods cannot fully compensate for unmeasured confounders. The methods can normally be categorised into those that adjust for known confounding factors and those that adjust for unknown confounding factors. The table below provides some of the more commonly known methods.
Table. Summary of methods to adjust for either known or unknown confounding
Methods that adjust for known confounding | |
Regression adjustment using regression models (such as logistic regression models by prognostic factors^{a}) | Regression models depend on covariates (such as prognostic factors) to predict the outcome. Models are fit for both the treated and untreated samples, and the treatment effects are then based on the differences between the predictions of the two models. Read more |
Inverse probability weighting (IPW)^{a} | This method aims is to make the groups more comparable by using a propensity score function to ‘weight’ the mean depending on a set of covariates or prognostic factors (a propensity score is a probability score based on different patient characteristics). The inverse of the propensity score is used to calculate a weighted mean. Read more |
Doubly robust methods | This method combines regression adjustment and IPW. Regression adjustment is made for the outcome, but not the treatment selection, resulting in a model being estimated for the probability of receiving treatment but not for an outcome. Read more |
Regression based on propensity score* | This method uses the propensity score to control for correlation between treatment and covariates; the method most often uses parametric regression for the outcome variable. Read more
This method may only be sufficient when there are relatively few outcomes (see here). |
Regression based on disease risk score^{a} | This method uses the disease risk score to control for correlation between treatment and covariates.
This method may only be sufficient (and less biased) when there are relatively few outcomes (Schmidt et al 2016). |
Matching | While matching can be done in a study design, it can also be an analytical method, aiming to ‘match’ control individuals who are similar to treated patients in one or more characteristic. This may be done based on a propensity score. For a brief description and more resources, see here. |
Parametric regression on a matched sample | This approach combines regression adjustment with matching, using the regression to control for any factors not adjusted for with matching. Read more |
Methods that adjust for unknown confounding | |
Instrumental variable methods | This is the most commonly used method to deal with unknown confounding. This approach aims to find a variable (or instrument) that is correlated with the treatment, but not directly correlated to the outcome (except through the treatment). A causal treatment effect is identified by varying the instrument. For a brief description and more resources, see here. |
Panel data models | This approach uses an individual as their own control at different time-points. Read more |
^{a} These were examined in a GetReal simulation study. |
These methods can also be categorised by the purpose for the method of analysis: they may aim to make the groups more comparable (for example, matching, inverse probability weighting), control for the effect of the confounding factors (e.g. regression adjustment, instrumental variable methods), or use natural experiments that aim to mimic randomisation (i.e. difference-in-difference and regression discontinuity) (Faria et al 2015).
A NICE Decision Support Unit technical support document (Faria et al 2015) was produced ‘to help improve the quality of analysis, reporting, critical appraisal and interpretation of estimates of treatment effect from non-RCT studies’. This document summarises commonly available methods to analyse comparative individual participant data (IPD) from non-RCTs to estimate a treatment effect. The document also provides various tools, including an algorithm to help users to select the appropriate method for analysis.
Many of the methods described here have been tested on adjusting for bias in pragmatic trials in GetReal simulations (see here). |
GetReal simulation on adjusting for confounding
As part of GetReal, a simulation was performed that examined different methods to adjust for confounding in post-launch settings. This simulation found that disease risk score methods may be an alternative to logistic regression when there are low event rates or low numbers of participants in the treatment arm, but even these methods are imperfect.