Assure quality and credibility of RWE

The defining feature of a randomised controlled trial (RCT), the random assignment of treatment groups, can ensure that characteristics of participants are similar in the groups being compared, when the trial is well conducted. This is most important when those characteristics also have a direct impact on the effect of a medicine, such as the severity of the disease (often called confounding variables or treatment effect modifiers).

While there methods other than randomisation that can be used to ensure equal distribution of these factors between groups (such as matching), random allocation is particularly important as there may be characteristics that influence a treatment effect that are not known.

Although other factors may influence the internal validity of a study, including the adherence to treatment protocols and the measurement of outcomes, the internal validity of well-conducted RCTs is likely to be high, providing more reliable estimates of a medicine’s effect. However, traditional RCTs are less likely to reflect the real world in the populations included, the way that interventions are administered or in other factors (i.e. they may have lower external validity).

The use of data collected outside RCTs (real-world data [RWD]) may have better external validity. However, the potential lack of internal validity and the potential for bias (the ‘robustness’ of the data) causes most uncertainty when using this data as a source of evidence on relative effectiveness. More information on the potential limitations of different RWD sources or study designs to inform relative effectiveness is found here and here.

Determining whether the effectiveness estimates reported in a study are credible and can be relied on for decision-making depends on multiple aspects relating to the quality of the study. Checklists to help assess a study for quality and credibility are discussed below.

Checklists for quality assessment

One of the key concerns about the use of evidence collected outside RCTs is the quality of studies used.

In the field of evidence-based medicine, checklists are often used to assess the quality of different study designs, aiming to ensure consistency across quality assessors. A number of existing checklists focus on methodological quality, but some also incorporate broader elements such as those relevant to cost-effectiveness analyses considered by payers or health technology assessment agencies.

A NICE Decision Support Unit technical support document (Faria et al 2015) has been produced ‘to help improve the quality of analysis, reporting, critical appraisal and interpretation of estimates of treatment effect from non-RCT studies’. This document includes a review and assessment of a number of existing checklists for quality assessment of the analysis of non-randomised studies.

The table below includes a list of commonly used checklists, organised by study design, some of which were reviewed by Faria et al 2015.

Table: Commonly used quality checklists by study design

Study designa Quality checklists
Randomised controlled trials (RCTs) Cochrane risk of bias tool

CASP randomised controlled trial checklist

Non-randomised study designs, controlled cohort, controlled before-and-after studies In the context of cost-effectiveness analyses:

ISPOR checklist for prospective observational studiesb

ISPOR checklist for retrospective database studiesb

Checklist for statistical methods to address selection bias in estimating incremental costs, effectiveness and cost-effectiveness (Kreif et al 2013)b

NICE DSU QuEENS checklist (for use on its own or to complement other checklists)

In general:

GRACE checklista

STROBE combined checklist for cohort, case-control, and cross-sectional studiesb

ROBINS-I assessment tool

Cohort and cross-sectional STROBE checklists for cohort and cross-sectional studies

ROBINS-I assessment tool

CASP cohort study checklist

Newcastle-Ottawa scale

Case-control STROBE checklist for case control studies

CASP case control checklist

Newcastle-Ottawa scale

a A difficulty in choosing the appropriate checklist is in determining the classification of a study, particularly for observational studies. A checklist of design features is covered in the Cochrane handbook for systematic reviews of interventions (see tables 13.2a and 13.2b). Furthermore, Box 13.4a of the Cochrane handbook for systematic reviews of interventions provides useful notes for completing the appropriate checklist. The appropriate checklist for a pragmatic trial will be dependent on whether or not randomisation was used as a feature of the study.
b Included in the review and assessment by Faria et al 2015.
Abbreviations: CASP, Critical Appraisal Skills Programme; GRACE, Good Research for Comparative Effectiveness; ISPOR, International Society for Pharmacoeconomics and Outcomes Research; NICE DSU; National Institute for Health and Care Excellence Decision Support Unit; QuEENS, Quality of Effectiveness Estimates from Non-randomised studies; ROBINS-I, The Risk of Bias in Non-randomized Studies – of Interventions; STROBE, Strengthening the Reporting of Observational Studies in Epidemiology.

In addition to the checklists above, Grading of Recommendations Assessment, Development and Evaluation (GRADE) is an approach that guides users to assess the quality or certainty of evidence in terms of the directness of the evidence to the decision, the precision of the effect estimates, and the heterogeneity of the results in addition to the risk of bias. This system is used to assess evidence to inform the strength of recommendations in the context of a clinical guideline.

It may be possible adjust or control for some of the bias in non-randomised and observational studies – for methods on controlling for confounding bias, see here.