Effectiveness issues – Outcome

1. Trial outcomes are not considered to be relevant measures of effectiveness

Outcomes reported in pivotal trials may not be considered measures of relative effectiveness from a health technology assessment (HTA) or reimbursement perspective. These efficacy (or safety) outcomes are selected to meet the needs of regulatory approval, but may not be optimal for some HTA agencies or health care payers. Trial outcomes may represent physiological parameters, such as tumour response, blood haemoglobin level or lung function, which are not considered to be patient-relevant. These may be considered to be surrogate endpoints (proxies) for effectiveness outcomes of relevance to HTA, but the relationship between the surrogate and ‘final’ endpoints will need to be demonstrated quantitatively. Outcomes that are clinically assessed disease activity indices may be considered to be measures of effectiveness if they are validated and are widely used. Outcomes that are composite endpoints (for example, MACE: major adverse cardiac events) may need to be disaggregated into their components for consideration in HTA.

2. There is uncertainty in reported trial outcomes

Trial results may be difficult to interpret because of large uncertainty (wide confidence intervals) in the outcome measures. Trials may not have been powered specifically to detect differences in patient-relevant endpoints such as health-related quality of life. In some healthcare systems, outcomes defined in the clinical trial as secondary and tertiary may be considered to be lower quality evidence. High levels of uncertainty may be reported if there are lower than expected rates of an outcome event reported in the trial. Results for subgroups of study patients of importance to the healthcare system of interest are likely to have greater uncertainty.

3. There are heterogeneous results across or within trials

Trial results may be difficult to interpret because of heterogeneous, potentially inconsistent, efficacy results across the pivotal trials. This may be due to differences in the composition of the trial populations, possibly related to underlying differences in the levels of the outcome of interest or differences in those levels (efficacy) in important sub-populations.

In individual trials the (efficacy) results may vary in magnitude and/or direction for different efficacy outcome, making interpretation not straightforward. For example, this may occur for the individual components of composite endpoints.

4. Definition of trial outcome is inconsistent across studies or with usual practice

The definition of an outcome used in the trial may be inconsistent with that used in studies of comparator therapies, including usual care or standard of care. This may be for a number of reasons: lack of standardisation of outcomes in a disease area, variations in what is considered important for good health (and included in patient-reported outcomes instruments), or definitions of outcomes changing over time. In addition there may be variations in the availability of study instruments, for example, a patient-reported outcomes instrument may not have been validated and versions may not be available in all languages. Differences in definitions of outcomes may have the consequence that the results of studies cannot be combined in meta-analyses.

In addition, the definition of an outcome used in the trial may differ from its definition when used in routine clinical practice. In a clinical study patients are frequently assessed more intensively and frequently, with access to a wider range of diagnostic services, which may enable outcomes to be defined and measured in a way that would not be feasible in routine clinical practice.

5. There is a high risk of biased comparisons from observational (non-randomised) data

During medicine development, results from (non-randomised) observational studies are generally not available on effectiveness of the new therapy. They are however used for other purposes: to describe the natural history of disease, disease burden and treatment patterns, to describe the relationship between surrogate and final endpoints, to validate new study endpoints, or to provide information on the effectiveness of comparator therapies. Observational data may be used to estimate the effectiveness of the new therapy, for example through predictive modelling of long-term outcomes. Healthcare decision-makers reviewing data used from these studies will be particularly interested in the differences between the study population and their population of interest (generalisability of the study results) and the statistical methods used to adjust for potential bias.

In situations of conditional reimbursement, potentially within an adaptive pathway for a new medicine, observational data, for example from registries, used to estimate the effectiveness of the new medicine may be presented and assessed at HTA reviews. Comparisons of health outcomes between therapy options which are based on non-randomised studies are particularly subject to bias. Careful study design is needed to minimise the bias. Various statistical techniques are available to adjust for imbalances between study groups that are observed and may affect the comparison, although this is unlikely to fully eliminate bias.

6. Modelling of final outcomes from trial efficacy is not robust

Evidence of relative effectiveness derived from modelling of final (long-term) outcomes from trial outcomes (efficacy) may be considered weak or unacceptable. This may be because the association between surrogate (trial) and final outcomes is weak, and therefore the latter are estimated with low confidence (wide uncertainty). The relationship may be poorly specified statistically, or not well validated, or the modelling approach itself may be inadequately described or justified. The data sources used in the modelling, especially sources other than trials, may not be considered relevant to the population of interest to the healthcare decision-maker. They may be of poor quality, with missing data, potential biases in their analysis. In some healthcare systems the use of modelling may be inadmissible or considered to deliver supportive evidence (only).