Definition of trial outcome is inconsistent across studies or with usual practice

Trial results may be difficult to interpret because of different (possibly inconsistent) efficacy results across the pivotal studies. This may be due to differences in the trial populations associated with differences in efficacy across important sub-populations.

Results for an individual trial may vary in magnitude and/or direction for different outcomes, making interpretation less straightforward, for example when reviewing results for individual components of composite endpoints.

Administration of trial comparator differs from usual practice

Although the usual care (or standard of care) medicine for the healthcare system is included as a comparator in the trial, its administration in the study (for example, dose, dose titration/escalation, frequency, route of administration, monitoring) may differ from usual practice in the country of interest. This may raise concerns about the transferability of study results to (local) usual practice. In some cases the clinical background and skill level of the administering clinicians may be important.

Administration of therapy is inconsistent with usual practice

The administration of the study medicine in trials (for example, dose, dose titration/escalation, frequency, route of administration, monitoring) may differ from the schedule that is likely to be used in clinical practice. This is perhaps more likely if the new therapy is added to current usual care, or if the treatment itself may be intentionally misused by patients if its administration is not under strict control (for example, opioids in pain relief). In some cases the clinical background and skill level of the administering clinicians may be important.

Trial treatment pathway is not generalisable to usual practice

‘Treatment pathway’ refers to the sequence of previous treatments received, based on patient selection criteria and response to previous therapies, often reflected in clinical guidelines. Assessment of the position in the treatment pathway drives the definition of the treatment population and choice of appropriate comparator. This pathway may have changed during the course of the trial if a new medicine has been approved or new clinical guidelines have been implemented locally. In some cases, the introduction of the medicine of interest will itself alter the treatment pathway. Alternatively there may not yet be an established treatment pathway or sequencing, and as a result there may be variations across countries or health centres within countries.

Trial comparators do not include current usual care or standard of care

Current usual care (or standard of care) for the healthcare system of interest is not included as a comparator in the clinical trial. This may be because there is wide variation in usual care across healthcare systems, so that not all options can be included in a single study. A ’usual care’ medicine in the country of interest may not be licensed or reimbursed, or it may not be recommended for use (for example, in clinical guidelines) in some study countries, preventing its inclusion in the trial. In some cases a placebo-controlled trial may have been required to support regulatory approval, for example to resolve safety concerns, which might preclude the inclusion of usual care as a comparator in the trial. It is possible that more than one usual care comparator is relevant for different segments of the target population, for example if a new diagnostic paradigm is involved.

Any comparison with usual care based on clinical trial data will therefore need to rely on an indirect comparison, for example a network meta-analysis based on an evidence network of results from all trials in populations with the disease of interest. Such analyses depend on statistical modelling assumptions (mostly concerning heterogeneity of the results across the source trials) as well as similarity in the design of the source trials (for example, study durations, study populations and definitions of health outcomes). Results of such meta-analyses may be associated with high levels of uncertainty. They are viewed with caution by some decision-makers because they are quite new (not yet fully ‘tried and tested’), are quire complex (loss of transparency) and are not yet widely understood.

Trial participants are at a different position in treatment pathway

‘Treatment pathway’ refers to the sequence of previous treatments received, based on patient selection criteria and response to previous therapies, often reflected in clinical guidelines. The position of patients in the treatment pathway drives the choice of appropriate comparator.

The position in the treatment pathway occupied by trial participants may differ from that which they would occupy in usual practice in the healthcare system of interest. For example, in usual practice the new medicine may be considered for use (only) after the disease progressed on two types of therapy, whereas trial participants may have experienced disease progression on just one. This assumes that the pathway has not changed during the course of the trial.

Evidence available is from single arm trials only

In some circumstances, in particular for rare diseases, accelerated approval may be sought for medicines in the absence of data from comparative randomised trials. In this situation the effectiveness of the new medicine needs to be estimated from single arm trials of the new medicine and trials or observational studies of comparator interventions.

Population-adjusted indirect comparisons (a type of standardisation), have been developed to map treatment effects observed in one population into effects that would be observed in another population. Matching-Adjusted Indirect Comparison (MAIC, based on propensity score weighting) and Simulated Treatment Comparison (STC, based on outcome regression) use individual patient data (IPD) from one study to adjust for between-study differences in the distribution of variables that influence outcome. ‘Unanchored’ comparisons are required when considering single arm studies as there is no common comparator across studies. Although these methods are superior to naïve comparisons (e.g. with historical ‘controls’) they require strong assumptions about the presence of all effect modifiers and prognostic variables in the data. Their results need to be treated with some caution as an unknown amount of residual bias may remain in the statistically modelled comparisons.

These methods were not reviewed explicitly by the GetReal project. Further information can be found in NICE DSU Technical Support Document 18 (Phillippo, 2016) and this article published in Value in Health (Signorovitch, 2012).

Other study design choices may limit generalisability

In multi-national studies the general level of care (concomitant therapies, access to technologies, patent support programmes) received by trial participants in some healthcare systems or study sites may differ from usual practice in the healthcare system of interest. This may have implications for the generalisability of the effectiveness results to that system. Recruitment of study participants may require a specific diagnostic activity that is currently not part of usual practice. Clinicians’ or participants’ willingness to participate or complete the trial may not be independent of factors such as adherence or underlying risk, which may be associated with effectiveness. This may be important when considering the applicability of results from studies in highly-resourced healthcare systems such as US, with increased access to sophisticated diagnostic and monitoring services as well as high-intensity care, to local populations where such services may not be available. If the effect of the medicine itself cannot be isolated from trial setting, then the combination of setting and intervention may need to be considered more broadly as an intervention strategy in its own right.

Trial settings and sites do not reflect usual clinical practice

In multi-national studies the general level of care (concomitant therapies, access to technologies, patent support programmes) received by trial participants in some healthcare systems or study sites may differ from usual practice in the healthcare system of interest. This may have implications for the generalisability of the effectiveness results to that system. Recruitment of study participants may require a specific diagnostic activity that is currently not part of usual practice. Clinicians’ or participants’ willingness to participate or complete the trial may not be independent of factors such as adherence or underlying risk, which may be associated with effectiveness. This may be important when considering the applicability of results from studies in highly-resourced healthcare systems such as US, with increased access to sophisticated diagnostic and monitoring services as well as high-intensity care, to local populations where such services may not be available. If the effect of the medicine itself cannot be isolated from trial setting, then the combination of setting and intervention may need to be considered more broadly as an intervention strategy in its own right.

There is a high risk of biased comparisons from observational (non-randomised) data

During medicine development, observational data are generally not available on effectiveness of the new therapy. They are however used for a number of purposes: to describe the natural history of disease, disease burden and treatment patterns, or the relationship between surrogate and final endpoints, or to validate new study endpoints or provide information on the effectiveness of comparator therapies. Some of this information may be used indirectly in estimating the effectiveness of the new therapy, for example through predictive modelling of long-term outcomes. Assessors will be particularly interested in the generalisability of the study population and the statistical methods used to control for bias.

In situations of conditional reimbursement, potentially within an adaptive pathway for a new medicine, observational data (for example, from registries) on the effectiveness of new medicine may be presented and assessed at HTA reviews. Comparisons between therapy alternatives of health outcomes based on non-randomised studies are particularly subject to bias: careful study design is required to minimise the bias. A variety of analytical techniques are available to adjust for imbalances observed between study groups that may affect the comparison, although this is unlikely to fully eliminate bias.

There is uncertainty in reported trial outcomes

Trial results may be difficult to interpret because of large uncertainty (wide confidence intervals) in the outcome measures. Trials may not have been powered specifically to detect differences in patient-relevant endpoints such as health-related quality of life. In some healthcare systems, secondary and tertiary outcomes may be considered to be lower quality evidence. High levels of uncertainty may be reported if there are lower rates of the outcome event than expected. Results for subgroups of importance to the health system of interest will have greater uncertainty, and in some cases may not have been reported.

Trial outcomes not considered to be measures of effectiveness

Outcomes reported in pivotal trials may not be considered to be measures of relative effectiveness from a health technology assessment (HTA) or reimbursement perspective. These outcomes (efficacy or safety) are selected to meet the needs of regulatory approval, but may not be optimal for some HTA agencies. A number of factors may be relevant. Trial outcomes may represent physiological parameters, such as tumour response, blood haemoglobin level or lung function, which are not considered to be patient-relevant. However, these may serve as surrogate endpoints (proxies) for effectiveness outcomes of relevance to HTA, but the relationship between the surrogate and ‘final’ endpoint needs to be demonstrated quantitatively. Outcomes that are clinically assessed disease activity indices may be considered measures of effectiveness if they are validated and are widely used. Outcomes that are composite endpoints (for example, MACE: major adverse cardiac events) may need to be disaggregated into their components for consideration in HTA.

Trial comparators do not include current usual care or standard of care

Current usual care (or standard of care) for the healthcare system of interest is not included as a comparator in the clinical trial. This may be because there is wide variation in usual care across healthcare systems, so that not all options can be included in a single study. A ’usual care’ medicine in the country of interest may not be licensed or reimbursed, or it may not be recommended for use (for example, in clinical guidelines) in some study countries, preventing its inclusion in the trial. In some cases a placebo-controlled trial may have been required to support regulatory approval, for example to resolve safety concerns, which might preclude the inclusion of usual care as a comparator in the trial. It is possible that more than one usual care comparator is relevant for different segments of the target population, for example if a new diagnostic paradigm is involved.

Any comparison with usual care based on clinical trial data will therefore need to rely on an indirect comparison, for example a network meta-analysis based on an evidence network of results from all trials in populations with the disease of interest. Such analyses depend on statistical modelling assumptions (mostly concerning heterogeneity of the results across the source trials) as well as similarity in the design of the source trials (for example, study durations, study populations and definitions of health outcomes). Results of such meta-analyses may be associated with high levels of uncertainty. They are viewed with caution by some decision-makers because they are quite new (not yet fully ‘tried and tested’), are quire complex (loss of transparency) and are not yet widely understood.

Adherence in study differs from usual practice

Outcomes reported in trials are usually for study participants at a high level of adherence to the study medicine and comparators. However, in usual practice lower levels of adherence are expected, potentially with different levels of adherence for different therapy options resulting from differences in side effects or methods of administration. In the absence of real-world evidence, an understanding of the relationship between effectiveness and adherence is required to project estimates of effectiveness in usual practice (with sub-optimal adherence) from efficacy reported in trials. However, the relationship (often non-linear) may be difficult to predict. Having real-world data on adherence alone is insufficient to estimate effectiveness.

Stopping rules for therapy are unclear

The administration of the study medicine in trials (for example, dose, dose titration/escalation, frequency, route of administration, monitoring) may differ from the schedule that is likely to be used in clinical practice. This is perhaps more likely if the new therapy is added to current usual care, or if the treatment itself may be intentionally misused by patients if its administration is not under strict control (for example, opioids in pain relief). In some cases the clinical background and skill level of the administering clinicians may be important.

Administration of therapy is inconsistent with usual practice

The administration of the study medicine in trials (for example, dose, dose titration/escalation, frequency, route of administration, monitoring) may differ from the schedule that is likely to be used in clinical practice. This is perhaps more likely if the new therapy is added to current usual care, or if the treatment itself may be intentionally misused by patients if its administration is not under strict control (for example, opioids in pain relief). In some cases the clinical background and skill level of the administering clinicians may be important.

Trial population (usual care) differs from populations in other studies

Effectiveness results from previous or concurrent studies of comparator therapies (including usual care or standard of care) are potentially useful for planning new studies, and for inclusion in indirect comparisons or meta-analyses that include results for the new medicine of interest. However, the populations included in these studies may have different characteristics to the target population for the new medicine. Differences may be related to demographics, risk profile, place in the treatment pathway, concomitant care received or referral practices. These may be a result of secular changes in patient management since these data were reported. As a result there may be variations in effectiveness reported for usual care in such trials or non-randomised studies, which may present problems for trial design (for example, using expected event rates to calculate adequate study size) as well as for meta-analyses synthesising results from relevant trials.

Trial participants are at a different position in treatment pathway

‘Treatment pathway’ refers to the sequence of previous treatments received, based on patient selection criteria and response to previous therapies, often reflected in clinical guidelines. The position of patients in the treatment pathway drives the choice of appropriate comparator.

The position in the treatment pathway occupied by trial participants may differ from that which they would occupy in usual practice in the healthcare system of interest. For example, in usual practice the new medicine may be considered for use (only) after the disease progressed on two types of therapy, whereas trial participants may have experienced disease progression on just one. This assumes that the pathway has not changed during the course of the trial.

Trial population mix differs from usual practice

Application of trial inclusion/exclusion criteria, together with other factors affecting recruitment, may mean that the trial population does not coincide with the population likely to be treated locally in usual practice. There may be differences in the distribution by age, gender, ethnicity, socio-economic factors, co-morbidities or other factors. Firstly, this is of concern if the intervention of interest has different efficacy across these factors. Secondly, even in the absence of such efficacy differences, the risk profile of the study population for the outcome of interest may differ from that of the population likely to be treated in usual practice. In this case there may be different levels of absolute effect (events averted etc.) associated with levels of trial-reported efficacy for the different populations.

Other study design choices may limit generalisability

In multi-national studies the general level of care (concomitant therapies, access to technologies, patent support programmes) received by trial participants in some healthcare systems or study sites may differ from usual practice in the healthcare system of interest. This may have implications for the generalisability of the effectiveness results to that system. Recruitment of study participants may require a specific diagnostic activity that is currently not part of usual practice. Clinicians’ or participants’ willingness to participate or complete the trial may not be independent of factors such as adherence or underlying risk, which may be associated with effectiveness. This may be important when considering the applicability of results from studies in highly-resourced healthcare systems such as US, with increased access to sophisticated diagnostic and monitoring services as well as high-intensity care, to local populations where such services may not be available. If the effect of the medicine itself cannot be isolated from trial setting, then the combination of setting and intervention may need to be considered more broadly as an intervention strategy in its own right.

Trial settings and sites do not reflect usual clinical practice

In multi-national studies the general level of care (concomitant therapies, access to technologies, patent support programmes) received by trial participants in some healthcare systems or study sites may differ from usual practice in the healthcare system of interest. This may have implications for the generalisability of the effectiveness results to that system. Recruitment of study participants may require a specific diagnostic activity that is currently not part of usual practice. Clinicians’ or participants’ willingness to participate or complete the trial may not be independent of factors such as adherence or underlying risk, which may be associated with effectiveness. This may be important when considering the applicability of results from studies in highly-resourced healthcare systems such as US, with increased access to sophisticated diagnostic and monitoring services as well as high-intensity care, to local populations where such services may not be available. If the effect of the medicine itself cannot be isolated from trial setting, then the combination of setting and intervention may need to be considered more broadly as an intervention strategy in its own right.

Trial treatment pathway is not generalisable to usual practice

‘Treatment pathway’ refers to the sequence of previous treatments received, based on patient selection criteria and response to previous therapies, often reflected in clinical guidelines. Assessment of the position in the treatment pathway drives the definition of the treatment population and choice of appropriate comparator. This pathway may have changed during the course of the trial if a new medicine has been approved or new clinical guidelines have been implemented locally. In some cases, the introduction of the medicine of interest will itself alter the treatment pathway. Alternatively there may not yet be an established treatment pathway or sequencing, and as a result there may be variations across countries or health centres within countries.

Administration of trial comparator differs from usual practice

Although the usual care (or standard of care) medicine for the healthcare system is included as a comparator in the trial, its administration in the study (for example, dose, dose titration/escalation, frequency, route of administration, monitoring) may differ from usual practice in the country of interest. This may raise concerns about the transferability of study results to (local) usual practice. In some cases the clinical background and skill level of the administering clinicians may be important.

Trial participants withdraw from therapy or cross over between treatment groups

Differential withdrawal rates between study arms in a trial may complicate interpretation of the findings, especially if there is divergence between intention-to-treat and on-treatment results. If study participants cross over between treatment groups after reaching a study endpoint (for example, disease progression in cancer) the ability of the trial to report unbiased comparisons for longer-term ‘effectiveness’ outcomes (for example, overall survival) is compromised.

A trial-based indirect comparison is not possible

In a situation where there is no direct (head-to-head) trial comparison with usual care (or standard of care) for the healthcare system of interest, it may not be possible to construct a connected evidence network based on available trials to support any required indirect comparison with usual care. For example, there may be no trials comparing the new medicine and usual care with the same alternative therapy option (such as placebo). It is also possible that patient groups in a study directly comparing alternative therapies are sufficiently different to preclude the study being pooled with other studies as part of a network meta-analysis.

Stopping rules for therapy are unclear

The administration of the study medicine in trials (for example, dose, dose titration/escalation, frequency, route of administration, monitoring) may differ from the schedule that is likely to be used in clinical practice. This is perhaps more likely if the new therapy is added to current usual care, or if the treatment itself may be intentionally misused by patients if its administration is not under strict control (for example, opioids in pain relief). In some cases the clinical background and skill level of the administering clinicians may be important.

Adherence in study differs from usual practice

Outcomes reported in trials are usually for study participants at a high level of adherence to the study medicine and comparators. However, in usual practice lower levels of adherence are expected, potentially with different levels of adherence for different therapy options resulting from differences in side effects or methods of administration. In the absence of real-world evidence, an understanding of the relationship between effectiveness and adherence is required to project estimates of effectiveness in usual practice (with sub-optimal adherence) from efficacy reported in trials. However, the relationship (often non-linear) may be difficult to predict. Having real-world data on adherence alone is insufficient to estimate effectiveness.

Modelling of final outcomes from trial efficacy is not robust

Evidence of relative effectiveness derived from modelling final outcomes (effectiveness) from trial outcomes (efficacy) may be considered weak or unacceptable. This may be because the association between final (model) and surrogate (trial) and final outcomes is weak, and therefore estimated with low confidence (wide uncertainty). More technically, the relationship may be poorly specified statistically, or not well validated. The modelling approach itself may be inadequately described or justified. The data sources (especially sources other than trials) may not be considered relevant to the population under consideration, or are of poor quality (missing data, potential biases in their analysis). In some healthcare systems the use of modelling per se may be inadmissible or considered to be (only) supportive evidence.

There is a high risk of biased comparisons from observational (non-randomised) data

During medicine development, observational data are generally not available on effectiveness of the new therapy. They are however used for a number of purposes: to describe the natural history of disease, disease burden and treatment patterns, or the relationship between surrogate and final endpoints, or to validate new study endpoints or provide information on the effectiveness of comparator therapies. Some of this information may be used indirectly in estimating the effectiveness of the new therapy, for example through predictive modelling of long-term outcomes. Assessors will be particularly interested in the generalisability of the study population and the statistical methods used to control for bias.

In situations of conditional reimbursement, potentially within an adaptive pathway for a new medicine, observational data (for example, from registries) on the effectiveness of new medicine may be presented and assessed at HTA reviews. Comparisons between therapy alternatives of health outcomes based on non-randomised studies are particularly subject to bias: careful study design is required to minimise the bias. A variety of analytical techniques are available to adjust for imbalances observed between study groups that may affect the comparison, although this is unlikely to fully eliminate bias.

Definition of trial outcome is inconsistent across studies or with usual practice

Trial results may be difficult to interpret because of different (possibly inconsistent) efficacy results across the pivotal studies. This may be due to differences in the trial populations associated with differences in efficacy across important sub-populations.

Results for an individual trial may vary in magnitude and/or direction for different outcomes, making interpretation less straightforward, for example when reviewing results for individual components of composite endpoints.