Introduction
Developments in the medical and pharmaceutical sectors have transformed treatment for patients with rare sub-indications of disease areas that were not traditionally considered as rare, for example, anaplastic lymphoma kinase (ALK) fusion-positive nonsmall cell lung cancer (Reference Frampton, Fichtenholtz and Otto1–Reference Ramagopalan, Leahy and Ray3). The increasing personalization of treatment for patients with these rare subindications combined with high unmet need and sometimes no established standard of care can make the conduct of randomized controlled trials (RCTs) challenging. Even when RCTs can be conducted, the standard of care used may not be relevant for all settings and thus nonrandomized comparisons may still be required. This has led to an increasing requirement to consider single-arm trials and nonrandomized treatment effects in health technology assessment (HTA) (Reference Patel, Grimson and Mihaylova4). As a result, external controls (ECs) have been used to form a comparator cohort to analyze treatment effectiveness (Reference Patel, Grimson and Mihaylova4). As a result, ECs have been used to form a comparator cohort to trial arms to analyze treatment effectiveness, a common nonrandomized study (NRS) design. The EC can be formed from any data external to the trial population, this could be from prior clinical trials or routinely collected health care data (i.e., real-world data (RWD)) (Reference Concato and Corrigan-Curay5). Prior clinical trial data is often preferred by HTA bodies due to the quality and consistency of the data collection process and availability of key confounders and effect modifiers. However, in rare conditions, there may be little to no historical trial data available and therefore RWD may be used to form the EC.
Although there are several approaches that can be used to mitigate the impact of bias on a nonrandomized comparison, such as target trial emulation, matching adjusted indirect comparison, simulated treatment comparison, and propensity score methods such as the inverse probability of treatment weighting, the use of an EC and correspondingly the nonrandomized treatment effect often still results in a substantial risk of residual bias and treatment effects from NRS have received criticism due to such biases (6–Reference Wieseler, Neyt, Kaiser, Hulstaert and Windeler10). This poses significant challenges for decision-makers in interpreting the results of such comparisons. Quantitative bias analysis (QBA) methods are a range of methods that can be used to understand the potential impact of a range of biases including unmeasured confounding, misclassification, and selection bias based on the results from an NRS. These methods attempt to quantitatively estimate the direction, magnitude, and uncertainty associated with biases that can affect measures of associations (Reference Lash, Fox, Cooney, Lu and Forshee11). The methods, which have been described in detail elsewhere, generally come in two forms. One form of approach involves utilizing external sources of information to adjust the results to estimate what would have been expected had the biases been considered in the primary analysis. The second form of methods use existing data to develop a threshold which an unmeasured confounder or other bias must exceed to explain away an effect estimate (Reference Lash, Fox, Cooney, Lu and Forshee11–Reference Gray, Grimson, Layton, Pocock and Kim14). Some examples of QBA methods for unmeasured confounding include but are not limited to derived bias formulae approaches such as the E-value (Reference VanderWeele and Ding15), Rosenbaum-type approaches (Reference Rosenbaum16), and simulation-based approaches (Reference Groenwold, Sterne and Lawlor17).
Although QBA represents a potentially powerful tool to support decision-making under uncertainty, there are few examples of its use in the HTA setting. Recent work has highlighted a number of key considerations requiring discussion and alignment within the HTA community to support the greater use and determine best practices for including QBA in HTA (Reference Leahy, Kent and Sammon12;Reference Sammon, Leahy, Gsteiger and Ramagopalan13). In an initial effort to address this need, a workshop was conducted with experts in HTA policy and science to elicit opinions on the use of QBA generally and specifically for unmeasured confounding in the HTA process. This manuscript describes the findings of the workshop.
Methods
A workshop was conducted with a panel of six experts who formed an advisory group to elicit opinions on the use of QBA in the HTA setting. All experts co-authored this manuscript and the list of experts is included in the Supplementary materials. The advisory group members were selected based on their knowledge and experience in HTA and their ability to represent experts and stakeholders of the HTA process. Experts were identified by reviewing the authorship of relevant literature and searching HTA agencies for appropriate contacts while maintaining diversity with respect to clinical, health-economic, and HTA agency stakeholders, as well as geographical diversity. The advisory group consisted of experts from a variety of backgrounds including members from HTA agencies, health economists, and medical practitioners. They provided expertise in HTA policy and science across a range of markets including the United Kingdom (UK), Germany (DE), France (FR), Italy (IT), Spain (ES), and Canada (CA). The workshop comprised a pre-read that was sent to all participants that introduced the setting and background context for the workshop, provided a brief review of literature on QBA, introduced QBA methods, and provided examples of applications focusing on NRS. Participants were offered the opportunity for a one-to-one pre-meeting discussion of the pre-read materials.
A 2.5-h virtual workshop was then conducted with all participants where several topics relating to the use of QBA in clinical studies used to inform HTA were discussed, and the opinions of participants were elicited. The workshop was structured as follows: an initial overview of the area, a discussion on each individual topic where each expert gave their opinion which was recorded, followed by a summary of what was discussed, and any other topics raised by experts. Following the workshop, a survey was sent to all participants of the expert advisory group to obtain specific feedback regarding the discussion in the workshop (see Supplementary materials).
The survey consisted of several statements and recommendations whereby members of the expert advisory group either agreed or disagreed and provided supporting comments and feedback. The recommendations below are the consensus of all experts.
Recommendations
Table 1 summarizes the 10 overarching recommendations from the workshop. All participants reached a consensus with minimal comments. The subsequent sections of this paper expand on each of these recommendations.
Recommendation 1: QBA is recommended as a potentially useful tool when conducting NRS, approaches such as the E-Value are useful when considering unknown-unknowns.
RCTs and systematic review of RCTs, potentially with meta-analyses will always be the preferable source of evidence to derive treatment effectiveness evidence for HTA (18–Reference Akobeng20). When an NRS is conducted, the rationale for not being able to conduct an RCT should therefore be clearly justified (Reference Kent, Salcher-Konrad and Boccia21).
When conducting an NRS for example between a single-arm trial and an EC, the expert advisory group emphasized a strong preference to address all important biases through the use of high-quality, fit-for-purpose data sources and the application of appropriate approaches to study design and analysis. There may be an acceptable justification for not having addressed important biases in the main analysis of an NRS. For example, a confounder was not captured in any existing data sources and the prospective collection of data was not feasible due to time constraints. In addition, when the nature of a potential confounding effect is challenging to accurately measure, for example, with a concept like frailty, potential residual bias may present a justifiable rationale for using QBA.
In cases such as these, QBA may be a useful tool to support decision-makers in determining the potential impact of systematic error on treatment effect estimates from NRS.
Even when all anticipated sources of bias have been addressed adequately in the main analysis of an NRS, there may be concerns about residual bias from unforeseen sources such as unknown confounders or “unknown unknowns.” The expert advisory group also highlighted that QBA could play a role in such situations.
Recommendation 2: The manufacturer (developer) should have the primary responsibility for conducting QBA.
The expert advisory group agreed that primary responsibility for designing and conducting a QBA should fall to the manufacturer of the technology or technologies under assessment. This is consistent with the responsibility for all other evidence-generation activities related to HTA typically lying with the manufacturer (Reference Leahy, Kent and Sammon12). Although the manufacturer has a vested interest in the analysis drawing conclusions that support the value of their product, the risk of investigator bias can be mitigated by enforcing best practices such as a detailed specification and ideally publication of the planned QBA analyses in advance of execution.
Ahead of the finalization of the study protocol, early engagement with HTA bodies to align on the nature of the QBA is recommended where such processes allow. This engagement is key and should involve alignment between the manufacturers and HTA bodies regarding the QBA methods and parameters planned for analysis. The Joint Scientific Consultations between manufacturers and HTA bodies covered by the EU HTA regulation could be a potential forum for this engagement for European markets (22). All code and data used to inform and conduct QBA analyses should be shared with the respective HTA bodies to enhance transparency and enable replication.
Recommendation 3: A variety of sources should be used to identify potential biases such as validation studies and expert elicitation.
A key component of the overall planning and design of any study should be to identify potential biases, including but not limited to residual/unmeasured confounding, selection bias, and misclassification. For comparative effectiveness studies for HTA purposes, a transparent and thorough approach should be used to identify the most important potential biases. A combination of validation studies, systematic reviews, expert elicitation, knowledge of a data source including completeness and accuracy, and directed acyclic graphs (DAGs) could be used to support the identification of relevant confounders and conceptualization of the various sources of bias that can potentially impact a study (Reference Lash, Fox, Cooney, Lu and Forshee11). The primary approach to address all biases should be applied in the main analysis, by using fit-for-purpose data sources and best practices in the study design and analysis. The biases that cannot be addressed using these approaches should be considered for investigation in a QBA. It is important to be pragmatic and choose the most relevant biases. that is, those expected to have the greatest magnitude, to assess with QBA, rather than trying to address many potential biases which would increase the complexity of the analysis and results and potentially reduce transparency (Reference Lash, Fox, Cooney, Lu and Forshee11).
As previously outlined, the use of QBA to contextualize the potential impact of bias from unforeseen sources, in additional to those identified using the above steps, should always be considered.
Recommendation 4: Consider the use of estimate-adjustment and threshold-based QBA approaches in different scenarios.
After the main analyses have accounted for biases through appropriate study design, the QBA approaches that can be applied can generally be divided into two groups, estimate-adjustment and threshold-based approaches. The estimate-adjustment QBA approaches to adjust the treatment effect estimate and corresponding confidence interval to estimate those that would have been observed had the bias of interest been accounted for, whereas the threshold-based approaches provide a measure of the bias that must be present to meet a specific threshold, such as to nullify the treatment effect (Reference Leahy, Kent and Sammon12;Reference VanderWeele and Ding15). Both the estimate-adjustment and threshold-based QBA approaches have a potential ancillary role in HTA.
Estimate-adjustment QBA should be preferred to address the “known unknowns.” That is, where a manufacturer chooses to proceed with an NRS with the knowledge that there will likely be residual biases introduced through known mechanisms, for example, a known unmeasured confounder. HTA bodies may reasonably request that manufacturers carry out the work to design and parameterize a detailed, potentially complex estimate-adjustment QBA to explore these biases. Although the need for substantial and accurate data is a challenge in using these methods this should not be seen as a rationale for not seeking to source such data in a systematic manner (Reference Leahy, Kent and Sammon12). The advantage and value of the estimate-adjustment approaches is that they produce an adjusted estimate that allows HTA bodies to understand the impact of the bias on the treatment effect estimate and incorporate it into quantitative sensitivity analyses.
Threshold-based QBA may be more suitable for addressing the “unknown unknowns” (Reference Leahy, Kent and Sammon12;Reference VanderWeele and Ding15;Reference Schneeweiss23). These approaches would be a means to provide the decision-maker with an impression of the robustness of the results to additional unforeseen sources of bias. In countries/jurisdictions using cost-effectiveness for reimbursement decisions, the threshold could be set against the outputs of the cost-effectiveness decision model reaching a certain threshold (e.g., incremental cost-effectiveness ratio (ICER) < £30,000 per QALY).
Recommendation 5: Use internal and external data as well as expert elicitation to acquire information on QBA inputs.
In order for QBA to be well-accepted in HTA, the evidence used to inform the analysis must be collected in a transparent manner and have sufficiently high accuracy. As such internal validation datasets and systematic reviews of external data are likely to be the best accepted sources of these data (Figure 1). The use of internal data from a subset of the dataset used for the main analysis may be an acceptable option if that sample has been randomly selected. If using external data, the use of a systematic literature review (SLR) to identify external data sources is the preferred approach, as this is the common standard used in HTA across many HTA bodies. An SLR also helps ensure evidence has not been “cherry picked” to support a preferred narrative and therefore overcomes investigator bias. However, there are challenges in conducting an SLR in this context, as the relevant information may not always be the main focus of a paper and is more likely to be incidentally reported in a study. The individual external data sources not identified via an SLR are likely to be less well or not accepted. As described in Recommendation 3, the same sources that were used to identify potential sources of bias could also be used to acquire information on the QBA inputs.
Expert elicitation approaches may have a place in supporting QBA but primarily as a means to validate or “plausibility-check” inputs obtained from other sources, rather than providing the primary source of evidence supporting an input. They may be acceptable to inform inputs that are challenging to source from external sources (e.g., prevalence of a confounder in treated/untreated groups) or in very rare conditions where limited data is available.
Regardless of the source used, input parameters should be identified through a systematic and transparent approach and specified a priori in a protocol or analysis plan.
Recommendation 6: Sensitivity analyses should be conducted to account for uncertainty in QBA inputs.
It is important to reflect uncertainty in the QBA inputs, the approach used to reflect uncertainty should be transparent and understandable to the relevant HTA stakeholders. One approach that is recommended includes using multiple plausible QBA input parameters to test the sensitivity relative to the QBA parameters and potentially a “worst case” scenario. This can be conducted by running the QBA across a range of possible input values (i.e., deterministically) or by defining the inputs using probability distributions derived from available data or by eliciting expert opinion (Reference Leahy, Duffield and Kent24). In addition, one must suitably report the results of the QBA including the range or distribution of the adjusted or threshold outcomes.
Recommendation 7: QBA should be applied to outcomes relevant to HTA.
The findings of a QBA should ideally be reflected in the outputs that are most relevant to HTA decision-makers, this may include outputs relevant to clinical effectiveness and/or cost-effectiveness and/or budget impact. That is, although the QBA itself is carried out on the clinical effectiveness results, it is possible to design and execute a QBA in such a way that its impact can be reflected in subsequent results that the clinical effectiveness data feed into (Reference Leahy, Duffield and Kent24). For example, threshold-based QBA may be designed to focus on the amount of bias required to move the ICER such that technology is no longer cost-effective. QBA could be applied to other relevant cost-effectiveness outcomes, for example, the net monetary benefit. As such, QBA should be designed such that the impact of residual bias can be seen on all results relevant to the decision problem. Further, the expert advisory group also communicated that QBA may have a potential role in price negotiations by relating the output range of the QBA results to a range of prices.
Recommendation 8: QBA results and associated assumptions should be presented.
The expert advisory group emphasized that results from QBA should be clearly and transparently reported alongside all assumptions of the methods used. For estimate-adjusted approaches, the use of forest plots is recommended to present results as these are generally well understood. In addition, a plot of the distribution of adjusted estimates could also be considered when multiple assumptions are used for the QBA parametrization. The expert advisory group also discussed the potential to compound the results of QBA for the adjustment of different biases, that is, implement multiple adjustments cumulatively; however, caution must be used due to its complexity. An example of a forest plot was developed to show consecutive adjustments for multiple QBA and is given in Figure 2.
Recommendation 9: QBA should generally be considered as a sensitivity analysis.
Experts provided their opinions regarding the current perceived receptiveness to QBA analyses by HTA bodies. The expert advisory group agreed that QBA would form supportive analyses and would unlikely be considered as part of the main analysis for estimate-adjusted methods. However, it was highlighted that when making decisions under uncertainty, it is better to have some understanding of the impact of potential biases than nothing. It was also emphasized that early engagement with the respective HTA bodies would aid in the understanding and interpretation of results produced from QBA and potentially lead to greater acceptability of these methods.
Recommendation 10: Further work should be conducted to support the use of QBA in HTA.
Activities should be pursued to increase awareness and understanding of QBA methods in the HTA community, further develop consensus around their use in this setting, and to potentially increase the prevalence of their use.
In order to support understanding, and therefore uptake, of these methods it will be important to be able to determine where QBA sits relative to the existing/traditional approaches to design and analysis of NRS, and what it adds beyond these approaches. The development of a clear, relatively concise visual framework that clarifies the role and/or place of QBA may be a useful resource in supporting the greater use of QBA in the HTA community to ultimately improve the HTA assessment process.
Engagement with existing scientific networks/communities of HTA stakeholders (e.g., EUNetHTA, HTAi, ISPOR, INHATA, etc.) would facilitate the consideration and potential use of QBA to improve the quality of evidence for HTA. Demonstration projects to show where QBA may have helped in real-world HTA settings will also raise awareness and potentially support greater use of QBA methods. Finally, the development of methods to communicate QBA results clearly and transparently, and on HTA-relevant outcomes, would be beneficial.
Discussion
The outcomes of the workshop conducted with experts in HTA policy and science provided several recommendations on the use of QBA, with a focus on unmeasured confounding. These recommendations can aid manufacturers and HTA representatives in the use of QBA in the absence of specific HTA guidelines.
An important consideration of the use of QBA methods is the barriers to their application in practice in the HTA setting. As already mentioned, the lack of guidance on the application of these methods is a potential barrier to their use. This workshop and resulting recommendations aimed to take an initial step at bridging this gap. Another barrier may be the lack of awareness of such methods, although there has been growing interest in this area (Reference Leahy, Kent and Sammon12;Reference Sammon, Leahy, Gsteiger and Ramagopalan13;Reference Wilkinson, Gupta and Scheuer25;26). As there have been limited applications of QBA in HTA, there may be a lack of credibility in such methods to provide meaningful evidence. One approach to overcome this may be further research showing their use, and the resulting increased transparency that they may provide to decision-makers when interpreting the uncertainty of results.
As discussed in Recommendation 10, it is important to understand where QBA sits relative to a more traditional, design-based approach to mitigating bias. Generally, as discussed in Recommendation 1, we believe there should be a strong preference to address all important biases through the use of appropriate approaches to study design and analysis and that QBA can be applied as a supportive tool for sensitivity analyses (Recommendation 9) when such study design and analysis approaches cannot account for important biases.
Beyond the consideration of the use of QBA in the HTA setting, it is also worthwhile considering their use in the regulatory setting (Reference Lash, Fox, Cooney, Lu and Forshee11). Although this was not discussed as part of the workshop and no recommendations are made for the regulatory context, some of the recommendations made may be relevant. An aligned approach among decision-makers in both the HTA and regulatory setting would allow for more consistent use of QBA methods across market access activities.
Although the expert panel varied in terms of geography and area of expertise, not all countries and HTA agencies were represented. Although we believe these recommendations can generally applied across HTA agencies and countries, further work is required to assess the generalizability to other countries. Further, the recommendations were agreed upon via informal methods at a workshop, through a post-workshop survey, and through revisions of the recommendations text, although similar to formal approaches to achieve consensus, such as a Delphi process, these were not used due to time constraints.
Conclusion
QBA represents a potentially powerful tool to support HTA decision-makers in using the results of NRS. However, there is a lack of guidance on key aspects of QBA relevant to HTA. Our findings from a workshop with experts in HTA science and policy provide those individuals conceptualizing and assessing NRS with an initial set of guiding principles for the conduct of QBA for HTA, based on the views of a geographically and methodologically diverse set of stakeholders with expertise in HTA.
Further work to engage more broadly with HTA agencies and HTA policy experts is recommended to highlight the research priorities produced by the workshop. Introducing the use of QBA in the HTA community holds promise for HTA decision-makers to make decisions under uncertainty with more confidence, thereby ultimately helping to support patient access to vital medicines.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0266462323002702.
Acknowledgments
The authors would like to acknowledge the contributions of Sreeram Ramagopalan and Cormac Sammon for facilitating the workshop and manuscript development.
Funding statement
The workshop was funded by F. Hoffmann-La Roche.
Competing interest
I.D-Z., L.S-C., Y.Z., D.C., and G.C. received consulting fees for participation in the workshop. T.P.L. is an employee of PHMR Ltd. S.K. has no conflict of interest.