Trends in use of composite endpoints in clinical trials: A comparison between acute heart failure trials and COVID-19 trials

Lan Shi; Christopher John Lindsell; Dandan Liu

doi:10.1017/cts.2024.492

Trends in use of composite endpoints in clinical trials: A comparison between acute heart failure trials and COVID-19 trials

Published online by Cambridge University Press: 08 March 2024

Lan Shi

Christopher John Lindsell

and

Dandan Liu

Show author details

Lan Shi: Affiliation:
Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
Christopher John Lindsell: Affiliation:
Department of Biostatistics & Bioinformatics, Duke University, Durham, NC, USA
Dandan Liu*: Affiliation:
Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
*: Corresponding author: D. Liu; Email: [email protected]

Article contents

Abstract
Introduction
Materials and methods
Results
Discussion
Author contributions
Funding statement
Competing interests
References

Rights & Permissions

Abstract

Composite endpoints can encode multiple pieces of information and are increasingly adopted in clinical trials. Advocacy for using composite endpoints began decades ago in cardiovascular trials, leading to incorporation of patient-oriented outcomes and consideration of a hierarchical ranking system. The use of composite endpoints in coronavirus disease (COVID-19) trials has evolved similarly. We conducted a literature review to investigate the use of composite endpoints in acute heart failure and COVID-19 clinical trials. The results showed more frequent use of patient-oriented outcomes and ordinal composite endpoints in COVID-19 trials, which might be driven by global consensus on a set of common outcome measures.

Keywords

Acute heart failure composite endpoint clinical trial COVID-19 patient-reported outcome

Type: Brief Report
Information: Journal of Clinical and Translational Science , Volume 8 , Issue 1 , 2024 , e55

DOI: https://doi.org/10.1017/cts.2024.492 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use and/or adaptation of the article.
Copyright: © The Author(s), 2024. Published by Cambridge University Press on behalf of Association for Clinical and Translational Science

Introduction

There rarely exists a single measure that encodes sufficient information for efficient clinical and statistical evaluation of efficacy in clinical trials [Reference Sankoh, Li and D’Agostino1]. By encoding multiple pieces of information into single variables, composite endpoints are appealing as they minimize sample size needs, shorten follow-up durations, and cut costs [Reference Sankoh, Li and D’Agostino1–Reference Ferreira-González, Alonso-Coello and Solà3]. Composite endpoints are increasingly adopted in clinical trials, especially in diseases with complex presentations like acute heart failure (AHF) or coronavirus disease (COVID-19).

The evolution of composite endpoints is mainly driven to increase the ability of clinical trials to detect treatment effects across the full spectrum of disease. Decades ago, composite endpoints were developed to better inform the impact of innovative clinical therapies in randomized clinical trials of AHF [Reference Califf, Harrelson-Woodlief and Topol4,Reference Braunwald, Cannon and McCabe5]. As many as half of cardiovascular trials have adopted a composite endpoint inclusive of morbidity and mortality [Reference Ferreira-González, Busse and Heels-Ansdell2]. Advances in cardiovascular medicine have led to a decline in morbidity and mortality and the subsequent inclusion of nonfatal events like hospitalization in defining a time-to-first event primary composite endpoints [Reference Hussain, Misra and Bozkurt6]. The commonly used analytical approach is under the survival analysis framework [Reference Fox, Ford, Steg, Tendera and Ferrari7–Reference Brugaletta, Gomez-Lara and Ortega-Paz9]. Meanwhile, a hierarchical clinical composite evaluated at a fixed time point was developed as an ordinal outcome in chronic heart failure trials [Reference Packer10] and later adapted to AHF trials by incorporating the occurrence of worsening clinical events beyond a fixed time point to fully evaluate the clinical course of patients [Reference Packer11]. Recent FDA guidance clarifies that therapies for treating heart failure can be approved based on their effect on symptoms or physical function, even if they fail to show a favorable effect on survival or hospitalization risk [12]. This further shifts the emphasis from clinical endpoints to more patient-oriented outcomes in AHF clinical trials [Reference Braunwald, Cannon and McCabe5]. Such a shift naturally motivates advanced statistical analysis methods. For example, the win ratio was proposed to handle the inherent limitation of using time-to-event composite endpoints, where each patient’s first event is emphasized over its clinical importance [Reference Pocock, Ariti, Collier and Wang13]. Another example is the global ranking approach that, similar to hierarchical clinical endpoints, ranks patients using varying aspects of the clinical course based on a prespecified hierarchical ranking system [Reference O’Brien14,Reference Felker, Anstrom and Rogers15]. Regardless of approach, there is a clear trend to studying AHF treatments using composite endpoints as mortality rates decline, and improving how a patient feels and functions becomes the primary motivator.

The evolution of composite endpoints for COVID-19 appears to follow a very similar pattern, though over a much shorter course. Early COVID-19 clinical trials primarily focused on mortality and serious clinical events. After generations of mutation, the severity of COVID-19 has decreased such that clinical trials focused on mortality and serious clinical events are less feasible. Trials testing effectiveness in reducing symptoms, improving patient quality of life, and preventing long-COVID have been critical. Morbidity and mortality remain important, leading to endpoints that combine clinical outcomes with patient-reported outcomes (PROs). PROs are defined as “any report of the status of a patient’s health condition that comes directly from the patient, without interpretation of the patient’s response by a clinician or anyone else;” and can encompass symptoms, quality of life, and more [16]. The World Health Organization (WHO) COVID-19 ordinal scale for clinical improvement [17] is a clear example that incorporates both clinical endpoints and PROs for consideration as an efficacy endpoint in COVID-19 clinical trials [Reference Thorlund, Smith and Linsell18].

Given the similar evolution of composite endpoints from an exclusive focus on clinical events to a multi-dimensional evaluation incorporating PROs in both AHF and COVID-19 clinical trials, we conducted a literature review to investigate and compare the adoption of composite endpoints. To the best of our knowledge, this is the first review that conducts annotations of composite endpoints regarding their composition and statistical types for both AHF and COVID-19 trials and explores the trends of their use in an analytical fashion. Our study combined with the observed evolution will shed light on the potential facilitators and barriers of composite endpoints uptake as well as the potential for composite outcomes in other areas.

Materials and methods

Literature search strategy

PubMed was searched to identify eligible trials. The final search was conducted on May 4, 2023. No date restrictions were placed on the search. The primary search strategy involved threefold identification pinpointing clinical trials, composite endpoints, and disease types. To identify clinical trials, the checkbox “Clinical Trials” in the PubMed filter of article types was used. To identify AHF and COVID-19 clinical trials using composite endpoints, titles and abstracts were searched for terms implying the use of composite endpoints (“composite endpoint” OR “composite” OR “multiple endpoint” OR “ordinal” OR “-free” OR “win ratio” OR “global ranking”) AND terms indicating disease types (“acute heart failure” OR “COVID-19”). Extra terms “ordinal,” “-free” (e.g., hospital-free or ventilator-free days), “win ratio,” and “global ranking” were added to not miss studies using composite endpoints but not explicitly stating that the endpoint was a composite. To avoid missing clinical trial papers not marked as “Clinical Trials” in PubMed and to capture AHF papers not specifically mentioning “acute heart failure,” a supplementary search was conducted not using the “Clinical Trials” checkbox but including “clinical trial” and removing disease types in the keyword search.

Annotation strategy

Papers were included if they were presenting primary results evaluating efficacy from AHF or COVID-19 clinical trials with at least one composite endpoint as either the primary or secondary outcome. Each publication was independently annotated by two authors (Liu, Faculty; and Shi, Research Assistant), recording exclusion reason, year of publication, number of composite endpoints, statistical type (time-to-event, binary, ordinal, count, continuous, or mixed; mixed mainly denotes the use of win ratio and global ranking approaches), composition type (clinical-only, PRO-only, or both), and number of components in the primary composite endpoint. Annotations were then compared and reconciled. Conflicts were resolved by consensus. If a resolution could not be reached, another faculty member would be involved to make the final decision.

Data analysis

Descriptive statistics were calculated to characterize the overall use of composite endpoints for AHF and COVID-19 trials. Wilcoxon Rank Sum test or Fisher’s Exact test was used for two group comparisons. The proportions of statistical types and compositions of the primary endpoints are plotted by publication year for each clinical field.

Results

Our search identified 946 publications, 419 from the primary search and 527 from the supplementary search. Of these, 227 met inclusion criteria with 46 AHF trials and 181 COVID-19 trials (Fig. 1). There are clear differences in the use of composite endpoints between AHF and COVID-19 trials (Table 1). COVID-19 trials were more likely to include multiple composite endpoints than AHF trials (43.5% in AHF vs 59.1% in COVID-19, p < 0.001). Among trials where the primary outcome was a composite endpoint, COVID-19 trials were more likely to include PROs in the primary composite endpoints (PRO-only + both: 7% in AHF vs 35.1% in COVID-19, p < 0.001), although the number of components in the primary endpoint was similar between diseases (p = 0.73). AHF trials predominantly used time-to-event composite endpoints (62.8% in AHF vs 26% in COVID-19), whereas COVID-19 trials were much more likely to use ordinal composite endpoints (2.3% in AHF vs 24.7% in COVID-19) with more diversity in statistical types (26% using time-to-event, 37.7% using binary, and 9.7% using continuous). The use of win ratio and global ranking approaches (i.e., mixed statistical type) in trials of both diseases is low (4.7% in AHF vs 1.9% in COVID-19).

Figure 1. Flow diagram for the inclusion/exclusion of clinical trial publications in our study. There are 946 publications in total with 419 from the primary search and 527 from the supplementary search. The 6 exclusion reasons are listed in a hierarchical order, with 1 > 2>…>6. That is, if a paper is neither a randomized clinical trial (RCT; reason 1) nor related to acute heart failure (AHF) or COVID-19 (reason 2), it will be classified as “1. Not RCT.” The number of papers excluded due to a specific reason is denoted as “N_ex” presented in the parentheses. In the end, 227 papers met inclusion criteria with 46 from AHF trials and 181 from COVID-19 trials.

Table 1. Descriptive statistics of the use of composite endpoints in acute heart failure (AHF) and COVID-19 trials. In the table, “Q1,” “Q3,” “min,” “max,” “cat.,” “No.,” and “PRO” are abbreviations for the first and third quartiles, minimum, maximum, “categorical,” “the number of,” and “patient-reported outcome”

* The p-values are based on the Wilcoxon Rank Sum test for continuous variables and Fisher’s Exact test for categorical variables.

^† For continuous variable “Publication Year,” categorized variables were derived as “Publication Year (cat.)” to provide more detailed information about the distribution. Note, this additional categorical variable would not be used for hypothesis testing.

^‡ The categories of publication years of AHF trials were derived based on quartiles; while for COVID-19, all eligible papers were published in or after 2020, so the number of trials for each year of 2020–2023 was listed.

^§ For proportions related to the attributes of primary composite endpoints, the denominators are the total numbers of papers with primary endpoints being composite, which were summarized in “Primary is Composite - Yes.”

The composition and statistical type of the primary composite endpoints are shown by publication year in Figure 2. For AHF trials, the use of “both” composition types began to appear after 2011, yet remain used in less than 10% of studies (Fig. 2a). In comparison, composites with “both” composition types were heavily used throughout the COVID-19 pandemic, where the proportion slightly dropped from 38.5% in 2020 to 26.5% in 2022, then rose again to 55.6% in 2023. Time-to-event analysis approaches consistently dominate in AHF (Fig. 2b). The use of a mixed approach to analysis appeared after 2016, with 1 trial in 2016–2019 and another post-2019. In contrast, the diversity of statistical types remains consistent over time in COVID-19 trials where ordinal, time-to-event, and binary approaches were most frequent in COVID-19 trials.

Figure 2. Distribution of (a) composition type and (b) statistical type of the primary composite endpoints summarized over publication years for acute heart failure trials (≤2010, 2011–2015, 2016-2019 and ≥2020) and COVID-19 trials (2020–2023). In the figure, “cat.” and “PRO” are abbreviations for “categorical” and “patient-reported outcome.”

Discussion

This literature review compared trends in the use, construction, and analysis of composite endpoints in clinical trials for two highly disparate clinical fields. AHF was chosen as one because there has been increasing advocacy for using composite endpoints over the past two decades [Reference Califf, Harrelson-Woodlief and Topol4,Reference Braunwald, Cannon and McCabe5,Reference Packer10]. This was driven partly by the lack of success in short-term pharmacological therapy since the 1970s, and that multiple domains need to be assessed for safety and efficacy [Reference Felker, Pang and Adams19]. COVID-19 was chosen for comparison due to its similar evolution patterns, though over a much shorter course of four years. Even though we are still early to observe the use of composite endpoints in COVID-19 trials, this field has caught up with AHF trials in the shift from the exclusive focus on clinical events to multi-dimensional evaluation by additionally incorporating PROs. Our findings suggest that it is critical to understand the current stage of disease management when picking endpoints; as disease transitions from being fatal to treatable, there is a need to measure change in outcomes over the full disease severity, reflecting not just survival but also how the patient functions and feels. Among trials where the primary outcome is a composite endpoint, COVID-19 trials were more likely to include PROs and use ordinal composite endpoints than AHF trials.

The well-recognized benefits of using composite endpoints are accompanied by challenges in their use, partly reflected in our results. While time-to-event composite endpoints only consider time-to-the-first clinical events and ignore their importance, ordinal composite endpoints can be difficult to interpret. Methods such as the win ratio and global ranking rely on ranking patients against one another within the trial, and clinical effect sizes are difficult to extract. Although advocacy for including PROs and using a prespecified hierarchical ranking system for composite endpoints occurred early in AHF trials [Reference Califf, Harrelson-Woodlief and Topol4,Reference Braunwald, Cannon and McCabe5,Reference Boden, van Gilst and Scheldewaert8,Reference Packer11], we posit the lack of global consensus impedes progress and explains the dominant use of clinical and time-to-event composite endpoints in AHF trials. In contrast, during the rapidly evolving COVID-19 outbreak, multiple international organizations including the WHO developed common outcome measures for COVID-19 clinical research [17,20], and guidance appeared early for construction of new endpoints. Many COVID-19 trials either used or refined the WHO ordinal composite endpoint, although dichotomizing the scale to improve interpretation appears to have remained common.

There are some limitations to this study. This was not intended as a systematic review but an exploration of trends in the use of composite outcomes in two disease entities – one with a long history of clinical trials and the other an infectious disease causing a global pandemic. This study only searches for clinical trials with composite endpoints, not all clinical trials in AHF and COVID-19, thus we are not able to assess the overall proportion of studies adopting composite endpoints in these clinical fields. The number of studies, especially for AHF, was smaller than expected despite an additional supplementary search. This may be because many published trials did not explicitly state the use of composite endpoints and were missed in our search. Many papers provided only a partial list of secondary composite endpoints, and some papers put the full list in a separate supplemental material precluding comprehensive annotation. There was no clear consensus on the choice of one secondary endpoint to annotate for composition and statistical type. Therefore, we only annotated the number of composite endpoints including secondary endpoints. Lastly, the overtime comparison on the use of composite endpoints was purely descriptive.

In summary, for AHF and COVID-19 trials, the use of composite endpoints has evolved to include PROs as well as clinical events, and their use commonly ranks multiple events. The change corresponds to the change in purpose of the trials from preventing mortality, then preventing progression, to improving quality of life. Achieving consensus on common outcome measurements and hierarchical rankings is expected to accelerate uptake.

Author contributions

LS, CJL, and DL designed the study. LS and DL conducted literature review and annotation. LS performed data analyses and drafted the initial manuscript. DL and CJL critically reviewed and revised the manuscript. All authors reviewed and edited sections of the manuscript and approved the final manuscript.

Funding statement

This study was supported by the National Center for Advancing Translational Sciences (NCATS) Clinical Translational Science Award Program, Award Number UL1TR002243, and NCATS Center for Innovative TRIals in ChilDrEN and AdulTs Award U24TR001608. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NCATS and NIH.

Competing interests

Dr Lindsell reports receiving grants to the institution from the NCATS for the submitted work; grants to the institution from NIH and Department of Defense and contracts to the institution from the CDC, bioMerieux, AstraZeneca, AbbVie, Entegrion Inc., and Endpoint Health outside the submitted work; patents for risk stratification in sepsis and septic shock issued to Cincinnati Children’s Hospital Medical Center; service on DSMBs unrelated to the current work; and stock options in Bioscape Digital unrelated to the current work. Other authors have no conflicts of interest to declare.

References

Sankoh, AJ, Li, H, D’Agostino, RB Sr. Use of composite endpoints in clinical trials. Stat Med. 2014;33(27):4709–4714.CrossRef Google Scholar PubMed

Ferreira-González, I, Busse, JW, Heels-Ansdell, D, et al. Problems with use of composite end points in cardiovascular trials: systematic review of randomised controlled trials. BMJ. 2007;334(7597):786.CrossRef Google Scholar PubMed

Ferreira-González, I, Alonso-Coello, P, Solà, I, et al. Composite endpoints in clinical trials. Rev Esp Cardiol. 2008;61(3):283–290.CrossRef Google Scholar PubMed

Califf, RM, Harrelson-Woodlief, L, Topol, EJ. Left ventricular ejection fraction may not be useful as an end point of thrombolytic therapy comparative trials. Ciculation. 1990;82(5):1847–1853.CrossRef Google Scholar

Braunwald, E, Cannon, CP, McCabe, CH. Use of composite endpoints in thrombolysis trials of acute myocardial infarction. Am J Cardiol. 1993;72(19):G3–G12.CrossRef Google Scholar PubMed

Hussain, A, Misra, A, Bozkurt, B. Endpoints in heart failure drug development. Cardiac Fail Rev. 2022;8:e01.CrossRef Google Scholar PubMed

Fox, K, Ford, I, Steg, PG, Tendera, M, Ferrari, R, BEAUTIFUL Investigators. Ivabradine for patients with stable coronary artery disease and left-ventricular systolic dysfunction (BEAUTIFUL): a randomised, double-blind, placebo-controlled trial. Lancet. 2008;372(9641):807–816.CrossRef Google Scholar PubMed

Boden, WE, van Gilst, WH, Scheldewaert, RG, et al. Diltiazem in acute myocardial infarction treated with thrombolytic agents: a randomised placebo-controlled trial. Incomplete infarction trial of european research collaborators evaluating prognosis post-thrombolysis (INTERCEPT). Lancet. 2000;355(9217):1751–1756.CrossRef Google Scholar PubMed

Brugaletta, S, Gomez-Lara, J, Ortega-Paz, L, et al. 10-year follow-up of patients with everolimus-eluting versus bare-metal stents after ST-segment elevation myocardial infarction. J Am Coll Cardiol. 2021;77(9):1165–1178.CrossRef Google Scholar PubMed

Packer, M. Proposal for a new clinical end point to evaluate the efficacy of drugs and devices in the treatment of chronic heart failure. J Card Fail. 2001;7(2):176–182.CrossRef Google Scholar PubMed

Packer, M. Development and evolution of a hierarchical clinical composite end point for the evaluation of drugs and devices for acute and chronic heart failure: a 20-year perspective. Circulation. 2016;134(21):1664–1678.CrossRef Google Scholar PubMed

Food and Drug Administration. Treatment for Heart Failure: Endpoints for Drug Development Guidance for Industry. FDA, 2020. (https://www.fda.gov/regulatory-information/search-fda-guidance-documents/treatment-heart-failure-endpoints-drug-development-guidance-industry). Accessed October 4, 2023.Google Scholar

Pocock, SJ, Ariti, CA, Collier, TJ, Wang, D. The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. Eur Heart J. 2012;33(2):176–182.CrossRef Google Scholar

O’Brien, PC. Procedures for comparing samples with multiple endpoints. Biometrics. 1984;40(4):1079–1087.CrossRef Google Scholar PubMed

Felker, GM, Anstrom, KJ, Rogers, JG. A global ranking approach to end points in trials of mechanical circulatory support devices. J Card Fail. 2008;14(5):368–372.CrossRef Google Scholar PubMed

U.S. Department of Health and Human Services FDA Center for Drug Evaluation and Research, U.S. Department of Health and Human Services FDA Center for Biologics Evaluation and Research, U.S. Department of Health and Human Services FDA Center for Devices and Radiological Health. Guidance for industry: patient-reported outcome measures: use in medical product development to support labeling claims: draft guidance. Health Qual Life Out. 2006;4(1):79.CrossRef Google Scholar

WHO R&D Blueprint novel Coronavirus COVID-19 Therapeutic Trial Synopsis. (https://www.who.int/docs/default-source/blue-print/covid-19-therapeutic-trial-synopsis.pdf?sfvrsn=44b83344_1&download=true). Accessed October 4, 2023.Google Scholar

Thorlund, K, Smith, D, Linsell, C, et al. The importance of appropriate selection of clinical endpoints in outpatient COVID-19 clinical trials. Commun Med. 2023;3(1):1–5.CrossRef Google Scholar PubMed

Felker, GM, Pang, PS, Adams, KF, et al. Clinical trials of pharmacological therapies in acute heart failure syndromes. Circulation. 2010;3(2):314–325.Google Scholar PubMed

WHO Working Group on the Clinical Characterisation and Management of COVID-19 infection. A minimal common outcome measure set for COVID-19 clinical research. Lancet. 2020;20(8):e192–e197.CrossRef Google Scholar

Figure 1. Flow diagram for the inclusion/exclusion of clinical trial publications in our study. There are 946 publications in total with 419 from the primary search and 527 from the supplementary search. The 6 exclusion reasons are listed in a hierarchical order, with 1 > 2>…>6. That is, if a paper is neither a randomized clinical trial (RCT; reason 1) nor related to acute heart failure (AHF) or COVID-19 (reason 2), it will be classified as “1. Not RCT.” The number of papers excluded due to a specific reason is denoted as “Nex” presented in the parentheses. In the end, 227 papers met inclusion criteria with 46 from AHF trials and 181 from COVID-19 trials.

Article contents

Trends in use of composite endpoints in clinical trials: A comparison between acute heart failure trials and COVID-19 trials

Abstract

Keywords

Introduction

Materials and methods

Literature search strategy

Annotation strategy

Data analysis

Results

Discussion

Author contributions

Funding statement

Competing interests

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests