Introduction
Randomized controlled trials play a major role in forming the evidence base and shaping clinical practice. However, clinical decision-making is based on accessible and published studies. Trials with statistically significant results are more likely to be published than trials with non-significant results, thus inflating estimates of drug efficacy and safety (Dickersin, Chan, Chalmers, Sacks, & Smith, Reference Dickersin, Chan, Chalmers, Sacks and Smith1987; Suñé, Suñé, & Montoro, Reference Suñé, Suñé and Montoro2013).
The degree to which published trial results overestimate efficacy can be determined using FDA review documents (Turner, Reference Turner2013). In the US, drug companies are required to register with the FDA all trials they intend to conduct for purposes of US marketing approval. Results of these trials are documented in FDA medical and statistical reviews. These are publicly available for download at Drugs@FDA (https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm) for drugs approved by the FDA since 1998 (Turner, Reference Turner2013). Reviews of drugs approved prior to 1998 require a request through the Freedom of Information Act (https://www.accessdata.fda.gov/scripts/foi/FOIRequest/requestinfo.cfm).
To date, research utilizing FDA data to examine reporting biases has been mostly limited to a few psychiatric drug classes, especially drugs approved for the treatment of major depressive disorder (Turner, Matthews, Linardatos, Tell, & Rosenthal, Reference Turner, Matthews, Linardatos, Tell and Rosenthal2008), anxiety disorders (Roest et al., Reference Roest, de Jonge, Williams, de Vries, Schoevers and Turner2015) and schizophrenia (Turner, Knoepflmacher, & Shapley, Reference Turner, Knoepflmacher and Shapley2012). Similar work on additional drug classes is needed.
Benzodiazepines are prescribed to over 5% of US adults (Bachhuber, Hennessy, Cunningham, & Starrels, Reference Bachhuber, Hennessy, Cunningham and Starrels2016) and are involved in nearly 14% of fatal opioid overdoses (National Institute on Drug Abuse, 2022), potentiating their lethality (Boon, van Dorp, Broens, & Overdyk, Reference Boon, van Dorp, Broens and Overdyk2020). While their potential have been acknowledged, the consensus view appears to be that benzodiazepines are ‘highly effective’ (Silberman et al., Reference Silberman, Balon, Starcevic, Shader, Cosci, Fava and Sonino2021) and ‘efficacious for the short- and long-term treatment of anxiety disorders (Nardi & Quagliato, Reference Nardi and Quagliato2022)’. The American Psychiatric Association Practice Guideline for the treatment of panic disorder recommends the use of benzodiazepines (along with SSRIs, SNRIs, TCAs, or CBT) as initial therapy, citing ‘demonstrated efficacy in numerous controlled trials’ (Stein et al., Reference Stein, Pollack, Roy-Byrne, Sareen, Simon and Campbell-Sills2010). Also, multiple meta-analyses have reported that benzodiazepines may be superior to placebo in the treatment of panic disorder (Bighelli et al., Reference Bighelli, Trespidi, Castellazzi, Cipriani, Furukawa, Girlanda and Barbui2016; Boyer, Reference Boyer1995; Breilmann et al., Reference Breilmann, Girlanda, Guaiana, Barbui, Cipriani, Castellazzi and Koesters2019; Wilkinson, Balestrieri, Ruggeri, & Bellantuono, Reference Wilkinson, Balestrieri, Ruggeri and Bellantuono1991). Several of these meta-analyses acknowledged the possibility of publication bias, but few have searched for unpublished trial data. A Cochrane review of 24 trials comparing various benzodiazepines to placebo (Breilmann et al., Reference Breilmann, Girlanda, Guaiana, Barbui, Cipriani, Castellazzi and Koesters2019) included data from just one unpublished trial (however, please see Discussion). Notably, the authors questioned the superiority of benzodiazepines over placebo, in part because they inferred ‘probable publication bias’ based on funnel plot asymmetry. To our knowledge, no studies have formally investigated publication bias with benzodiazepines by searching for unpublished benzodiazepine trial data in regulatory documents and reassessing their efficacy.
Among all benzodiazepines, the one with the highest prescribing rate in the US – and with the highest frequency for all measures of nonmedical use, abuse, and related harms – is alprazolam (Ait-Daoud, Hamby, Sharma, & Blevins, Reference Ait-Daoud, Hamby, Sharma and Blevins2018). In the current study, we examined efficacy data for alprazolam, specifically its extended-release formulation (Xanax XR), for the treatment of panic disorder. Because it was approved in 2003, its FDA reviews are available for download from Drugs@FDA. By contrast, data on the original immediate-release formulation of alprazolam, and all other benzodiazepines, are much less accessible, since they were approved many years before the 1997 launch of Drugs@FDA. Our objective was to compare alprazolam XR's efficacy according to the published literature with its efficacy according to the FDA. Efficacy was examined in terms of overall trial outcome (positive or not) and meta-analytic effect size.
Methods
Data from FDA reviews
At Drugs@FDA (http://wwww.accessdata.fda.gov), we downloaded the medical and statistical reviews for alprazolam XR. From these we identified all phase 2/3 randomized, double-blind, placebo-controlled trials of alprazolam XR for the treatment of panic disorder with or without agoraphobia; patients included were adult male and female outpatients ages 18–65. The FDA reviews of these trials identified multiple primary outcome measures, among which, the following five were common to all trials: (1) mean change from baseline in total number of panic attacks, (2) percentage of patients achieving zero panic attacks, (3) mean change from baseline in Clinical Global Impression–Severity, (4) Clinical Global Impression–Change in Condition, and (5) overall phobia state. For each of these, summary statistics were extracted for purposes of meta-analysis (see below). For each trial, we also extracted the FDA's regulatory decision, that is, whether, for purposes of approval, the trial was judged to be positive (supportive of efficacy).
Regarding one trial (Study 5, #0032), the FDA reported its overall conclusion but no summary statistics other than its sample size. Since this trial was unpublished, we established email contact with the sponsor, but our follow-up emails were not answered. We also requested additional data on this trial from the FDA's Freedom of Information Office, but they estimated that our ‘complex track request’ would not be processed until fall 2024.
Data from journal articles
For each FDA-registered premarketing trial, we searched PubMed, bibliographies of review articles (Breilmann et al., Reference Breilmann, Girlanda, Guaiana, Barbui, Cipriani, Castellazzi and Koesters2019; Perugi, Frare, & Toni, Reference Perugi, Frare and Toni2007) and Google Scholar. In PubMed, we used the following search syntax: (alprazolam[Title/Abstract] OR xanax[Title/Abstract]) AND (XR[Title/Abstract] OR extended release[Title/Abstract] OR sustained release) AND placebo[Title/Abstract]. In Google Scholar, we combined ‘alprazolam XR’ (and ‘alprazolam extended release’) with ‘placebo’ and ‘panic disorder’ and reviewed the first ten pages of search results. Author RA identified the best match between FDA-registered clinical trials and journal articles based on drug name, comparator, dosage groups, associated sample sizes, trial duration, and investigator names. Stand-alone publications (i.e. full article reporting the results of a single trial) were preferred. For any given trial, if no stand-alone publication could be found, we included publications aggregating the results of multiple trials, as long as they did not cover other trials separately published in stand-alone format (Turner et al., Reference Turner, Knoepflmacher and Shapley2012).
Each trial publication conclusion was classified as positive or not positive based on the overall conclusion expressed in the abstract. If no abstract was available, we used the overall conclusion expressed in the conclusion/discussion section. We then noted whether the two data sources (trial publication v. FDA review) (dis)agreed regarding the trial's outcome.
For purposes of meta-analysis (below), both authors independently identified and extracted summary data on the apparent primary outcome, that is, the drug-placebo comparison reported first in the text of the results section or in the table or figure first cited in the text (Turner et al., Reference Turner, Matthews, Linardatos, Tell and Rosenthal2008).
Statistical analysis
We conducted two meta-analyses, one based on the FDA reviews and one based on corresponding publications. For outcomes based on continuous data, the measure of effect size was Hedges's g (Hedges, Reference Hedges1982), calculated as:
(This method is algebraically equivalent to using means and standard deviations (Rosenthal, Reference Rosenthal1991) which, in our experience, are inconsistently provided in FDA reviews.) The t statistic was calculated from p and N using the TINV function in Excel, multiplying t by −1 if drug underperformed placebo. If a precise p value was unavailable (e.g. p < 0.05), t was calculated from other summary statistics (SDs, SEs, 95% CIs), if available; otherwise, we calculated t from the upper limit of the p value range. There were two multiple-dose trials for which we calculated a single study-level effect size, as done previously (Roest et al., Reference Roest, de Jonge, Williams, de Vries, Schoevers and Turner2015; Turner et al., Reference Turner, Matthews, Linardatos, Tell and Rosenthal2008, Reference Turner, Knoepflmacher and Shapley2012; Turner, Cipriani, Furukawa, Salanti, & de Vries, Reference Turner, Cipriani, Furukawa, Salanti and de Vries2022), avoiding a spuriously low standard error by counting the shared placebo N only once.
For one outcome based on count data (percentage of patients achieving zero panic attacks), we used the csi command in Stata 11 (StataCorp LP, 2009) to calculate the odds ratio (OR), which we converted to Cohen's d using the method of Cox and Snell (Anzures-Cabrera, Sarpatwari, & Higgins, Reference Anzures-Cabrera, Sarpatwari and Higgins2011; Thorlund, Walter, Johnston, Furukawa, & Guyatt, Reference Thorlund, Walter, Johnston, Furukawa and Guyatt2011).
(We considered the method of Hasselblad and Hedges, which differs only in that it employs a slightly larger denominator (1.81), but this would have led to smaller FDA-based d values, increasing the gap between them and the corresponding journal-based d values.)
The variance of d was calculated as:
where a, b, c, and d are the cell frequencies in the 2-by-2 contingency table (Wilson, Reference Wilson2017). Cohen's d was multiplied by the correction factor J, where
to obtain Hedges's g (Cooper, Hedges, & Valentine, Reference Cooper, Hedges and Valentine2019) and the standard error of g was calculated as the square root of vg, where
For each FDA-reviewed clinical trial, we calculated a composite effect size using the arithmetic mean of the five outcome-level effect sizes and of their variances (López-López, Page, Lipsey, & Higgins, Reference López-López, Page, Lipsey and Higgins2018).
Study 5 (trial #0032) could not be incorporated into the meta-analysis due to insufficient statistical summary data (above). Statistical analyses were independently performed by RA using R 4.0.2 (R Core Team, 2020) and by ET using Stata 11 (StataCorp LP, 2009); their results were compared and reconciled.
Results
The numerical results extracted from the FDA review and from the published literature are presented in Table 1.
CGI, Clinical Global Impression rating scale. Sample sizes, shown as [N drug; N placebo], represent number of patients analyzed, which may be slightly less than number treated, as given in text.
a Count data reported in FDA review (alprazolam yes|alprazolam no||placebo yes|placebo no), from which authors calculated odds ratios (OR) ± 95% confidence intervals.
b Precise p value calculated from means, SDs, and Ns; dose groups pooled.
c Upper limit of p value range used to calculate Hedges's g. Brackets: [N drug; N placebo]. Gray shading indicates nonsignificant outcomes. NS, outcomes assumed nonsignificant because FDA stated trial had ‘failed on face’.
Study 1 (2000/0271)
Results per FDA
Study 1 was conducted at three centers in two countries (US and Canada) from July 1986 to March 1989 (M-53/56). In this study, 208 male and female adult outpatients with panic disorder with or without agoraphobia were treated, 69 with alprazolam XR, 70 with alprazolam immediate release (IR, not included in this article's tables or figures), and 69 with placebo (M-17/56 and M-53/56) (Abbreviations here refer to pages 17 and 53 of the 56-page FDA medical (M) review. Henceforth, ‘S-’ refers to the FDA statistical review). The FDA statistician stated Study 1 was ‘positive in 5 of 7 primary endpoints’ (S-4/26). However, because the FDA deems studies positive only if all primary endpoints achieve statistical significance (FDA, 2022), the FDA medical review included Study 1 as one of four that had ‘failed on face’ (M-5/56 and M-15/56).
Results published
The article on Study 1 (Pecknold, Luthe, Munjack, & Alexander, Reference Pecknold, Luthe, Munjack and Alexander1994) presented it as a positive trial and gave no indication that, overall, it should be considered negative. The abstract stated, ‘On global measures, Hamilton Rating Scale for Anxiety, phobia rating, and work disability measures, both active treatment groups were equally effective and significantly more efficacious than the placebo cell on endpoint MANOVA analysis. On analysis of the panic factor…significantly more effective than the placebo group’. The Results section, Outcome Measures subsection, began, ‘Both the physician-rated and patient-treated scales estimating overall improvement showed that [sic] both CT [compressed tablet] alprazolam and the XR alprazolam to be superior to placebo. For the patient-rated scale…p < 0.001 on all weeks’ for the patient-rated scale. The remainder of the Results section presented several additional statistically significant results. The Comment section began: ‘During this 6-week trial, both the CT and the XR alprazolam patients had significantly more improved ratings than the placebo group on most of the outcome variables’.
Study 2 (2000/0369)
Results per FDA
Study 2 was conducted at three centers in the United States from June 1988 to January 1990. In this study, 200 male and female adult outpatients with panic disorder were treated, 104 patients with alprazolam XR and 96 with placebo (M-17 and -53/56). Of the five alprazolam XR trials, Study 2 was the only study the FDA deemed clearly positive: ‘All 7 co-primary efficacy endpoints were statistically significantly superior in the group treated with alprazolam XR compared to the group treated with placebo’ (M-47/56).
The fact that just one positive study was deemed sufficient for alprazolam XR's approval was apparently not the case earlier: ‘This study was first submitted [~33 characters redacted], but the NDA [new drug application] was [~31 characters redacted] due to the fact that there was only one positive study and per guideline at that time required two positive studies for approval of the submission (S-5/26)’.
The FDA review also reported an irregularity at one of the three study sites: ‘…[site] inspection findings were as follows: ‘28 of 37 subject records were not available for review, as Dr Rosenthal had destroyed these records in March 1999…validity of the data reported could not be verified…it is recommended that the Review Division should consider excluding all data generated at this site and reanalyzing efficacy data in support of this NDA’’ (M-11/56). After doing so, the FDA review division found, ‘these results do not change the conclusion’ (S-12/26).
Results published
Study 2 was published as positive (Schweizer, Patterson, Rickels, & Rosenthal, Reference Schweizer, Patterson, Rickels and Rosenthal1993). However, the above-mentioned investigator was included as a co-author, as were the data from his site.
Study 3 (2002/0002)
Results per FDA
Study 3 was conducted at 15 centers in the United States from June 1990 to October 1991. In this study, 231 male and female adult outpatients with panic disorder were treated, 155 with alprazolam XR and 76 with placebo (M-17 and -53/56). According to the FDA, the primary analyses involved comparisons for primary endpoints at week eight for all patients, using the last-observation-carried-forward (LOCF) method of handling dropouts (Committee for Proprietary Medicinal Products, 2001; Leon et al., Reference Leon, Mallinckrodt, Chuang-Stein, Archibald, Archer and Chartier2006). As reported in Table 1 and the FDA review (S-24/26), the results were nonsignificant on 5 of 5 primary outcomes.
Results published
In the corresponding journal publication, Study 3 (Alexander, Reference Alexander1993) was presented as positive, contrary to the FDA report. The first half of the results section (there was no abstract) was an ‘Initial Analysis’, which began, ‘Statistically significant improvements…with both doses of alprazolam XR within the first week of treatment’. The second sentence referred to Table 1, whose title indicated that the results were observed values, that is, obtained via complete case analysis, which, unlike the (primary) LOCF method, violates the intention to treat principle by simply omitting the data from patients who drop out (Committee for Proprietary Medicinal Products, 2001; Leon et al., Reference Leon, Mallinckrodt, Chuang-Stein, Archibald, Archer and Chartier2006). Table 1 displayed Week 1 results of p = 0.014 and p = 0.0005 for the two doses, respectively. This table displayed p values only for one other time point, Week 6, both with p < 0.05. Results for Week 8, the primary analysis time point (see FDA section above), were not shown. Later, in the fourth paragraph or the results section, LOCF results were mentioned, but two of its three sentences reported on the relative performance of the two active dose groups; its last sentence acknowledged that ‘Similar reductions were seen in the placebo group…’ but not the corresponding nonsignificant p values.
The second half of the Results section was entitled ‘Analysis Excluding High Placebo-Response Centers’, for which, judging from the FDA review, there was no prespecified plan. This section's first sentence reported, ‘excluding four [of 15] centers with unusually high placebo responses, revealed clear differences between alprazolam-XR and placebo at every time point and on all efficacy measures’. This section devoted two tables and two figures to these post hoc results. Table 2 displayed ten p values (2 doses × 5 time points), including seven with p < 0.05; Table 3 displayed 16 p values, including 14 with p < 0.05.
The Conclusion section began, ‘The results of this study indicate that both 4 and 6 mg alprazolam-XR are effective in reducing panic attacks and producing clinical improvements in patients with panic disorder’. The Conclusion section ended, ‘…active treatment with alprazolam-XR clearly demonstrates that twice-daily dosing with an extended-release preparation is an effective treatment for panic disorder’.
Study 4 (2002/0003)
Results per FDA
Study 4 was conducted at 15 centers in the United States from May 1990 to October 1991. In this study, 261 male and female adult outpatients with panic disorder were treated, 178 patients with alprazolam XR and 83 with placebo (M-17 and -53/56). As reflected in Table 1, the FDA review (S-21/26) reported nonsignificant on all five primary endpoints.
Results (un)published
There was no stand-alone paper for Study 4; it was mentioned briefly in a review publication (Stahl, Reference Stahl1993) also covering Studies 1–3, which were separately published in stand-alone format. Consequently, Study 4 was considered not transparently published.
Study 5 (2002/0032)
Results per FDA
Study 5 was conducted at one center in the United States from 1994 to 1995. In this study, 50 male and female outpatients with panic disorder with or without agoraphobia were treated, 23 patients with alprazolam XR and 24 were treated with placebo (M-17 and -53/56). Patients in both treatment groups also received cognitive-behavioral therapy. Beyond sample sizes and the fact the results were nonsignificant, the FDA provided no statistical results. Referring to this plus three other studies, the FDA medical review (M-15/56) stated, ‘[A]s these 4 studies failed on face, the efficacy data were not reviewed in detail. Furthermore, the Division had required that the sponsor submit data from only one positive well-controlled trial for the purpose of establishing efficacy for this NDA [new drug application]’.
Results (un)published
Study 5 was neither published as a stand-alone article nor acknowledged in the above-mentioned review article (Stahl, Reference Stahl1993).
Summary comparison of FDA v. journal reporting for the five studies
Overall, in the five studies, a total of 950 male and female adult outpatients with panic disorder were treated, 531 patients with alprazolam XR, 70 with alprazolam IR, and 349 with placebo (M-17 and -53/56). As shown in Fig. 1, the FDA review indicated five efficacy trials of alprazolam XR, only one of which (Study 2) was positive, with the remaining four trials ‘failed on face’. These were not transparently published: Two – Studies 1 and 3 – were published as positive (Alexander, Reference Alexander1993; Pecknold et al., Reference Pecknold, Luthe, Munjack and Alexander1994), contrary to the FDA's conclusion, while the other two – Studies 4 and 5 – were not published.
Meta-analyses comparing published v. FDA data
Table 2 shows, for each study, an FDA-based effect size (ES) value for each primary outcome and a composite ES. For each study, the FDA-based composite ES value is shown in Table 2 and in Fig. 2, where it is compared to the journal-based ES. Across the studies, the overall FDA-based composite ES was 0.33 (CI95% 0.06–0.60) while the journal-based ES was 0.47 (CI95% 0.29–0.65), representing an increase of 0.14 or 42%.
Gray shading indicates outcomes with nonsignificant results. Columns represent results for the five primary outcomes and a composite ES across outcomes; rows represent the various studies. CGI refers to Clinical Global Impression rating scale.
Discussion
Key findings
We found that alprazolam XR may be less effective than the published literature would suggest. According to the published literature, every trial of alprazolam XR found it to be effective. By contrast, according to the FDA, only one of five trials was positive. Consequently, the effect size derived from FDA data (0.33) was substantially lower than the effect size derived from the published literature (0.47). The FDA-based effect size was similar to the FDA-based effect size reported previously for antidepressants in the treatment of panic disorder (0.28) (Roest et al., Reference Roest, de Jonge, Williams, de Vries, Schoevers and Turner2015). These findings arguably alter the risk-benefit ratio for the prescribing of this benzodiazepine, especially in the light of recent attention to their contribution to the opioid crisis (Park, Saitz, Ganoczy, Ilgen, & Bohnert, Reference Park, Saitz, Ganoczy, Ilgen and Bohnert2015) and the availability of safer alternatives.
Strengths and Limitations
Prior studies have compared trial data from the FDA v. journal articles for other drug classes (introduction). This is the first study, to our knowledge, to quantify the effect of selective publication on the apparent efficacy of a benzodiazepine, and it arguably provides a more realistic estimate of its efficacy.
A strength of this study is its use of FDA reviews to discover unpublished clinical trial data. Unfortunately, this resource is seldom utilized in Cochrane systematic reviews (Schroll, Bero, & Gøtzsche, Reference Schroll, Bero and Gøtzsche2013). Indeed, two Cochrane reviews on various benzodiazepines (Bighelli et al., Reference Bighelli, Trespidi, Castellazzi, Cipriani, Furukawa, Girlanda and Barbui2016; Breilmann et al., Reference Breilmann, Girlanda, Guaiana, Barbui, Cipriani, Castellazzi and Koesters2019) found only one unpublished alprazolam trial (GSK), but here alprazolam served as an active control v. the SSRI paroxetine, for which GSK was seeking FDA approval for panic disorder. (Our group previously reported this trial's paroxetine data using the FDA review for that drug-indication combination (Roest et al., Reference Roest, de Jonge, Williams, de Vries, Schoevers and Turner2015). In that trial, neither drug demonstrated superiority to placebo.) The latter Cochrane review (Breilmann et al., Reference Breilmann, Girlanda, Guaiana, Barbui, Cipriani, Castellazzi and Koesters2019), more relevant because it focused on benzodiazepines v. placebo, missed the two unpublished trials we found (Studies 4 and 5) plus one of the published trials (Study 3) (Alexander, Reference Alexander1993). As the authors acknowledged, they were unable to rule out overestimation of treatment effects due to sponsorship bias; by contrast, FDA statistical reviewers, with access to patient-level data and the original protocol, conduct independent analyses (Turner, Reference Turner2004).
This study has several limitations. FDA reviews are limited to premarketing trials, so postmarketing trials are excluded. However, we know of no reason to suspect that drug performance should change after v. before marketing approval; and postmarketing trials can be susceptible to sponsorship-based reporting biases (Heres et al., Reference Heres, Davis, Maino, Jetzinger, Kissling and Leucht2006). This study is restricted to one benzodiazepine for one anxiety-related indication, and it is restricted to issues of efficacy, as opposed to ‘real-world’ effectiveness (Singal, Higgins, & Waljee, Reference Singal, Higgins and Waljee2014). Examination of safety/harm outcomes, which would impact the overall risk-benefit ratio, was beyond the scope of this study. We lacked summary data on one small nonsignificant trial. Lastly, it is possible that trials could have been misclassified as unpublished; however, given our literature search methods, it seems unlikely such publications would be discoverable by most interested clinicians.
Implication
This study brings to light unpublished trial data and provides a more balanced and realistic view of the efficacy of alprazolam XR, compared to what has been previously reported. It is unknown whether the discrepancy between FDA and journal trial data is greater or smaller for other benzodiazepines. This adds to the literature on publication bias in clinical trials for drugs for psychiatric conditions, including major depressive disorder (Melander, Ahlqvist-Rastad, Meijer, & Beermann, Reference Melander, Ahlqvist-Rastad, Meijer and Beermann2003; Turner et al., Reference Turner, Matthews, Linardatos, Tell and Rosenthal2008), bipolar disorder (Ghaemi, Reference Ghaemi2009), anxiety disorders (Roest et al., Reference Roest, de Jonge, Williams, de Vries, Schoevers and Turner2015), and schizophrenia(Heres et al., Reference Heres, Davis, Maino, Jetzinger, Kissling and Leucht2006; Turner et al., Reference Turner, Knoepflmacher and Shapley2012).
While one might expect that trial registration would prevent the type of reporting bias documented here, ClinicalTrials.gov did not come into existence until the year 2000, many years after all other currently marketed benzodiazepines were approved. If alprazolam XR were approved in the current era, its trials might well have been reported more transparently. Indeed, an increase in the transparent reporting has been found with trials involving, for example, antidepressants (Turner et al., Reference Turner, Cipriani, Furukawa, Salanti and de Vries2022). This is arguably due to policy changes, such as the advent of ClinicalTrials.gov and its later augmentation with required results reporting (Zarin, Tse, Williams, & Carr, Reference Zarin, Tse, Williams and Carr2016), increased awareness of unpublished negative studies (Eyding et al., Reference Eyding, Lelgemann, Grouven, Härter, Kromp, Kaiser and Wieseler2010; Jureidini, McHenry, & Mansfield, Reference Jureidini, McHenry and Mansfield2008; Melander et al., Reference Melander, Ahlqvist-Rastad, Meijer and Beermann2003; Turner et al., Reference Turner, Matthews, Linardatos, Tell and Rosenthal2008; Whittington et al., Reference Whittington, Kendall, Fonagy, Cottrell, Cotgrove and Boddington2004), and recent incentives to increase transparency (Miller, Wilenzick, Ritcey, Ross, & Mello, Reference Miller, Wilenzick, Ritcey, Ross and Mello2017).
Our findings are consistent with prior studies that have compared clinical trial data in journal publications with those in FDA reviews. Selective reporting of clinical trials undermines the integrity of the evidence base and deprives clinicians, patients, researchers, and policymakers of accurate data critical for decision-making. Our study highlights the value of regulatory data to the public health.
Data
All data used in this paper – extracted from journal articles and FDA review documents (Drugs@FDA, https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm) – are publicly accessible.
Funding statement
This study was not funded.
Competing interests
RA has no competing interests to declare. ET declares that he formerly worked at the FDA as a clinical reviewer and has no other competing interests to declare.
Ethical standards
Approval by an ethics board was not required, as only summary data in the public domain were used.