According to treatment guidelines,Reference Bandelow, Zohar, Hollander, Kasper and Moller1–3 psychological therapies and psychopharmacological drugs have shown efficacy for the treatment of the three major anxiety disorders: panic disorder with or without agoraphobia (PDA), generalised anxiety disorder (GAD) and social anxiety disorder (SAD). Among psychotherapies, cognitive–behavioural therapy (CBT) is the method studied most, but some trials have also investigated applied relaxation, psychodynamic therapy, interpersonal therapy, mindfulness meditation and therapies conducted via the internet. Medications used for anxiety disorders include selective serotonin reuptake inhibitors, serotonin–noradrenaline reuptake inhibitors, pregabalin, tricyclic antidepressants, benzodiazepines and others.Reference Bandelow, Zohar, Hollander, Kasper and Moller1
In a recent meta-analysis of 234 acute treatment studies for anxiety disorders involving 37 333 patients, we had shown that medications were associated with significantly higher average pre-post effect sizes (Cohen's d = 2.02) than psychotherapies (Cohen's d = 1.22).Reference Bandelow, Reitt, Rover, Michaelis, Gorlich and Wedekind4 We did not find evidence that this result was influenced by heterogeneity, publication bias or allegiance effects.
Enduring effects
It is a common opinion that patients treated with drugs show immediate relapse after stopping medication, whereas gains of psychological therapies are maintained for months or years after treatment termination. This would offer psychological therapies considerable advantage over drug treatment, and in some guidelines (e.g. the UK National Institute for Health and Care Excellence guidelines5, 6), CBT is preferred over medication because of the assumed longer duration of effect. However, there have always been doubts as to whether it is an oversimplification to assume that the differences in relapse rates between drug treatment and psychotherapy are substantial. In naturalistic studies following up patients with anxiety, considerable relapse rates were found years after CBT treatment. For example, in an analysis of 8 controlled studies of CBT for anxiety disorders, 48% of patients were still symptomatic after 2–14 years of follow-up.Reference Durham, Chambers, Power, Sharp, Macdonald and Major7 On the other hand, in relapse prevention studiesReference Bandelow, Zohar, Hollander, Kasper and Moller1, Reference Donovan, Glue, Kolluri and Emir8 in which treatment responders to open drug treatment for 8–12 weeks were re-randomised to long-term treatment (24–52 months) with the same drug or placebo, the relapse rates of patients randomised to placebo ranged from 8 to 56%.
Available follow-up studies directly comparing the durability of CBT with drug therapy did not show clearly longer-lasting effects of CBT: in only oneReference Marks, Swinson, Basoglu, Kuch, Noshirvani and O'Sullivan9 of one studies of PDA, a longer-lasting effect of CBT could be demonstrated.Reference Loerch, Graf-Morgenstern, Hautzinger, Schlegel, Hain and Sandmann10–Reference Mavissakalian, Michelson and Dealy13 Likewise, in SAD, only twoReference Clark, Ehlers, McManus, Hackmann, Fennell and Campbell14, Reference Nordahl, Vogel, Morken, Stiles, Sandvik and Wells15 of four studies have shown longer-lasting effects for CBT than for medication, whereas two did not.Reference Liebowitz, Heimberg, Schneier, Hope, Davies and Holt16, Reference Haug, Blomhoff, Hellstrom, Holme, Humble and Madsbu17
Lack of controlled follow-up studies
There is a lack of controlled follow-up studies for psychotherapy, as 70–75% randomised controlled studies for anxiety disorders use a waitlist as a control condition during the acute treatment period.Reference Bandelow, Reitt, Rover, Michaelis, Gorlich and Wedekind4, Reference Patterson, Boyle, Kivlenieks and Van Ameringen18 For follow-up assessments, the waitlist patients cannot be used anymore as a control group because they are assigned to active treatment after their waiting period. When no deterioration is observed during follow-up after treatment discontinuation, we have to disentangle whether this is attributable to a true enduring effect of the treatment or simply to spontaneous remission or regression to the mean.
To our knowledge, no previous meta-analysis has studied whether psychological therapies have longer-lasting effects than control conditions. Therefore, we used studies involving a drug treatment arm and studies using a pill placebo or a psychological (attention) placebo as a control to see if there was a significantly larger decline of effect size after termination of drug treatment compared with psychological therapies. A ‘psychological placebo’ is defined as conversation of the same length as a psychotherapy session, in which study staff who do not necessarily have psychotherapeutic training establish a supportive, listening and nondirective relationship without applying specific techniques.
The meta-analytic procedure has the advantage that all of the many available follow-up studies can be included in the analysis and not only the few head-to-head comparisons of psychotherapy and drug or placebo conditions.
Method
Selection of studies
The present study extends a comprehensive meta-analysis on efficacy of treatments for anxiety disorders in short-term studies to follow-up studies.Reference Bandelow, Reitt, Rover, Michaelis, Gorlich and Wedekind4
Randomised treatment studies from 1980 to 2016 for PDA, GAD and SAD were found by electronic and hand search. Study quality was assessed with the Scottish Intercollegiate Guidelines Network statement.19 Reasons for exclusion were missing information making it impossible to compute effect sizes, a sample size of any of the treatment arms at inclusion less than ten, reports that were restricted to subsamples (e.g. only elderly patients) and studies that included children and/or adolescents. We did not include open studies because these may have been influenced by expectation effects. Drugs were included that had been shown to be effective in randomised controlled studies and are licensed in at least some countries for the treatment of anxiety disorders.Reference Bandelow, Zohar, Hollander, Kasper and Moller1 Psychological therapies were categorised as follows: ‘CBT’ included individual or group CBT or exposure techniques or a combination of both, as well as CBT treatments conducted via the internet; ‘other psychotherapies’ comprised psychodynamic therapy (n = 5), interpersonal therapy (n = 2), relaxation (n = 16), mindfulness therapy (n = 2) and bibliotherapy (n = 7). Drugs used in the studies were alprazolam, citalopram, clomipramine, fluoxetine, fluvoxamine, imipramine, lorazepam, moclobemide, paroxetine, phenelzine and sertraline. Control conditions included pill placebo (n = 7 studies) and ‘psychological placebo’ (n = 8), and a combination of both (n = 2). None of the included control conditions involved treatment as usual. Because of the small sample sizes in the different placebo groups, we did not analyse the placebo conditions separately. In the psychological placebo studies, the number and length of the sessions were the same as in the experimental conditions, except in two studies, in which patients in the psychological placebo condition had fewer sessions than the psychotherapy group.
From the original database of 234 eligible studies used in the meta-analysis, 91 studies with 180 study arms were chosen that had investigated psychological therapies, medications or a psychological placebo and had included at least one follow-up assessment. In addition, two new follow-up studies that appeared since 1 October 2013 were added.Reference Nordahl, Vogel, Morken, Stiles, Sandvik and Wells15, Reference Leichsenring, Salzer, Beutel, Herpertz, Hiller and Hoyer20
A total of 93 studies with 185 study arms were included in the analysis (CBT, n = 120 study arms; other psychotherapies, n = 32 study arms; medications, n = 16 study arms and placebo conditions, n = 17 study arms). The selection of studies is displayed in a Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement (Fig. 1). All studies are listed in online Table DS3.
For every follow-up week, we pooled all available studies for the three anxiety disorders for this time point. We only included studies with up to 24 months duration, as there was only one evaluable study with one CBT arm using a longer follow-up period (36 months), and no further eligible studies with longer follow-up intervals.
Meta-analytical procedure
Outcome measures
Three reviewers (B.B., Y.G., and A.S.) independently extracted all data. Any discrepancy was resolved by consensus. To limit heterogeneity and to achieve maximum comparability, we preferably used the most commonly applied scales: the Hamilton Rating Scale for AnxietyReference Hamilton21 for PDA and GAD and the Liebowitz Social Anxiety ScaleReference Liebowitz22 for SAD. If these were not available for the follow-up time points, we chose other scales following an algorithm described in Bandelow et al.Reference Bandelow, Reitt, Rover, Michaelis, Gorlich and Wedekind4
Effect sizes (Cohen's d) were calculated from the differences between baseline and end-point or follow-up time point by subtracting the post-treatment mean from the pre-treatment mean and dividing the difference by the pre-treatment s.d. of the measure.Reference Dunlap, Cortina, Vaslow and Burke23 If there was more than two treatment groups in a study, a pooled baseline s.d. based on all treatment arms in the study was used.
Whenever available, intention-to-treat data were used. Where a study only reported data from dichotomous outcomes (the proportion of responders to treatment, e.g. defined by a 50% reduction on the Hamilton Anxiety Scale), it was assumed that participants who ceased to engage in the study from whatever group had an unfavourable outcome. Odds ratios were transformed to Cohen's d.Reference Borenstein, Hedges, Higgins and Rothstein24
We calculated the effect sizes for all three anxiety disorders together, as effect size did not differ significantly between the different disorders in our first study.Reference Bandelow, Reitt, Rover, Michaelis, Gorlich and Wedekind4
Analysis
Meta-analyses were done by Comprehensive Meta-Analysis Version 3.0 (Biostat, USA). IBM SPSS Statistics 24 was used to conduct further analyses. Because most studies differed considerably in scheduling follow-up assessments, including intervals between 4 and 104 weeks (see online Table DS2 for an overview), we interpolated missing data linearly between the nearest empirical follow-up assessments by calculating the mean of the available follow-up scores before and after the interpolated follow-up time point. If no further follow-up scores were available, we used the last-observation-carried-forward (LOCF) method. We preferred this combined model, as solely using the LOCF method can lead to an inappropriate bias.Reference Little, D'Agostino, Cohen, Dickersin, Emerson and Farrar25 We chose a general linear model for repeated measures, as this fits best to analyse the variance for within-subject measurements (follow-up assessments) and for between-subject factors (treatments) within the same model. Follow-up effect sizes were added as 13-stage within-subject factor, also including the interpolated and LOCF scores. Treatment arms were included as a four-stage between-subject factor. For multiple comparisons, P values were corrected by the Bonferroni method (significance was set at P < 0.05, two-tailed).
Heterogeneity was assessed with the Q statistic and the I 2 metric with 95% confidence intervals.Reference Ioannidis, Patsopoulos and Evangelou26 Because moderate (I 2 > 50%) to high (I 2 > 75%) heterogeneityReference Higgins, Thompson, Deeks and Altman27 was found for most comparisons (online Table DS2), the random effects model in which studies are weighted based on the inverse variances and an additional variance component reflecting the observed heterogeneity was applied in all analyses. In general, including a random effect will lead to more conservative results than the fixed effects model.
The analysis of publication bias has been described in a previous publication.Reference Bandelow, Reitt, Rover, Michaelis, Gorlich and Wedekind4 Possible allegiance effects for all study arms were analysed by two independent raters and were assumed when a medication study was sponsored by the current manufacturer of the investigated drug, when authors disclosed financial support from the manufacturer or when one of the authors was a staff member of the manufacturer. For psychological treatments, allegiance effects were assumed when authors had developed the treatment, contributed to an aetiological model or published manuals for the treatment.
Results
After termination of treatment, we observed no significant deterioration in any of the four conditions during the follow-up period (all Bonferroni-corrected pairwise comparisons were not significant; see Fig. 2) (Table 1 and Fig. 2). Raw scores for Cohen's d for all studies at the different follow-up time points are shown in online Table DS2.
CBT, cognitive–behavioural therapy; d, Cohen's d; n = number of study arms.
Missing values were interpolated.
Post hoc analyses revealed that patients in the CBT group even showed a significant improvement over time for all follow-ups from 26 weeks onwards (P-value between 0.05 and 0.003), as measured from end-point. However, this improvement was moderate.
The general linear model for repeated measures showed a significant difference between the four treatment arms (F (3, 181) = 3.268; P = 0.023; see Fig. 2). However, post hoc analyses revealed that only the difference between CBT and placebo was significant (P = 0.021). The medication arms did not differ significantly from the CBT and other psychotherapies arms, demonstrating that patients who stopped taking a drug showed the same durable improvement as patients who stopped psychotherapy.
Substantial heterogeneity was found in all three conditions, with moderate to high I 2 values (online Table DS2), indicating that the distribution did not estimate a common population mean. For 40 (26.3%) of the psychotherapy arms and 5 (31.1%) of the drug arms, allegiance effects were assumed.
Discussion
In the present meta-analysis, we could demonstrate that patients treated with psychological therapies (CBT and other psychotherapies) maintained their gains for up to 2 years after treatment was stopped. In the CBT group, patients even improved relative to their values at treatment termination.
The enduring effect of drug treatment was not significantly smaller than the one obtained with psychotherapy. Our study casts doubt on the widespread assumption that only psychological treatments have enduring effects and gains achieved with medications are lost soon after they are stopped. However, the good news for patients with anxiety disorders is that the chance of deterioration within 2 years after treatment termination is low, and independent from previous treatment.
We know from relapse prevention studies that drugs may have long-lasting effects. These enduring effects may have a neurobiological background. For example, serotonergic drugs may exert sustaining effects on serotonin neurotransmitter systems in the brain, which last longer than the actual treatment period. However, expectancy effects may also take effect, as patients who have experienced improvement with a drug know that they can restart the drug at any time in case a relapse occurs. Many patients with anxiety disorders have concerns that they ‘will have to take the drug forever’, once a medication has been started. In clinical practice, however, many patients take their medication only for some months and stop the drug soon after remission has occurred, although guidelines recommend a treatment duration of 12Reference Bandelow, Zohar, Hollander, Kasper and Moller1 or 6Reference Baldwin, Anderson, Nutt, Allgulander, Bandelow and den Boer2 months after remission.
Also, the effect sizes in the placebo conditions did not show a significant decline during the follow-up period. However, in contrast to the other psychotherapies, CBT showed significantly higher effect sizes than the placebo arms, which may have been a result of the large number of study arms in this condition. No differences were found between CBT and the other psychotherapies; however, as the latter group comprised various forms of psychotherapy, no conclusions can be drawn from this finding.
Our results will have to be reconciled with relapse prevention drug studies, which mostly show some deterioration in the drug arm and a significantly greater deterioration in the placebo arms. Also, psychotherapy is associated with substantial relapse rates, as has been shown in naturalistic studies. The most probable explanation for this lack of deterioration during the treatment-free period in our meta-analysis is that the long-lasting effects seen in all four conditions may be superimposed by effects of spontaneous remission and/or regression to the mean. Moreover, follow-up studies are hampered by methodological problems, i.e. high attrition rates and many confounding factors. For example, it is common for clinical trial protocols to require participants to refrain from involvement in any other treatments during the active treatment period. However, in follow-up studies, it is almost impossible to control what alternative treatments patients utilise after stopping their original treatment. In the treatment-free period, patients who re-experience symptoms of their disorder may start new psychological therapies or take medications, most probably the ones that they have found to be helpful previously. Thus, possible differences between the experimental and control conditions are levelled out.
In contrast to the original meta-analysis using the full data-set of 234 randomised controlled trials, which showed that effect sizes for medications were almost twice as high at post-treatment than for psychotherapies,Reference Bandelow, Reitt, Rover, Michaelis, Gorlich and Wedekind4 the effect size of medications did not differ significantly from the psychotherapy effect size before the follow-up period (post-treatment) in the present study. This may be because of the fact that we could only include 16 drug trials that had follow-up assessments, and these studies did not use the medications with the highest effect sizes (e.g. imipramine, sertraline, phenelzine, fluoxetine, alprazolam and lorazepam). Generally, the end-point effect sizes of all treatments analysed in this study were lower than the effect size achieved with some drugs in the analysis of the acute studies in which for all medications together, an effect size of d = 2.02 was calculated.
Our study has limitations. There were only few follow-up studies using medication or placebo conditions. Substantial heterogeneity was found. Follow-up studies with the medications that showed higher effects sizes in the acute treatment period are lacking. Therefore, our results should be interpreted with caution. Patients on medications should be monitored for a relapse, and treatment should not be terminated too early. There were substantial differences in the number of available follow-up assessments at the various time points and in the intervals between these time points. Therefore, we had to deal with missing values in the statistical analysis. We pooled studies from all three major anxiety disorders because in the analysis of the acute studies we did not find significant differences between these disorders with respect to response. However, some anxiety disorders may be more inclined to respond to a certain treatment than others, and some drugs may be more effective than others.
In summary, uncontrolled studies that report stable improvements after a treatment-free follow-up period may overestimate the ‘durability’ of psychotherapies, as these may be caused by unspecific effects. The often-cited advantage of psychotherapy over pharmacotherapy for anxiety disorders – a longer-lasting improvement – could not be confirmed in our study. Future follow-up studies should use a protocol that monitors confounding factors, e.g. additional, unscheduled medications or psychological therapies during the follow-up period.
Supplementary material
Supplementary material is available online at https://doi.org/10.1192/bjp.2018.49.
eLetters
No eLetters have been published for this article.