Introduction
Major depressive disorder (MDD) is not only highly prevalent (Alonso et al. Reference Alonso, Angermeyer, Bernert, Bruffaerts, Brugha, Bryson, de Girolamo, Graaf, Demyttenaere, Gasquet, Haro, Katz, Kessler, Kovess, Lépine, Ormel, Polidori, Russo, Vilagut, Almansa, Arbabzadeh-Bouchez, Autonell, Bernal, Buist-Bouwman, Codony, Domingo-Salvany, Ferrer, Joo, Martínez-Alonso, Matschinger, Mazzi, Morgan, Morosini, Palacín, Romera, Taub and Vollebergh2004; Waraich et al. Reference Waraich, Goldner, Somers and Hsu2004; Kessler et al. Reference Kessler, Chiu, Demler, Merikangas and Walters2005; Wittchen et al. Reference Wittchen, Jacobi, Rehm, Gustavsson, Svensson, Jönsson, Olesen, Allgulander, Alonso, Faravelli, Fratiglioni, Jennum, Lieb, Maercker, van Os, Preisig, Salvador-Carulla, Simon and Steinhausen2011; Rozental et al. Reference Rozental, Andersson, Boettcher, Ebert, Cuijpers, Knaevelsrud, Ljótsson, Kaldo, Titov and Carlbring2014) but also associated with substantial impairment (Ustün et al. Reference Ustün, Ayuso-Mateos, Chatterji, Mathers and Murray2004; Saarni et al. Reference Saarni, Suvisaari, Sintonen, Pirkola, Koskinen, Aromaa and Lönnqvist2007) and economic costs (Berto et al. Reference Berto, D'Ilario, Ruffo, Di Virgilio and Rizzo2000; Greenberg & Birnbaum, Reference Greenberg and Birnbaum2005; Smit et al. Reference Smit, Cuijpers, Oostenbrink, Batelaan, de Graaf and Beekman2006).
Psychological treatments have been shown to be effective in the treatment of depression (Cuijpers et al. Reference Cuijpers, van Straten, Andersson and van Oppen2008a , Reference Cuijpers, Karyotaki, Weitz, Andersson, Hollon and van Straten2014). However, not all benefit from these treatments and many affected individuals remain untreated (Kohn et al. Reference Kohn, Saxena, Levav and Saraceno2004; Wittchen et al. Reference Wittchen, Jacobi, Rehm, Gustavsson, Svensson, Jönsson, Olesen, Allgulander, Alonso, Faravelli, Fratiglioni, Jennum, Lieb, Maercker, van Os, Preisig, Salvador-Carulla, Simon and Steinhausen2011).
Internet-based guided self-help interventions might be an acceptable (Cavanagh et al. Reference Cavanagh, Seccombe and Lidbetter2011), effective (Johansson & Andersson, Reference Johansson and Andersson2012; Richards & Richardson, Reference Richards and Richardson2012), and cost-effective (Hedman et al. Reference Hedman, Ljótsson and Lindefors2012) treatment alternative, that could provide treatment to individuals not reached so far (Ebert et al. Reference Ebert, Berking, Heber, Riper, Laferton, Cuijpers and Lehr2015a ). While researchers have consistently demonstrated positive effects of Internet-based guided self-help for depression both, for adults (Richards & Richardson, Reference Richards and Richardson2012) and youths (Ebert et al. Reference Ebert, Zarski, Christensen, Stikkelbroek, Cuijpers, Berking and Riper2015b ), little is known about potential negative effects of Internet-based psychological treatments for depression (Boettcher et al. Reference Boettcher, Rozental, Andersson and Carlbring2014; Ebert et al. Reference Ebert, Lehr, Baumeister, Boß, Riper, Cuijpers, Reins, Buntrock and Berking2014a ; Rozental et al. Reference Rozental, Andersson, Boettcher, Ebert, Cuijpers, Knaevelsrud, Ljótsson, Kaldo, Titov and Carlbring2014). This is not unique for Internet treatments, as limited data are also available regarding negative effects for psychotherapy in general (Barlow, Reference Barlow2010; Emmelkamp et al. Reference Emmelkamp, David, Beckers, Muris, Cuijpers, Lutz, Andersson, Araya, Banos Rivera, Barkham, Berking, Berger, Botella, Carlbring, Colom, Essau, Hermans, Hofmann, Knappe, Ollendick, Raes, Rief, Riper, Van Der Oord and Vervliet2014).
While in pharmacological outcome research the standard is to always evaluate both risks and benefits of an intervention (Willan et al. Reference Willan, O'Brien and Cook1997; Curtin & Schulz, Reference Curtin and Schulz2011) psychotherapy outcome research has so far mostly focused on treatment benefits only (Lilienfeld, Reference Lilienfeld2007; Dimidjian & Hollon, Reference Dimidjian and Hollon2010).
Among different potential negative effects of psychotherapy one particularly unfavorable outcome is deterioration of symptoms as a consequence of treatment. Evidence from uncontrolled psychotherapy outcome studies indicates that a substantial number of patients experience a symptom deterioration while being in psychotherapy. The proportion of patients with symptom deterioration in these uncontrolled studies range from 3% to 14% (Smith & Glass, Reference Smith and Glass1977; Mohr, Reference Mohr1995; Hansen et al. Reference Hansen, Lambert and Forman2006; Lambert et al. Reference Lambert, Whipple, Hawkins, Vermeersch, Nielsen and Smart2006). This phenomenon of ‘the deterioration effect’ has been noted even in the early years of psychotherapy research (Bergin, Reference Bergin1966; Garfield et al. Reference Garfield, Prager and Bergin1971).
With regard to Internet-based self-help treatments, it could be argued that such interventions may be associated with an even greater risk for symptom deterioration than face-to-face approaches. For example, for some individuals a self-help approach might not be intense enough (Kiluk et al. Reference Kiluk, Sugarman, Nich, Gibbons, Martino, Rounsaville and Carroll2011). Further, individuals might be overstrained by trying to apply psychotherapeutic self-help strategies. Some therapeutic techniques could be inappropriately implemented by participants without direct guidance from a therapist. These problems could result in a further aggravation of hopelessness in severely affected individuals. It could also be argued that in face-to-face treatments it is much easier to observe and react to early signs of deterioration than it is via the Internet. Another potential negative effect could be that self-help treatments could lead to a delayed help-seeking, which could result in a further deterioration of symptoms, if the initial low-intensity self-help treatment should be not sufficient.
Despite the fact that the topic of potential negative effects of both Internet-based treatments (Kiluk et al. Reference Kiluk, Sugarman, Nich, Gibbons, Martino, Rounsaville and Carroll2011; Boettcher et al. Reference Boettcher, Rozental, Andersson and Carlbring2014; Rozental et al. Reference Rozental, Andersson, Boettcher, Ebert, Cuijpers, Knaevelsrud, Ljótsson, Kaldo, Titov and Carlbring2014, Reference Rozental, Boettcher, Andersson, Schmidt and Carlbring2015; Bengtsson et al. Reference Bengtsson, Nordin and Carlbring2015) and face-to-face psychotherapy (Lilienfeld, Reference Lilienfeld2007; Barlow, Reference Barlow2010; Dimidjian & Hollon, Reference Dimidjian and Hollon2010; Linden, Reference Linden2013; Ladwig et al. Reference Ladwig, Rief and Nestoriuc2014) have recently gained attention in the literature, empirical evidence on potential negative effects drawn from randomized controlled trials (RCTs) is still almost absent (Lilienfeld, Reference Lilienfeld2007).
RCTs are the most reasonable approach to determine whether a treatment is differentially associated with a deterioration in functioning or an increase in symptomatology (Dimidjian & Hollon, Reference Dimidjian and Hollon2010). If a randomized trial shows that participants in the active condition show greater deterioration in functioning than those in the non-treatment control condition, one can confidently conclude that the deterioration was a consequence of therapy (Lilienfeld, Reference Lilienfeld2007). However, given that the number of people deteriorating during treatment is expected to be small, randomized trials are mostly underpowered to examine this research question adequately.
Moreover, RCTs evaluating psychological treatments seldom report the number of patients who deteriorated during treatment, thus it is not possible to investigate mean deterioration effects found in randomized trials and its moderators using traditional meta-analytical approaches. Consequently, to the best of our knowledge, there is no meta-analytical review on symptom deterioration and its moderators within RCTs evaluating Internet-based treatments or psychological interventions for depression in general. Given the increasing popularity of Internet-based treatments in healthcare systems worldwide (Andersson et al. Reference Andersson, Cuijpers, Carlbring, Riper and Hedman2014), there is a pressing need to evaluate potential deterioration effects of Internet-based treatments.
Individual participant data meta-analysis (IPDMA) can overcome some of the limitations of conventional meta-analysis at the study level (Clarke, Reference Clarke2005; Riley et al. Reference Riley, Lambert and Abo-Zaid2010). By collecting and pooling the primary data of individual trials, analyses can be conducted, which have not been reported in the original studies. Furthermore, trials designed to detect overall treatment effects have limited power to detect treatment ×subgroup interactions (Brookes et al. Reference Brookes, Whitely, Egger, Smith, Mulheran and Peters2004). By combining the primary data of multiple trials using an IPDMA approach, it is possible to obtain a large sample size with sufficient power to examine effects in relevant subgroups and identify moderators of outcome (Cooper & Patall, Reference Cooper and Patall2009).
Hence, the present study aims to investigate deterioration rates and moderators of deterioration within randomized trials on Internet-based guided self-help interventions for adult depression, using IPDMA. We also evaluated deterioration rates in a number of subgroups of interest.
Method
Identification and selection of studies
In this study, we included randomized trials in which the effects of an Internet-based guided self-help treatment were compared with a control or comparison group (waiting list, care-as-usual, other) in adults (aged ⩾18 years) with depression (established by diagnostic interview or elevated levels of depressive symptoms based on self-report measures). Studies were excluded if study participants were not currently in a depressive episode (e.g. if they were in remission), if the interventions were provided without guidance (i.e. without support from a therapist or other healthcare professional) in order to increase internal validity and to reduce potential heterogeneity, if the interventions were delivered to the individual via a group format or were delivered at a location that required the individual to travel to use the programme (e.g. a clinic). Co-morbid general medical or psychiatric disorders were not used as a study exclusion criterion. No language restrictions were applied. Fig. 1 shows the selection process for included studies.
For the identification of potential studies for inclusion, we used a database of 1476 papers on the psychological treatment of depression that has been described in detail elsewhere (Cuijpers et al. Reference Cuijpers, van Straten, Warmerdam and Andersson2008b ). These searches covered papers published until January 2014 and in these searches we examined 14 164 abstracts in Pubmed (3638 abstracts), PsycInfo (2824), EMBASE (4682), and the Cochrane Central Register of Controlled Trials (3020). These abstracts were identified by combining terms indicative of psychological treatment and depression (both MeSH terms and text words). Further, the primary studies from 42 meta-analyses of psychological treatment for depression were checked to ensure that no published studies were missed. From the 14 164 abstracts (10 474 after removal of duplicates) 1476 full-text papers were retrieved for possible inclusion in the database.
Data collection, characteristics of included studies and participants
Corresponding authors were contacted for each of the identified papers and asked to provide raw data from their study. Of the 15 published studies identified from the database search, primary data was obtained from 14 (Andersson et al. Reference Andersson, Bergström, Holländare, Carlbring, Kaldo and Ekselius2005; van Straten et al. Reference van Straten, Cuijpers and Smits2008; Warmerdam et al. Reference Warmerdam, van Straten, Twisk, Riper and Cuijpers2008; Perini et al. Reference Perini, Titov and Andrews2009; Ruwaard et al. Reference Ruwaard, Schrieken, Schrijver, Broeksteeg, Dekker, Vermeulen and Lange2009; Vernmark et al. Reference Vernmark, Lenndin, Bjärehed, Carlsson, Karlsson, Oberg, Carlbring, Eriksson and Andersson2010; Titov et al. Reference Titov, Andrews, Davies, McIntyre, Robinson and Solley2010; Berger et al. Reference Berger, Hämmerli, Gubser, Andersson and Caspar2011; van Bastelaar et al. Reference van Bastelaar, Pouwer, Cuijpers, Riper and Snoek2011; Choi et al. Reference Choi, Zou, Titov, Dear, Li, Johnston, Andrews and Hunt2012; Johansson et al. Reference Johansson, Ekbladh, Hebert, Lindström, Möller, Petitt, Poysti, Larsson, Rousseau, Carlbring, Cuijpers and Andersson2012a , Reference Johansson, Sjöberg, Sjögren, Johnsson, Carlbring, Andersson, Rousseau and Andersson b ; Sheeber et al. Reference Sheeber, Seeley, Feil, Davis, Sorensen, Kosty and Lewinsohn2012; Ünlü Ince et al. Reference Ünlü Ince, Cuijpers, van ‘t Hof, van Ballegooijen, Christensen and Riper2013). Data for one study (Titov et al. Reference Titov, Dear, Schwencke, Andrews, Johnston, Craske and McEvoy2011) could not be obtained, as the dataset was no longer available to the Titov research team. The study that was not included did not differ from the other studies in terms of design, participants, intervention, or quality. We also asked all authors whether they were aware of other recently completed RCTs that met our inclusion criteria, but were not yet published. Four more studies were identified by this method, and the authors were all willing to contribute their primary data to this project (Carlbring et al. Reference Carlbring, Hägglund, Luthström, Dahlin, Kadowaki, Vernmark and Andersson2013; Newby et al. Reference Newby, Mackenzie, Williams, McIntyre, Watts, Wong and Andrews2013; Ebert et al. Reference Ebert, Lehr, Boß, Riper, Cuijpers, Andersson, Thiart, Heber and Berking2014c ; Kleiboer et al. Reference Kleiboer, Donker, Seekles, van Straten, Riper and Cuijpers2015). This process resulted in a dataset with the primary data from 18 RCTs including 2079 cases. These 18 randomized controlled studies included 21 comparisons between an Internet-based guided self-help group v. control condition from baseline to post-test, five comparisons in addition from baseline to follow-up I (1–4 months, mean = 2.44, s.d. = 1.09, range 1–4, n = 737 participants) and four comparisons from baseline to follow-up 2 (⩾6 months, mean = 6.96, s.d. = 1.7, range 6–10, n = 594 participants). If a study had three conditions there would be two comparisons (i.e. the active treatment condition with each of the two control conditions). Only one study provided data for both follow-up time points (Ebert et al. 2014b). Characteristics of each included study are described in Table 1. Detailed information on sociodemographic and clinical characteristics of study participants can be found in Table 2.
Recr, Recruitment population; Comm, Community sample; Clin, Clinical Sample; Depression, confirmation of depression; CBT, cognitive behaviour therapy; ACT, acceptance and commitment therapy; PD, psychodynamic therapy; PST, problem-solving therapy; N mod, Number of modules in the intervention; WL, waiting list control; BDI, Beck Depression Inventory; CES-D, Centre for Epidemiology Studies Depression Scale; Qual, risk of bias Score; Publ, publication of result (0 = unpublished, 1 = published); SWE, Sweden; SWZ, Switzerland; GER, Germany; NL, The Netherlands; AU, Australia; USA, United States of America.
a In this column a positive or negative sign is given for four quality criteria, respectively: allocation sequence; concealment of allocation to conditions; blinding of assessors; and intention-to-treat analyses.
BDI, Beck Depression Inventory; CES-D, Centre for Epidemiological Studies Depression Scale.
a Percentages refer to those participants of studies who reported data.
b Medication: BDI and CES-D data refer to the imputed values.
c Studies that used the BDI as primary outcome did not assess follow-up I.
Risk of bias assessment
The validity of included studies was assessed using four criteria of the ‘Risk of Bias’ assessment tool, developed by the Cochrane Collaboration (Higgins et al. Reference Higgins, Altman, Sterne, Higgins and Green2011). This tool assesses possible sources of bias in randomized trials, including the adequate generation of allocation sequence; the concealment of allocation to conditions; the prevention of knowledge of the allocated intervention (masking of assessors); and dealing with incomplete outcome data (this was assessed as positive when intention-to-treat (ITT) analyses were conducted, meaning that all randomized participants were included in the analyses). Assessment of the quality was conducted independently by two assessors. Overall risk of bias was low. All studies reported an adequate sequence generation, and allocation to conditions by an independent (third) party. Sixteen studies reported blinding of outcome assessors or used only self-report outcomes, whereas five did not report blinding. All studies were coded as having handled missing data adequately, as ITT analyses were applied and missing data were imputed for all studies using multiple imputation. Sixteen studies met all four quality criteria, the remaining five studies met three of four criteria. Agreement between independent raters (P.C., L.D.) on the risk of bias was 95% across studies.
Missing data
Analyses were conducted according to the ITT principle. Missing data in the raw datasets were handled using multiple imputations (Schafer & Graham, Reference Schafer and Graham2002) with a Markov Chain Monte Carlo multivariate imputation algorithm (Missing data module in SPSS v. 20; IBM Corp., USA) and 100 estimations per missing value. For the imputation of the primary outcome depression severity, we used all complete participant and study characteristics (study identifier, intervention group, baseline depression score, age, sex, recruitment population, confirmation of depression diagnosis method, intervention type, country of study, bias score – and post-intervention depression score when imputing follow-up). We did not impute baseline predictors.
Calculating deterioration rates
All studies used either the Centre for Epidemiological Studies – Depression Scale (CES-D; Radloff, Reference Radloff1977), or the Beck Depression Inventory (BDI; Beck et al. Reference Beck, Ward, Mendelson, Mock and Erbaugh1961) as outcome measures. Where multiple depression measures were present, the BDI was coded as the primary outcome measure given that it was the most frequently used outcome measure across studies. For both measures we calculated deterioration and response rates according to the widely used reliable change index (RCI; Jacobson & Truax, Reference Jacobson and Truax1991). Participants whose scores from pre-treatment to post-treatment had RCIs below the cut point of −1.96 were considered to have experienced deterioration. A RCI of −1.96 is equivalent to increases of depression of 7.68 points on the CES-D; and 7.63 points on the BDI.
Analyses
Effects of Internet-based treatments on deterioration rates were calculated using the standard two-step IPDMA approach (Riley et al. Reference Riley, Lambert and Abo-Zaid2010). Thus, after calculating whether or not a participant deteriorated (yes/no) we calculated event rates for each study separately on the basis of the imputed data. Following this, pooled event rates across studies were calculated according to a random-effects model as implemented in the Comprehensive Meta-analysis software package version 2.2.021 (https://www.meta-analysis.com), accounting for clustering of both participants’ within-study and between-study heterogeneity (Abo-Zaid et al. Reference Abo-Zaid, Guo, Deeks, Debray, Steyerberg, Moons and Riley2013). We proceeded by calculating the relative risks for each study, and pooled the results across the studies using a random-effects DerSimonian–Laird model (DerSimonian & Laird, Reference DerSimonian and Laird1986). For all analyses we chose a random-effects model, as we expected considerable heterogeneity among the studies. If there were significant differences between the groups with regard to deterioration, response, and remission rates, we also calculated the number needed to harm (NNH) and/or the number needed to treat (NNT) and the associated 95% confidence intervals (CIs), compared to the control group. The NNH indicates the number of participants treated in the experimental condition for one extra person to demonstrate symptom deterioration as compared to the control group. We also calculated a benefit–risk ratio (Willan et al. Reference Willan, O'Brien and Cook1997), by dividing the NNH for one extra symptom deterioration through the NNT to achieve one response [response was also defined using the reliable change criteria, such that participants with a reliable positive change (+1.96 on the RCI) were considered responders]. This procedure is usually used within drug treatment research (Curtin & Schulz, Reference Curtin and Schulz2011), and quantifies the numbers of favourable outcomes achieved for each additional unfavorable outcome event incurred. Benefit–risk ratios were only calculated if there is a higher risk of deterioration in the intervention group as compared to the control group (Willan et al. Reference Willan, O'Brien and Cook1997).
Sensitivity analyses
To test the robustness of our findings, we also conducted a sensitivity analysis applying an alternative criterion for deterioration. We defined the alternative criterion for deterioration such that individuals whose depression scores at baseline increased by ⩾50% at follow-up were categorized as having experienced deterioration. This criterion refers to a relative change instead of to an absolute change in symptoms.
Multiple treatments within one study
There were three studies in which two treatments were compared with a single control group (Warmerdam et al. Reference Warmerdam, van Straten, Twisk, Riper and Cuijpers2008; Titov et al. Reference Titov, Andrews, Davies, McIntyre, Robinson and Solley2010; Johansson et al. Reference Johansson, Sjöberg, Sjögren, Johnsson, Carlbring, Andersson, Rousseau and Andersson2012b ). In these cases, we treated each comparison as a separate study, and we avoided double counting of controls by randomly assigning half the control participants to each comparison.
Heterogeneity
As a test of homogeneity of effect sizes, we calculated the I 2 statistic as an indicator of heterogeneity in percentages (Ioannidis et al. Reference Ioannidis, Patsopoulos and Evangelou2007). A value of 0% indicates no observed heterogeneity, and larger values indicate increasing heterogeneity, with 25% as low, 50% as moderate, and 75% as high heterogeneity. We calculated 95% CIs around relative risks (RRs), using the non-central χ²-based approach within the heterogi module for Stata (Orsini et al. Reference Orsini, Higgins, Bottai and Buchan2013). We also calculated the Q statistic, but only report whether this was significant.
Publication bias
Publication bias was tested by inspecting the funnel plot and by Egger's test (Egger et al. Reference Egger, Davey Smith, Schneider and Minder1997). We also applied Duval & Tweedie's trim-and-fill procedure (Duval & Tweedie, Reference Duval and Tweedie2000), which yields an estimate of the effect size after the publication bias has been taken into account (Borenstein et al. Reference Borenstein, Hedges, Higgins and Rothstein2009).
Subgroup analyses
We conducted a series of subgroup analyses. Pooling of the results was conducted according to the mixed-effects model. In this model, studies within subgroups are pooled with the random-effects model, while tests for significant differences between subgroups are conducted with the fixed-effects model. Subgroup analyses were only conducted for post-treatment data and not for follow-up data, as the sample sizes of follow-up datasets were not large enough to test for significant differences between subgroups. The following subgroups were investigated: Participant characteristics: sex (male/female); age group [adults (18–59 years), older adults (⩾60 years)]; education [low (up to high school), medium to high (high school degree or further education after high school)]; co-morbid anxiety disorder (yes/no); depression severity at baseline [mild to moderate (BDI < 29); severe (BDI ⩾29)]; depression severity at baseline subgroup analyses was only calculated for participants of studies using the BDI, as the CES-D does not have an established cut-off score for depression severity. Study characteristics: MDD confirmed using an established diagnostic interview (yes/no); recruitment (community, clinical setting); risk of bias score [low (4); some risk (<4)]; type of control group (non-active/active). Intervention characteristics: theoretical model of the intervention (CBT, other); number of modules (4–5, 6–7, 8–11).
Ethical standards
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.
Results
Deterioration rates
Overall pooled reliable deterioration rates across measurements are summarised in Table 3. The risk for a reliable deterioration from baseline to post-treatment was significantly lower in the intervention v. control conditions (RR 0.47, 95% CI 0.29–0.75) and the NNT to avoid one additional deterioration was 43.21 (95% CI 25.83–132.10). Heterogeneity was zero. The risk of a deterioration from baseline to follow-up I (1–4 months) appeared to demonstrate a trend towards being lower in the intervention groups compared to the control groups (RR 0.47, 95% CI 0.20–1.42), although the difference did not reach statistical significance (p = 0.097). There were no significant differences between the groups in the relative risk of a deterioration from baseline to follow-up II (p = 0.72). Heterogeneity was zero at both follow-ups.
N co, Number of comparisons; ER, event rate (number of patients with reliable deterioration); IG, intervention group; CG, control group; RR, relative risk for a deterioration; RCI, Reliable Change Index; BDI, Beck Depression Inventory; CES-D, Centre for Epidemiological Studies Depression Scale; MDD, major depressive disorder; CBT, cognitive behavioural therapy.
a A negative number in the column NNT/NNH indicates numbers needed to harm. A positive number indicates numbers needed to treat.
b The p value indicates whether differences between subgroups are significant.
c 95% CI around I 2 cannot be calculated as there must be at least three studies with patients experiencing a reliable deterioration to calculate 95% CIs.
* p < 0.05, ** p < 0.01, *** p < 0.001, † p < 0.1.
Sensitivity and subgroup analyses are presented in Table 3. Applying the alternative deterioration criteria (50% symptom increase) resulted in slightly lower deterioration rates, but demonstrated a similar overall pattern.
Effects of Internet-based treatments compared to control conditions on deterioration rates at post-treatment were non-significant (p > 0.05) in 11 out of 25 subgroups tested and showed significantly better outcomes for those in Internet-based treatment in 14 out of 25 subgroups tested. RR was higher among those in Internet-based treatments than those in control conditions in only one subgroup (i.e. those with low education), although this difference was not significant (RR 1.72, NNH 30.3, 95% CI −26.32 to 9.62, p = 0.37).
Moderator of deterioration effects
Education level was also the only significant moderator of treatment effects on deterioration, such that there was significantly higher risk for deterioration for participants with lower levels of educational attainment compared to those with more education. All other differences between subgroups on deterioration rates were non-significant (p > 0.10).
Benefit–risk ratio
Participating in Internet-based treatments for depression was not associated with an increased risk for deterioration when compared to a control group. There was only one subgroup analysis (i.e. participants with low levels of education) in which the relative risk for a deterioration was higher in the intervention group compared to the controls, although this difference was non-significant. Nonetheless, analyses of response rates at post treatment for the subgroup of participants with low education showed that their responses to treatment were significantly higher in the intervention group compared to controls, with a relative risk of 1.91 and a NNT to achieve one additional response of 3.23) (Ebert et al. unpublished data). Dividing the NNH in order that one symptom deterioration occurs through the NNT to achieve one treatment response, results in a benefit–risk ratio of 9.38, indicating that 9.38 participants with low education achieved a treatment response compared to the control group for each participant experiencing a deterioration in symptoms.
Publication bias
Inspection of the funnel plot and Egger's test indicated some possible publication biases. However, adjustment for publication bias using Duval & Tweedie's trim-and-fill procedure did not result in substantial changes. After adjustment for missing studies (five imputed studies) the RR for deterioration by post-test was 0.58 (95% CI 0.38–0.90), NNT were 43.21 (95% CI 25.83–132.10). Results at both follow-ups stayed the same.
Discussion
This IPDMA evaluated deterioration rates of Internet-based guided self-help interventions for depression compared to control conditions in randomized trials. In addition, deterioration rates were evaluated in subgroups of interest and potential moderating effects were examined.
Results showed that overall deterioration rates were low and the risk of deterioration was significantly lower for participants in Internet-based guided self-help conditions compared to controls. Education significantly moderated the risk for deterioration such that participants with lower educational attainment displayed a higher risk of deterioration compared to participants with more education. Nonetheless, a risk–benefit ratio analysis indicated that also in the subgroup of participants with low education the likelihood of benefits of positive response to Internet-based treatment clearly outweigh the possible risk for deterioration.
To the best of our knowledge, the present study is the first meta-analysis that evaluated deterioration effects and its moderators in RCTs evaluating a psychological treatment. Observed deterioration effects are in the lower range (3.6%) of those found in observational studies of face-to-face psychotherapy (3–14%) (Strupp et al. Reference Strupp, Hadley and Gomes-Schwartz1977; Mohr, Reference Mohr1995; Hansen et al. Reference Hansen, Lambert and Forman2006; Lambert et al. Reference Lambert, Whipple, Hawkins, Vermeersch, Nielsen and Smart2006). In contrast to early indications of possible adverse effects of psychological treatments (Bergin, Reference Bergin1966; Garfield et al. Reference Garfield, Prager and Bergin1971), we did not find a deterioration effect as a consequence of therapy. Instead results indicate that participating in Internet-based guided self-help programmes is associated with a lower risk of deterioration (RR 0.47) relative to controls. This effect held for the overall group and most subgroups. However, education level was identified as a significant moderator, with low educated participants at a greater risk for a deterioration than highly educated participants. This finding corresponds to results from some randomized trials that found that lower educational attainment was associated with worse treatment outcomes compared to higher educated participants in Internet-based self-help interventions (Spek et al. Reference Spek, Nyklícek, Cuijpers and Pop2008; Warmerdam et al. Reference Warmerdam, Van Straten, Twisk and Cuijpers2013). An explanation for such findings may be, that some patients with a lower educational level experience difficulties in terms of understanding the treatment modules, as most self-help manuals require a quite advanced reading comprehension. That may, in turn, decrease their self-efficacy and create feelings of hopelessness. Although all trials involved some form of guidance, this kind of support might not be sufficient for some individuals to overcome the barrier of low education. A more intensive treatment modality, as seeing a therapist face-to-face instead, could potentially help these patients understand the treatment rationale, and, thus, result in (hypothetically) less deterioration (Martinez et al. Reference Martinez, Whitfield, Dafters and Williams2007). However, given that the topic of predictors of deterioration in psychotherapy has so far not been addressed in face-to-face psychotherapy, future studies should examine whether participants with a high risk for deterioration in Internet-interventions would be better suited for face-to-face psychotherapy. Another explanation may be that people with low education may also have other confounds (e.g. low income, poor physical health status, physical comorbidities, lower social support, less access to health services, etc.) which may either contribute to increased severity or lower ability to engage with the content/practice of skills from these programmes.
It should be noted, however, that deterioration rates of participants with low education (10%) were still in the range of those found in observational studies on face-to-face psychotherapy (3–14%, see above). Further, the benefit–risk ratio indicates that, in comparison to the control group, 9.38 patients with low education achieve a treatment response compared to the control group for each participant experiencing a deterioration in symptoms. It is also of note that with regard to response v. deterioration, a previously reported study using the same dataset did not find that education was a moderator of treatment response. Patients with low levels of education profited significantly and to almost the same extent (NNT = 3.23 for response) as patients with more education (NNT = 3.25 for response; Ebert et al. unpublished data). Thus, it is clear that most with low education experience response rather deterioration in Internet-based treatment. Therefore, low education alone should not be used to identify someone as high risk for deterioration and further research is needed to more specifically identify those who may be at high risk for deterioration.
When interpreting results from this study, several limitations need to be considered. First, the only negative effect evaluated was depression symptom deterioration. Other adverse effects may also occur and should be examined alongside RCTs in the context of Internet-based guided self-help interventions in the future. For example, providing less intensive treatment than necessary through a self-help intervention might lead to lower treatment expectation in participants who fail to achieve a treatment response (Ebert et al. Reference Ebert, Lehr, Baumeister, Boß, Riper, Cuijpers, Reins, Buntrock and Berking2014a ). Hence, although the present study did not find indications for harm of Internet-based guided self-help interventions, the study design does not allow to conclude an absence of harm. For a complete discussion on negative effects in Internet-based psychotherapy see Rozental et al. (Reference Rozental, Andersson, Boettcher, Ebert, Cuijpers, Knaevelsrud, Ljótsson, Kaldo, Titov and Carlbring2014) and for psychological treatments in general see Linden (Reference Linden2013). Future studies should examine other potential harmful effects of Internet-based treatments alongside RCTs. Second, given the limited number of studies that included a follow-up assessment, both for the intervention and the control condition, the analyses were underpowered to adequately conduct subgroup and moderator analyses at follow-up. Third, although the total number of participants was very high (2079), the low number of participants per subgroup did not allow for an examination of the association between the intervention and participant-characteristics within subgroups. For example, given the result for education as a moderator, future studies should investigate predictors of deterioration in the subgroup of participants with low levels of education in order to differentiate between those participants with low education with high chances for treatment success and participants with low education at high risk for failure. Moreover, all programmes were only examined in one randomized trial. Hence we were not able to investigate potential negative effects of specific programmes, which should be done in future studies. Fourth, the present study only investigated guided treatments for depression, hence we can not conclude anything about potential negative effects of unguided self-help treatments. Fifth, given the nature of IPDMA, the examined moderators of outcome were limited to those assessed in the original randomized trials. There might be other relevant moderators that have been not assessed.
The present study has relevant implications for both clinical practice and research. First, the differential results for moderators of effects on deterioration and treatment response indicate that the chances and risks for positive and negative change in psychological treatments might be two distinct constructs. Thus future studies on the differential effectiveness of psychological treatments should investigate moderators of both outcomes separately, instead of only evaluating the effects on mean change scores as is commonly done in psychotherapy research. Second, while many healthcare systems hesitate to implement Internet-based guided self-help approaches, the present study indicates, that such interventions are not associated with an increased risk for deterioration, but instead reduces the risk for a further aggravation of symptoms. Taken together with findings showing that such interventions have substantial positive effects on mean symptom improvement and on treatment response and remission (Johansson & Andersson, Reference Johansson and Andersson2012; Ebert et al. Reference Ebert, Zarski, Christensen, Stikkelbroek, Cuijpers, Berking and Riper2015b ) this further supports the need for dissemination of such treatments in routine mental health care. Nevertheless the moderator result for education indicates that a monitoring of participants with low education seems warranted, as they face an increased risk to deteriorate compared to participants with high education. Given that most participants with low education nevertheless achieve treatment response, and the mean effects on treatment response are comparable to those of participants with higher education, low education should not be used as an exclusion criteria in clinical practice. Instead, therapists should closely monitor the treatment and symptom progress in order to detect and react to early signs of a deterioration, e.g. by referral to more intensive treatment modalities. Given that the present study is the first study on deterioration rates in RCTs, one cannot conclude yet, whether these results are specific for Internet-based guided self-help intervention or whether such findings refer to psychological treatments for depression in general. Thus, future studies should evaluate deterioration rates and their moderators for face-to-face psychotherapy and should also compare overall and differential deterioration effects of internet-based and face-to-face psychotherapy.
The present study did not show any evidence for harm of Internet-based guided self-help interventions and indicates that such interventions reduce the risk for a symptom deterioration.
Acknowledgements
The European Union funded this study [EU EFRE: ZW6-67280119999, CCI 2007DE161PR001 & FP7 E-Compared HEALTH.2013.3.1-1: Comparative Effectiveness Research CER in health systems and health services interventions (603098)].
Declaration of Interest
None.