Introduction
Anxiety disorders are considered among the most significant global health burdens (Gustavsson et al., Reference Gustavsson, Svensson, Jacobi, Allgulander, Alonso, Beghi and Group2011). These disorders, including generalized anxiety disorder (GAD), panic disorder with or without agoraphobia (PD/A), social anxiety disorder (SAD) and specific phobias (SP), are the world's most prevalent mental health disorders (Bandelow & Michaelis, Reference Bandelow and Michaelis2015). As a result, anxiety disorders account for a substantial financial liability and an ever-pressing demand to increase access to evidence-based treatments (Gustavsson et al., Reference Gustavsson, Svensson, Jacobi, Allgulander, Alonso, Beghi and Group2011). It is reported that less than 20% of people facing anxiety disorders receive adequate treatment (Roberge et al., Reference Roberge, Normand-Lauzière, Raymond, Luc, Tanguay-Bernard, Duhoux and Fournier2015; Teija et al., Reference Teija, Mauri, Terhi, Jonna, Samuli and Jaana2016).
Solving the need for access to evidence-based treatments for anxiety disorders is critically important. Common barriers to treatment access include the lack of qualified mental health professionals, time and travel inconvenience and the reluctance of people in need to seek treatment due to mental health stigmatization (Stefanopoulou, Lewis, Taylor, Broscombe, & Larkin, Reference Stefanopoulou, Lewis, Taylor, Broscombe and Larkin2019). Digitally delivered interventions have the potential to resolve a variety of access constraints, through treatment self-guidance, patient privacy and convenience, reduction in therapist time commitment and overall cost of care provision (Stefanopoulou et al., Reference Stefanopoulou, Lewis, Taylor, Broscombe and Larkin2019; Weightman, Reference Weightman2020). Although considerable advances have been made in the development and utilization of digital interventions over the past two decades (Hollis et al., Reference Hollis, Morriss, Martin, Amani, Cotton, Denis and Lewis2015), adoption of these interventions in practice depends in large part on improving the precision of treatment effect estimates and practitioner confidence in expected benefits across treatment applications (Feijt, de Kort, Bongers, & Ijsselsteijn, Reference Feijt, de Kort, Bongers and Ijsselsteijn2018).
Valuable progress has been made to synthesize the evidence base for the effectiveness of digital interventions for anxiety disorders. Disorder specific meta-analyses are available for GAD (Richards, Richardson, Timulak, & McElvaney, Reference Richards, Richardson, Timulak and McElvaney2015), SAD (Kampmann, Emmelkamp, & Morina, Reference Kampmann, Emmelkamp and Morina2016) and PD/A (Stech, Lim, Upton, & Newby, Reference Stech, Lim, Upton and Newby2020). Moreover, prior meta-analyses have synthesized the evidence base encompassing all anxiety disorders to provide overarching estimates of treatment effectiveness in addition to disorder-specific estimates (Andrews et al., Reference Andrews, Cuijpers, Craske, McEvoy, Titov and Baune2010, Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018; Cuijpers et al., Reference Cuijpers, Marks, van Straten, Cavanagh, Gega and Andersson2009; Păsărelu, Andersson, Bergman Nordgren, & Dobrean, Reference Păsărelu, Andersson, Bergman Nordgren and Dobrean2017). However, the dramatic increase in the volume and diversity of research since the last all-encompassing review that included studies published until September 2016 (Andrews et al., Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018) makes a current state assessment of the field essential. Studies published since September 2016 represent nearly the majority of randomized controlled trials (RCT) on the efficacy of digital interventions for anxiety disorders. Recent trials have included diverse intervention formats, such as unguided interventions (Ciuca, Berger, Crişan, & Miclea, Reference Ciuca, Berger, Crişan and Miclea2018), group interventions (Schulz et al., Reference Schulz, Stolz, Vincent, Krieger, Andersson and Berger2016) and mobile-based interventions (Stolz et al., Reference Stolz, Schulz, Krieger, Vincent, Urech, Moser and Berger2018). Finally, the Andrews et al. (Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018) meta-analysis did not include studies with mixed anxiety disorder samples (Bell, Colhoun, Carter, & Frampton, Reference Bell, Colhoun, Carter and Frampton2012) or studies on interventions other than cognitive behavioral therapy, such as internet-delivered psychodynamic therapy (Andersson, Carlbring, & Furmark, Reference Andersson, Carlbring and Furmark2012) or acceptance and commitment-based therapy (Ivanova et al., Reference Ivanova, Lindner, Ly, Dahlin, Vernmark, Andersson and Carlbring2016).
The present systematic review and meta-analysis aimed to examine the effectiveness of digital interventions across all anxiety disorders and specific to each anxiety disorder in comparison to inactive control conditions. More precise and powerful estimates of effectiveness overall and specific to each disorder were expected to add precision to the evidence base, with an enlarged and more representative sample of studies, and provide further justification for the practical application of digital interventions.
Methods
Study sources, search and selection
A comprehensive anxiety literature database was used as the information source of the present meta-analysis and is registered at the open science framework (Papola, Barbui, Cuijpers, Karyotaki, & Sijbrandij, Reference Papola, Barbui, Cuijpers, Karyotaki and Sijbrandij2020). Development of the database began with a systematic search on 25 April 2019 and a subsequent update on 13 February 2020 by two independent researchers. A systematic search was conducted using a full range of terms related to the applicable interventions, disorders and outcomes. The full search string for the PubMed database search is provided in Other Supplementary Material, eAppendix 1. Published studies were searched from inception to 1 January 2020, using the following electronic databases: PubMed (MEDLINE), EMBASE, PsycINFO, and The Cochrane Central Register of Controlled Trials (CENTRAL). Reference tracking was also conducted on recent systematic reviews and meta-analyses of digital interventions across anxiety disorders and specific to certain disorder (Andrews et al., Reference Andrews, Cuijpers, Craske, McEvoy, Titov and Baune2010, Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018; Arnberg et al., Reference Arnberg, Linton, Hultcrantz, Heintz, Jonsson and Davey2014; Cuijpers et al., Reference Cuijpers, Marks, van Straten, Cavanagh, Gega and Andersson2009; Kampmann et al., Reference Kampmann, Emmelkamp and Morina2016; Newby, Twomey, Yuan Li, & Andrews, Reference Newby, Twomey, Yuan Li and Andrews2016; Păsărelu et al., Reference Păsărelu, Andersson, Bergman Nordgren and Dobrean2017; Richards et al., Reference Richards, Richardson, Timulak and McElvaney2015; Stech et al., Reference Stech, Lim, Upton and Newby2020; Stefanopoulou et al., Reference Stefanopoulou, Lewis, Taylor, Broscombe and Larkin2019)
Studies were included based on the following criteria: (1) participants age 18 or older, (2) clinician validated diagnosis of any primary anxiety disorder as defined by the diagnostic and statistical manual version 5 (American Psychiatric, 2013) including GAD, PD/A, SAD or SP, (3) use of a guided or unguided digital intervention conducted without any physical presence of a therapist or a requirement to participate outside of a personal setting of choice to treat anxiety disorder symptoms, (4) use of an inactive control comparison group such as wait-list control (WLC) or care-as-usual (CAU), (5) use of a customary RCT design and (6) publication in peer-reviewed journals, including advanced online publication. Control comparison groups were categorized as WLC if participants were put on a waiting list to receive the intervention after the intervention was received by the active intervention group, or CAU if participants received or had access to routine care and did not wait to receive the active intervention. In addition, studies that compared digital interventions to face-to-face psychotherapy were selected and extracted for supplemental analysis.
Types of outcome measures
The primary efficacy outcome was anxiety symptoms at the study endpoint, measured as effect size. One outcome measure was selected for each study, based on a pre-determined hierarchy of outcome instruments according to (1) most used and (2) most valid instrument. The hierarchy is available in Other Supplementary Material, eAppendix 2. If none of the outcomes in the pre-determined hierarchy were used in a given study, the primary outcome measure as defined by the study was used. All outcomes will refer to acute-phase treatment (study endpoint), which normally last two to six months.
Data extraction
Post-treatment outcome measure means, standard deviations and the number of participants randomized and eligible for analysis per condition were extracted from each study. In addition, descriptive data were extracted and coded as follows: (1) guided or unguided treatment, (2) mean number of treatment sessions completed, (3) treatment adherence as defined by the percentage of participants who completed all treatment sessions, (4) recruitment setting (community or clinical), and (5) continental location of study. Guided treatment was defined by the provision of support related to treatment content by a trained professional or para-professional, whereas unguided treatment did not include any treatment content-related guidance.
To assess study quality and risk of bias, data were extracted according to the Risk of Bias 2 (RoB 2) tool (Sterne et al., Reference Sterne, Savović, Page, Elbers, Blencowe, Boutron and Higgins2019). This risk of bias assessment entails the review and grading of five quality domains including (1) the randomization process, (2) deviations from the intended interventions, (3) missing outcome data, (4) measurement of the outcome and (5) selection of the reported outcome, resulting in a summary assessment of low, some concern or high risk of bias.
Data extraction decisions were assessed for quality and validity in comparison to a parallel independent study. Interrater reliability was calculated and reported using the interclass correlation coefficient (ICC) and Cohen's Kappa (Banerjee, Capozzoli, McSweeney, & Sinha, Reference Banerjee, Capozzoli, McSweeney and Sinha1999; Barnhart, Haber, & Lin, Reference Barnhart, Haber and Lin2007; Cohen, Reference Cohen1968). Agreement reliability was interpreted using ICC estimates as follows: poor if less than 0.5, moderate if between 0.5 and 0.75, good if between 0.75 and 0.9, and excellent if higher than 0.9 (Koo & Li, Reference Koo and Li2016). Agreement reliability was interpreted using Cohen's Kappa estimates as follows: none to slight if between 0.01 and 0.2, fair if between 0.21 and 0.4, moderate if between 0.41 and 0.6, substantial if between 0.61 and 0.8, and almost perfect if between 0.81 and 1.0 (McHugh, Reference McHugh2012).
Statistical analysis
Statistical analyses were completed using Comprehensive Meta-Analysis version 3 (CMA, 2016), following PRISMA guidelines (Moher, Liberati, Tetzlaff, & Altman, Reference Moher, Liberati, Tetzlaff and Altman2010) and standard guidance for meta-analysis (Cuijpers, Reference Cuijpers2016). Effect size indicates the scale score difference between treatment and control groups at post-treatment, and is calculated by subtracting the mean score of the treatment group from the mean score of the control group, divided by pooled standard deviation. The effect size was estimated as Hedges' g to correct for small sample size bias (Hedges & Olkin, Reference Hedges and Olkin1985). To ease interpretation of effect size, corresponding numbers needed to treat (NNT; Gloster et al., Reference Gloster, Sonntag, Hoyer, Meyer, Heinze, Ströhle and Wittchen2015) figures were calculated according to the method provided by Kraemer and Kupfer (Kraemer & Kupfer, Reference Kraemer and Kupfer2006), representing the number of persons requiring treatment in order to achieve one additional successful treatment outcome.
To account for multiple treatment arms within a study, a decision hierarchy was followed according to the guidance of Cochrane handbook for systematic reviews, hereafter referred to as Cochrane (Higgins, Green, & Cochrane, Reference Higgins, Green and Cochrane2008). As recommended by Cochrane, to avoid multiple correlated comparisons and potential unit of analysis error, treatment groups were combined when possible (Higgins et al., Reference Higgins, Green and Cochrane2008). RevMan software was used to combine treatment arms (RevMan, 2020) according to the Cochrane guidelines (Higgins et al., Reference Higgins, Green and Cochrane2008). An exception to this method was used when treatment arms within a study compared guided and unguided support conditions. In these cases, treatment arms were kept separate in order to retain these comparisons for treatment support subgroup analysis, and the total control group was equally split across the guided and unguided treatment condition arms. According to Cochrane, this method is acceptable and creates sufficient independence between treatment arms to mitigate the unit of analysis error (Higgins et al., Reference Higgins, Green and Cochrane2008).
When trials had missing standard deviation data, standard errors or confidence intervals (CI) were used to calculate standard deviations. When none of these data was available, standard deviations were imputed according to the method outlined by Furukawa et al. (Furukawa, Barbui, Cipriani, Brambilla, & Watanabe, Reference Furukawa, Barbui, Cipriani, Brambilla and Watanabe2006). In the present study, this method was only considered if (1) no standard deviation, standard error or CI data were available and (2) 10 or more comparable studies, using the same outcome instrument, could be used to generate a pooled standard deviation for imputation.
All meta-analyses were conducted in CMA using a random-effects pooling model. Heterogeneity was assessed by the I 2 statistic and corresponding 95% CI (Ioannidis, Patsopoulos, & Evangelou, Reference Ioannidis, Patsopoulos and Evangelou2007), using the non-central chi-squared heterogi module from Stata (Orsini, Bottai, Higgins, & Buchan, Reference Orsini, Bottai, Higgins and Buchan2005). I 2 values of 25, 50, and 75% typically indicate low, medium, and high heterogeneity, respectively. Publication bias was assessed using Egger's test for funnel plot asymmetry and a funnel plot adjusted for publication bias according to the Duval and Tweedie trim-and-fill procedure (Duval & Tweedie, Reference Duval and Tweedie2000; Egger, Smith, Schneider, & Minder, Reference Egger, Smith, Schneider and Minder1997). In addition, an overall analysis of effect size was conducted after the exclusion of outlier studies. Outlier studies were those with a 95% CI for effect size which did not overlap with the 95% CI for the overall pooled effect size.
Subgroup analyses
A series of subgroup analyses were conducted to explore potential explanations of heterogeneity based on differences in relative effect sizes between subgroups and the relative relationship of each subgroup variable to the overall pooled effect size. The series included subgroup comparisons for (1) primary diagnosis of GAD, mixed anxiety samples, PD/A or SAD, (2) low, some concern and high risk of bias, (3) guided v. unguided interventions, (4) community v. clinical recruitment setting and (5) continental location of study, all as categorical variables, as well as (6) mean number of treatment sessions completed and (7) treatment adherence, as continuous variables. All subgroup analyses for categorical variables were conducted in CMA using the mixed-effects model, which pools studies within subgroups based on the random-effects model while testing significant differences between subgroups based on the fixed-effects model. All subgroup analyses for continuous variables were conducted using meta-regression.
Results
Selection, inclusion and extraction
In total, 15 030 abstracts were identified. After the removal of 5606 duplicates, 9424 records remained for title and abstract or full-text screening. After screening, 1307 records were retained for inclusion in the database. All 1307 studies in the comprehensive anxiety database (Papola et al., Reference Papola, Barbui, Cuijpers, Karyotaki and Sijbrandij2020) were screened for inclusion. Reference tracking resulted in 14 additional studies for screening (Andrews et al., Reference Andrews, Cuijpers, Craske, McEvoy, Titov and Baune2010, Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018; Arnberg et al., Reference Arnberg, Linton, Hultcrantz, Heintz, Jonsson and Davey2014; Cuijpers et al., Reference Cuijpers, Marks, van Straten, Cavanagh, Gega and Andersson2009; Kampmann et al., Reference Kampmann, Emmelkamp and Morina2016; Newby et al., Reference Newby, Twomey, Yuan Li and Andrews2016; Păsărelu et al., Reference Păsărelu, Andersson, Bergman Nordgren and Dobrean2017; Richards et al., Reference Richards, Richardson, Timulak and McElvaney2015; Stech et al., Reference Stech, Lim, Upton and Newby2020; Stefanopoulou et al., Reference Stefanopoulou, Lewis, Taylor, Broscombe and Larkin2019). Following title and abstract review, 1176 studies were excluded for specific reasons, leaving 146 studies for full-text review. Following full-text review, 99 studies were excluded for specific reasons, leaving 47 studies for the present meta-analysis. Among the 47 included studies, seven studies had multiple trial arms which were merged for analysis (Berger, Boettcher, & Caspar, Reference Berger, Boettcher and Caspar2014; Christensen et al., Reference Christensen, Batterham, Mackinnon, Griffiths, Kalia Hehir, Kenardy and Bennett2014; Oromendia, Orrego, Bonillo, & Molinuevo, Reference Oromendia, Orrego, Bonillo and Molinuevo2016; Richards, Klein, & Austin, Reference Richards, Klein and Austin2006; Robinson et al., Reference Robinson, Titov, Andrews, McIntyre, Schwencke and Solley2010; Schulz et al., Reference Schulz, Stolz, Vincent, Krieger, Andersson and Berger2016; Stolz et al., Reference Stolz, Schulz, Krieger, Vincent, Urech, Moser and Berger2018), according to the recommended method of Cochrane (Higgins et al., Reference Higgins, Green and Cochrane2008). In four studies, the control group was equally split and shared between guided and unguided intervention arms (Ciuca et al., Reference Ciuca, Berger, Crişan and Miclea2018; Ivanova et al., Reference Ivanova, Lindner, Ly, Dahlin, Vernmark, Andersson and Carlbring2016; Paxling et al., Reference Paxling, Almlov, Dahlin, Carlbring, Breitholtz, Eriksson and Andersson2011; Titov, Andrews, Choi, Schwencke, & Mahoney, Reference Titov, Andrews, Choi, Schwencke and Mahoney2008). Standard deviations were imputed for one study (Titov et al., Reference Titov, Andrews, Choi, Schwencke and Mahoney2008). The 47 studies resulted in 4958 participants (2808 treatment group and 2150 control group) and 53 comparisons quantified in analysis.
The selection process and exclusion rationales are provided in a PRISMA flowchart in Fig. 1 (Moher et al., Reference Moher, Liberati, Tetzlaff and Altman2010). The search also identified nine studies that compared the digital intervention to face-to-face psychotherapy, for supplemental analysis. References for all studies included are provided in Other Supplementary Material, eAppendix 3.
Interrater reliability of data extraction decisions ranged from good to perfect. The Cohen's Kappa for overall risk of bias judgements was perfect at 1.0, while ICC assessing agreement on each individual domain judgement was good to excellent at 0.88 [95% CI: 0.81–0.92]. Similarly, ICC assessing agreement on outcome and characteristics data was good at 0.81 [95% CI: 0.70–0.88].
Characteristics of included studies
Characteristics of the 53 comparisons analyzed are summarized in Table 1. The most represented primary diagnosis was SAD (20 comparisons; 1960 participants), followed by PD/A (15 comparisons; 837 participants), GAD (nine comparisons; 1203 participants) and mixed anxiety samples (nine comparisons, 958 participants). The control comparisons in all studies were categorized as WLC. Additional detail about the WLC condition was provided in 34% (n = 16) of studies, in which the WLC was described as having access to specific support provisions such as routine care (n = 5; 11%), attention control (n = 4; 9%) and online discussion forum (n = 7; 15%). Cognitive behavioral therapy was the most common digital intervention used in the 53 comparisons (n = 48; 90%), followed by acceptance and commitment therapy (n = 3; 6%), psychodynamic therapy (n = 1; 2%) and mindfulness-based therapy (n = 1; 2%). A guided intervention format was used in 42 (79%) comparisons as compared to an unguided format in 11 (21%). Recruitment was conducted in a community setting for 49 (92%) comparisons compared to 4 (8%) in a clinical setting. Regarding the continental location of study, 70% of comparisons were conducted in Europe (n = 37; Austria, Ireland, Denmark, Germany, Netherlands, Romania, Spain, Sweden, Switzerland), followed by 26% in Oceania (n = 14; Australia, New Zealand), 2% in North America (n = 1; Canada) and 2% across mixed continents (n = 1; Australia and Scotland). Finally, the mean number of treatment sessions completed was 5.8 across 43 comparisons with available data, while treatment adherence was 55.9% across 41 comparisons with available data.
Note: GAD: generalized anxiety disorder; PD/A: panic disorder with or without agoraphobia; SAD: social anxiety disorder; iCBT: internet delivered cognitive behavioral therapy; iPDT: internet delivered psychodynamic therapy; iMBT: internet delivered mindfulness based therapy; iACT: internet delivered acceptance and commitment therapy; cCBT: computer delivered cognitive behavioral therapy; WL: wait- list control; WL-IC: wait-list with access to information control provision; WL-OD: wait-list with access to online discussion group; WL-UC: wait-list with indicated access to usual care; Guided: content support provided by trained/para-professional; Unguided: no content support provided; N: total sample analyzed by intention to treat; Hedges' g: effect size between treatment and control according to Hedges' g formula; M Sess. (%): mean number of sessions completed by the treatment group and also reported as a percentage of total sessions; Adherence %: percentage of treatment group that completed all treatment sessions; RoB: Risk of bias by category of low, some concern (S.C.) or high; Com.: open, public, voluntary recruitment; Clin.: clinical referral to study from a medical professional for recruitment method; Europe: continental Europe; Oceania: continent including studies from Australia and or New Zealand; Mix: sample from multiple international regions (Scotland, Australia); N.A.: North America (Canada); PSWQ: Penn State Worry questionnaire; GAD-7: Generalized anxiety disorder-7; BAI: Beck Anxiety Inventory; PDSS-SR: Panic Disorder Severity Scale – Self-rated; PDSS: Panic disorder severity scale; BSQ: Body sensations questionnaire; LSAS-SR: Liebowitz social anxiety scale – Self-Rated; LSAS: Liebowitz social anxiety scale; BFNE: Brief Fear of Negative Evaluation; SIAS: Social interaction anxiety scale.
Risk of bias
The risk of bias was judged to be ‘low’ in 6 (11%) study comparisons, ‘some concern’ in 34 (64%) study comparisons and ‘high’ in 13 (25%) study comparisons. The primary areas of concern were found within domain three and four, related to missing outcome data and measurement of outcomes respectively, where 5 (11%) of the studies were judged to be high risk in each domain.
Meta-analysis
The results for the meta-analysis of digital interventions v. inactive controls are summarized in Table 2 and a forest plot of effect sizes in Fig. 2. The overall effect size was g = 0.80 [95% CI: 0.68–0.93], corresponding with an NNT of 2.3. Heterogeneity was high [I 2 = 75, 95% CI: 68–80]. The effect size for GAD comparisons alone (n = 9) was moderate at g = 0.62 [95% CI: 0.31–0.93] with high heterogeneity [I 2 = 81, 95% CI: 61–88]. The effect size for mixed anxiety disorder comparisons alone (n = 9) was moderate at g = 0.68 [95% CI: 0.39–0.97] with high heterogeneity [I 2 = 77, 95% CI: 51–87]. Regarding PD/A comparisons alone (n = 15), the effect size was large at g = 1.08 [95% CI: 0.77–1.39] with high heterogeneity [I 2 = 76, 95% CI: 57–84]. Similarly, the effect size for SAD comparisons alone (n = 20) was large at g = 0.76 [95% CI: 0.62–0.91] with moderate heterogeneity [I 2 = 53, 95% CI: 11–71].
Note: Diagnosis: primary diagnosis under intended observation; Mixed anxiety: study samples with comorbid or mixed anxiety populations; Guided: support provided by a trained profession and related to treatment content in any format; Unguided: no support provided related to treatment content; Risk of bias: categorically low, some concern (S.C.) or high; Community: recruitment through open community promotion; Clinical: recruitment through clinical referral; Adherence: percentage of participants who completed all treatment sessions; Hedges' g: effect size between treatment and control according to Hedges' g formula; 95% CI (g): 95% confidence interval for effect size (g); p value: significance difference between the effect sizes in the subgroups at .alpha 05; I 2: heterogeneity as a proportion; 95% CI (I 2): 95% confidence interval for I 2; NNT: number needed to treat; Europe: continental Europe; Oceania: continent including studies from Australia and or New Zealand; Mix: sample from multiple international regions (Scotland, Australia); Coeff: meta-regression coefficient; s.e.: standard error; Outliers include: Carlbring et al. (Reference Carlbring, Bohman, Brunt, Buhrman, Westling, Ekselius and Andersson2006); Christensen et al. (Reference Christensen, Batterham, Mackinnon, Griffiths, Kalia Hehir, Kenardy and Bennett2014); Dear et al. (Reference Dear, Staples, Terides, Karin, Zou, Johnston and Titov2015); Klein et al. (Reference Klein, Richards and Austin2006); Kok et al. (Reference Kok, Van Straten, Beekman and Cuijpers2014); Richards et al. (Reference Richards, Timulak, Rashleigh, Mcloughlin, Colla, Joyce and Anderson-Gibbons2016); Schroder et al. (Reference Schroder, Jelinek and Moritz2017); Titov et al. (Reference Titov, Andrews, Johnston, Robinson and Spence2010); van Ballegooijen et al. (Reference Van Ballegooijen, Riper, Klein, Ebert, Kramer, Meulenbeek and Cuijpers2013).
An overall analysis of effect size was also conducted after removing nine outlier studies, resulting in a slightly higher overall effect size g = 0.82 [95% CI: 0.73–0.92] with considerably lower heterogeneity [I 2 = 41, 95% CI: 9–58]. Egger's test for asymmetry was significant [Intercept: 3.34; 95% CI: 1.94–4.74; p < 0.000]. However, there was no indication of publication bias based on Duval and Tweedie's trim and fill procedure (2000), as no studies were recommended for imputation and no difference was found between the observed effect size and the effect size after adjustment for publication bias (g = 0.80). Publication bias is illustrated by funnel plot in Other Supplementary Material, eFigure 1.
In a supplemental analysis, nine comparisons (683 participants) were made between digital interventions v. face-to-face treatment groups. No difference was found between treatment conditions [g = 0.14 favoring digital interventions; 95% CI: −0.01 to 0.30]. Heterogeneity was low [I 2 = 3, 95% CI: 0–56]. Characteristics of the nine comparisons analyzed are summarized in Other Supplementary Material, eTable 1.
Subgroup analyses
No significant differences between subgroups were found, including (1) risk of bias; (p = 0.819), (2) guided v. unguided treatment support (p = 0.177), (3) community v. clinical recruitment setting (p = 0.409), (4) continental location of study (p = 0.101), (5) mean number of treatment sessions completed as a continuous subgroup (p = 0.079) and (6) treatment adherence as a continuous subgroup (p = 0.215). An auxiliary subgroup analysis revealed no difference between WLC control comparison categories (p = 0.131). Results for subgroup analyses are provided in Table 2 and Other Supplementary Material, eTable 2.
Discussion
The present meta-analysis aimed to examine the effectiveness of digital interventions v. inactive controls across anxiety disorders and specific to each disorder. In result, a large, significant pooled effect size across all anxiety disorders (g = 0.80) was found. The results are in line with prior research and can be considered more precisely representative of anxiety disorder treatment effectiveness (Andrews et al., Reference Andrews, Cuijpers, Craske, McEvoy, Titov and Baune2010, Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018; Cuijpers et al., Reference Cuijpers, Marks, van Straten, Cavanagh, Gega and Andersson2009; Păsărelu et al., Reference Păsărelu, Andersson, Bergman Nordgren and Dobrean2017). First, Andrews et al. (Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018) updated their prior meta-analysis of digital interventions (Andrews et al., Reference Andrews, Cuijpers, Craske, McEvoy, Titov and Baune2010) while including major depression studies for 50% of the comparisons. Next, the Cuijpers et al. (Reference Cuijpers, Marks, van Straten, Cavanagh, Gega and Andersson2009) meta-analysis included post-traumatic stress disorder and obsessive−compulsive disorder studies, as well as studies on specific phobia treatment through interventions conducted in an office setting with a therapist or para-professional contact. Last, the Păsărelu et al. (Reference Păsărelu, Andersson, Bergman Nordgren and Dobrean2017) meta-analysis was limited to 14 comparisons (1513 participants) of transdiagnostic or tailored digital interventions. Therefore, the present meta-analysis with 53 comparisons and 4958 participants is considerably larger and specific to anxiety disorders as currently defined by the DSM-5 (American Psychiatric, 2013).
Moderate to large, significant effect sizes were also found specific to GAD (g = 0.62), mixed anxiety disorder samples (g = 0.68), SAD (g = 0.76) and PD/A (g = 1.08). The disorder-specific and mixed sample-specific findings align with prior research while adding value by updating the field (Andrews et al., Reference Andrews, Cuijpers, Craske, McEvoy, Titov and Baune2010, Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018; Arnberg et al., Reference Arnberg, Linton, Hultcrantz, Heintz, Jonsson and Davey2014; Cuijpers et al., Reference Cuijpers, Marks, van Straten, Cavanagh, Gega and Andersson2009; Kampmann et al., Reference Kampmann, Emmelkamp and Morina2016; Păsărelu et al., Reference Păsărelu, Andersson, Bergman Nordgren and Dobrean2017; Richards et al., Reference Richards, Richardson, Timulak and McElvaney2015; Stech et al., Reference Stech, Lim, Upton and Newby2020) The estimate for SAD can be considered more precise and powerful based on 20 comparisons v. the 11 comparisons in Andrews et al. (Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018) and 16 comparisons in Kampmann et al. (Reference Kampmann, Emmelkamp and Morina2016).
Differences in effect sizes across studies were not significantly related to the risk of bias, treatment support, recruitment setting, continental location of study, mean treatment sessions completed or treatment adherence. The results align with prior research on recruitment method and continental location (Păsărelu et al., Reference Păsărelu, Andersson, Bergman Nordgren and Dobrean2017), while contrasting with Andrews et al. (Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018) on the risk of bias. The difference could be explained by a mix of anxiety and depression trials in the Andrews et al. (Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018) meta-analysis compared to the present meta-analysis on anxiety trials. This is the first time that trials comparing guided and unguided intervention arms to inactive control conditions have been meta-analyzed and reported as comparative subgroups. The results are consistent with several studies on the relative efficacy of guided and unguided interventions for anxiety disorders covering GAD, SAD and PD/A, as well as a prior Cochrane systematic review, all of which found no evidence of a difference between guided and unguided interventions (Berger et al., Reference Berger, Caspar, Richardson, Kneubuhler, Sutter and Andersson2011; Ciuca et al., Reference Ciuca, Berger, Crişan and Miclea2018; Dear et al., Reference Dear, Staples, Terides, Karin, Zou, Johnston and Titov2015, Reference Dear, Staples, Terides, Fogliati, Sheehan, Johnston and Titov2016; Fogliati et al., Reference Fogliati, Dear, Staples, Terides, Sheehan, Johnston and Titov2016; Gershkovich, Herbert, Forman, Schumacher, & Fischer, Reference Gershkovich, Herbert, Forman, Schumacher and Fischer2017; Ivanova et al., Reference Ivanova, Lindner, Ly, Dahlin, Vernmark, Andersson and Carlbring2016; Olthuis, Watt, Bailey, Hayden, & Stewart, Reference Olthuis, Watt, Bailey, Hayden and Stewart2015; Titov, Andrews, Choi, Schwencke, & Johnston, Reference Titov, Andrews, Choi, Schwencke and Johnston2009). The non-significant finding is contrary to a prior systematic review, however, the results reported were based on a small mix of depression and anxiety studies rather than the subgroup of anxiety studies alone (Baumeister, Reichler, Munzinger, & Lin, Reference Baumeister, Reichler, Munzinger and Lin2014). Based on the large scale of anxiety-specific studies in the present meta-analysis, all subgroup findings and notably the guided v. unguided intervention findings can be considered robust and distinct.
No difference in effect was found between digital interventions and face-to-face interventions in supplemental analysis, strengthening prior research that also indicated digital interventions to be equally effective as face-to-face interventions in treating anxiety disorders (Andrews et al., Reference Andrews, Basu, Cuijpers, Craske, McEvoy, English and Newby2018; Carlbring, Andersson, Cuijpers, Riper, & Hedman-Lagerlöf, Reference Carlbring, Andersson, Cuijpers, Riper and Hedman-Lagerlöf2018; Cuijpers et al., Reference Cuijpers, Marks, van Straten, Cavanagh, Gega and Andersson2009). Further research is required to advance these preliminary findings and firmly establish equivalence between the two treatment formats for each anxiety disorder in particular. Nonetheless, the results position digital interventions as a promising alternative to face-to-face treatment and underscore the potential to tailor care delivery models that suit the needs of patients and providers (Schuster, Topooco, Keller, Radvogin, & Laireiter, Reference Schuster, Topooco, Keller, Radvogin and Laireiter2020).
Applying digital interventions in practice could address barriers to treatment access and affordability. Digital interventions could be offered as the first stage of care as an alternative to waiting periods or by treatment plan design, consistent with a stepped care model (Nordgreen et al., Reference Nordgreen, Haug, Öst, Andersson, Carlbring, Kvale and Havik2016; Stiles et al., Reference Stiles, Chatterton, Le, Lee, Whiteford and Mihalopoulos2019). This could benefit patients both in terms of the timeliness and the flexibility of care access, particularly advantageous for patients reluctant to seek treatment due to stigmatization or for patients in countries that lack sufficient care infrastructure. Furthermore, finding no evidence of a difference in effectiveness between guided and unguided treatments implies that digital interventions could be effectively facilitated by professionals and non-professionals alike or even self-administered. This could increase the volume of care resources, and if organized systematically such as in stepped care, enable therapists to more efficiently allocate valuable time and resources in such high demand. A study comparing a stepped care model to face-to-face therapy found that most patients who reached a response threshold did so before reaching the later stages of treatment requiring higher demand on therapist time and accessibility (Nordgreen et al., Reference Nordgreen, Haug, Öst, Andersson, Carlbring, Kvale and Havik2016). To highlight the promising cost−benefit, research suggests that the adoption of stepped care models alone could yield incremental cost-effective ratios over €1800 per disability-adjusted life year in comparison to CAU (Stiles et al., Reference Stiles, Chatterton, Le, Lee, Whiteford and Mihalopoulos2019). Finally, digital interventions could also be offered in the form of massive open online interventions (Ricardo et al., Reference Ricardo, Eduardo, Ken, Stephen, Julia, Elizabeth and Eliseo2016), providing open-access, self-managed care as a valuable treatment strategy where healthcare systems lack the infrastructure to provide mental health services at scale, a common barrier in low- and middle-income countries (Cuijpers, Karyotaki, Reijnders, Purgato, & Barbui, Reference Cuijpers, Karyotaki, Reijnders, Purgato and Barbui2018). Improving the balance and organization of treatment delivery could result in a considerable reduction in wait time, cost and overall disease burden (Chisholm et al., Reference Chisholm, Sweeny, Sheehan, Rasmussen, Smit, Cuijpers and Saxena2016; Cuijpers, Kleiboer, Karyotaki, & Riper, Reference Cuijpers, Kleiboer, Karyotaki and Riper2017).
Strengths and limitations
Notable strengths of the present meta-analysis include the magnitude of quantified comparisons giving the results power and precision, the systematic methodology used to search and select studies specific to anxiety disorders and the absence of observed publication bias.
Nonetheless, certain limitations must be acknowledged to cautiously interpret the results. Multiple factors may limit the generalizability of findings. First, the WLC, which is known to potentially inflate effects of treatment conditions (Furukawa et al., Reference Furukawa, Noma, Caldwell, Honyashiki, Shinohara, Imai and Churchill2014; Guidi et al., Reference Guidi, Brakemeier, Bockting, Cosci, Cuijpers, Jarrett and Fava2018), was used for 100% of control condition comparisons. Limited information is monitored and provided to describe the WLC condition, such as the extent to which routine care was used during the waitlist period. Moreover, other potential influencing factors such as treatment history and medication use should be examined, although the studies reported no differences among these factors when analyzing treatment and control group demographics. Second, only 8% of studies recruited participants from the clinical setting. Third, 96% of trials were conducted in continental Europe or Oceania. Fourth, the present meta-analysis was limited to adult samples. Fifth, the findings are based only on outcome data at post-treatment. Finally, assessing the validity of data extraction decisions in comparison to a parallel independent study may not yield a complete resolution of discrepancies, although high interrater reliability estimates found provide strong evidence of data quality and validity. These limitations considered, the present meta-analysis adds substantial strength to the evidence base for digital interventions for anxiety disorders.
Conclusion
The findings of the present study inform several priorities for future research. First, future research is encouraged to prioritize study designs that improve the generalizability of findings, such as trials with diverse control comparison groups and sufficient monitoring of control condition provisions, pragmatic effectiveness trials, various care populations and geographic locations of study. Also, a critical future step could be to examine which digital intervention formats and components most significantly influence treatment outcomes, such as the type and frequency of support offered during guided or unguided treatment. This advancement is considered essential for shaping policy and clinical practice guidelines for psychological interventions (Cuijpers, Cristea, Karyotaki, Reijnders, & Hollon, Reference Cuijpers, Cristea, Karyotaki, Reijnders and Hollon2019; England, Butler, & Gonzalez, Reference England, Butler, Gonzalez and IOM2015). Network meta-analysis (NMA) and component NMA are promising methods that have already been used to link probable differences in effectiveness to treatment components and component combinations for the treatment of panic disorder (Pompoli et al., Reference Pompoli, Furukawa, Efthimiou, Imai, Tajika and Salanti2018). Finally, extending the present meta-analysis across all age groups could provide insight regarding the psychopathology of anxiety disorders across the lifespan relative to the effectiveness of treatment per age group.
In conclusion, the powerful and precise effectiveness estimates found in the present meta-analysis provide a consequential part of the evidence base that could further justify the adoption of digital interventions practice. Enabling the adoption of digital interventions in practice, through accurate and rigorous effectiveness findings, could shape clinical practice guidelines and make effective treatment more accessible, affordable and effective.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291721001999.
Acknowledgements
None.
Financial support
This research received no specific grant from any funding agency, commercial or not-for-profit sectors.
Conflict of interest
No potential conflict of interest was reported by the authors.