Major depressive disorder (MDD) has a point prevalence of approximately 5% in the general population. Reference Otte, Gold, Penninx, Pariante, Etkin and Fava1,2 It is one of the most common comorbidities in individuals with physical conditions, with an estimated prevalence of up to 20%. Reference Momen, Plana-Ripoll, Agerbo, Benros, Børglum and Christensen3–Reference Gold, Köhler-Forsberg, Moss-Morris, Mehnert, Miranda and Bullinger5 In these populations, MDD is associated with worse quality of life, daily functioning, overall disease burden, medication adherence, disease progression and suicide risk. Reference Rosenblat, Kurdyak, Cosci, Berk, Maes and Brunoni4,Reference Read, Sharpe, Modini and Dear6–Reference Gürhan, Beşer, Polat and Koç9
Antidepressants are prescribed in about 15–30% of individuals with physical conditions. Reference Sanjida, Janda, Kissane, Shaw, Pearson and DiSipio10–Reference Cipriani, Furukawa, Salanti, Chaimani, Atkinson and Ogawa12 Although extensive meta-analyses on antidepressants in the general population are available, Reference Cipriani, Furukawa, Salanti, Chaimani, Atkinson and Ogawa12 most of the included trials have explanatory designs and typically exclude individuals with physical conditions, which raises concerns about the generalisability of these findings.
Rationale of the antidepressant treatment
Depression is a multifactorial disorder encompassing biological, psychological and social components. Biologically, various physical conditions can significantly influence inflammation, immunity, endocrine and metabolic pathways, stress responses and neural activity, increasing vulnerability to depression. Reference Krishnan and Nestler13–Reference Inserra, Mastronardi, Rogers, Licinio and Wong15 Moreover, each physical condition can activate different sets of etiological pathways. Similarly, various physical conditions may be associated with different psychological experiences based on their intrinsic features, such as the degree of social and work impairment or reduced life expectancy. For this reason, the efficacy and tolerability of antidepressants might be different in these individuals compared to the general population.
Treating depression and anxiety and enhancing quality of life in individuals with physical conditions is crucial, but antidepressants can pose risks of serious side-effects that may worsen the underlying illness. Reference Dragioti, Solmi, Favaro, Fusar-Poli, Dazzan and Thompson16 A large umbrella review Reference Köhler-Forsberg, Stiglbauer, Brasanac, Chae, Wagener and Zimbalski17 examining 176 systematic reviews across 43 physical diseases concluded that antidepressants are effective and safe in people with physical conditions and comorbid depression. However, the qualitative approach employed in this analysis prevents a clear estimation of the clinical effect of antidepressants and a comparison between available agents.
On these grounds, we performed a systematic review and network meta-analysis (NMA) to assess the comparative efficacy and tolerability of antidepressants in adults suffering from depression and comorbid physical conditions.
Method
This study was conducted and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines for NMA Reference Hutton, Salanti, Caldwell, Chaimani, Schmid and Cameron18 (see Supplement A, pp. 2–4). The study protocol was registered in advance in the online repository Open Science Forum (https://osf.io/9cjhe/). Changes to the original protocol are reported in Supplement S (p. 179).
Search strategy and selection criteria
We performed a systematic review and NMA including randomised controlled trials (RCTs) comparing antidepressants with placebo and between each other. Only studies with at least 4 weeks of follow-up were included, as this is a reasonable timeframe within which we expect to observe a clinical response to antidepressants. We searched for RCTs that included adults (≥18 years of age) with a primary diagnosis of one or more physical conditions and a diagnosis of acute depressive episode (including single episode of depression, recurrent depressive disorder, mixed depressive and anxiety disorder or adjustment disorder with clinically relevant depressive symptoms) as assessed by clinical examination with or without the support of validated rating scales (e.g. Hamilton Depression Rating Scale (HDRS), Montgomery–Åsberg Depression Rating Scale (MADRS), Beck Depression Inventory (BDI)) or manualised diagnostic criteria (e.g. the Diagnostic and Statistical Manual of Mental Disorders (DSM) 19 and the International Classification of Diseases (ICD) 20 ). If the presence of clinically relevant depression was not clearly described, we assessed its presence according to baseline mean scores on validated rating scales measuring depression, considering the following validated cut-offs indicating at least ‘moderate’ depression: HDRS Reference Hamilton21 ≥ 14; MADRS Reference Montgomery and Asberg22 ≥ 20; BDI version I (BDI-I) Reference Beck, Ward, Mendelson, Mock and Erbauch23 ≥ 10 and version II (BDI-II) Reference Beck, Steer and Brown24 ≥ 14; Major Depression Inventory (MDI) Reference Bech, Timmerby, Martiny, Lunde and Soendergaard25 ≥ 26; Center for Epidemiologic Studies Depression Scale (CES-D) Reference Lewinsohn, Seeley, Roberts and Allen26 ≥ 16; Brief Zung Self-Rating Depression Scale (BZSRD) Reference Zung27 ≥ 44; Hospital Anxiety and Depression Scale (HADS) Reference Snaith and Zigmond28 ≥ 11; Patient Health Questionnaire (PHQ-9) Reference Kroenke, Spitzer and Williams29 ≥ 10; Symptom Checklist-90-Revised (SCL-90-R) Reference Derogatis, Lipman and Covi30 depression subscale ≥ 20. In the case of studies with mixed populations (i.e. depression and anxiety), we included only those with at least 80% of participants suffering from clinically relevant depression, as defined above.
We excluded conditions for which a clear pathogenesis could not be identified (e.g. physically unexplained symptoms, functional symptoms), or conditions for which a relevant psychiatric or psychological component may be particularly relevant (e.g. somatic symptoms disorders, fibromyalgia, irritable bowel syndrome). We also excluded minor depression, dysthymia, prolonged grief disorder or complicated bereavement. No exclusion criteria were applied in terms of setting (e.g. in- and out-patient settings; physical, surgical, psychiatric setting) or type of antidepressant, as even those that are uncommonly prescribed or no longer marketed (e.g. nomifensine) can indirectly contribute to estimate the differential effect between other antidepressants.
We searched electronic databases (MEDLINE, Embase, PsycINFO, CENTRAL, CINAHL), databases of regulatory agencies (e.g. Food and Drug Administration, European Medicines Agency) and trial registers (e.g. clinicaltrials.gov, the World Health Organization’s International Clinical Trials Registry Platform) from inception until 30 April 2024, without language restrictions (see Supplement B for the full research syntax). Two review authors (B.D.L., A.C.) independently assessed titles and abstracts of all retrieved records, and then the full texts of all potentially eligible records. Disagreements were resolved by discussion and consensus with a third review author (G.O.). Data extraction was performed between 1 June 2024 and 15 July 2024 by two review authors (B.D.L., A.C.) independently, in agreement with the recommendations of the Cochrane Handbook for Systematic Reviews of interventions. Reference Higgins, Thomas, Chandler, Cumpston, Li and Page31 Two review authors (G.V., A.C.) independently assessed the risk of bias of included studies using the Cochrane Risk-of-Bias tool, version 2 (RoB2). Reference Sterne, Savović, Page, Elbers, Blencowe and Boutron32 Disagreements were resolved by discussion and consensus with a third review author (G.O., C.B.).
Outcomes
The primary outcomes were (a) depressive symptoms measured as the mean change score (or, alternatively, the end-point score) Reference Ostinelli, Efthimiou, Luo, Miguel, Karyotaki and Cuijpers33 on validated rating scales (HDRS, BDI, MADRS or any other) by the end of the trial, and (b) tolerability, defined as the number of individuals withdrawing from treatment because of adverse events by the end of the trial, as a proportion of the total number of randomised participants.
Secondary outcomes included the following dichotomous outcomes: response; remission; acceptability; death caused by worsening of the physical condition; death from any cause; serious adverse events (SAEs); and the following continuous outcomes: anxiety symptoms; quality of life; social functioning.
Data analysis
Analyses were conducted with Stata (version 18.5 for macOS), and with the R programming language (R version 4.4.0 for macOS; R Foundation for Statistical Computing, Vienna, Austria; https://cran.r-project.org/) within the R Studio integrated development environment. For each outcome and comparison, we performed a standard pairwise, random-effects meta-analysis and a NMA with a random-effects model in a frequentist framework, using the R packages netmeta and the Stata package mvmeta. Reference Balduzzi, Rücker, Nikolakopoulou, Papakonstantinou, Salanti, Efthimiou and et al34,Reference White35 For dichotomous outcomes, risk ratios were log-transformed and pooled using a conventional normal-normal random-effects model with 95% confidence intervals, applying a strict intention-to-treat (ITT) approach (all randomly assigned participants as the denominator). For continuous outcomes, we pooled mean differences if all trials used the same rating scale; otherwise, we employed standardised mean differences (SMDs), including all participants that contributed to the analysis in the original trial. We calculated SMDs as the ratio between the mean difference between groups and the standard deviation of outcome among participants, according to the Cochrane Handbook. Reference Higgins, Thomas, Chandler, Cumpston, Li and Page31 If we could not retrieve missing data from trial authors, we imputed them with validated statistical methods, including imputing response and remission according to mean rating scale scores, standard deviations and validated cut-offs. Reference Higgins, Thomas, Chandler, Cumpston, Li and Page31,Reference Furukawa, Barbui, Cipriani, Brambilla and Watanabe36,Reference Furukawa, Cipriani, Barbui, Brambilla and Watanabe37
For the NMA, common heterogeneity across all comparisons was assumed in each network, and global heterogeneity was assessed using τ2 (low, τ ² ≤ 0.010; moderate, 0.010 < τ ² ≤ 0.242; and high, τ ² > 0.242), estimated by the DerSimonian–Laird method through the R netmeta function Reference Huhn, Nikolakopoulou, Schneider-Thoma, Krause, Samara and Peter38 and I² (low, 0–40%; moderate, 30–60%; substantial, 50–90%; and considerable, 75–100%). Reference Higgins, Thomas, Chandler, Cumpston, Li and Page31
To assess the assumption of transitivity, we compared the distribution of the following variables across treatment strategies: sample size; year of publication; follow-up duration; blinding (double blind versus open label); setting (in-patient, out-patient or mixed settings); industry sponsorship; mean age; percentage of female participants; mean psychopathology score at baseline; mean duration of depression; depression as the primary study outcome; mean number of co-occurring physical conditions; mean severity of the physical condition at baseline (according to a classification derived from the Severity of Illness Instrument). Reference Horn, Horn and Sharkey39
For the primary outcomes, we visually inspected boxplots on the distribution of continuous variables across treatments and assessed whether imbalances were large enough to threaten the transitivity assumption according to both the Kruskal–Wallis test and meta-regression analyses.
We evaluated the presence of inconsistency, defined as the statistical disagreement between direct and indirect evidence of a treatment comparison, by comparing direct and indirect evidence within each closed loop and comparing the goodness of fit for a NMA model that assumes consistency with a model that allows for inconsistency in a ‘design-by-treatment interaction model’ framework by using the R Studio netmeta package and decomp.design and netsplit commands. Reference Balduzzi, Rücker, Nikolakopoulou, Papakonstantinou, Salanti, Efthimiou and et al34,Reference Horn, Horn and Sharkey39 Where we found evidence of inconsistency, we investigated this further using both a node-splitting and a side-splitting approach between comparisons. For each outcome, we calculated the probability of each antidepressant of being at each possible rank and the treatment hierarchy by means of p-scores (mean normalised rank probabilities), which reflect the extent of certainty that a treatment is better than competing treatments, based on the point estimates and standard errors. Reference Rücker and Schwarzer40 If a comparison included ten studies or more, we assessed publication bias by visually inspecting the funnel plot, tested them for asymmetry with the Egger’s regression test and investigated possible reasons for funnel plot asymmetry.
We performed sensitivity analyses excluding RCTs with the following characteristics: placebo controlled; not double blind; with overall high risk of bias according to RoB2; with follow-up shorter than 3 months; efficacy on depressive symptoms was not the primary study aim; without a formal diagnosis of major depression (post hoc); with a small sample size (<50 participants; post hoc analysis); recruiting participants with illness of less than moderate severity. Furthermore, we performed meta-regression analyses on the same variables assessed for transitivity, to identify possible treatment effect moderators.
We also performed the following subgroup analyses for the primary outcomes, provided that at least three RCTs can be pooled together: analysing each subgroup of diseases (according to the ICD-11 classification) and each physical condition separately; grouping treatments according to their class, namely selective serotonin reuptake inhibitors (SSRIs), selective serotonin and noradrenaline reuptake inhibitors (SNRIs), tricyclic antidepressants (TCAs) and other antidepressants.
For the primary outcomes, certainty of the pooled evidence was assessed using the Confidence in Network Meta-Analysis (CINeMA) approach. Reference Nikolakopoulou, Higgins, Papakonstantinou, Chaimani, Del Giovane and Egger41
Results
We identified 2924 records after the database and hand-search. After removing duplicates and examining titles and abstracts, we selected 315 records for full-text assessment. Of these, 115 primary studies were eligible for inclusion. Reference Agarwal, Palka, Gajewski, Khan and Brown42–Reference Rampello, Alvano, Chiechio, Raffaele, Vecchio and Malaguarnera156 Of these, 104 RCTs including 7714 participants contributed to the analysis of efficacy, and 82 RCTs including 6083 participants contributed to the analysis of tolerability (Fig. 1; Supplement C, pp. 8–21). The study by Rampello et al Reference Rampello, Alvano, Chiechio, Raffaele, Vecchio and Malaguarnera156 met the inclusion criteria, but was not included in the meta-analysis because of concerns regarding the quality of the data presented. Results showed an exceptionally large effect size (SMD 5.9) for reboxetine versus placebo in post-stroke depression, which we deemed clinically improbable and possibly caused by a reporting error of the variance. Unfortunately, we could not retrieve additional information from the study authors to clarify this issue. Detailed characteristics of included studies are reported in Supplement D (pp. 22–7). The mean sample size of included studies was 80.9 individuals (median: 51; interquartile range (IQR): 31–88) with 55 (47.4%) studies including ≤50 participants. For the primary outcome of efficacy, the mean age of included participants was 56.1 years (s.d. 11.45) and the mean proportion of female participants was 51.8%, while for tolerability the mean age was 55.6 years (s.d. 12.45) and the mean proportion of female participants was 52.2%. According to the RoB2, 33 studies (31.7%) for the primary outcome of efficacy and six studies (7.3%) for tolerability had an overall high risk of bias (Supplement E, pp. 28–9). Table 1 describes the characteristics of studies included in the two primary analyses. The transitivity assumption was not violated for any of the potential effect modifiers analysed (Supplement F, pp. 30–2).

Fig. 1 Process of study selection.
Table 1 Characteristics of studies contributing to the co-primary outcomes

TCA, tricyclic antidepressants; SSRI, selective serotonin reuptake inhibitors; SNRI, serotonin and noradrenaline reuptake inhibitors.
In terms of efficacy (mean change at rating scales measuring depression), the following antidepressants outperformed placebo (Fig. 2; ordered by effect size): imipramine, nortriptyline, amitriptyline, desipramine among TCAs; sertraline, paroxetine, citalopram, fluoxetine, escitalopram among SSRIs; mianserin, mirtazapine, agomelatine among other antidepressants, with SMDs ranging from −1.01 (imipramine) to −0.34 (escitalopram). Head-to-head comparisons showed relatively few statistically significant differences between antidepressants (Supplement G, pp. 39–40). Notably, imipramine outperformed fluoxetine and venlafaxine. P-scores showed imipramine as the best-performing treatment, followed by mianserin and nortriptyline.

Fig. 2 Network map and forest plot for the primary outcome efficacy. SMD, standardised mean difference; SSRI, selective serotonin reuptake inhibitor; TCA, tricyclic antidepressant; SNRI, serotonin and noradrenaline reuptake inhibitor.
Overall, the NMA showed moderate heterogeneity (τ2 = 0.16; I 2 = 70.0%), and no overall incoherence emerged according to the global approach (design-by-treatment test, P = 0.30), while the local separate indirect from direct evidence (SIDE) approach showed significant inconsistency of two over 38 comparisons (agomelatine versus escitalopram; placebo versus fluoxetine).
Certainty of evidence according to the CINeMA approach was ‘low’ or ‘very low’ for most comparisons, with few exceptions (i.e. ‘moderate’ certainty for the comparisons desipramine versus placebo, dothiepin versus imipramine, imipramine versus nomifensine) (Supplement G, pp. 44–7).
Sensitivity analyses (Supplement G, pp. 48–52) removing studies not blinded, with follow-up durations shorter than 3 months, recruiting individuals without a formal diagnosis of depression, recruiting individuals with illness of less than moderate severity, for which the efficacy on depressive symptoms was not the primary outcome and those with a sample size <50 did not show relevant differences compared with the primary analysis in terms of heterogeneity, which instead was moderately reduced after removing placebo-controlled studies (τ2 = 0.02; I 2 = 18.6%). Effect estimates from sensitivity analyses did not change significantly compared to the primary analysis. Meta-regression analyses showed that increasing baseline severity of the physical condition was associated with a smaller treatment effect (β = −0.18; P = 0.04).
When performing subgroups on different clusters of diseases (Supplement G, pp. 52–9), sertraline and paroxetine outperformed placebo in four out of seven physical conditions; fluoxetine in three; nortriptyline and citalopram in two; agomelatine, imipramine, trazodone, desipramine, amitriptyline, mirtazapine, mianserin and moclobemide in one (Table 2). No consistency issues emerged in any of the subgroup analyses; heterogeneity was substantial for circulatory system diseases (I 2 = 85.5%, which remained high also after analysing RCTs on ischemic heart disease separately) (Table 2; Supplement G, pp. 52–9). After grouping antidepressants in classes, TCAs, other antidepressants and SSRIs outperformed placebo in terms of efficacy, and no differences emerged between classes. Heterogeneity was moderate, and no inconsistency issues emerged (Table 2, Supplement I, pp. 86–94).
Table 2 Subgroup analyses

ESM, effect size measure; TCA, tricyclic antidepressant; SMD, standardised mean difference; SSRI, selective serotonin reuptake inhibitor; IHD, ischemic heart disease; PCOS, polycystic ovarian syndrome; HIV, human immunodeficiency virus; AIDS, acquired immunodeficiency syndrome; COPD, chronic obstructive pulmonary disease.
In terms of tolerability (individuals discontinuing treatment because of adverse events), placebo outperformed sertraline, imipramine and nortriptyline (Fig. 3; ordered by effect size), with risk ratios ranging from 1.47 (sertraline) to 3.41 (nortriptyline). Head-to-head comparisons showed relatively few statistically significant differences between antidepressants, with amitriptyline, doxepin, paroxetine, sertraline and fluoxetine being more tolerable than nortriptyline (Supplement H, p. 65). P-scores showed doxepin as the best-performing treatment, followed by mianserin and venlafaxine.

Fig. 3 Network map and forest plot for the primary outcome tolerability. SSRI, selective serotonin reuptake inhibitor; TCA, tricyclic antidepressant; SNRI, serotonin and noradrenaline reuptake inhibitor.
Overall, the NMA on tolerability showed no heterogeneity (τ2 = 0; I 2 = 0%), and no overall incoherence emerged according to both the global approach (design-by-treatment test, P = 0.99) and the local SIDE approach, which did not show inconsistency for any of the 34 comparisons.
Certainty of evidence according to the CINeMA approach was ‘low’ or ‘very low’ for most comparisons, with few exceptions. Notably, certainty was ‘high’ for the comparisons amitriptyline versus paroxetine, sertraline versus placebo, citalopram and paroxetine; and ‘moderate’ for nortriptyline versus placebo, doxepin, venlafaxine, sertraline, paroxetine, citalopram, amitriptyline; amitriptyline versus citalopram, sertraline, placebo; paroxetine and citalopram versus placebo (Supplement H, pp. 70–4).
Sensitivity analyses did not affect heterogeneity or global and local consistency, and effect estimates did not change significantly compared to the primary analysis (Supplement H, pp. 75–9). Meta-regression analyses did not show any possible moderating effect of the analysed variables.
When performing subgroups on different clusters of diseases (Supplement H, pp. 80–5), sertraline was less tolerable than placebo for circulatory system diseases; imipramine was less tolerable than placebo for infectious diseases; citalopram and mirtazapine were less tolerable than placebo for nervous system diseases (Table 2) (Supplement J, pp. 95–101).
We examined the efficacy/tolerability balance by plotting point estimates of efficacy and tolerability against placebo in a two-dimensional plot (Fig. 4). While SSRIs and SNRIs appeared as relatively consistent groups, being more effective but slightly less tolerable than placebo, for TCAs and other antidepressants results were more heterogeneous, with nortriptyline and imipramine standing out as highly effective but poorly tolerated treatments, and mianserin appearing to be particularly effective and tolerable.

Fig. 4 Efficacy/tolerability balance. For each antidepressant, efficacy versus placebo expressed as standardised mean difference is plotted on the x-axis (values below 0 indicate better efficacy than placebo), and tolerability versus placebo expressed as risk ratios in logarithm scale are plotted on the y-axis (values below 0 indicate better tolerability than placebo). AGO, agomelatine; AMI, amitriptyline; CIT, citalopram; CLO, clomipramine; DES, desipramine; ESC, escitalopram; FLU, fluoxetine; IMI, imipramine; MIA, mianserin; MIR, mirtazapine; NEF, nefazodone; NOM, nomifensine; NOR, nortriptyline; PAR, paroxetine; SER, sertraline; SMD, standardised mean difference; SNRI, serotonin and noradrenaline reuptake inhibitor; SSRI, selective serotonin reuptake inhibitor; TCA, tricyclic antidepressant; TRA, trazodone; TRI, trimipramine; VEN, venlafaxine.
Findings for secondary outcomes are reported in Table 3 and in detail in Supplements K–R (pp. 102–78). In decreasing order of effect size, imipramine, nortriptyline, mianserin, moclobemide, paroxetine, agomelatine, mirtazapine, fluoxetine, citalopram, sertraline and escitalopram were superior to placebo in terms of response. Imipramine, moclobemide, desipramine, mianserin, nortriptyline, paroxetine, citalopram, amitriptyline, fluoxetine, sertraline and escitalopram were superior to placebo in terms of remission. No differences emerged between any antidepressant and placebo in terms of anxiety measured with rating scales, deaths caused by physical conditions and deaths from any cause. Paroxetine outperformed placebo in terms of quality of life; trazodone outperformed placebo in terms of social functioning; mianserin outperformed placebo, and placebo outperformed nortriptyline, in terms of all cause discontinuations; mianserin and citalopram outperformed placebo in terms of discontinuations because of inefficacy. Analyses on anxiety symptoms and social functioning showed substantial heterogeneity; the analysis on quality of life showed moderate heterogeneity, while the remaining analyses did not show relevant heterogeneity. No consistency issues emerged for the secondary analyses (Table 3, Supplements K–R, pp. 102–78).
Table 3 Secondary analyses

ESM, effect size measure; SMD, standardised mean difference.
Discussion
To our knowledge, this is the largest and most comprehensive systematic review and meta-analysis on the efficacy and tolerability of antidepressants in individuals suffering from depression and physical comorbidities. Overall, we found SSRIs, most of the commonly prescribed TCAs (i.e. imipramine, nortriptyline, amitriptyline and desipramine) and some of the other antidepressants (i.e. mianserin, mirtazapine and agomelatine) to be effective over placebo, although the certainty of evidence according to the CINeMA assessment was ‘low’ or ‘very low’ in most cases, mostly because of relevant within-study bias, imprecision and heterogeneity. Sertraline and paroxetine were effective for the largest number of physical conditions. For other efficacy-related outcomes, such as anxiety symptoms, quality of life, social functioning and inefficacy-related discontinuation, relatively few studies were analysed, providing imprecise results that prevent clear conclusions. In terms of undesirable effects, imipramine, nortriptyline and sertraline were less tolerable than placebo, with ‘low’, ‘moderate’ and ‘high’ certainty, respectively. It is important to note that while the magnitude of sertraline’s effect against placebo was comparable to other SSRIs, the estimate was significantly more precise because of the large number of trials and participants in which it was tested, thereby increasing the certainty of the evidence. As sertraline proved to be effective in many disease-specific subgroup analyses and no relevant differences in tolerability emerged when compared with other SSRIs, we consider that this medication is a reasonable first-line choice in physically ill individuals with depression. After comparing medication classes, SSRIs were more tolerable than TCAs but less tolerable than placebo. Secondary safety outcomes, such as deaths caused by the physical condition and from any cause, did not show differences between antidepressants and placebo.
Compared with the available literature, we found that the benefits of antidepressants in physically ill individuals are overall consistent with those detected in the general population. At the same time, tolerability appeared to be generally worse in the former. Reference Cipriani, Furukawa, Salanti, Chaimani, Atkinson and Ogawa12 This is particularly evident for some of the most commonly used TCAs, which have up to a 3.4-fold risk of clinically relevant adverse events, but is still relevant for SSRIs, which are generally recommended in people suffering from chronic physical conditions. Reference Köhler-Forsberg, Stiglbauer, Brasanac, Chae, Wagener and Zimbalski17,Reference Bigger and Glassman157,Reference Katon158 Although most trials involved participants who were older than those recruited in typical antidepressant studies (mean age around 55 years), a meta-regression analysis did not show a moderating effect of age on poorer tolerability of antidepressants.
The results of this NMA should be interpreted in the light of possible limitations. First, we analysed the effect of antidepressants on physically ill individuals, pooling together studies including different physical conditions. This choice might be questionable, as individuals with different diseases might also differ in terms of sociodemographic variables (e.g. distribution of genders) and clinical variables (e.g. level of disability, life expectancy), which can affect the response to antidepressants. Also, depression arising in different physical conditions might be theoretically underpinned by different etiopathological mechanisms (e.g. immunological, inflammatory, endocrinological), which can modulate the response to antidepressants. However, the similar distributions of sociodemographic, clinical and methodological variables across treatments, along with the absence of statistical inconsistency, indicate that the transitivity assumption was upheld in our analyses. In the efficacy analysis, moderate heterogeneity was found, and sensitivity analyses suggest that this was mostly related to the inclusion of placebo-controlled studies, which are generally more likely to deviate from real-world populations, rather than important differences across physical conditions. Further, most heterogeneity likely originated within clusters of studies on the same physical condition, particularly ischemic heart disease, as shown in subgroup analyses. The comparable efficacy of antidepressants across different physical conditions may be explained by their ability to target common underlying pathways, despite differences in the conditions themselves. Second, the quality of included studies was suboptimal because of relatively small sample sizes, bias related to deviations from intended interventions and missing outcome data. Third, although it is recognised that adverse events of antidepressants might differ according to specific vulnerabilities related to each physical condition, we could analyse only generic proxies of tolerability and safety in the overall population, including all conditions together. Further analyses on individual conditions might be useful to detect specific patterns of adverse events, although this aim was beyond the primary objective of our work. Finally, data on quality of life and social functioning were scarce, preventing firm conclusions on such important patient-centred clinical outcomes.
Despite these limitations, the findings of this NMA might have relevant implications for clinical practice, policy and research. According to our results, SSRIs represent a first choice in physically vulnerable populations, with paroxetine and sertraline being investigated in the largest number of physical conditions and being supported by the strongest certainty of evidence. However, a higher vulnerability of comorbid individuals should be recognised also for commonly prescribed antidepressants, such as SSRIs, highlighting the importance of closer clinical monitoring. The use of TCAs remains a highly effective option for the management of depressive symptoms, although, in these populations, the risk of adverse events is arguably higher compared to individuals without comorbid conditions. Thus, this choice should be limited to selected individuals under close clinical monitoring. Other antidepressants, particularly mianserin, mirtazapine and agomelatine, showed relatively large effect sizes in terms of efficacy, although imprecise estimates and overall poor certainty warrant further experimental investigation. Moreover, these medications might effectively target distressing symptoms that are commonly reported in some physical conditions (e.g. insomnia, inappetence), and might have a favourable tolerability profile because of the lack of direct serotonin reuptake inhibition (e.g. lower risk of bleeding and gastrointestinal symptoms). In general, future studies should routinely assess patient-centred outcomes, such as quality of life and social functioning, that are particularly important in these populations.
In conclusion, tailoring treatments to meet individual patient needs, while balancing benefits and risks, poses a significant challenge for practitioners treating individuals with depression and physical comorbidities. Including the best evidence base available within a shared decision-making process is essential to improve the quality of individualised treatments globally.
Supplementary material
The supplementary material is available online at https://doi.org/10.1192/bjp.2025.18
Data availability
The database will be made available upon motivated request to the corresponding author, G.O., which will be considered and discussed by the main authors of the study (B.D.L., A.C., C.B., G.V., G.O.).
Author contributions
B.D.L.: conceptualisation, data curation, formal analysis, visualisation, writing (original draft preparation), writing (review and editing). A.C.: conceptualisation, data curation, visualisation, writing (original draft preparation), writing (review and editing). C.M.: data curation, visualisation, writing (original draft preparation). C.G.: conceptualisation, methodology, writing (review and editing). D.P.: conceptualisation, methodology, writing (review and editing). A.M.: data curation, visualisation, writing (original draft preparation), writing (review and editing). F.T.: formal analysis, methodology, writing (review and editing). F.A.: methodology, supervision, writing (review and editing). M.P.: methodology, supervision, writing (review and editing). M.S.: methodology, supervision, writing (review and editing). C.B.: conceptualisation, methodology, supervision, writing (review and editing). G.V.: conceptualisation, data curation, formal analysis, methodology, writing (original draft preparation), writing (review and editing). G.O.: conceptualisation, data curation, formal analysis, methodology, supervision, writing (original draft preparation), writing (review and editing).
Funding
This work was supported by #NEXTGENERATIONEU (NGEU), which is funded by the Ministry of University and Research (MUR) and the National Recovery and Resilience Plan (NRRP), project MNESYS (PE0000006) – A multiscale integrated approach to the study of the nervous system in health and disease (DN. 1553 11.10.2022).
Declaration of interest
M.S. received honoraria/has been a consultant for Angelini, AbbVie, Boehringer Ingelheim, Lundbeck and Otsuka. The other authors have no potential conflicts of interest to disclose.
Transparency statement
The authors affirm that this manuscript is an honest, accurate and transparent account of the study being reported. No important aspects of the study have been omitted, and any discrepancies from the registered study protocol have been fully explained.
eLetters
No eLetters have been published for this article.