In psychiatry, the need to measure the impact of treatments on patient outcomes has led to a gradual increase in the variety of instruments available and in their use as measures of outcome in clinical trials. The use of these instruments, in the form of questionnaires or rating scales, allows us to generate continuous outcome data, where each individual's outcome is measured in numbers. Continuous data are referred to data that can take any value in a specified range, for example weight, rating scales scores, area and volume. This means that any number may be measured and reported to arbitrarily many decimals.
In terms of data management and analysis, continuous outcomes may be categorized (two categories, such as improved and not improved) or kept continuous. The aim of this ABC of Methodology is to briefly discuss the pros and cons of these two approaches, which are commonly employed in the analyses of rating scale scores in clinical trials and systematic reviews.
Clinically, re-expressing continuous data as dichotomous can facilitate understanding and applicability of results, as it allows doctors to express in terms of the proportion of patients and not in terms of means and standard deviations (Table 1). In clinical trials and meta-analyses of trial data, categorization of continuous outcome measures allows us to express differences between competing treatments in terms of risk difference, relative risk or odds ratio, which are commonly employed and relatively easy to understand measures of treatment effect. However, dichotomizing leads to several problems (Table 1) (Altman & Royston, Reference Altman and Royston2006). A first issue is that it may seriously underestimate the extent of variation in outcome between groups, losing information and statistical power. This may increase the risk of type I error, which means failing to detect a difference that is real, a major drawback in clinical trials and meta-analyses. A second issue is that the definition of the cut-point may not be a straightforward task, and may be rather arbitrary and not based on any solid clinical reasoning. Consequently, it may happen that individuals close to, but on opposite sides of the cut point are considered as very different rather than very similar, which is clinically counterintuitive (Table 1). A third issue is the possibility that re-expressing continuous data as dichotomous may artificially produce large differences in proportions. Moncrieff & Kirsch (Reference Moncrieff and Kirsch2005), who hypothesized a situation of one point difference in mean change of scores on the Hamilton rating scale between drug and placebo, showed that defining response as a minimum 12-point improvement on the Hamilton rating scale for depression, if improvement is normally distributed and the criterion for response is close to the mean improvement rate, a response rate of 50% in the drug condition and 32% in the placebo condition can be obtained.
RD, risk difference; RR, relative risk; OR, odds ratio; NNT, number needed to treat.
In contrast, the main advantage of keeping data continuous is that all available information is used (Table 1). This is of paramount importance, as even a small difference between two means may have a significant impact on many patients. Guyatt et al. hypothesized a situation of a randomized clinical trial showing a mean difference of 0.25 in a questionnaire in which the minimal important difference is 0.5 (Guyatt et al. Reference Guyatt, Juniper, Walter, Griffith and Goldstein1998). It may be erroneously concluded that the difference is clinically not relevant, but this interpretation would be based on the assumption that every patient treated scored 0.25 better than they would have scored had they received the control treatment. This would ignore the possibility that treatment might have a heterogeneous effect and, depending on the true distribution of results, the appropriate interpretation might be different. Therefore, keeping data continuous may help identify heterogeneity in treatment effect (Table 1). At the same time, however, this can lead to several problems (Table 1). The first concern is that in clinical practice it is counterintuitive to express in terms of means and standard deviations, as doctors treat individual patients. The clinical meaning of differences in means may be rather difficult to extrapolate, as mean differences are not easily translated into proportions of patients who may benefit. In meta-analyses of trial data, additionally, the need for lumping together means and standard deviations from different rating scales has led to the use of standardized mean differences, which are an artefact as these standardized measures apply to a theoretical reference rating scale that does not exist in real life.
We argue that critical appraisal of findings from randomized clinical trials and systematic reviews should consider how continuous outcome data from rating scales have been manipulated and analyzed. Physicians should be encouraged to interpret study findings taking into consideration all possible implications of re-expressing continuous data as dichotomous versus keeping them continuous. Physicians should also be aware that it is possible to design clinical trials (Lieberman et al. Reference Lieberman, Stroup, McEvoy, Swartz, Rosenheck, Perkins, Keefe, Davis, Davis, Lebowitz, Severe and Hsiao2005) and systematic reviews (Barbui et al. Reference Barbui, Furukawa and Cipriani2008) that, instead of relying on rating scale scores as primary outcome measures, employ pragmatic outcomes (Barbui et al. Reference Barbui, Veronese and Cipriani2007), such as suicide attempts, treatment switching, hospitalization, school failure or truancy, job loss, or even dropping out of the trial itself. These outcomes have the added value of being very close to real life without requiring any form of manipulation.