The term orthorexia nervosa (ON) was coined by Bratman (1997; Bratman & Knight, Reference Bratman and Knight2000) to describe an excessive preoccupation about consuming only biologically pure foods and eating in a healthy way, leading to dietary limitation, malnutrition, and underweight. People with this problem are worried about the quality of food, which for them is more important than its quantity or taste. They avoid having foods considered “impure”, unhealthy or contaminated (Moroze, Dunn, Holland, Yager, & Weintraub, Reference Moroze, Dunn, Holland, Yager and Weintraub2015; Varga, Dukay-Szabó, Túry, & van Furth, Reference Varga, Dukay-Szabó, Túry and van Furth2013). This preoccupation, when extreme, leads to restricted social lives, as orthoretic people do not eat outside because they do not trust the quality of the food or the way it is cooked. They spend excessive amounts of money and time buying food and planning, preparing, and consuming meals (Koven & Abry, Reference Koven and Abry2015; Varga et al., Reference Varga, Dukay-Szabó, Túry and van Furth2013), and if they do not follow their rigid self-imposed rules, they feel guilty and punish themselves (Moroze et al., Reference Moroze, Dunn, Holland, Yager and Weintraub2015; Varga et al., Reference Varga, Dukay-Szabó, Túry and van Furth2013).
Although the interest in ON has increased in recent years and professionals from the field of eating disorders recognize this behavioral pattern in their own practice (Vandereycken, Reference Vandereycken2011), with descriptions of some clinical cases resembling the definition of ON (Moroze et al., Reference Moroze, Dunn, Holland, Yager and Weintraub2015), this concept is still controversial. ON is not recognized as a disorder by the Diagnostic and Statistical Manual of Mental Disorders (5th Ed.; American Psychiatric Association, 2013) or by the International Classification of Diseases (10th Ed.; World Health Organization, 1992). Moreover, there is no consensus about the definition of ON or the possible diagnostic criteria. There have been two attempts to establish diagnostic criteria, by Moroze et al. (Reference Moroze, Dunn, Holland, Yager and Weintraub2015) and by Dunn and Bratman (Reference Dunn and Bratman2016). Both models conceptualize orthorexia as obsessive preoccupation or focus (which in our opinion would be better conceptualized as an overvalued idea, see Veale, Reference Veale2002). Moroze’s criteria include aspects such as guilt, the excessive amount of money and time consumed, and exclusion criteria (see Moroze et al., Reference Moroze, Dunn, Holland, Yager and Weintraub2015) that are not included in Dunn’s proposal. Dunn`s criteria include the idea that the accomplishment of the self-imposed eating behavior excessively determines the body image, self-worth, identity and/or satisfaction (Dunn & Bratman, Reference Dunn and Bratman2016).
Some authors point to a symptom overlap between ON and eating disorders, obsessive-compulsive disorder, hypochondriasis, and even psychotic disorders (Brytek-Matera, Reference Brytek-Matera2012). Previous studies found significant associations between ON scores and eating disorder symptomatology (Bundros, Clifford, Silliman, & Morris, Reference Bundros, Clifford, Silliman and Morris2016; Brytek-Matera, Krupa, Poggiogalle, & Donini Reference Brytek-Matera, Krupa, Poggiogalle and Donini2014; Musolino, Warin, Wade, & Gilchrist, Reference Musolino, Warin, Wade and Gilchrist2015; Segura-Garcia et al., Reference Segura-Garcia, Papaianni, Caglioti, Procopio, Nistico, Bombardiere and Capranica2012, Reference Segura-Garcia, Ramacciotti, Rania, Aloi, Caroleo, Bruni and De Fazio2015; Stochel et al., Reference Stochel, Janas-Kozik, Zejda, Hyrnik, Jelonek and Siwiec2015) and obsessive-compulsive traits and symptoms (Koven & Senbonmatsu, Reference Koven and Senbonmatsu2013; Segura-Garcia et al., Reference Segura-Garcia, Papaianni, Caglioti, Procopio, Nistico, Bombardiere and Capranica2012; Reference Segura-Garcia, Ramacciotti, Rania, Aloi, Caroleo, Bruni and De Fazio2015). However, further studies are needed to clarify what place ON should take in the spectrum of disordered eating.
To help clarify the research on ON, it is crucial to have valid instruments for its assessment. As far as we know, three instruments have been developed with this goal in mind: Bratman’s test (Bratman & Knight, Reference Bratman and Knight2000), with no validation studies; the ORTO-15 test (Donini, Marsili, Graziani, Imbriale, & Canella, Reference Donini, Marsili, Graziani, Imbriale and Cannella2005); and more recently, the Eating Habits Questionnaire (EHQ; Gleaves, Graham, & Ambwani, Reference Gleaves, Graham and Ambwani2013). The limitation of the EHQ is that it does not include items reflecting the negative emotionality associated with ON, as the sadness, fear, shame, guilt, frustration, self-loathing, and self-punishment, which is experimented by patients when they are forced to eat or faced with unhealthy food. Until now, the ORTO-15 has been translated and validated in several languages, and has been used in most of the studies in the field, as if it were considered the gold-standard in the assessment of ON, independently of its psychometrical properties.
Until now, the ORTO-15 has been translated and validated in several languages, and it has been used in most of the studies in the field because it was considered the gold-standard in the assessment of ON, regardless of its psychometric properties.
The ORTO-15 was developed based on Bratman’s Test (Bratman & Knight, Reference Bratman and Knight2000). It consists of 15 items with responses on a four-point scale ranging from Always to Never. Donini et al. (Reference Donini, Marsili, Graziani, Imbriale and Cannella2005) considered that the questionnaire tapped three dimensions (although no factor analysis was provided): Six items addressed a cognitive-rational area (e.g., “When eating, do you pay attention to the calories of the food?”); five items were about the clinical area (e.g., “In the last three months, did the thought of food worry you?”); and four items about the emotional area (e.g., “When you go in a food shop, do you feel confused?”). Four items were reverse scored and, importantly, two items followed a specific scoring method: For Item 1 and Item 13, Always = 2, Often = 4, Sometimes = 3, and Never = 1. We will refer to these two items as recoded items. A single total scored is computed by adding the scores on each item. Lower total scores are indicative of ON, and high scores indicate normal eating behavior.
The ORTO-15 questionnaire was originally created in Italian, and it has been translated and validated in various languages: Turkish (Arusoğlu, Kabakçi, Köksal, & Kutluay Merdol, Reference Arusoğlu, Kabakçi, Köksal and Kutluay Merdol2008), Portuguese (Alvarenga et al., Reference Alvarenga, Martins, Sato, Vargas, Philippi and Scagliusi2012), Hungarian (Varga, Thege, Dukay-Szabó, Túry, & van Furth, Reference Varga, Thege, Dukay-Szabó, Túry and van Furth2014), Polish (Brytek-Matera et al., Reference Brytek-Matera, Krupa, Poggiogalle and Donini2014; Stochel et al., Reference Stochel, Janas-Kozik, Zejda, Hyrnik, Jelonek and Siwiec2015), and German (Missbach et al., Reference Missbach, Hinterbuchinger, Dreiseitl, Zellhofer, Kurz and König2015). Five studies have analyzed the internal structure of the ORTO-15, with very different results among them. These studies have used principal components analysis (Alvarenga et al., Reference Alvarenga, Martins, Sato, Vargas, Philippi and Scagliusi2012; Arusoğlu et al., Reference Arusoğlu, Kabakçi, Köksal and Kutluay Merdol2008), which is not an appropriate technique to evaluate the internal factor structure of a set of items (e.g., Widaman, Reference Widaman, Cudeck and MacCallum2007), and confirmatory factor analysis (Brytek-Matera et al., Reference Brytek-Matera, Krupa, Poggiogalle and Donini2014; Missbach et al., Reference Missbach, Hinterbuchinger, Dreiseitl, Zellhofer, Kurz and König2015; Varga et al., Reference Varga, Thege, Dukay-Szabó, Túry and van Furth2014), which could be considered a doubtful choice, given that the internal structure was fairly unclear. A summary of the proposed structures can be seen in Table 1. The number of factors retained ranges from one (Varga et al., Reference Varga, Thege, Dukay-Szabó, Túry and van Furth2014) to three (Alvarenga et al., Reference Alvarenga, Martins, Sato, Vargas, Philippi and Scagliusi2012; Arusoğlu et al., Reference Arusoğlu, Kabakçi, Köksal and Kutluay Merdol2008). In some versions, several items have been dropped due to problematic loadings, leading to shortened versions ranging from nine items (Brytek-Matera et al., Reference Brytek-Matera, Krupa, Poggiogalle and Donini2014; Missbach et al., Reference Missbach, Hinterbuchinger, Dreiseitl, Zellhofer, Kurz and König2015) to 12 items (Alvarenga et al., Reference Alvarenga, Martins, Sato, Vargas, Philippi and Scagliusi2012). Of the 15 items, nine of them have been dropped in one or more of the validation studies: the four reverse-keyed items, the two items that are not linear, and Items 6, 14, and 15.
ITA = Italian version; TUR = Turkish version; POR = Portuguese version; POL = Polish version; HUN = Hungarian version; GER = German version; SPA = Spanish version; P = problematic item (deleted in a previous version); R = reversed item; RC = recoded item. Shaded cells correspond to items removed from the final version.
The present study had two goals. The first goal, given that there are no data about the ORTO-15 in the Spanish language, was to carry out the adaptation of the ORTO-15 to Spanish and study its psychometric properties. The second, given the inconsistencies in the previous results about the internal structure of the ORTO-15, was to offer additional evidence about its factorial structure. To do so, we studied two independent samples in order to cross-validate our results.
Method
Participants and Procedure
Sample 1
Data were collected through the Internet. The link to the survey was distributed through the e-mail distribution lists of students from the university where the first two authors worked. Participants completed the protocol, which took approximately 20 minutes. After reading the description of the study, participants provided informed consent, where the anonymity of the responses was clearly stated. Participants had to be at least 18 years old to take the survey, and participants who reported having a diagnosis of any mental disorder and/or taking psychotropic medication were not included in the study. Individuals did not receive any compensation for their participation in the study.
A total of 837 participants completed the measures. Data from 30 participants were excluded because they reported having a mental disorder. Of the 807 remaining participants, 598 were women (74.1%), and 209 were men (25.9%). The mean age was 23.65 years (SD = 6.01, range [18. 66]). Regarding education, 0.1% of the sample reported not having studies or primary studies, 1.1% had secondary studies, 74.8% were university students, and 23.9% had completed university studies.
Sample 2
This sample was recruited using the “snowball” method. Advanced psychology students at the third author’s university participated voluntarily and received one course credit for their recruitment efforts. They attended a two-hour seminar where they received training about the purpose of the study and how to present the study and the instruments to prospective participants. Each student individually administered the assessment protocol to three friends and/or family members. Participants were volunteers. Before completing the questionnaires, each participant gave written informed consent. Participants who had current mental health problems and/or were taking psychotropic medication were not included in the study. Two weeks later, each student administered a retest of the ORTO-15 questionnaire to one of the three friends and/or family members. A total of 72 participants provided responses to the ORTO-15 on both test and retest.
A total of 248 participants completed the measures. Data from six participants were excluded because they reported having a mental disorder. Thus, 242 individuals made up this sample, 153 women (63.2%). The mean age was 24.94 years old (SD = 7.07; range [18, 54]). Regarding education, 7% of the sample reported having primary studies, 21.1% had secondary studies, and 71.1% were university students or had completed university studies. The mean BMI was 22.41 (SD = 3.30; range [16.94, 35.14]). More than half of the sample were students (62.4%), single (73.6%), and had a self-rated medium socio-economic level (63.3%). The present study has been approved by the corresponding ethics committee.
Instruments
In Sample 1, the following questionnaires were administered:
Sociodemographics
Participants reported their sex, age, and education level.
ORTO-15 (Donini et al., Reference Donini, Marsili, Graziani, Imbriale and Cannella2005)
This is the questionnaire under study, and it is described in detail in the introduction section. It was translated into Spanish by a group of clinical psychologists with strong English language skills. It was then reviewed by a professional native translator and back-translated into Spanish.
In Sample 2, in addition, these questionnaires were included:
Sociodemographics
As in sample 1, participants in Sample 2 provided information about their sex, age, education level, weight (to the nearest kilogram) and height (to the nearest centimeter). Moreover, they were asked four yes/no questions: “Go on a diet”, “practice regular physical exercise”, “tobacco consumption “, and “alcohol consumption”.
Eating Attitudes Test-26 (EAT-26; Garner, Olmsted, Bohr, & Garfinkel, Reference Garner, Olmsted, Bohr and Garfinkel1982)
This 26-item questionnaire measures the frequency of the individual’s behavior or attitudes about eating disorders. Items are responded to on a 6-point scale ranging from Always to Never, and responses are recoded to a 4-point scale. Although the internal structure of the EAT-26 is not completely clear (e.g., Koslowsky et al., Reference Koslowsky, Scheinberg, Bleich, Mark, Apter, Danon and Solomon1992; Ocker, Lam, Jensen, & Zhang, Reference Ocker, Lam, Jensen and Zhang2007), three different subscales are usually considered: (a) a Diet dimension score (13 items; e.g., ‘‘I am terrified about being overweight’’), (b) a Bulimia and Food Preoccupation score (6 items; e.g., ‘‘I find myself preoccupied with food’’), and (c) an Oral Control score (7 items; e.g., ‘‘I avoid eating when I am hungry’’). For the present study, the Spanish version was used (Castro, Toro, Salamero, & Guimerá, Reference Castro, Toro, Salamero and Guimerá1991). Cronbach’s alphas for the present sample are shown in Table 4.
Self-Report Yale–Brown Cornell Eating Disorders Scale (SR-YBC-EDS; Bellace et al., Reference Bellace, Tesser, Berthod, Wisotzke, Crosby, Crow and Halmi2012)
The Spanish adaptation of the SR-YBC-EDS is composed of two parts. Part 1 assesses symptom frequency in the past month through a 65-item checklist with a 5-point scale ranging from Never = 0 to Always = 4. Two different scores are usually computed, a score for the preoccupations (21 items; e.g., “Think excessively about the fat content of the food”), and a score for the Rituals (44 items; e.g., “Need to cut each piece of food into a specific size”). Part 2 of the SR-YBC-EDS assesses, on a 5-point scale ranging from None = 0 to Extreme = 4, the severity of preoccupations and rituals separately, based on four items about time occupied, interference, distress, and degree of control. The Spanish version was used (Perpiñá, Giraldo-O’Meara, Roncero, & Martínez-Gómez, Reference Perpiñá, Giraldo-O’Meara, Roncero and Martínez-Gómez2015). Cronbach’s alphas for the present sample are shown in Table 4.
Analysis
All the analyses, whenever possible, were performed with both samples in order to check whether they cross-validated. We analyzed our data in four steps. First, we studied the scoring scheme for the ORTO-15. As previously noted, all the recoded items and all the reverse-keyed items have been found problematic in at least one validation study. We computed the Pearson correlations between the total score just using the direct items and (a) the items that should be recoded (Items 1 and 3; with and without recoding) and (b) the reverse-keyed items (Items 2, 5, 8, and 9; with non-reversed scores and reversed scores). Evidence favoring the scoring scheme proposed by Donini et al. (Reference Donini, Marsili, Graziani, Imbriale and Cannella2005) will be found if the correlations are higher for recoded and reversed scores, as proposed.
Second, we studied the internal structure of the scale. Considering the inconsistencies in previous studies, we opted for an exploratory factor analysis (EFA). To determine the number of dimensions to be retained, we used parallel analysis, following the recommendation of Garrido, Abad, and Ponsoda (Reference Garrido, Abad and Ponsoda2013), and interpretability of the solutions. Models were analyzed using weighted least squares–mean and variance adjusted (WLSMV). By using this estimator instead of one based on maximum-likelihood, we were able to maintain the categorical nature of the responses and obtain less biased estimates (Finney & DiStefano, Reference Finney, DiStefano, Hancock and Mueller2006; Rhemtulla, Brosseau-Liard, & Savalei, Reference Rhemtulla, Brosseau-Liard and Savalei2012). Goodness-of-fit in all derived models was assessed with the common cut-off values for the fit indices (Hu & Bentler, Reference Hu and Bentler1999). Thus, as indications of model fit, we consider whether the comparative fit index (CFI) and Tucker-Lewis index (TLI) have values greater than .95, and whether the root mean square error of approximation (RMSEA) is less than .06.
Third, after defining the internal structure of the scale, and removing items if needed, we evaluated the reliability of the scores by computing Cronbach’s alpha and test-retest correlation. Fourth, and finally, we computed correlations between the ORTO scores and BMI, EAT, SR-YBC-EDS scores, and mean differences were evaluated with Cohen’s d.
EFA was performed with Mplus 7 (Muthén & Muthén, 1998–2012), without modifying any of its default specifications, so a GEOMIN rotation was applied. Parallel analysis was computed with syntax written in R. For all the other analyses, SPSS 20 was used.
Results
According to Donini et al. (Reference Donini, Marsili, Graziani, Imbriale and Cannella2005), Item 1 and Item 13 are not linear with ON, and so they must be recoded. For both the items and samples, when items are recoded, the correlations with the total score (only with direct items) were not only lower in magnitude, but also negative: Mean correlation before recoding was .50; mean correlation after recoding, –.35 (see Table 2).
For the items that were designed to be reverse-keyed, the correlation with a total score (again, computed with only the direct items) should be negative before reversing and positive after being reversed. Contrary to this expectation, for two out of the four reverse-keyed items, the correlations were negative after being reversed for both samples (mean correlation = –.31). For the remaining two items, the correlations were positive (see Table 2), but very small (mean correlation = .07). Considering these results, in the next steps we kept the original scores of the items, without reversing or recoding.
For both samples, the first two factors in the EFA had eigenvalues that were clearly above their random data counterparts (real versus random eigenvalues –Sample 1/Sample 2–: factor 1 = 5.13/4.05 vs. 1.36/1.70, factor 2 = 1.59/1.94 vs. 1.27/1.54), whereas the third eigenvalue was below its random counterpart (real = 1.17/1.23, random = 1.23/1.42). For Sample 1, model fit could be considered satisfactory [χ2(76) = 275.40, CFI = .962, TLI = .947, RMSEA = .057]. For Sample 2, model fit was not as good [χ2(76) = 131.35, CFI = .935, TLI = .911, RMSEA = .055]. The correlation between factors was equal to .32 in Sample 1 and –.06 in Sample 2.
When assessing the interpretability of a two-factor solution, shown in Table 3, we found several problems. Item 14 presented very low loadings in any factor for both samples (maximum loading = .25). For several items, allocation to a single dimension was not possible, as the loadings were very similar in both factors (Item 4 and Item 12 in Sample 1, Item 6 in Sample 2). Finally, the pattern of results was not consistent across samples.
Note: 2FS = Two factor solution (for ORTO-15). 1FS = One factor solution (for ORTO-11). F1= First factor. F2 = Second factor.
Considering the problems with a two-factor solution, we decided to create a unidimensional version of the test. To do so, we retained a single factor with the scores from Sample 1, and we removed the items with loadings below .30 in this solution. With this cut-off criterion, Items 5, 6, 8, and 14 were discarded (loadings –.10, .29, –.06, .28, respectively). The loadings of the shortened version can be seen in Table 3. For Sample 1, the fit of this model can be considered to be basically satisfactory [χ2(44) = 267.86, CFI = .954, TLI = .942, RMSEA = .079]. The mean, maximum, and minimum loadings were .87, .31, and .62, respectively.
We checked the stability of the factor solution for the shortened test with Sample 2. Although the model fit approached satisfactory levels [χ2(44) = 84.13, CFI = .947, TLI = .934, RMSEA = .061], the loadings of two items fell below .30. The loading of Item 2 was .25, and for Item 15, it was .16. Considering that the size of Sample 1 is much larger than Sample 2, to avoid further item elimination, and accepting that this shortened version is not consistent in both samples, we considered that the best solution was a shortened version of the scale with the 11 items selected with Sample 1. We will call this shortened version ORTO-11. For comparability with previous studies, we also computed scores for the ORTO-15 following the indications of the authors.
The reliability of the scores on the ORTO-11, as measured by Cronbach’s alpha, was satisfactory for both samples and times (Sample 1 – α = .83; Sample 2, test – α = .74; Sample 2, retest – α = .78). Scores showed a high level of temporal stability, with a Pearson correlation of .92; p < .001. When we consider the original ORTO-15, Cronbach’s alphas were .20 for Sample 1 and .23 for Sample 2 (test), and .45 (retest); test-retest correlation was equal to .78.
The associations between variables calculated with sample 2 can be seen in Table 4. To simplify the interpretation, we computed scores for the ORTO-11 and ORTO-15, where higher values are indicative of higher ON. The correlation between the ORTO-11 and the ORTO-15 was medium-sized (r = .44). When we considered the additional variables measured, in all cases the associations were higher with the ORTO-11 scores than with the ORTO-15 scores. Considering only the ORTO-11, this measure presented a high overlap (rs > .50) with the Diet and Bulimia factors of the EAT-26 and with Preoccupations and Rituals of the SR-YBC-EDS. BMI and ORTO-11 scores were positively correlated, r = .22. Women and non-consumers of alcohol presented higher scores on the ORTO-11, both Cohen’s d = .29. Importantly, those on a diet presented much higher mean ORTO-11 scores than those who were not on a diet, d = 1.14. The relationship with smoking or physical exercise was not statistically significant.
Note: ORTO-11 = Reduced version of the ORTO-15 scale; EAT = Eating Attitudes Test; YBC = Self-Report Yale–Brown Cornell Eating Disorders Scale; BMI = body mass index. Underlined values correspond to statistically significant correlations or mean differences (p < .05). To facilitate the interpretation of ORTO-11 and ORTO-15 scores, and contrary to Donini et al. (Reference Donini, Marsili, Graziani, Imbriale and Cannella2005), higher values correspond to higher orthorexia nervosa. Sex was coded with a dummy variable, where 0 = women and 1 = men. Dieting, exercise, smoking, and alcohol were coded with dummy variables, where 0 = no and 1 = yes.
Discussion
The ORTO-15 is the most widely used instrument to measure ON across the studies. Results about its internal structure have been inconsistent across different validation studies in several languages. Therefore, the main objective of the present study was to analyze the psychometric properties of this instrument. In doing so, we counted on two independent samples from a non-clinical population. Moreover, no data about the Spanish version of this questionnaire have been published.
First, the scoring scheme of the questionnaire was studied. We found no evidence supporting the advisability of recoding some items, and some items that were supposed to be reversed offered higher correlations with the direct items when non-reversed. This original scoring scheme has also been found to be problematic in previous studies (Alvarenga et al., Reference Alvarenga, Martins, Sato, Vargas, Philippi and Scagliusi2012).
Regarding the internal structure of the questionnaire, in a first step, a two-factor solution was found. However, due to interpretability problems (i.e., several items had similar loadings in both factors, and the patterns of results were inconsistent in the two samples), we opted for a unidimensional solution. To do so, four items had to be removed, leading to the ORTO-11. This version of the ORTO questionnaire corresponds to the one found in the Hungarian version. Moreover, as Table 1 shows, Items 8 and 14 have repeatedly been deleted in the Polish, German, and Hungarian versions.
When the stability of the one-factor solution of the ORTO-11 was tested in Sample 2, the model fit was satisfactory, but two items showed loadings lower than .30: items 2 and 15. These items have been found to be problematic in other validations (see Table 1). Item 2 was deleted in the Portuguese, Polish, and German versions, whereas Item 15 was deleted in the Polish version.
The internal consistency with the 11 items was adequate (α = .74). This result agrees with the Hungarian version (α = .82), which shares the same structure. However, in the Portuguese, Turkish, and German versions, with other item structures, the internal consistency was not adequate (α < .70). The Cronbach’s alpha of our proposed version of the ORTO is much higher than the Cronbach’s alpha of the full version with item reversal and recoding. Moreover, our shortened version presented higher temporal stability.
The ORTO-11 was associated with external variables to a larger degree than the original ORTO-15. We consistently found that higher eating psychopathology scores were associated with higher scores on the ORTO-11, mainly Diet and Bulimia factors of the EAT-26 and Preoccupations and Rituals factors of the SR-YBC-EDS. This pattern of results suggesting the association with eating psychopathology agrees with previous studies (Barnes & Caltabiano, 2016; Brytek-Matera et al., Reference Brytek-Matera, Krupa, Poggiogalle and Donini2014; Bundros et al., Reference Bundros, Clifford, Silliman and Morris2016; Sanlier, Yassibas, Bilici, Sahin, & Celik, Reference Sanlier, Yassibas, Bilici, Sahin and Celik2016).
However, this result about the association between ON and eating disorders should be interpreted with caution, due to the redundancy observed between the ORTO-15 and EAT-26. For instance, the Item 1 wording – “When eating, do you pay attention to the calories of the food?”– presents a high overlap with the wording of Item 6 on the EAT-26 (“I am aware of the calorie content of foods that I eat”), but this item on the EAT-26 is supposed to measure dieting, not ON. There is also a similar overlap between Item 7 on the ORTO-15 –“Does the thought of food worry you for more than three hours a day?”– and items 3 and 21 on the EAT-26 –“Find myself preoccupied with food” and “Give too much time and thought to food”–.
The ORTO-11 was associated with a measure that evaluates preoccupations and rituals related to eating disorders, the SR-YBC-EDS. Results showed that the severity of preoccupations and rituals was associated with orthorexia in a positive way, i.e., the greater the severity of the preoccupations and rituals, the greater the severity of ON. However, the association was lower when compared to the indices for the association between ORTO-11 and EAT-26. The association between SR-YBC-EDS and ORTO has been observed in previous studies (Segura-García et al., 2012; Segura-García et al., 2015) and agrees with the suggestion that ON is associated with some obsessive-compulsive traits (Bundros et al., Reference Bundros, Clifford, Silliman and Morris2016; Brytek-Matera, Reference Brytek-Matera2012; Koven & Abry, Reference Koven and Abry2015; Koven & Senbonmatsu, Reference Koven and Senbonmatsu2013; Varga et al., Reference Varga, Dukay-Szabó, Túry and van Furth2013).
Integrating our results with those from previous studies, mainly those from Missbach et al. (Reference Missbach, Hinterbuchinger, Dreiseitl, Zellhofer, Kurz and König2015), we consider that the use of the ORTO-15, and the shortened versions derived from it, should be discontinued. The performance of this instrument has been inconsistent across samples, with almost as many versions as studies. The method for computing a total score is not clear, as the items that should be reversed or recoded are found to be problematic or offer better performance without incorporating the changes suggested by Donini et al. (Reference Donini, Marsili, Graziani, Imbriale and Cannella2005). The content validity of the instrument is doubtful. The high correlations with EAT-26 and SR-YBC-EDS scores, the theoretically unexpected correlation between the ORTO-11 and BMI, and the very large mean difference between dieters and non-dieters, can be easily interpreted if we consider that, instead of assessing ON, the ORTO-11 is mainly measuring dieting or restrictive eating. The satisfactory reliability of the ORTO-11 scores indicates that this scale measures something with a small error, but the other available data suggest that we are not measuring what we are supposed to be measuring, ON.
The present study has some limitations. Data are based on self-reports, and both samples were convenience samples mainly composed of university students. The questions about going on a diet, doing exercise, or alcohol and tobacco consumption were yes/no questions, and in future studies this should be specified to better understand the ON pattern in non-clinical populations. Cronbach’s alpha for the Bulimia and Oral Control scales were rather low, probably due to a floor effect. And finally, our Spanish version of ORTO-15 was translated from the English version, and not directly from the Italian version.
This study represents an addition effort to evaluate the validity of the ORTO-15. We have done so with two independent samples and starting from the basis: the scoring scheme, internal structure with an exploratory analysis, and relation with other variables. The validity of the instrument has been questioned by other authors (see Dunn & Bratman, Reference Dunn and Bratman2016; Koven & Abry, Reference Koven and Abry2015; Moroze et al., Reference Moroze, Dunn, Holland, Yager and Weintraub2015; Varga et al., Reference Varga, Thege, Dukay-Szabó, Túry and van Furth2014), and our results agree with these previous studies, suggesting that the psychometric properties of the ORTO-15 are not adequate. Our results indicate that the ORTO-11 scores detect people who are on diet, but this instrument is not efficient in detecting the severity of orthorexic behaviors and attitudes. With all of this in mind, greater attention is needed on other measures of ON (e.g., the EHQ) or on the development of new instruments. Furthermore, much of what is supposed to be known about ON should be reconsidered, as the ORTO, used in most of the studies as if it were considered the gold-standard, is not very golden.