Hostname: page-component-586b7cd67f-rdxmf Total loading time: 0 Render date: 2024-11-28T09:49:59.296Z Has data issue: false hasContentIssue false

A proof-of-concept study applying machine learning methods to putative risk factors for eating disorders: results from the multi-centre European project on healthy eating

Published online by Cambridge University Press:  29 November 2021

I. Krug*
Affiliation:
Melbourne School of Psychological Sciences, The University of Melbourne, Melbourne, VIC, Australia
J. Linardon
Affiliation:
School of Psychology, Deakin University, Geelong, Australia
C. Greenwood
Affiliation:
Centre for Social and Early Emotional Development, Deakin University, Burwood, Australia Centre for Adolescent Health, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC, Australia
G. Youssef
Affiliation:
School of Psychology, Deakin University, Geelong, Australia Centre for Adolescent Health, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, VIC, Australia
J. Treasure
Affiliation:
Department of Psychological Medicine, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
F. Fernandez-Aranda
Affiliation:
Eating Disorders Unit, Department of Psychiatry, University Hospital of Bellvitge, Barcelona, Spain Consorcio CIBER, Fisiopatología de la Obesidad y Nutrición (CIBERObn), Instituto de Salud Carlos III (ISCIII), Madrid, Spain Department of Clinical Sciences, School of Medicine and Health Sciences, University of Barcelona, Barcelona, Spain Psychiatry and Mental Health Group, Neuroscience Program, Institut d'Investigació Biomèdica de Bellvitge—IDIBELL, L'Hospitalet de Llobregat, Spain
A. Karwautz
Affiliation:
Department of Child and Adolescent Psychiatry, Medical University of Vienna, Vienna, Austria
G. Wagner
Affiliation:
Department of Child and Adolescent Psychiatry, Medical University of Vienna, Vienna, Austria
D. Collier
Affiliation:
SGDP Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, De Crespigny Park, Denmark Hill, London, UK Discovery Neuroscience Research, Eli Lilly and Company Ltd, Lilly Research Laboratories, Erl Wood Manor, Surrey, UK
M. Anderluh
Affiliation:
Department of Child Psychiatry, University Children's Hospital, University Medical Center Ljubljana, Ljubljana, Slovenia
K. Tchanturia
Affiliation:
Department of Psychological Medicine, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
V. Ricca
Affiliation:
Department of Neuroscience, Psychology, Drug Research and Child Health, University of Florence, Florence, Italy IRCCS Fondazione Don Carlo Gnocchi, Florence, Italy
S. Sorbi
Affiliation:
Department of Neuroscience, Psychology, Drug Research and Child Health, University of Florence, Florence, Italy IRCCS Fondazione Don Carlo Gnocchi, Florence, Italy
B. Nacmias
Affiliation:
Department of Neuroscience, Psychology, Drug Research and Child Health, University of Florence, Florence, Italy IRCCS Fondazione Don Carlo Gnocchi, Florence, Italy
L. Bellodi
Affiliation:
Department of Neuropsychiatric Sciences, Fondazione Centro San Raffaele del Monte Tabor, Milan, Italy
M. Fuller-Tyszkiewicz
Affiliation:
School of Psychology, Deakin University, Geelong, Australia Centre for Social and Early Emotional Development, Deakin University, Burwood, Australia
*
Author for correspondence: I. Krug, E-mail: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Background

Despite a wide range of proposed risk factors and theoretical models, prediction of eating disorder (ED) onset remains poor. This study undertook the first comparison of two machine learning (ML) approaches [penalised logistic regression (LASSO), and prediction rule ensembles (PREs)] to conventional logistic regression (LR) models to enhance prediction of ED onset and differential ED diagnoses from a range of putative risk factors.

Method

Data were part of a European Project and comprised 1402 participants, 642 ED patients [52% with anorexia nervosa (AN) and 40% with bulimia nervosa (BN)] and 760 controls. The Cross-Cultural Risk Factor Questionnaire, which assesses retrospectively a range of sociocultural and psychological ED risk factors occurring before the age of 12 years (46 predictors in total), was used.

Results

All three statistical approaches had satisfactory model accuracy, with an average area under the curve (AUC) of 86% for predicting ED onset and 70% for predicting AN v. BN. Predictive performance was greatest for the two regression methods (LR and LASSO), although the PRE technique relied on fewer predictors with comparable accuracy. The individual risk factors differed depending on the outcome classification (EDs v. non-EDs and AN v. BN).

Conclusions

Even though the conventional LR performed comparably to the ML approaches in terms of predictive accuracy, the ML methods produced more parsimonious predictive models. ML approaches offer a viable way to modify screening practices for ED risk that balance accuracy against participant burden.

Type
Original Article
Copyright
Copyright © The Author(s), 2021. Published by Cambridge University Press

Introduction

Eating disorders (ED) are psychiatric conditions characterised by high rates of comorbidity and relapse (e.g. Klump, Bulik, Kaye, Treasure, & Tyson, Reference Klump, Bulik, Kaye, Treasure and Tyson2009). Decades of retrospective and longitudinal studies have sought to elucidate the key factors that contribute to EDs (e.g. Striegel-Moore & Bulik, Reference Striegel-Moore and Bulik2007). Converging evidence across studies has identified various socio-cultural (Stice, Reference Stice2002), demographic, family history, negative life events, personality, and genetic factors to play a role in influencing the onset and progression of EDs (Jacobi, Hayward, de Zwaan, Kraemer, & Agras, Reference Jacobi, Hayward, de Zwaan, Kraemer and Agras2004). The large number of potential risk factors suggests that the pathway to an ED is complex and heterogeneous.

Most studies that have aimed to identify ED risk factors have done so using a set of commonly used statistical methods (e.g. general, or generalised linear models) applied to a single dataset, without sufficient consideration of model parsimony or replicability of results. Although these studies have significantly advanced knowledge around the aetiology of EDs and have led to effective prevention programmes (Stice, Marti, Shaw, & Rohde, Reference Stice, Marti, Shaw and Rohde2019), conventional statistical approaches have limitations that can hinder our ability to fully understand and more accurately predict ED onset and progression.

Conventional logistic regression (LR)-based approaches are best suited for the examination of smaller subsets of predictor variables that are theoretically justified. This is because the blind inclusion of many predictors can be one cause of overfitting, an artefactual increase in the predictive accuracy of a model simply because it contains more predictors than important predictors. Moreover, by focusing on sets of predictors that are unique to each study, overfitting can occur as there is rarely an opportunity to evaluate the performance of a predictive model in new datasets, affecting the generalisability of findings. The consequence of these practices is reduced ability to identify parsimony within a large/complex set of risk variables and can mean that the predictive models have very low cross-sample replicability (Ranganathan, Pramesh, & Aggarwal, Reference Ranganathan, Pramesh and Aggarwal2017). Thus, although previous studies have appropriately utilised conventional LR approaches on smaller subsets of predictors within a single dataset (e.g. Krug et al., Reference Krug, Treasure, Anderluh, Bellodi, Cellini, Collier and Fernández-Aranda2009), it is likely necessary to incorporate the breadth of putative risk factors to establish which factors are most important and, in turn, devise the best models to accurately predict ED outcomes (Ali et al., Reference Ali, Ali, Khan, Khan, Abbas, Khalil and Khalil2019).

Moreover, although several ED-relevant models incorporate interaction terms within their models (e.g. Bardone-Cone, Abramson, Vohs, Heatherton, & Joiner, Reference Bardone-Cone, Abramson, Vohs, Heatherton and Joiner2006), the list of proposed interactions is typically short, and there is seldom exploration of interactions that may be useful, yet not anticipated a priori (Stice & Desjardins, Reference Stice and Desjardins2018). In light of evident heterogeneity in risk factor predictive value across samples, hitherto unexplored yet complex interactions among risk factors may enhance prediction. Traditional general(ised) linear models are ill-suited to searches for interaction terms as they are reliant on user anticipation of relevant variables to model in this way.

Machine-learning (ML) methods can overcome the limitations associated with conventional statistical approaches and are optimally suited to enhance the prediction of ED onset and progression. ML involves a constellation of data-driven techniques that enable computer algorithms to identify and iteratively refine the ideal parameters to fit complex patterns between variables (Bzdok & Meyer-Lindenberg, Reference Bzdok and Meyer-Lindenberg2018). ML approaches are well suited to circumstances where there is a large amount of data to model and the best combination of predictors is uncertain a priori, as some ML approaches [such as prediction rule ensembles (PREs)] are designed to facilitate the detection of complex (non-linear) high-dimensional interactions that might inform predictions at the individual patient level (Dwyer, Falkai, & Koutsouleris, Reference Dwyer, Falkai and Koutsouleris2018). Accumulating research has demonstrated the advantages of ML over conventional statistical approaches in predicting the course, trajectory, and treatment outcomes of severe psychiatric conditions (e.g. Chekroud et al., Reference Chekroud, Zotti, Shehzad, Gueorguieva, Johnson, Trivedi and Corlett2016).

In contrast, the application of ML methods to predict ED onset and progression is limited. Some studies have used ML-based classification-tree analyses to establish empirically derived cut-points on various baseline measures for identifying individuals most at risk for an ED (for review, see Stice, Reference Stice2016). Results from these studies suggest that there may be many transdiagnostic and diagnostic-specific interactions between putative risk factors that increase the probability of future illness onset and progression. However, it is noteworthy that each investigated a small number of putative risk factors that feature in prominent theoretical models, and as such did not capitalise on ML capabilities of being able to handle and detect complex patterns between many variables. Furthermore, these studies have not tested for competing ML analyses to conventional LR models, suggesting that we do not know the improved accuracy and model parsimony of ML methods over commonly used methods for predicting ED risk outcomes.

Two recent prospective studies examined the progression of EDs capitalised on the strengths of ML-methods. The first study by Haynos et al. (Reference Haynos, Wang, Lipson, Peterson, Mitchell, Halmi and Crow2020) used ML-techniques (elastic net, random forests) to predict ED behaviour and diagnostic persistence from more than 50 self-reported sociodemographic, clinical, and patient history variables in 320 participants at Year 1 and 277 individuals at Year 2 follow-up assessments. They found that the ML models provided consistently higher prediction accuracy over 1 and 2 years than the conventional LR models (with 19% greater classification accuracy). Another recent study by Espel-Huynh et al. (Reference Espel-Huynh, Zhang, Thomas, Boswell, Thompson-Brenner, Juarascio and Lowe2021) compared ML approaches [support vector machine (SVM) and k-nearest neighbours] to conventional LR models to generate personalised predictions of symptom trajectories among 333 ED patients during the first two weeks of residential treatment. Contrary to Haynos et al.'s (Reference Haynos, Wang, Lipson, Peterson, Mitchell, Halmi and Crow2020) findings, this study revealed that the ML models did not improve predictive power beyond the one achieved by the LR analyses.

The current study

The present retrospective study builds on those two prior studies by applying ML-techniques to several putative ED risk factors on two outcomes: the presence v. absence of an ED, and the type of ED diagnosis (AN v. BN). The first outcome was selected because it is more important to identify those at elevated risk for any ED. This is because prevention programmes should ideally target all EDs, rather than just one subtype (Stice et al., Reference Stice, Marti, Shaw and Rohde2019). The second outcome was selected as it enabled evaluation of whether this predictor set could also differentiate participants at the level of the ED subtype, thus informing treatment planning efforts.

The aims of the present study were as follows. First, we sought to compare two different statistical ML approaches against the traditional LR approach for accuracy in the prediction of our key outcomes. Second, we aimed to explore which predictors were identified as important in these competing models.

Methods

Participants and procedure

Data were drawn from six ED centres from five different European countries. The current sample of 1402 participants comprised 588 ED patients [333 with AN (174 with AN-restrictive subtype (AN-R), 159 with AN-binge purging (AN-BP) subtype) and 255 with BN (224 with BN purging subtype and 31 with BN non-purging subtype) and 760 healthy controls (54 had missing data on diagnosis). Diagnoses (DSM-IV-R, APA, 2000) were derived through a semi-structured clinical interview carried out by experienced psychologists and psychiatrists. Most of the ED participants were recruited from the participating clinical sites.

The healthy control group was recruited from community sources from the same catchment areas. The exclusion criterion for the control group was a lifetime history of any health or mental illness (including EDs), screened by the General Health Questionnaire-28 (Goldberg, Reference Goldberg1981). Ethical approval for the study was obtained from all study sites.

The cross-cultural (environmental) risk factor questionnaire (CCQ)

This retrospective self-administered questionnaire entails a total of 51 items, divided into six sections, which assess a wide range of factors related to the development and maintenance of EDs. Information on how the CCQ items were developed can be found in our previous publications (e.g. Krug et al., Reference Krug, Treasure, Anderluh, Bellodi, Cellini, Collier and Fernández-Aranda2009).

Only items asking about events before the age of 12 years that took place before the ED emerged were included. A full list of all the 46 risk factors split into the four sections (1) Eating and weight concerns; (2) Individual and family eating patterns; (3) Family style, expectations, and lifestyle behaviours; (4) Social ideals of thinness] can be found in Table 1 and a copy of the full CCQ questionnaire can be requested from the corresponding author. Most of these questions were provided on a YES/NO format or a five-point response scale ranging from ‘Not at all’ to ‘Extremely’.

Table 1. Sample demographics

ED, eating disorder; AN, anorexia nervosa; BN, bulimia nervosa.

a 54 cases missing ED status in raw data, but imputed in analyses; M, mean; s.d., standard deviation; n, number of cases.

Participants were also asked to provide their demographics, including age, gender, level of education and employment. In Section 3, questions on lifestyle behaviours were included because the developmental phenotype of social difficulties (i.e. loneliness, shyness, inferiority, and low social support) commonly found in ED patients (e.g. Krug et al., Reference Krug, Penelo, Fernandez-Aranda, Anderluh, Bellodi, Cellini and Treasure2013) could be assessed indirectly through these CCQ questions.

A previous study assessing the psychometric properties of the CCQ (including all 127 items; Penelo et al., Reference Penelo, Granero, Krug, Treasure, Karwautz, Anderluh and Fernández-Aranda2011) provided satisfactory accuracy for discriminating between ED cases and controls (area under the ROC curve = 0.88). In previous publications, Cronbach alpha values ranged from 0.75 to 0.92 (Krug et al., Reference Krug, Treasure, Anderluh, Bellodi, Cellini, Collier and Fernández-Aranda2009; Penelo et al., Reference Penelo, Granero, Krug, Treasure, Karwautz, Anderluh and Fernández-Aranda2011).

Statistical analyses

Three modelling approaches were used for comparison: (1) conventional binary LR, (2) LASSO, and (3) PREs. The conventional binary LR included all 46 risk factors simultaneously and retained all these factors in the final model. By contrast, the LASSO approach aims to balance parsimony with an overall model performance by penalising predictors that have only small contributions to the model by shrinking their effect sizes towards zero, with the outcome being a model that retains only those predictors that are important (Tibshirani, Reference Tibshirani1996). Finally, interaction effects were examined using PREs. PRE is a relatively new statistical learning method that automatically identifies stratifications within a set of predictors that improve the prediction of an outcome. These stratifications (subgroups defined by multiple interacting variables) are defined by cut points that are estimated by the PREs and can be written as simple rules that define an important stratification of the form if [condition] then [prediction] (Fokkema & Strobl, Reference Fokkema and Strobl2020). The conditions could be a decision rule that spans multiple interacting variables (e.g. if an individual reports dieting and weight-related teasing occurred in childhood, then they are more likely to have an ED currently). Specific decision rules are ranked based on a measure known as importance which can take values >0 (higher values mean stronger performance of the prediction rule) and account for both the magnitude of its regression coefficient and the extent to which the stratification varies in the sample (Fokkema & Strobl, Reference Fokkema and Strobl2020).

All analyses were conducted in R version 6.3. Standard and penalised LR were estimated using the glmnet package version 3.0.2 (Friedman, Hastie, & Tibshirani, Reference Friedman, Hastie and Tibshirani2010) and PREs were estimated using the pre package version (Fokkema & Strobl, Reference Fokkema and Strobl2020). Several analytic decisions were held constant across these methods. First, an imputed data set was created for the control v. ED and BN v. AN samples using the mice package version 3.8.0 (Van Buuren & Groothuis-Oudshoorn, Reference Van Buuren and Groothuis-Oudshoorn2011). For each comparison, a positive coefficient reflected a higher risk for EDs or AN, whereas negative coefficients indicated a higher risk for non-ED or BN. Continuous variables were standardised by dividing scores by 2 standard deviations, as recommended to improve alignment with binary indicators (Gelman, Reference Gelman2008). Second, the overall dataset was divided with an 80/20% split for training and test sets, respectively, to mitigate potential overfitting to the training set and improve out-of-sample performance. Reported results are based on the 20% test set. Third, the predictive performance of models was assessed using the area under the curve (AUC) of the receiver operator characteristic and F1 scores, which is a metric that incorporates both the positive predictive value (a.k.a precision) and sensitivity (a.k.a recall) of model predicted outcomes. In accordance with other ML studies (Haynos et al., Reference Haynos, Wang, Lipson, Peterson, Mitchell, Halmi and Crow2020) we applied a value of AUC of 0.5 to suggest no discrimination, whereas a value of 1 was given to perfect discriminations. AUC scores were able to range from ‘extremely poor’ (0.5–0.59), ‘poor’ (0.60– 0.69), ‘fair’ (0.70–0.79), ‘good’ (0.80–0.89), to ‘excellent’ (>0.90). Third, all models were run to predict both ED v. non-ED cases (n = 1402) and AN v. BN cases (n = 588, AN and BN only). Further details specific to each analytic approach are provided in the online Supplementary Material (S1).

Results

Sample descriptives

Demographics for the full sample, the ED samples, including AN and BN and the healthy control group are presented in Table 1. Most of the sample was female, employed and had high education levels.

Predictive performance

Predictive performance (AUC) was greater for distinguishing ED from healthy individuals (LR = 0.88; LASSO = 0.89; PREs = 0.82) than for differentiating individuals with AN v. BN (LR = 0.71; LASSO = 0.70; PREs = 0.69; Table 2).

Table 2. Predictive performance of the different statistical models

ED, eating disorder; AN, anorexia nervosa; BN, bulimia nervosa; AUC, area under the curve.

AUC and F1 scores range from 0–1, with higher scores reflecting greater accuracy.

Based on AUC scores, predictive performance for distinguishing ED from healthy controls was greatest for the LASSO method, followed by conventional LR and then THE PREs. However, each method for the overall ED analyses was comparable, based on F1 scores (LR = 0.80; LASSO and PREs = 0.81), which considers precision and recall. For differentiating AN from BN cases, predictive performance was comparable based on AUC scores, although F1 scores favoured the PREs (0.74) method, followed by conventional LR (0.71), and then the LASSO (0.69) approach.

Important predictors

Conventional and penalised (LASSO) logistic regression

Coefficient estimates for the conventional and penalised (LASSO) LR models predicting ED (v. non-ED) and AN (v. BN) are presented in Table 3.

Table 3. Parameter estimates for conventional logistic regression and penalised (LASSO) regression for ED v. Non-ED and AN v. BN

ED, eating disorder; AN, anorexia nervosa; BN, bulimia nervosa.

For the conventional LR approach, we retained all predictors in the model. The strongest five predictors of ED status were own appearance concerns influenced eating (β = 0.81), number of times ate in fast-food restaurants (β = 0.80), unwanted sexual experiences (β = 0.64), doing schoolwork at school (β = 0.61) and family relationships influenced eating (β = 0.61),

The strongest five predictors of AN status were knowing anyone with an AN diagnosis (β = 1.08), had first meal of day before lessons started (β = 0.99), played computer games (β = 0. 93), food prepared for respondent (β = 0.79), and own appearance concerns influenced eating (β = −0.71).

LASSO analyses retained 26 predictors of both non-ED v. ED and AN v. BN. The strongest five predictors of ED status were own appearance concerns influenced eating (β = 0.75), family relationships influenced eating (β = 0.56), doing schoolwork at school (β = 0.43), relationships with friends influenced eating (β = 0.41) and number of meals ate in fast-food restaurants (β = 0.38),

The strongest five predictors of AN status were played computer games (β = 0.59), own appearance concerns influenced eating (β = −0.59), having first meal of day before lessons started (β = 0.58), schoolwork at school (β = 0.44) and knowing someone with an AN diagnosis (β = 0.41).

Prediction rule ensemble

Analyses identified 33 rules to differentiate individuals with an ED from those without and 13 prediction rules to differentiate individuals with a diagnosis of AN from those with a diagnosis of BN. Twenty distinct predictors were included in the 33 prediction rules for differentiating ED from non-ED participants, while twelve predictors were included in the 13 prediction rules for differentiating individuals with AN v. BN. Table 4 provides the ten strongest rules for differentiation of ED v. non-ED and AN v. BN [full list presented in online Supplementary Table S2 (ED v. non-ED) and online Supplementary Table S3 (AN v. BN)].

Table 4. Prediction rules for predicting ED status

ED, eating disorder; AN, anorexia nervosa; BN, bulimia nervosa; Imp, variable importance; Coeff, coefficient; s.d., standard deviation.

Variable names: AbuseFath = ‘Abusive relationship from father’, CompGames = ‘Played computer games’, FamAppInf = ‘Family weight/shape concerns influenced eating’, FamDietInf = ‘Joint dieting with family member(s) influenced eating’, FamRelInf = ‘Family Relationships influenced eating’, FoodPrepRes = ‘Food prepared for respondent’, FriendRelInf = ‘Relationships with friend influenced eating’, MealBefore = ‘Had first meal of day before lessons started’, MediaInf = ‘Mass media influenced eating’, NegFath = ‘Negative parenting from father’, NegMoth = ‘Negative parenting from mother’, NummealsRes = ‘Number of times ate in fast-food restaurants’, OwnAppInf = ‘Own appearance influenced eating’, PosMoth = ‘Positive parenting from mother’, SaltSugarRest = ‘Access to salty or sugary snacks was more restricted than friends’, SchoolWork = ‘Schoolwork at school’, SnackFreq = ‘Frequency with which ate fatty/sugary snacks’, StrictRules = ‘Parents had strict rules about food’, UnwantedSex = ‘Unwanted Sex’.

The most important prediction rules for ED v. non-ED cases suggest that individuals were more likely to have an ED if they report: (1) at least some influence from family relationships on eating (scores above 0), had to complete schoolwork at school (scores greater than 0), and reported at least some (>once a week) consumption of fatty/sugary snacks (scores greater than 1).

Conversely, participants were less likely to have an ED, if they exhibited the following protective factors in combination: (2) a low score on own physical appearance concerns influenced eating (scores of 3 or less), never eating at fast-food restaurants, and less family relationship influence on their eating (scores less than or equal to 2); and (3) no reported influence of family relationships on eating (scores of 0) and low reported mass media influenced eating and relationships with friends influenced eating (scores of less than or equal to 1).

The best prediction rules for AN v. BN cases suggest that AN was more likely for individuals who: (1) played computer (scores greater than 0) and reported at least some negative parenting from father (scores greater than 0.33); and (2) reported usually having first meal of day before lessons started (scores greater than 0) and doing schoolwork at school.

Participants were more likely to present with BN if: (3) they reported that their own appearance concerns influenced their eating (scores greater than 2) and the absence of strict parental rules about food (scores of 0).

The most important predictor for both models was own appearance concerns influenced eating. For differentiating ED v. non-ED cases, family relationships influenced eating, relationship with friends influenced eating, frequency of eating fatty/sugary snacks, and family weight/shape concerns influenced eating were the next most important predictors across all prediction rules. In contrast, for differentiating AN v. BN cases, the next most important predictors were played computer games, parents had strict rules about food, negative parenting from father, and had first meal of day before lessons started (Fig. 1).

Fig. 1. Overall predictor importance for ED v. Non-ED (left) and AN v. BN (right). Notes. For ED v. Non-ED: 1 = OwnAppInf: ‘Own appearance influenced eating’; 2 = FamRelInf: ‘Family Relationships influenced eating’, 3 = FriendRelInf: ‘Relationships with friend influenced eating’; 4 = SnackFreq: ‘Frequency with which ate fatty/sugary snacks’; 5 = FamAppInf: ‘Family weight/shape concerns influenced eating’; 6 = MediaInf: ‘Mass media influenced eating’; 7 = NummealsRes: ‘Number of times ate in fast-food restaurants’, 8 = FamDietInf: ‘Joint dieting with family member(s) influenced eating’; 9 = NegMoth: ‘Negative parenting from mother’, 10 = PosMoth: ‘Positive parenting from mother’; 11 = NegFath: ‘Negative parenting from father’; 12 = Schoolwork: ‘Schoolwork at school’; 13 = BodSatChild: ‘Body Sattisfaction as a Child’; 14 = AbuFath: ‘Abusive relationship from father’; 15 = SocialFood: ‘Inclusion in social events/meals’; 16 = UnwantedSex: ‘Unwanted Sex’; 17 = StrictRules: ‘Parents had strict rules about food’; 18 = TeaseAppInf: ‘Teasing about weight/shape by family/friends influenced eating’; 19 = NumFammeals: ‘Number of family members present at most meals’; 20 = TeaseEatInf: ‘Teasing about eating habits by family/friends influenced eating’. For AN v. BN: 1 = OwnAppInf: ‘Own appearance influenced eating’; 2 = CompGames: ‘Played computer games’; 3 = StrictRules: ‘Parents had strict rules about food’; 4 = NegFath: ‘Negative parenting from father’; 5 = MealBefore: ‘Had first meal of day before lessons started’; 6 = Schoolwork: ‘Schoolwork at school’; 7 = SnackFreq: ‘Frequency with which ate fatty/sugary snacks’; 8 = SaltSugarRest: ‘ Access to salty or sugary snacks was more restricted than friends’; 9 = FoodVal: ‘Value placed on food’; 10 = MediaInf: ‘Mass media influenced eating’; 11 = FamDietInf: ‘Joint dieting with family member(s) influenced eating’; 12 = FoodPrepRes: ‘Food prepared for family’.

Discussion

We compared the results of the conventional LR analyses to two ML (LASSO and PREs) techniques in predicting ED classification and diagnostic ED subtype based on numerous risk factors. Three main findings emerged. First, we found a higher prediction accuracy for classifying ED v. non-ED cases relative to classifying AN v. BN cases across each approach. Second, the highest accuracy was obtained for the two regression methods (binary LR and LASSO regressions), although the PRE technique relied on fewer predictors with comparable accuracy. Third, a range of individual and familial risk factors emerged across the three statistical approaches, with different risk factors emerging depending on the outcome classification (EDs v. non-EDs and AN v. BN).

Our first main finding indicated that the prediction accuracy was higher for differentiating overall EDs from the healthy controls than AN from BN. This finding is in line with the notion of transdiagnostic risk factors for any ED (e.g. Stice, Marti, & Durant, Reference Stice, Marti and Durant2011), and highlights the importance of developing ED prevention programmes targeting all EDs, rather than just one ED subtype. Our finding is also in agreement with the results of Haynos et al.'s (Reference Haynos, Wang, Lipson, Peterson, Mitchell, Halmi and Crow2020) findings, which found that for any ED diagnosis persistence, a baseline BN, and an absence of a partial baseline binge eating disorder (BED) diagnosis were found to be important predictors for the year 1, but not the year 2, follow-up prediction models. However, it should be noted that the models in Haynos et al.'s (Reference Haynos, Wang, Lipson, Peterson, Mitchell, Halmi and Crow2020) study yielded ‘poor’ predictive performance, which might partially explain these inconsistent findings.

Our second result was that all three statistical approaches performed well when taking into consideration the indices (AUC and FI) and outcomes (ED v. non-ED and AN v. BN). This finding is consistent with a recent systematic review which found no performance benefit of ML over conventional LR (Christodoulou et al., Reference Christodoulou, Ma, Collins, Steyerberg, Verbakel and Van Calster2019). Our result is also in line with the ED-specific study by Espel-Huynh et al. (Reference Espel-Huynh, Zhang, Thomas, Boswell, Thompson-Brenner, Juarascio and Lowe2021), which found all SVM models performed similarly well compared to the conventional LR analyses with. The best performing SVM in Espel-Huynh et al.'s (Reference Espel-Huynh, Zhang, Thomas, Boswell, Thompson-Brenner, Juarascio and Lowe2021) was the radial-kernel SVM (AUC = 0.94), which was almost identical to the performance of the LR (AUC = 0.93). However, the current finding contradicts the findings by Haynos et al. (Reference Haynos, Wang, Lipson, Peterson, Mitchell, Halmi and Crow2020), which suggested clearer improvements in the predictive performance of ML models in comparison to conventional LR. It is worth noting that our study had an overall better prediction accuracy for the overall ED models for all statistical models (AUC average = 0.86) than Haynos et al.'s (Reference Haynos, Wang, Lipson, Peterson, Mitchell, Halmi and Crow2020) study (AUC average = 0.72]. It is possible that this might have resulted from the fact that our study focused on a more comprehensive range of risk factor predictors for EDs, whereas Haynos et al.'s (Reference Haynos, Wang, Lipson, Peterson, Mitchell, Halmi and Crow2020) models were based on a mixture of demographic, clinical and treatment predictors. Larger sample size in the present study may have also facilitated stronger predictive performance in the current test sample.

Given the similarly adequate predictive performance in our study across all methods, consideration of the feature selection capabilities helps provide context for how each model achieved this level of accuracy. Whereas the conventional LR required all 46 predictions, the LASSO approach required only 26 predictors. Similarly, given that predictive performance of the PREs was comparable to the other two methods, their greatest value is in: (1) identifying interaction terms not anticipated a priori, and (2) developing a more parsimonious set of predictors through the predictive value of incorporating these interaction terms. In the current study, the PREs identified 20 risk factors for the ED v. non-ED model and 12 risk factors for the AN v. BN model (cf. 46 predictors in the LR model), highlighting the gain in parsimony relative to decrement in predictive performance.

Interaction terms, as they are included in the PREs are seldom featured in ED risk models. We articulated a range of interactions in our decision rules, many of which were anticipated given the current ED risk factor literature (Stice & Desjardins, Reference Stice and Desjardins2018; Stice et al., Reference Stice, Marti and Durant2011). However, there was one noteworthy exception, in that for BN, we found one PRE to be based on own appearance concerns influenced eating and the absence of strict parental rules about food. Contradicting this finding, previous research has shown strict parenting practices around food to lead to BN (e.g. MacBrayer, Smith, McCarthy, Demos, & Simmons, Reference MacBrayer, Smith, McCarthy, Demos and Simmons2001). Future research is needed to verify these unexpected prediction rules identified in the current study.

Finally, we found some common ED risk factors for the three statistical approaches. The strongest risk factor for the overall ED models was the extent to which one's own appearance concerns influenced eating. This finding is in accordance with the findings of recent prospective studies (Stice, Gau, Rohde, & Shaw, Reference Stice, Gau, Rohde and Shaw2017; Stice et al., Reference Stice, Marti and Durant2011), which also revealed body dissatisfaction and existing disordered eating pathology to be the most salient proximal risk factors for EDs. A few other top-ranked risk factors [e.g., family relationships influenced eating (Krug et al., Reference Krug, Treasure, Anderluh, Bellodi, Cellini, Collier and Fernández-Aranda2009), doing schoolwork at school, eating in fast-food restaurants (e.g. Mitchell et al., Reference Mitchell, King, Courcoulas, Dakin, Elder, Engel and Wolfe2015) and eating sugary snacks (e.g. Easter et al., Reference Easter, Naumann, Northstone, Schmidt, Treasure and Micali2013) were also shared amongst all three statistical approaches for the overall ED models. All these factors have commonly been revealed in previous ED risk factor studies, except for schoolwork undertaken at school. It is possible that undertaking schoolwork at school might interfere with the establishment of a nurturing stable home environment by spending more time away from parents and siblings, though this variable requires future consideration in large-scale prospective studies.

Implications

The findings of our current study provide important insights into the utility of the novel computational approaches of ML approaches such as LASSO and PREs for advancing research on ED risk factors, with the final aim of developing data-derived personalised prevention and early intervention efforts. Findings of complex interactions among proposed risk factors highlight that the role of a specific risk factor may depend on a range of other contextual influences in one's environment. As such, prediction rules may enable a more nuanced understanding at the individual level of both the risks to an individual and the mechanisms that place this individual at risk.

In the current study, all statistical approaches yielded favourable results for our risk predictions. However, the LASSO and PREs have several desirable features over conventional LR analyses, which makes them attractive approaches for future ED risk prediction studies. First, they tend to use less variables to achieve accuracy. This can be useful for screening purposes since researchers would need less measures to assess one's risk status. Second, the PREs approach utilises an automated search for useful interactions among a list of variables and may thus identify interactions not considered in previous research. It is therefore described as an exploratory, hypothesis-generating approach (Stice & Desjardins, Reference Stice and Desjardins2018).

Limitations

The results of this study must be interpreted within the context of some methodological limitations. First, the retrospective, self-report data collection procedures may have limited the validity and the reliability of our findings. It is for instance possible that some of the risk factors may have been heavily biased by the person's active ED status at the time of assessment (e.g. appearance concerns were identified as an important predating risk factor for an ED in hindsight simply because it was a salient factor during the illness). Second, all presumed risk factors were collected post ED diagnosis, which does not allow us to translate the findings into prediction before ED onset. All inferences about prediction are therefore primarily based on the prediction accuracy of our statistical models. Third, the current ED sample was assessed using DSM-IV-R criteria (APA, 2000). Since information on symptom level was not available, we could not convert these diagnoses into DSM-5 (APA, 2013) diagnoses. Fourth, the sample sizes for AN-R and AN-BP were not large enough to allow for separate analyses. Future studies would benefit from comprising larger ED samples to assess prediction accuracy in a range of ED diagnoses [also including BED and otherwise specified feeding and eating disorder (OSFED)]. Fifth, control participants were not explicitly screened for psychiatric diagnosis; therefore, it is not clear that the current model differentiated individuals with EDs from psychiatrically healthy individuals. Sixth, data on race/ethnicity were not collected on this sample (therefore it is unclear the extent to which the result generalise across this key demographic). Finally, the current study was not able to assess for biological and/ or genetic factors that could have interacted with the current environmental risk factors. Upcoming research should try to replicate the current findings in a longitudinal at-risk sample, including a range of biological, psychological, and social risk factors, before we can draw strong conclusions from the current findings.

Conclusion

This study applied ML methods to predict key ED outcomes from several risk factors. Our findings revealed that all three statistical methods yielded similar and appropriate statistical prediction accuracy. The risk factors covered for the different prediction models were mainly consistent with the literature and covered a range of previously established individual, familial and social ED risk factors, but also identified new risk factors. Even so, the overall performance of these models suggests a need for consideration of additional risk factors if we are to achieve strong predictive accuracy that limits false positives and ensures treatment is prioritised to those in greatest need. Present findings will ideally help generate more complex aetiological models that highlight distinct risk and protective pathways.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S003329172100489X.

Financial support

Financial support was received from the European Union (Framework – V Multicenter Research Grant, QCK1-1999-916). CIBERobn is an initiative of ISCIII. J.L holds a National Health and Medical Research Council Investigator Grant (APP1196948).

Conflicts of interest

None.

Ethical standards

The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.

The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional guides on the care and use of laboratory animals.

References

Ali, A., Ali, S., Khan, S. A., Khan, D. M., Abbas, K., Khalil, A., … Khalil, U. (2019). Sample size issues in multilevel logistic regression models. PLoS One, 14(11), e0225427. doi: 10.1371/journal.pone.0225427CrossRefGoogle ScholarPubMed
American Psychiatric Association (2000). Diagnostic and statistical manual of mental disorders (4th ed., Text Revision). Washington, DC: American Psychiatric Association.Google Scholar
American Psychiatric Association (2013). Diagnostic and statistical manual of mental disorders (DSM-5®). American Psychiatric Publication.Google Scholar
Bardone-Cone, A. M., Abramson, L. Y., Vohs, K. D., Heatherton, T. F., & Joiner, T. E. (2006). Predicting bulimic symptoms: An interactive model of self-efficacy, perfectionism, and perceived weight status. Behavior Research and Therapy, 44(1), 2742. doi: 10.1016/j.brat.2004.09.009CrossRefGoogle ScholarPubMed
Bzdok, D., & Meyer-Lindenberg, A. (2018). Machine learning for precision psychiatry: Opportunities and challenges. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 3(3), 223230. doi: 10.1016/j.bpsc.2017.11.007Google ScholarPubMed
Chekroud, A. M., Zotti, R. J., Shehzad, Z., Gueorguieva, R., Johnson, M. K., Trivedi, M. H., … Corlett, P. R. (2016). Cross-trial prediction of treatment outcome in depression: A machine learning approach. The Lancet Psychiatry, 3(3), 243250. doi: 10.1016/S2215-0366(15)00471-XCrossRefGoogle ScholarPubMed
Christodoulou, E., Ma, J., Collins, G. S., Steyerberg, E. W., Verbakel, J. Y., & Van Calster, B. (2019). A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology, 110, 1222. doi: 10.1016/j.jclinepi.2019.02.004CrossRefGoogle ScholarPubMed
Dwyer, D. B., Falkai, P., & Koutsouleris, N. (2018). Machine learning approaches for clinical psychology and psychiatry. Annual Review of Clinical Psychology, 14, 91118. doi: 10.1146/annurev-clinpsy-032816-045037CrossRefGoogle ScholarPubMed
Easter, A., Naumann, U., Northstone, K., Schmidt, U., Treasure, J., & Micali, N. (2013). A longitudinal investigation of nutrition and dietary patterns in children of mothers with eating disorders. The Journal of Pediatrics, 163(1), 173178.e171. doi: 10.1016/j.jpeds.2012.11.092CrossRefGoogle ScholarPubMed
Espel-Huynh, H., Zhang, F., Thomas, J. G., Boswell, J. F., Thompson-Brenner, H., Juarascio, A. S., & Lowe, M. R. (2021). Prediction of eating disorder treatment response trajectories via machine learning does not improve performance versus a simpler regression approach. International Journal of Eating Disorders, 54(7), 12501259. doi: 10.1002/eat.23510CrossRefGoogle Scholar
Fokkema, M., & Strobl, C. (2020). Fitting prediction rule ensembles to psychological research data: An introduction and tutorial. Psychological Methods, 25(5), 636652. doi: 10.1037/met0000256CrossRefGoogle ScholarPubMed
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 122. doi: 10.18637/jss.v033.i01CrossRefGoogle ScholarPubMed
Gelman, A. (2008). Scaling regression inputs by dividing by two standard deviations. Statisics in Medicine, 27(15), 28652873. doi: 10.1002/sim.3107CrossRefGoogle ScholarPubMed
Goldberg, D. P. (1981). Manual of the General Health Questionnaire (GHQ-28). Swindon, Wiltshire, IK: NFER Nelson Publishing.Google Scholar
Haynos, A. F., Wang, S. B., Lipson, S., Peterson, C. B., Mitchell, J. E., Halmi, K. A., … Crow, S. J. (2020). Machine learning enhances prediction of illness course: A longitudinal study in eating disorders. Psychological Medicine, 51(8), 13921402. doi: 10.1017/S0033291720000227CrossRefGoogle ScholarPubMed
Jacobi, C., Hayward, C., de Zwaan, M., Kraemer, H. C., & Agras, W. S. (2004). Coming to terms with risk factors for eating disorders: Application of risk terminology and suggestions for a general taxonomy. Psychological Bulletin, 130(1), 1965. doi: 10.1037/0033-2909.130.1.19CrossRefGoogle ScholarPubMed
Klump, K. L., Bulik, C. M., Kaye, W. H., Treasure, J., & Tyson, E. (2009). Academy for eating disorders position paper: Eating disorders are serious mental illnesses. International Journal of Eating Disorders, 42(2), 97103. doi: 10.1002/eat.20589CrossRefGoogle ScholarPubMed
Krug, I., Penelo, E., Fernandez-Aranda, F., Anderluh, M., Bellodi, L., Cellini, E., … Treasure, J. (2013). Low social interactions in eating disorder patients in childhood and adulthood: A multi-centre European case-control study. Journal of Health Psychology, 18(1), 2637. doi: 10.1177/1359105311435946CrossRefGoogle ScholarPubMed
Krug, I., Treasure, J., Anderluh, M., Bellodi, L., Cellini, E., Collier, D., … Fernández-Aranda, F. (2009). Associations of individual and family eating patterns during childhood and early adolescence: A multicentre European study of associated eating disorder factors. British Journal of Nutrition, 101(6), 909918. doi: 10.1017/s0007114508047752CrossRefGoogle ScholarPubMed
MacBrayer, E. K., Smith, G. T., McCarthy, D. M., Demos, S., & Simmons, J. (2001). The role of family of origin food-related experiences in bulimic symptomatology. International Journal of Eating Disorders, 30(2), 149160. doi: 10.1002/eat.1067CrossRefGoogle ScholarPubMed
Mitchell, J. E., King, W. C., Courcoulas, A., Dakin, G., Elder, K., Engel, S., … Wolfe, B. (2015). Eating behavior and eating disorders in adults before bariatric surgery. International Journal of Eating Disorders, 48(2), 215222. DOI: 10.1002/eat.22275CrossRefGoogle ScholarPubMed
Penelo, E., Granero, R., Krug, I., Treasure, J., Karwautz, A., Anderluh, M., … Fernández-Aranda, F. (2011). Factors of risk and maintenance for eating disorders: Psychometric exploration of the cross-cultural questionnaire (CCQ) across five European countries. Clinical Psychology & Psychotherapy, 18(6), 535552. doi: 10.1002/cpp.728CrossRefGoogle ScholarPubMed
Ranganathan, P., Pramesh, C. S., & Aggarwal, R. (2017). Common pitfalls in statistical analysis: Logistic regression. Perspectives in Clinical Research, 8(3), 148151. doi: 10.4103/picr.PICR_87_17Google ScholarPubMed
Stice, E. (2002). Risk and maintenance factors for eating pathology: A meta-analytic review. Psychological Bulletin, 128(5), 825848. doi: 10.1037/0033-2909.128.5.825CrossRefGoogle Scholar
Stice, E. (2016). Interactive and mediational etiologic models of eating disorder onset: Evidence from prospective studies. Annual Review of Clinical Psychology, 12, 359381. doi: 10.1146/annurev-clinpsy-021815-093317CrossRefGoogle ScholarPubMed
Stice, E., & Desjardins, C. D. (2018). Interactions between risk factors in the prediction of onset of eating disorders: Exploratory hypothesis-generating analyses. Behavior Research and Therapy, 105, 5262. doi: 10.1016/j.brat.2018.03.005CrossRefGoogle ScholarPubMed
Stice, E., Gau, J. M., Rohde, P., & Shaw, H. (2017). Risk factors that predict future onset of each DSM–5 eating disorder: Predictive specificity in high-risk adolescent females. Journal of Abnormal Psychology, 126(1), 38. doi: 10.1037/abn0000219CrossRefGoogle ScholarPubMed
Stice, E., Marti, C. N., & Durant, S. (2011). Risk factors for onset of eating disorders: Evidence of multiple risk pathways from an 8-year prospective study. Behavior Research and Therapy, 49(10), 622627. doi: 10.1016/j.brat.2011.06.009CrossRefGoogle ScholarPubMed
Stice, E., Marti, C. N., Shaw, H., & Rohde, P. (2019). Meta-analytic review of dissonance-based eating disorder prevention programs: Intervention, participant, and facilitator features that predict larger effects. Clinical Psychology Review, 70, 91107. doi: 10.1016/j.cpr.2019.04.004CrossRefGoogle ScholarPubMed
Striegel-Moore, R. H, & Bulik, C. M. (2007). Risk factors for eating disorders. The American Psychologist, 62(3), 181–98. doi: 10.1037/0003-066X.62.3.181.CrossRefGoogle ScholarPubMed
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological, 58(1), 267288. doi: 10.1111/j.2517-6161.1996.tb02080.xGoogle Scholar
Van Buuren, S., & Groothuis-Oudshoorn, K. (2011). Multivariate imputation by chained equations in R. Journal of Statistical Software, 45(3), 167. doi: 10.18637/jss.v045.i03Google Scholar
Figure 0

Table 1. Sample demographics

Figure 1

Table 2. Predictive performance of the different statistical models

Figure 2

Table 3. Parameter estimates for conventional logistic regression and penalised (LASSO) regression for ED v. Non-ED and AN v. BN

Figure 3

Table 4. Prediction rules for predicting ED status

Figure 4

Fig. 1. Overall predictor importance for ED v. Non-ED (left) and AN v. BN (right). Notes.For ED v. Non-ED: 1 = OwnAppInf: ‘Own appearance influenced eating’; 2 = FamRelInf: ‘Family Relationships influenced eating’, 3 = FriendRelInf: ‘Relationships with friend influenced eating’; 4 = SnackFreq: ‘Frequency with which ate fatty/sugary snacks’; 5 = FamAppInf: ‘Family weight/shape concerns influenced eating’; 6 = MediaInf: ‘Mass media influenced eating’; 7 = NummealsRes: ‘Number of times ate in fast-food restaurants’, 8 = FamDietInf: ‘Joint dieting with family member(s) influenced eating’; 9 = NegMoth: ‘Negative parenting from mother’, 10 = PosMoth: ‘Positive parenting from mother’; 11 = NegFath: ‘Negative parenting from father’; 12 = Schoolwork: ‘Schoolwork at school’; 13 = BodSatChild: ‘Body Sattisfaction as a Child’; 14 = AbuFath: ‘Abusive relationship from father’; 15 = SocialFood: ‘Inclusion in social events/meals’; 16 = UnwantedSex: ‘Unwanted Sex’; 17 = StrictRules: ‘Parents had strict rules about food’; 18 = TeaseAppInf: ‘Teasing about weight/shape by family/friends influenced eating’; 19 = NumFammeals: ‘Number of family members present at most meals’; 20 = TeaseEatInf: ‘Teasing about eating habits by family/friends influenced eating’. For AN v. BN: 1 = OwnAppInf: ‘Own appearance influenced eating’; 2 = CompGames: ‘Played computer games’; 3 = StrictRules: ‘Parents had strict rules about food’; 4 = NegFath: ‘Negative parenting from father’; 5 = MealBefore: ‘Had first meal of day before lessons started’; 6 = Schoolwork: ‘Schoolwork at school’; 7 = SnackFreq: ‘Frequency with which ate fatty/sugary snacks’; 8 = SaltSugarRest: ‘ Access to salty or sugary snacks was more restricted than friends’; 9 = FoodVal: ‘Value placed on food’; 10 = MediaInf: ‘Mass media influenced eating’; 11 = FamDietInf: ‘Joint dieting with family member(s) influenced eating’; 12 = FoodPrepRes: ‘Food prepared for family’.

Supplementary material: File

Krug et al. supplementary material

Krug et al. supplementary material 1

Download Krug et al. supplementary material(File)
File 15.4 KB
Supplementary material: File

Krug et al. supplementary material

Krug et al. supplementary material 2

Download Krug et al. supplementary material(File)
File 14.7 KB
Supplementary material: File

Krug et al. supplementary material

Krug et al. supplementary material 3

Download Krug et al. supplementary material(File)
File 15.4 KB
Supplementary material: File

Krug et al. supplementary material

Krug et al. supplementary material 4

Download Krug et al. supplementary material(File)
File 16.4 KB
Supplementary material: File

Krug et al. supplementary material

Krug et al. supplementary material 5

Download Krug et al. supplementary material(File)
File 14.5 KB