Introduction
The suicide rate among transitioning U.S. service members (TSMs) in the year following military separation is 2.5 times the rate among age-sex matched personnel who remain in service (Shen, Cunha, & Williams, Reference Shen, Cunha and Williams2016). A 2018 Presidential Executive Order tried to address this problem by calling for increased programming and care coordination between the Departments of Defense (DoD) and Veterans Affairs (VA) to support TSMs (Executive Office of the President, 2018; U.S. Secretary of DoD, U.S. Secretary of VA, & U.S. Secretary of Homeland Security, 2018; 116th Congress [2019–2020], 2020). The U.S. Army responded by expanding its Soldiers for Life (SFL) program to focus on transitional assistance needs of TSMs. The VA responded by establishing the Solid Start program, which makes three contacts with each TSM in the 12 months after leaving active service to provide guidance on accessing VA benefits (U.S. VA, 2020). However, SFL and Solid Start are both universal programs that do not target special efforts or services to TSMs estimated to be at high risk of suicide-related behaviors.
In an earlier report (Stanley et al., Reference Stanley, Chu, Gildea, Hwang, King, Kennedy and Kessler2022), we described a machine learning model based on information available at the time the soldier left active service. The model was developed to predict suicide attempts (SAs) after leaving service and help target high-risk TSMs for preventive interventions. The analysis was based on data from the Study to Assess Risk and Resilience in Servicemembers-Longitudinal Study (STARRS-LS), an ongoing cohort study of Army soldiers surveyed both during and after leaving active service. We found that a model could be developed that predicted SAs with good accuracy using a combination of predictors taken from surveys administered while still in service, information from military records, and geocoded data about the areas soldiers planned to move to after leaving service. Top predictors were in the domains of lifetime self-injurious thoughts and behaviors, traumatic events, and socio-demographics, with fewer predictors in the domains of Army career and mental and physical disorders.
Based on the good performance of this prediction model, a VA and Army initiative was established to implement a suicide-prevention intervention for high-risk TSMs in conjunction with SFL. The STARRS model was to be used for targeting. However, to make this practical, a simplified version of the model was needed that used only a short series of self-report survey questions, as these questions would need to be included in a brief needs assessment survey of TSMs implemented at enrollment to SFL. The original STARRS model used much more extensive survey questions along with administrative variables that are not available for real-time use in SFL risk targeting. The original model also included small area geocode data not available for use in SFL risk targeting because the SFL survey is administered 6–12 months before leaving active service, at which time a substantial proportion of TSMs are unsure about their post-transition residential plans. A critical question consequently arose early in intervention planning whether it would be possible to develop a useful high-risk targeting model based only on responses to a short series of self-report survey questions. The current report presents the results of a reanalysis of the STARRS data to answer that question.
Materials and methods
Sample
Baseline surveys
As detailed in Stanley et al. (Reference Stanley, Chu, Gildea, Hwang, King, Kennedy and Kessler2022), Army STARRS included three separate baseline surveys, all using group in-person self-administration (Fig. 1): (1) the 2011–2012 New Soldier Study (n = 50765 new soldiers surveyed during the first few days of service before beginning Basic Combat Training); (2) the 2011–2013 All Army Study (n = 39 666 soldiers surveyed in representative duty units throughout the world); and (3) the 2012–2014 Pre-Post Deployment study (n = 9415 soldiers surveyed shortly before deployment to Afghanistan). Field procedures are described extensively elsewhere (Heeringa et al., Reference Heeringa, Gebler, Colpe, Fullerton, Hwang, Kessler and Ursano2013; Kessler et al., Reference Kessler, Colpe, Fullerton, Gebler, Naifeh, Nock and Heeringa2013a, Reference Kessler, Heeringa, Colpe, Fullerton, Gebler, Hwang and Ursano2013b; Ursano et al., Reference Ursano, Colpe, Heeringa, Kessler, Schoenbaum and Stein2014). The Human Subjects Committees of the University of Michigan, the Uniformed Services University of the Health Sciences, and the Army Medical Research and Materiel Command approved all recruitment, consent, and field procedures. The n = 72 387 respondents in these surveys who gave written informed consent to link their deidentified survey data with Army administrative data were the focus of subsequent study. We applied calibration weights to these cases to adjust for differences in survey responses between respondents who did v. did not agree to administrative linkage and differences from the population and the sample on a wide range of administrative variables. A probability subsample of these weighted survey respondents was then selected to participate in the STARRS Longitudinal Surveys (LS), a series of mixed-mode mail-phone surveys carried out in successive waves beginning in 2016–2017 (LS1; n = 14 508). We attempted to resurvey all LS1 respondents in 2018–2019 (LS2; n = 12 156).
The focus of the current report is on the LS1 and LS2 respondents who were in the Regular Army at baseline, in service more than 6 months before leaving active service, and out of active service for at least 12 months at the time of one or both of their LS surveys. This is a smaller sample than in Stanley et al. (Reference Stanley, Chu, Gildea, Hwang, King, Kennedy and Kessler2022), where we did not require being in service more than 6 months before leaving. This new requirement was imposed because SFL is offered only to soldiers who are in service more than six months before leaving active service. We included both soldiers who separated completely from the Army and those who transitioned to the Reserve or National Guard. Data were pooled over the LS1 and LS2 surveys. The n = 11 LS2 respondents who had already left active service as of LS1 and reported a SA in the 12 months before their LS1 survey were excluded from the LS2 analysis to avoid double-counting multiple SAs for any single TSM. This means that, by construction, none of the n = 70 TSMs who reported a SA in the 12 months before LS1 were included with the n = 3110 respondents in the analysis who were in both LS1 and LS2. The full analysis sample included n = 8335 observations, composed of n = 3935 at LS1 (n = 3405 separated from active service and n = 530 no longer activated) and n = 4400 at LS2 (n = 3785 separated and n = 615 no longer activated). More detailed information about recruitment is presented in online Supplementary Figs S1 and S2.
Measures
Self-reported suicide attempts
SA was assessed with questions adopted from the Columbia-Suicide Severity Rating Scale (Posner et al., Reference Posner, Brown, Stanley, Brent, Yershova, Oquendo and Mann2011) that asked: Did you ever make a suicide attempt (i.e. purposefully hurt yourself with at least some intention to die) at any time since your last survey? Respondents who said yes were asked about number/recency (age) of such SAs. When reported recency age was within one year of age at interview, we asked if the most recent SA was in the past 12 months v. more than 12 months ago. We focus here on SAs that occurred within 12 months of the survey. SAs recorded in electronic health records (EHRs) were not included because we had no access to EHRs for LS respondents no longer in active service. Previous studies found that self-reports capture about two-thirds of the SAs detected either by self-reports or medical records (Chu et al., Reference Chu, Zuromski, Bernecker, Gutierrez, Joiner, Liu and Nock2020; Lee et al., Reference Lee, Kearns, Wisco, Green, Gradus, Sloan and Marx2018).
Predictors
Our earlier report (Stanley et al., Reference Stanley, Chu, Gildea, Hwang, King, Kennedy and Kessler2022) detailed the nine categories of predictors considered in our initial analysis, which were based on previous research (Franklin et al., Reference Franklin, Ribeiro, Fox, Bentley, Kleiman, Huang and Nock2017; Holliday et al., Reference Holliday, Borges, Stearns-Yoder, Hoffberg, Brenner and Monteith2020; Klonsky, May, & Saffer, Reference Klonsky, May and Saffer2016; Nock et al., Reference Nock, Deming, Fullerton, Gilman, Goldenberg, Kessler and Ursano2013): socio-demographics, Army career variables, mental disorders, self-injurious thoughts and behaviors, physical health problems, chronic stressors, adverse childhood experiences, other lifetime traumatic events, and personality characteristics. These measures were obtained from baseline Army STARRS surveys, Army/DoD administrative data systems, and geospatial databases in public records about Census Block Groups and Counties where respondents resided. In developing this model, we assumed that any future use of the model would be based on new surveys administered to all TSMs shortly before leaving active service in addition to Army administrative data systems and information provided by TSMs about the civilian addresses to which they planned to move after leaving active service. These assumptions turned out to be incorrect with respect to SFL, though, as we do not have access to Army administrative systems at the time of the SFL survey. And many TSMs report during the survey that they are not yet sure where they will relocate after leaving active duty.
Based on these considerations, we revised the predictor set in two ways for purposes of developing a new model for use in the SFL intervention. First, we focused only on the Army STARRS survey measures, some of which were scales and others individual items, considered in developing the initial model (n = 137 measures) along with a subset of 83 administrative variables that we felt could be assessed quickly and accurately with self-report equivalents. We estimated a preliminary machine learning model using those 220 variables. Second, as many of the predictors in the initial model involved long multi-item scales, we disaggregated these scales to the item-level to facilitate creation of a short question series. We then replicated the machine learning analysis with the more extensive set of disaggregated items from all scales that entered the initial model in addition to all single-item predictors. Prior research has shown that it is sometimes possible in this way to recover most of the predictive power of an initial model that used scales as predictors while substantially reducing the number of questions needed in the survey used as input to the model (e.g. Nock et al., Reference Nock, Millner, Ross, Kennedy, Al-Suwaidi, Barak-Corren and Kessler2022).
Analysis methods
Analysis was carried out November–December 2022. As reviewed elsewhere (Kessler et al., Reference Kessler, Bernecker, Bossarte, Luedtke, McCarthy, Nock, Zaslavsky, Passos and Kapczinski2019), most machine learning studies that predict suicide-related behaviors either use a single algorithm or try several different algorithms and choose the one with the best prediction accuracy. We instead used the Super Learner (SL) stacked generalization method, which allows pooling across multiple algorithms with a weight generated via cross-validation guaranteed to perform at least as well in expectation as the best component algorithm (Polley, LeDell, Kennedy, Lendle, & van der Laan, Reference Polley, LeDell, Kennedy, Lendle and van der Laan2018). We used a diverse set of algorithms in the ensemble to capture nonlinearities and interactions and reduce misspecification (online Supplementary Table S2) (Kennedy, Reference Kennedy2017). As discussed in more detail in the Methodology Appendix, this use of a diverse ensemble also addressed the issue of fairness that has been raised in many recent machine learning studies (Chen et al., Reference Chen, Chen, Lipkova, Wang, Williamson, Lu and Mahmood2022). The model was trained in a random 70% training sample and validated in the remaining 30% test sample.
Given the need for a short questionnaire, we evaluated the implications of restricting the number of predictors to only the top 10, 20, 30, 40, and 50 most important. Predictor importance was defined for this purpose using the model-agnostic kernel Shapley Additive exPlanations (SHAP) method, which estimates the marginal contribution to overall model accuracy of each variable in a predictor set (Lundberg & Lee, Reference Lundberg and Lee2017). In addition, we estimated a simple lasso penalized regression model as a benchmark (Tibshirani, Reference Tibshirani1996). The best model was defined as the one with the highest area under the ROC curve (ROC-AUC) in the test sample. Once that model was selected, we divided the test sample into deciles of predicted risk and calculated both conditional and cumulative sensitivity (SN; the proportion of self-reported SAs within and across deciles of predicted risk) and positive predictive value (PPV; prevalence of self-reported SAs within and across deciles of predicted risk).
Data management and calculation of prevalence and ROC-AUCs were carried out in SAS version 9.4 (SAS Institute, 2013). The SL models and SHAP values were estimated in R version 3.6.3 (R Core Team, 2021). The R packages used for each algorithm are listed in online Supplementary Table S2.
Results
Sample composition
As noted above, the LS surveys were administered to individuals who participated in any of the three baseline Army STARRS surveys. These surveys were combined with mean unit weights of 1.0 within each survey for the purpose of the analysis. Dummy variables for which initial sample the respondent came from were included in the predictor set to determine if model performance varied depending on initial sample. Comparisons of weighted sample distributions with population distributions found generally good consistency (online Supplementary Tables S3 and S4). Median respondent age at time of leaving active service was 26 (Table 1). Most LS respondents were male (84.9%), Non-Hispanic White (67.4%), and heterosexual (93.8%). Most had a high school education (69.7%) and were either married (59.0%) or never married (35.6%) at the time of leaving active service. Most had not deployed to a combat theater (39.5%) or had only one such deployment (35.2%), were of junior enlisted rank (60.3%), and were separated (87.8%; i.e. terminated their relationship with the Army) rather than released from active service (12.2%; i.e. continued in service as a member of the Reserve or National Guard but no longer activated).
Abbreviations: s.e., standard error; GED, General Educational Development.
a Wald χ2 for comparing the distribution of demographic variables between two: attempt and no attempt.
Note: Model estimates reflect weighted data.
SA prevalence
Overall, n = 110 respondents reported a SA in the past 12 months, representing a prevalence (s.e.) of 1.0% (0.1), with higher prevalence in LS1 [1.3% (0.2)] than LS2 [0.7% (0.2)], and comparable prevalence among respondents still in the Reserve or National Guard [1.1% (0.4)] v. those separated from active service [1.0% (0.1)]. Based on the one-in-ten rule of thumb, we would expect that this sample size would support the development of a prediction model with roughly 11 predictors (i.e. 110/10), which would be adequate for our goal of developing a risk calculator based on a brief set of survey questions. However, simulations show that it is often possible to detect stable associations of a larger number of predictors depending on the data structure (van Smeden et al., Reference van Smeden, de Groot, Moons, Collins, Altman, Eijkemans and Reitsma2016). This led us to cast a wider net in evaluating the incremental benefits of adding predicting in our analysis.
Model results
Model estimation
As noted above, we began by estimating a SL model using all 220 predictors that included scale-level baseline survey predictors and administrative measures that we felt we could easily convert into survey measures. 6 survey scales, 11 survey items, and 7 administrative variables had nonzero SHAP values in that initial model. The scales were then disaggregated into items. We also expanded the set of traumatic events (both those experienced during deployments and others) from those that entered the initial model to all those in the baseline surveys, resulting in a total of 64 variables (57 survey variables plus the 7 administrative variables) used in an item-level model (online Supplementary Table S6).
Overall model fit
As detailed in the Methodology Appendix, nine algorithms had nonzero SL importance weights in the item-level SL model: a penalized logistic regression, four different random forest specifications, three different extreme gradient boosting specifications, and one support vector machine specification (online Supplementary Table S5). The ROC-AUC (s.e.) of that model in the test sample was 0.81 (0.03) (Fig. 2). This is higher than the test sample ROC-AUC of the initial SL model using all 220 scale-level predictors, indicating that the initial model over-fitted the data.
We then examined the implications of limiting the model to include only the 10, 20, 30, 40, or 50 most important of the 64 predictors as a way of reducing the question series to a manageable number for use in the SFL survey. The highest SL test sample ROC-AUC (s.e.), 0.86 (0.03), was found when we limited the item-level predictor set to only 20 predictors (Fig. 2). However, a very similar ROC-AUC, 0.85 (0.03), was obtained when the lasso model was estimated from the 64 predictors. Only 17 predictors were selected by this lasso model. Based on the much easier scoring in lasso than SL, along with the equally strong performance of the lasso model in the part of the risk distribution important for defining high SA risk (Fig. 2), we selected the lasso model over SL for implementation.
Inspection of the lasso predicted risk deciles in the test sample showed that 44.9% of all SAs after leaving active service occurred among the 10% of TSMs estimated to be at highest risk, 67.8% of SAs among the 20% of TSMs estimated to be at highest risk, and 92.5% of SAs among the 30% of TSMs estimated to be at highest risk (Table 2). SA prevalence was 4.3% in the top risk decile, 3.3 and 3.0%, respectively, pooled across the top two and three risk deciles, and 0.1% in the remainder of the sample.
Abbreviations: s.e., standard error.
a The n = 2509 respondents in the test sample represent roughly 30% of the n = 8335 in the total sample, including n = 34 of the n = 110 total-sample respondents who reported attempting suicide in the 12 months before their STARRS-LS survey. The remaining 70% of the total sample were in the training sample.
Predictor importance
Five of the 17 lasso predictors, including 4 of the 10 with highest RRs, assessed self-injurious thoughts and behaviors, all associated with increased SA risk: lifetime passive and active ideation, plans, and attempts; and ideation (either passive or active) in the 2 years prior to leaving active service (Table 3). Another 5 were indicators of externalizing disorders, again all of which were associated with increased SA risk, including 3 indicators of childhood conduct problems (school truancy, running away from home, bullying), one of interpersonal problems linked to substance use, and another of physically assaulting others. Four of the remaining 7 predictors were associated with increased SA risk: having 1+ child ages 6–13; having an honorable discharge or being discharged under honorable conditions; identifying as gay, lesbian, or bisexual; and being the victim of any crime in the 4 years before leaving active service. The other 3 predictors were associated with decreased SA risk: being 34+ years of age at time of ending active service; having 2+ Global War on Terror deployments; and having any life-threatening accident or other experience that put the respondent at risk of death or serious injury other than physical or sexual assault, illness or injury, or a natural disaster. It is noteworthy that the confidence intervals for many of these predictors included the point estimates. This means that these predictors would not be considered statistically significant using conventional criteria, but these CIs should be used only heuristically, as they are not exact when predictor selection is done using lasso. The lasso selected these predictors because they best represent the joint associations of all survey predictors with the outcome.
Abbreviations: RR, relative-risk; CI, confidence interval.
a The coefficients and CI were estimated in multivariable and univariable Poisson regression models with a stable regularization method used to estimate standard errors. However, as a prior lasso model was used to select the predictors included in the models, the confidence intervals should be used only heuristically, as they are not exact when predictor selection is done using lasso. It is noteworthy that some predictors would not be considered statistically significant using conventional criteria but were selected by lasso because they best represent joint effects of all survey predictors. Each predictor was standardized to have a mean of 0 and variance of 1 prior to estimation, resulting in the RR estimates describing the proportional differences in risk of the outcome associated with 1 s.d. changes in each predictor.
b See online Supplementary Table S1 for a description of the predictor variables. All variables were defined as of the time period prior to the respondent leaving or being released from active service.
c Other than physical or sexual assault, illness or injury, or a natural disaster.
Discussion
Most of the support provided by DoD and VA for TSMs is universal; that is, the same services are offered to all TSMs. Efforts to target enhanced transitional assistance resources to the TSMs with greatest need has been limited by lack of information about predictors of differential need (Bullman, Hoffmire, Schneiderman, & Bossarte, Reference Bullman, Hoffmire, Schneiderman and Bossarte2015; Ravindran, Morley, Stephens, Stanley, & Reger, Reference Ravindran, Morley, Stephens, Stanley and Reger2020; Reger et al., Reference Reger, Smolenski, Skopp, Metzger-Abamukang, Kang, Bullman and Gahm2015; Shen et al., Reference Shen, Cunha and Williams2016). We found that a parsimonious model can be developed using self-report survey data obtained before TSMs leave active service to predict self-reported SAs after leaving service. About 45% of the reported SAs occurred to the 10% of TSMs in the top risk decile, more than two-thirds to the 20% of TSMs in the top two risk deciles, and more than 90% to the 30% of TSMs in the top three risk deciles. These results are likely conservative, as the baseline survey data were collected up to six years before the TSM left active service and the analyses in developing our initial model found that prediction accuracy was inversely proportional to time between the baseline survey and time of leaving service (Stanley et al., Reference Stanley, Chu, Gildea, Hwang, King, Kennedy and Kessler2022). This means the model would be expected to perform better when based on the SFL survey, which is administered 6–12 months before TSMs leave active service.
It is noteworthy that none of the dummy variables for initial Army STARRS survey membership entered our final model, indicating that model results are stable across the three baseline samples that we combined in developing the model. It should also be noted that the predictors identified as important by our model should not be interpreted as causal, but rather as the best marker items representing the joint associations of all the individually significant survey predictors in the full predictor set with SA (Hubbard, Kennedy, & van der Laan, Reference Hubbard, Kennedy, van der Laan, van der Laan and Rose2018; Kraemer et al., Reference Kraemer, Kazdin, Offord, Kessler, Jensen and Kupfer1997). Furthermore, as the lasso model is designed to optimize model accuracy rather than the accuracy of coefficients involving individual predictors, additional caution is needed not to interpret relative coefficient sizes or even signs as indicative of the relative importance of individual predictors themselves. These coefficients should instead be interpreted as indicating the importance of the selected predictors as markers of the joint associations between the many variables in the predictor set and the outcome (Hastie, Tibshirani, & Wainwright, Reference Hastie, Tibshirani and Wainwright2016).
Within the context of these cautions, two dominant patterns were found in the analysis of predictor importance. First, consistent with much prior research (Franklin et al., Reference Franklin, Ribeiro, Fox, Bentley, Kleiman, Huang and Nock2017), 5 of the 17 predictors in the final lasso model assessed self-injurious thoughts and behaviors. These were all positively associated with SA. Second, again consistent with prior research (Moselli, Casini, Frattini, & Williams, Reference Moselli, Casini, Frattini and Williams2023), another five predictors were indicators of externalizing disorders in childhood or adulthood. These were all positively associated with SAs. Two other predictors were also consistent with prior research: identifying as gay, lesbian, or bisexual, which was positively associated with SA (Plöderl et al., Reference Plöderl, Wagenmakers, Tremblay, Ramsay, Kralovec, Fartacek and Fartacek2013); and being 34+ years of age at the time of leaving active service, which was negatively associated with SA (Ravindran et al., Reference Ravindran, Morley, Stephens, Stanley and Reger2020).
The signs of the other five predictors were less consistent with expectations. Two of these were being the victim of any crime in the 4 years before leaving active service, which was positively associated with SA, and having any lifetime life-threatening accidents or other risky near-death experiences other than physical or sexual assault, illness or injury, or a natural disaster, which was negatively associated with SA. Why this opposite-sign pattern occurred is unclear. Nor is it clear why this particular pair of stressors was selected when information was also included in the initial predictor set about many more specific victimization experiences (e.g. physical assault, sexual assault, property crime) and life-threatening experiences (e.g. combat experiences, illness or injury, natural disaster). The negative sign for life-threatening accidents or other near-death experiences is inconsistent with some other work documenting positive associations of trauma exposure with SA as well as with theory (Nock et al., Reference Nock, Deming, Fullerton, Gilman, Goldenberg, Kessler and Ursano2013). The same could be said for two of the other remaining predictors: having 1+ dependents ages 6–13 and being discharged honorably/under honorable conditions, both positively associated with SA.
It is plausible to think that these counter-intuitive associations are related to complex exogenous relationships among the predictors leading to sign inversions in the lasso model, but comparison with univariable associations shows that there were no sign inversions. Nonrandom selection is another possibility. The latter might explain the negative association between having 2+ Global War on Terror deployments and SA, as most multiple combat deployments occur only among soldiers that reenlist after an initial tour of duty, and we know that mental health problems are significant predictors of plans to leave service after a first tour of duty (Beymer, Reagan, Rabbitt, Webster, & Watkins, Reference Beymer, Reagan, Rabbitt, Webster and Watkins2021). Unpublished STARRS data suggests that onset of mental disorders in the wake of combat deployments is associated with high probability of leaving service at the end of that tour of duty. In addition, given that commanders select only a fraction of the soldiers in their units for combat deployments, a phenomenon known as the ‘healthy warrior effect’ has been observed in which soldiers selected for deployment have better pre-deployment mental health than those not selected (Wilson et al., Reference Wilson, Jones, Fear, Hull, Hotopf, Wessely and Rona2009). It might be that these selection processes together lead to low risk of SA after leaving active service.
Limitations
The study has several noteworthy limitations. First, the sample was restricted to soldiers who participated in Army STARRS surveys in 2011–2014 and could be traced and resurveyed in 2016–2019. As a result, the possibility of sample bias cannot be ruled out despite our use of adjustment weights. Nor can we rule out the possibility that the significant predictors of SAs during the transition between military and civilian life have changed since those years. Second, SAs were assessed exclusively with self-reports, as we had no access to administrative records after leaving active service. Self-reports under-represent true SAs (Millner, Lee, & Nock, Reference Millner, Lee and Nock2015). It is not clear whether prediction accuracy would be different for SAs assessed only by administrative data. This could be investigated in future studies of TSMs that obtain VA healthcare, but this is true of only a minority of recently separated servicemembers (U.S. VA, 2014) and the suicide rate is lower for Veterans who receive VA healthcare compared to Veterans who do not (U.S. VA, 2022). One way to address this problem would be to work in future extensions with one or more of the health information technology aggregation companies that allow deidentified access to health information data across many different health systems throughout the country. Third, the limited number of SAs in the follow-up surveys limited statistical power to detect a larger set of predictors. Fourth, as the STARRS surveys were explicitly advertised as independent academic surveys in which identified respondent reports would not be made available to military leaders, different results might be found when the same questions are administered in the SFL survey. Fifth, as substantial variation existed in the time lag between the baseline Army STARRS surveys, the LS surveys, and date of leaving or being released from active service, model prediction accuracy is likely to have been under-estimated. We will be able to evaluate some of these limitations once we administer follow-up surveys and evaluate model performance in the control subsample of our anticipated intervention evaluation.
Conclusions and next steps
Within the context of these limitations, the study demonstrated that data available prior to a soldier leaving active service can be used to predict self-reported SAs after leaving with good accuracy. In addition, the finding that the lasso model is as accurate as more complex ML models shows that it is possible to combine this information into a simple risk calculator. Based on these results, the 17 survey questions in our lasso model have been added to the SFL survey and a pilot is underway to evaluate the effects of providing an intensive case management intervention to high-risk TSMs focused on SA prevention (Manuel et al., Reference Manuel, Nizza, Herman, Conover, Esquivel, Yuan and Susser2022) based on this risk targeting scheme. Future intervention work will examine the accuracy of the model (among soldiers in the control group of the intervention evaluation) and investigate opportunities for model refinement. Efforts will also be made to investigate whether additional precision treatment models might help improve efforts to match high-risk TSMs to the services within and beyond the intervention that are most likely to result in optimal outcomes (Kessler et al., Reference Kessler, Furukawa, Kato, Luedtke, Petukhova, Sadikova and Sampson2021).
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291723000491
Acknowledgements
The Army STARRS Team consists of Co-Principal Investigators: Robert J. Ursano, MD (Uniformed Services University) and Murray B. Stein, MD, MPH (University of California San Diego and VA San Diego Healthcare System); Site Principal Investigators: James Wagner, PhD (University of Michigan) and Ronald C. Kessler, PhD (Harvard Medical School); Army scientific consultant/liaison: Kenneth Cox, MD, MPH (Office of the Assistant Secretary of the Army (Manpower and Reserve Affairs); and Other team members: Pablo A. Aliaga, MA (Uniformed Services University); David M. Benedek, MD (Uniformed Services University); Laura Campbell-Sills, PhD (University of California San Diego); Carol S. Fullerton, PhD (Uniformed Services University); Nancy Gebler, MA (University of Michigan); Meredith House, BA (University of Michigan); Paul E. Hurwitz, MPH (Uniformed Services University); Sonia Jain, PhD (University of California San Diego); Tzu-Cheg Kao, PhD (Uniformed Services University); Lisa Lewandowski-Romps, PhD (University of Michigan); Alex Luedtke, PhD (University of Washington and Fred Hutchinson Cancer Research Center); Holly Herberman Mash, PhD (Uniformed Services University); James A. Naifeh, PhD (Uniformed Services University); Matthew K. Nock, PhD (Harvard University); Victor Puac-Polanco, MD, DrPH (Harvard Medical School); Nancy A. Sampson, BA (Harvard Medical School); and Alan M. Zaslavsky, PhD (Harvard Medical School). As a cooperative agreement, scientists employed by the National Institute of Mental Health and U.S. Army liaisons and consultants collaborated to develop the study protocol and data collection instruments, supervise data collection, interpret results, and prepare reports. Although a draft of the manuscript was submitted to the U.S. Army and National Institute of Mental Health for review and comment before submission for publication, this was done with the understanding that comments would be no more than advisory.
Financial support
Dr Kearns was supported in part by VA Suicide Prevention Center grant I01CX002621-01. Dr Marx was supported by VA Suicide Prevention Center grant I01CX002621-01. Army STARRS was sponsored by the Department of the Army and funded under cooperative agreement number U01MH087981 with the U.S. Department of Health and Human Services, National Institutes of Health, National Institute of Mental Health (NIH/NIMH). Subsequently, STARRS-LS was sponsored and funded by the Department of Defense (USUHS grant numbers HU00011520004 and HU0001202003). The grants were administered by the Henry M. Jackson Foundation for the Advancement of Military Medicine Inc. (HJF). The contents are solely the responsibility of the authors and do not necessarily represent the views of the Department of Health and Human Services, NIMH, the Department of the Army, Department of Defense or HJF.
Conflict of interest
In the past 3 years, Dr Kessler was a consultant for Cambridge Health Alliance, Canandaigua VA Medical Center, Holmusk, Partners Healthcare, Inc., RallyPoint Networks, Inc., and Sage Therapeutics. He has stock options in Cerebral Inc., Mirah, PYM, and Roga Sciences. In the past 3 years, Dr Stein received consulting income from Actelion, Acadia Pharmaceuticals, Aptinyx, atai Life Sciences, Boehringer Ingelheim, Bionomics, BioXcel Therapeutics, Clexio, EmpowerPharm, Engrail Therapeutics, GW Pharmaceuticals, Janssen, Jazz Pharmaceuticals, and Roche/Genentech. He has stock options in Oxeia Biopharmaceuticals and EpiVario. He is paid for his editorial work on Biological Psychiatry (Deputy Editor) and UpToDate (Co-Editor-in-Chief for Psychiatry). Dr Goodman is a consultant for Boehringer Ingleheim Pharmacueticals. The other authors declare that they have no conflicts of interest.
Ethical standards
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.