Since the end of the Persian Gulf War of 1991, its veterans have reported a range of health complaints attributed to service during the war. These veterans report an increased prevalence of a whole range of common symptoms compared with other service personnel who were not deployed to the Gulf.
It is now widely recognised that exposure to combat and other wartime experiences can have both short-term and long-term psychological effects. These psychological consequences are varied, but the concept of post-traumatic stress disorder (PTSD) has arisen to describe the syndrome of intrusive thoughts, flashbacks, hyperarousal and numbing that can occur after exposure to any traumatic event, including those common in wartime.
The Persian Gulf War was brief and there were relatively few casualties among the troops deployed on behalf of the United Nations. Nevertheless, a number of aspects of the war exposed service personnel to traumatic and stressful events: these included the risk of chemical and biological warfare, exposure to combat, and dealing with prisoners and dead and wounded Iraqi soldiers. This paper describes a systematic review of studies that have compared the prevalence of psychiatric disorder in Gulf War veterans with its prevalence in a comparison group who were not deployed to the Gulf (non-Gulf veterans).
METHOD
Data search
Studies between January 1990 and May 2001 were identified from a range of electronic databases, including EMBASE, Medline, ASSIA, SIGLE, PsycINFO, CancerLit, HealthSTAR, Dissertation, Abstracts, Current Contents, Health and Psychosocial Instruments, CINAHL and Biological Abstracts. Keywords used to identify the studies were: DESERT STORM or DESERT SHIELD or DESERTSHIELD or GULF WAR or GULF SYNDROME or GULF WAR SYNDROME or PERSIAN GULF WAR or PERSIAN GULF SYNDROME. References of identified studies were searched for further studies. Specialist Gulf veterans' illnesses research websites (US Department of Defense Center for Deployment Health Research site and the Walter Reed Army Medical Center Gulf War database) and more-general Gulf websites were also searched for any additional references. Researchers who had expressed an interest in Gulf veterans' illness research were contacted for any non-published information. There was no restriction on the identification of studies in terms of publication status or language. This search strategy was first applied to data published up to the end of 1998 (n=4156) and then repeated to the end of May 2001 (n=1231).
Studies were included if they contained data on veterans who had been deployed to the Gulf War on military, medical or peace-keeping grounds (i.e. those involved in operations Desert Shield, Desert Storm, Granby or Desert Peace). Any study design was eligible for inclusion provided that an appropriate control or comparison group was included to compare the prevalence of psychiatric disorder.
The 5387 abstracts identified by the original search were screened by N.J.S. and the 2296 that remained eligible were examined by two members of the research team to decide whether they might meet our inclusion criteria. Printed copies of 409 papers were then obtained and examined by two members of the research team to confirm eligibility and extract data.
In our original search we also included studies that compared ill and well Gulf War veterans, but these were excluded from the review reported here. Studies were also excluded if they measured simulated exposures, if they measured non-health-related outcomes, or if the study population included inhabitants of the Persian Gulf states rather than deployed military, medical or peace-keeping personnel.
All identified papers that fulfilled the pre-stated inclusion criteria were categorised by health outcome. Forty-nine studies included data on psychiatric disorder, 29 of which reported on Gulf War veterans and an external comparison group of non-Gulf War veterans. We further restricted the studies to those with a limited range of outcomes concerned with psychiatric disorder (20 studies). The outcomes we chose to include were as follows:
-
(a) PTSD diagnosed using a recognised standardised assessment;
-
(b) common mental disorder: depression or anxiety diagnosed using a recognised standardised assessment; or self-reported symptoms of depression recorded on a checklist;
-
(c) problems related to alcohol misuse.
We have chosen to use the term ‘common mental disorder’ (Reference Goldberg and HuxleyGoldberg & Huxley, 1992) to refer to the common symptoms of depression and anxiety that are seen in the community and reflect the use of assessments such as the General Health Questionnaire (GHQ; Reference Goldberg and WilliamsGoldberg & Williams, 1988) and the Symptom Checklist (and its derivatives) (Reference Derogatis, Lipman and RickelsDerogatis et al, 1974, Reference DerogatisDerogatis, 1977; Reference Derogatis and SpencerDerogatis & Spencer, 1982)
Data extraction
Data relating to the studies' main hypotheses and to methodological quality were extracted independently by two members of the research team. Information on the methodological quality of the individual studies included the response rate, the potential of selection bias in the sampling of the study participants, the potential bias in the measurement of outcomes, the availability of data on confounders, and any adjustment for such variables.
Statistical analysis
Summary odds ratio and risk ratios were calculated with a random-effects model using the inverse variance method. The degree of heterogeneity was assessed using the chi-squared test within a fixed-effects model. All analyses were performed using the METAN command (Reference Bradburn, Deekes and AltmanBradburn et al, 1998) in Stata version 6 (StataCorp, 1999). We chose this approach because of the inherent heterogeneity in the data. In particular, we were combining studies with a variety of outcome measures. A random-effects model assumes that the studies in a meta-analysis are sampled from a distribution of effect sizes, which are estimated from the data in the meta-analysis. In contrast, a fixed-effects model assumes that all the studies are sampled from a population with the same effect estimate.
We chose to perform analyses on dichotomous outcomes because the distribution of scores from continuous scales is often difficult to establish from published articles, and this — together with the wide variety of scales that were used — can introduce difficulties in performing a quantitative synthesis. Using ratio measures to estimate association should be less sensitive to the different case definitions and measures used in the constituent studies.
RESULTS
The systematic review process is shown in Fig. 1 We identified 20 primary studies that investigated the association between deployment to the Gulf War and psychiatric disorder (Reference Perconte, Wilson and PontiusPerconte et al, 1993; Sutker et al, Reference Sutker, Uddo and Brailey1993, Reference Sutker, Uddo and Brailey1994; Stretch et al, Reference Stretch, Bliese and Marlowe1996a Reference Stretch, Marlowe and Wright b ; Iowa Persian Gulf Study Group, 1997; Reference PiercePierce, 1997; Reference Stuart and HalversonStuart & Halverson, 1997; Goss Gilroy Inc., 1998; Reference Holmes, Tariot and CoxHolmes et al, 1998; Reference Proctor, White and WolfeProctor et al, 1998; Reference Stuart and BlieseStuart & Bliese, 1998; Reference Gray, Kaiser and HawksworthGray et al, 1999; Reference Ishoy, Suadicani and GuldagerIshoy et al, 1999; Reference Unwin, Blatchley and CokerUnwin et al, 1999; Reference Wolfe, Proctor and EricksonWolfe et al, 1999; Reference Bartone and PatonBartone, 2000; Reference Kang, Mahan and LeeKang et al, 2000; Reference SteeleSteele, 2000; Reference Cherry, Creed and SilmanCherry et al, 2001). We excluded nine other studies that included data on psychiatric disorder in Gulf War veterans but did not meet our inclusion criteria: five repeated results already included, three did not include any of the psychiatric outcomes defined above, and one compared Gulf veterans with reported illness with a comparison sample (further details available from the authors upon request).
Table 1 summarises the studies we identified. All are best described as cross-sectional surveys. Some studies, for example those by Kang et al (Reference Kang, Mahan and Lee2000), Unwin et al (Reference Unwin, Blatchley and Coker1999) and Ishoy et al (Reference Ishoy, Suadicani and Guldager1999), resemble cohort studies, as the population was defined in terms of ‘exposure’ to the Gulf War. However, these studies had little or no information on health status before deployment and therefore share most of the methodological limitations of cross-sectional surveys.
Reference | Study design | Sample | Study period | Main outcomes | Main results | Selection or response bias | Confounding |
---|---|---|---|---|---|---|---|
Samples from national military databases | |||||||
Pierce (Reference Pierce1997) | Longitudinal study comparing GWV (n=153) with those deployed elsewhere (n=331) | Stratified random sample, female US Air Force GWV and ‘other deployed’ veterans. Sampling frame from US DoD manpower data centre. Stratified by active duty/reserve, parent/non-parent | T 1 was 2 years after war and T 2 4 years | Common mental disorder | Started with 638 women, traced 525, of whom 92% and 87% responded at T 1 and T 2 respectively | Controlled for age using analysis of variance. Possibly frequency-matched for service status and parent | |
Sub-scales for depression, anxiety and somatisation based on Hopkins Symptom Checklist. Depression item from RAND Questionnaire | ‘No significant differences’ with any of the mental health measures used. Unclear whether at T 1 or at T 2 | ||||||
Mean scores on depression item (no s.d. given): T 1: NGV 1.84, GWV, 2.02 T 2: NGV 1.66, GWV 1.87 | |||||||
PTSD | |||||||
Mississippi scale | GWV 24% NGV 15% Unclear whether at T 1 or at T 2 | ||||||
Goss Gilroy Inc. (1998) | Cross-sectional survey: postal | All Canadian GWV (n=3113) identified from the Canadian Department of National Defense and a ‘deployed elsewhere’ control group matched for gender, age, reserve/regular (n=3439) | Survey June—Dec. 1997 | Common mental disorder | Response rate: GWV 73.0% NGV 60.3% | In addition to the matching, the ORs were adjusted for other confounders that were statistically significantly associated with outcome | |
Minor and major depression (PRIME—MD) | GWV had higher levels of minor depression adjusted for rank (OR=1.78; 95% CI 1.51-2.11) and major depression adjusted for rank and income (OR=3.67, 95% CI 3.04-4.44) compared with NGV | ||||||
Alcohol misuse (measure not specified) | |||||||
No significant association with alcohol misuse | |||||||
PTSD | |||||||
PTSD based on symptom report (PCL—M) | PTSD OR=2.69 (95% CI 1.59-4.26) adjusted for income | ||||||
Stuart & Bliese (Reference Stuart and Bliese1998) | Cross-sectional survey: postal | Randomly selected units from all US Army National Guard and Reserves taken from the DoD manpower data centre; NGV were deployed to USA or Germany GWV n=991 NGV n=279 | Jan. and Feb. 1993 | Common mental disorder | Return rate was 31% | Multiple regression was used to control for rank, gender, age, marital status, ethnicity, current life stress scale. | |
Brief Symptom Inventory. Cases defined on the Global Severity Index | Prevalence: GWV 32.1% NGV 17.3% | ||||||
NGV group from USA omitted because of error in table | |||||||
Ishoy et al (Reference Ishoy, Suadicani and Guldager1999) | Cross-sectional survey | All Danish military GWV and NGO aid workers compared with NGV group frequency-matched on gender, age and ‘profession’ using a Danish military database GWV n=686 NGV n=231 | Feb. 1997—Jan. 1998 | Common mental disorder | Response rate: GWV 83.6% NGV 57.8% | Controls matched for gender, age and profession. Non-response meant GWV more likely to be older males | |
Self-reported symptom of depression or sadness during previous 12 months that started during or after Gulf War | Prevalence: GWV 22.6% NGV 10.4% | ||||||
Unwin et al (Reference Unwin, Blatchley and Coker1999)1 | Cross-sectional survey: postal | Stratified random sample drawn from UK military database: GWV n=3284 Bosnian veterans n=1815 Era veterans n=2408 Bosnia and era samples frequency-matched for age, service, service status, gender, rank and fitness level Different sample from Cherry et al (Reference Cherry, Creed and Silman2001) | Aug. 1997-11 Nov. 1998 | Common mental disorder | Response rate: GWV 70.4% Bosnia 61.9% Era 62.9% Responders were significantly older and more likely to be in service, but did not differ on SF—36 | Potential confounders (age, marital status, rank, education, employment, still serving or discharged, smoking and alcohol consumption) are adjusted for using logistic regression. The samples were frequency-matched by age, service, service status, gender, rank and fitness. Analyses restricted to men only | |
Case was ≥ 3 on GHQ—12 (Reference Goldberg and WilliamsGoldberg & Williams, 1988) | Prevalence: GWV 39.2% Bosnia 26.3% Era 24.0% GWV v. Bosnia OR=1.6 (95% CI 1.4-1.8) GWV v. era, OR=2.1 (95% CI 1.9-2.4) Adjusted ORs reported | ||||||
PTSD | |||||||
Items from Mississippi scale. Authors' own case definition based on DSM—IV | Prevalence: GWV 13.2 Bosnia 4.7 Era 4.1 GWV v. Bosnia, OR=2.6 (95% CI 1.9-3.4) GWV v. era, OR=3.8 (95% CI 2.8-4.9) Adjusted ORs reported | ||||||
Kang et al (Reference Kang, Mahan and Lee2000) | Cross-sectional survey | Stratified random sample taken from the DoD manpower data centre GWV n=11 441 NGV n=9476 Frequency-matched on gender and service status | Not mentioned | Common mental disorder | Response rate: GWV 75% NGV 64% Non-responders were more likely | No adjustment made for potential confounders except for the matching variables, gender and service status | |
Self-reported symptom of ‘depression’ | Prevalence: GWV 36% NGV 22% Rate difference 14% (95% CI 13.9-14.1) | ||||||
Cherry et al (Reference Cherry, Creed and Silman2001) | Cross-sectional survey | Stratified random sample from the UK Ministry of Defence database GWV (n=80 14) and NGV (n=3900) frequency-matched by gender, age, service and rank different sample from Unwin et al (Reference Unwin, Blatchley and Coker1999) | Dec. 1997—Sept. 1999 | Common mental disorder | Response rate: GWV 84.3% NGV 82.1% | No adjustment apart from matching variables: gender, age, service and rank | |
Self-reported symptom ‘Feeling unhappy and depressed’: visual analogue scale | Mean score: GWV 6.3 NGV 3.7 (from figure) | ||||||
Representative samples restricted to a single geographical area | |||||||
Iowa Persian Gulf Study Group (1997) | Cross-sectional telephone survey | DoD manpower data centre used to create a stratified random sample from 28 968 military personnel from Iowa: GWV n=1896 NGV n=1799 | Sept. 1995—May 1996 | Common mental disorder | Response rate: GWV 78.3% NGV 73% | Adjusted for stratification variables: military service (regular/National Guard) age, gender, race, branch of service and rank | |
Brief Symptom Inventory — case definition not stated CAGE questionnaire for alcohol misuse — case definition not stated | Prevalence difference in depression GWV v. NGV, 6.0 (95% CI 4.0-7.9) Prevalence of depression (estimated from tables): GWV=9.0% NGV=4.6% Prevalence of alcohol misuse (estimated from tables): 17.4% v. 12.6%, P=0.02 | ||||||
PTSD | |||||||
PTSD Checklist — Military (case defined as score ≥50) | Prevalence (estimated from tables): GWV 1.9% NGV 0.8% | ||||||
Steele (Reference Steele2000) | Cross-sectional telephone survey | GWV n=1435, NGV n=409 DoD manpower data centre used to create a stratified random sample from 16 566 military personnel from Kansas. Stratified by reservist and gender | Feb.—Aug. 1998 | Common mental disorder | Response rate: 65% (3138 eligible) Gulf and female veterans were more likely to respond | Odds ratio adjusted for gender, age, income and education level | |
Self-reported ‘Feeling down or depressed’ | Prevalence: GWV 23% NGV 9% OR=2.99 (95% CI 2.07-4.31) | ||||||
Other sampling strategies | |||||||
Perconte et al (Reference Perconte, Wilson and Pontius1993) | Cross-sectional survey | Selected from ‘various’ reserve units in western Pennsylvania: | Not provided | Common mental disorder | Response rate: 95% (approximate) | No adjusted results presented | |
NGV n=126 | Cases defined on Global Severity | Prevalence: | |||||
GWV n=439 | Index (score ≥ 0.7) or SCL—90—R; | on GSI: | |||||
EDV n=26 | Beck Depression Inventory | GWV 27.9% | |||||
case score ≥ 10 | EDV 11.5% | ||||||
Administered after psycho-educational presentations and discussion | BGV 15.1%; | ||||||
on BDI: | |||||||
GWV 26.9% | |||||||
EDV 23.1% | |||||||
NGV 16.7% | |||||||
PTSD | |||||||
Mississippi scale | Prevalence: | ||||||
GWV 15.5% | |||||||
NGV 4.0% | |||||||
EDV 3.9% | |||||||
Sutker et al (Reference Sutker, Uddo and Brailey1993) | Cross-sectional survey | Participants were drawn from 5 National Guard and Army Reserve units in Louisiana: 215 troops deployed to the Gulf and 60 troops from the same units who were activated but not deployed overseas | Troops were assessed 4-10 months after completion of Desert Storm | Common mental disorder | Response rate: GWV 70.3% | All personnel had returned to the USA and not sought mental health treatment services. GWV were more likely to be younger and higher military rank, although correlational analyses found no significant difference between these measures and scores on mood measures | |
Beck Depression Inventory; anxiety: State—Trait Anxiety Inventory; Health Symptom Checklist (includes 9 items from Hopkins Symptom Checklist) | The GWV who experienced ‘high war-zone stress’ scored significantly more highly on the BDI and STAI anxiety measures | No response rate given for NGV | |||||
Sutker et al (Reference Sutker, Uddo and Brailey1994) | Cross-sectional survey | 60 Army reservists assigned grave registration duties GWV (n=40) NGV (n=20) ‘activated but remained stateside’ | 12 months after return from Desert Storm | Common mental disorder | The 60 respondents were selected from 124 in a survey of unspecified design. Response rate uncertain | All troops with previous combat experience were GWV | |
Depressive disorder NOS; SCID—P | Higher prevalence in GWV (13%). No case found in NGV | GWV older than NGV No adjustment for confounding | |||||
PTSD | |||||||
Defined using SCID—P | Higher prevalence in GWV. No cases in NGV | ||||||
Stretch et al (Reference Stretch, Bliese and Marlowe1996a ) | Cross-sectional survey: postal | Various units from Hawaii and Pennsylvania who either deployed to the Gulf or did not deploy GWV n=1524 NGV n=2512 | 1993 | Common mental disorder | Response rate: 31% | Hierarchical multiple regression used to adjust for age, rank, service branch, race, education, drinking and smoking | |
Brief Symptom Inventory | GWV significantly higher on all sub-scales (and after adjustment) | ||||||
PTSD | |||||||
Stretch et al (Reference Stretch, Marlowe and Wright1996b ) | Cross-sectional survey: postal | Various units from Hawaii and Pennsylvania who either deployed to the Gulf or did not deploy GWV n=1524 NGV n=2512 | 1993 | Walter Reed Army Institute of Research PTSD algorithm (9 items from IES) and 8 items from BSI | Prevalence: GWV 8.6% NGV 1.6% | Response rate: 31% | No adjustment was made for potential confounders. Current life stresses were found to be strongly related to PTSD symptoms |
Stuart & Halverson (Reference Stuart and Halverson1997) | Cross-sectional survey | No details provided of sampling strategy GWV n=2180 male, n=182 female Bosnia veterans n=1254 male, n=184 female, serving in Bosnia May—July 1996 | GWV were studied from Nov. 1990—Mar. 1991; NGV were sampled at 2 time points; May—Sep. 1991 and Jan.—Mar. 1996 | Common mental disorder | Response rate: unknown | No apparent adjustment for confounders | |
Global Severity Index of the Brief Symptom Inventory | GSI mean (s.d.): GWV, male 0.7 (0.7), female 0.8 (0.7) Bosnia veterans, male 0.6 (0.6), female 0.7 (0.6) | ||||||
Holmes et al (Reference Holmes, Tariot and Cox1998) | Cross-section survey: postal | All members of an Air National Guard Unit GWV n=296 NGV n=210 | 11 months after end of hostilities | Common mental disorder | Response rate: GWV 57.2% NGV 42.3% | Used logistic regression but insufficient details to decide what adjustments were made | |
Case defined as score ≥70 on Global Severity Index of SCL—90—R | Prevalence: GWV 11.5% NGV 7.3% Significant increase in mean scores on GSI in GWV | ||||||
PTSD | |||||||
Mississipi scale (case ≥90) | Prevalence: GWV 6.8% NGV 1.7% | ||||||
Proctor et al (Reference Proctor, White and Wolfe1998) | Cross-sectional survey | Stratified random samples of GWV from Fort Devens, New England (n=186) and New Orleans (n=66), and an NGV comparison group from an air ambulance unit (n=48) | Spring 1994 to autumn 1996 | Common mental disorder | Response rate: New England 62% New Orealns 38% NGV 51% Participants were recruited after taking part in a previous study whose response rate was 78% NGV not very comparable | Prevalence estimates account for stratification. Odds ratios adjusted for age, gender and education with logistic regression | |
Self-reported ‘frequent periods of feeling depressed’ | Prevalence: New England 22.6% New Orleans 5.8% NGV 1.6% OR=6.0 for New England, 3.9 for New Orleans | ||||||
PTSD | |||||||
CAPS Mississippi scale for Desert Storm war zone personnel. CAPS possibly subject to observer bias | Prevalence, CAPS: New England 5% (8/148) New Orleans 8% (4/58) NGV 0% Prevalence, Mississippi: New England 8.1% New Orleans 7.6% NGV not stated | ||||||
Wolfe et al (Reference Wolfe, Proctor and Erickson1999) | Cross-sectional survey | Stratified random samples of GWV from New England (Fort Devens) (n=148) and New Orleans (n=56) and an NGV comparison group from an ambulance unit (n=48) (same study as Proctor et al (Reference Proctor, White and Wolfe1998) | 1994-1996 | Common mental disorder | Response rate: GWV 30-42% NGV 51% of unit GWV non-responders tended to be younger and unmarried. The GWV groups were quite different from the NGV group | Adjusted for sampling design to reflect distribution of gender and reported health symptoms. No other adjustments were made | |
SCID used to define DSM—III—R major depressive disorder CAPS and SCID administered by trained clinicians Blindness to exposure status not mentioned | Major depressive disorder: New England GWV 6.6% New Orleans GWV 4.5% NGV 0% | ||||||
Gray et al (Reference Gray, Kaiser and Hawksworth1999) | Cross-sectional survey | Active duty Seabees (naval construction workers) who remained in service after Desert Storm/Shield and were serving in one of two large Seabee centres. Selected using the DoD manpower database GWV n=527 NGV n=970 | Sept. 1994—June 1995 | Common mental disorder | Estimated overall response rate: 53% | GWVs younger, more often male and less educated No adjustments for confounders | |
Five dimensions of Hopkins Symptom Checklist Self-reported symptom of depression | GWV had statistically significantly increased scores on all 5 dimensions of the Hopkins Symptom Checklist Prevalence of self-reported depression: GWV 6.8% NGV 2.8% OR=2.6 (95% CI 1.5-4.4) | ||||||
PTSD | |||||||
PTSD screen (items from DSM—IV) | Prevalence: GWV — 15.2% NGV 9.0% OR 1.8 (95% CI 1.3-2.5) | ||||||
Bartone (Reference Bartone and Paton2000) | Cross-sectional survey | Six army reserve medical units GWV n=389 NGV n=381 (236 to USA, 145 to Germany) | 4-6 months after the end of the Gulf War | PTSD | Response rate: approximately 50% | Statistically significant after adjustment for number of stressful life events and ‘hardiness’ | |
Impact of Events Scale | Mean (s.d.) GWV 13.8 (9.7) NGV (USA) 4.0 (7.2) NGV (Germany) 9.3 (9.7) |
Sampling
The sampling design of the studies varied. For example, Unwin et al (Reference Unwin, Blatchley and Coker1999), Kang et al (Reference Kang, Mahan and Lee2000), Goss Gilroy Inc. (1998), Ishoy et al (Reference Ishoy, Suadicani and Guldager1999) and Cherry et al (Reference Cherry, Creed and Silman2001) identified samples of service personnel from military databases. The Unwin et al and Cherry et al studies were of two independent samples drawn from the same UK military database. They employed stratified random sampling in order to frequency-match the characteristics of Gulf War veterans with those who were on active duty at the time but were not deployed to the Gulf. These comparison groups are referred to as non-Gulf veterans; the proportion actually deployed to areas other than the Gulf varied between studies. An alternative sampling strategy used by two studies, the Iowa Persian Gulf Study Group (1997) and Steele (Reference Steele2000), identified all military service personnel who had served during the period of the Gulf War and who lived in one US state (Iowa and Kansas, respectively). Within this standard survey design the investigators then compared those who had been deployed to the Gulf with those who had not. Pierce (Reference Pierce1997) also used a military database but selected only women from the US Air Force to study.
There were also more ad hoc sampling procedures that did not use the large national databases. For example, Holmes et al (Reference Holmes, Tariot and Cox1998), Gray et al (Reference Gray, Kaiser and Hawksworth1999) and Sutker et al (Reference Sutker, Uddo and Brailey1993) compared Gulf War veterans and non-Gulf veterans within a selection of units. Some studies also chose a small number of military bases without any apparent justification for inclusion (Reference Proctor, White and WolfeProctor et al, 1998; Reference Wolfe, Proctor and EricksonWolfe et al, 1999).
Response rates
Response rates also varied considerably between studies (Table 1). Of most importance is that the response rate of the Gulf War veterans was higher than that of the non-Gulf veterans in studies that reported the response rates separately. This could introduce a biased comparison. For example, Unwin et al (Reference Unwin, Blatchley and Coker1999) had a 70% response rate in the Gulf War veterans and a 63% response rate in the non-Gulf War veterans sample. Goss Gilroy Inc. (1998) in the Canadian study reported response rates of 73% for Gulf War veterans and 60% for non-Gulf War veterans.
Measurement
Most of the studies took place after there had been considerable publicity about illness in Gulf War veterans. However, four studies included here reported findings based upon surveys carried out within about a year of the end of the Gulf War: these studies were by Sutker et al (Reference Sutker, Uddo and Brailey1993, Reference Sutker, Uddo and Brailey1994), Holmes et al (Reference Holmes, Tariot and Cox1998) and Stuart & Halverson (Reference Stuart and Halverson1997). All reported a significant excess of psychopathological disorder within the Gulf War veterans.
Many of the studies used the Mississippi scale (Reference Keane, Caddell and TaylorKeane et al, 1988) or modified versions thereof to assess symptoms of PTSD; this is a self-administered scale and it is generally assumed to be less valid than some of the more detailed questionnaires. Some studies used their own method for assessing PTSD based upon questions modelled on the DSM—III—R (American Psychiatric Association, 1987) criteria. A few studies used structured interviews administered by clinicians (Reference Sutker, Uddo and BraileySutker et al, 1994; Reference Proctor, White and WolfeProctor et al, 1998; Reference Wolfe, Proctor and EricksonWolfe et al, 1999), but these assessments would have had the potential disadvantage of introducing possible observer bias, as the interviewers would not have been masked to the participants' deployment status.
We identified 17 studies that included data on common mental disorders. The self-administered questionnaire used most frequently to assess common mental disorder, in eight studies, was the Hopkins Symptom Checklist or Brief Symptom Inventory (Reference Derogatis, Lipman and RickelsDerogatis et al, 1974; Reference DerogatisDerogatis, 1977; Reference Derogatis and SpencerDerogatis & Spencer, 1982; Reference Derogatis and MelisaratosDerogatis & Melisaratos, 1983). This scale was reported either as a continuous outcome or used to define a ‘case’ of common mental disorder. The other studies used a variety of methods to assess common mental disorder, from self-reported symptoms of depression (Reference Proctor, White and WolfeProctor et al, 1998; Reference Gray, Kaiser and HawksworthGray et al, 1999; Reference Ishoy, Suadicani and GuldagerIshoy et al, 1999; Reference Kang, Mahan and LeeKang et al, 2000; Reference SteeleSteele, 2000), other self-administered scales such as the GHQ (Reference Unwin, Blatchley and CokerUnwin et al, 1999), to lengthy clinician-administered structured interviews (Reference Wolfe, Proctor and EricksonWolfe et al, 1999).
Confounding
There was considerable variation in the extent to which the authors attempted to adjust for confounders. Many of the studies that selected from the military databases used a stratified sampling procedure and frequency-matched the non-Gulf veterans on some characteristics in order to adjust for confounding. Some studies included these variables in a multivariate model when analysing their results, which was probably necessary given the differential response rate between the Gulf War veterans and non-Gulf veterans. The most thorough adjustments were carried out by Unwin et al (Reference Unwin, Blatchley and Coker1999). In particular, only Unwin et al and Stuart & Bliese (Reference Stuart and Bliese1998) adjusted for marital status. This is likely to be an important confounding variable, as single people usually have higher rates of common mental disorder and were more likely to be deployed to the Gulf War — although not in Unwin et al's study, possibly because the UK military have fewer members who are never deployed on active service. Unwin et al (Reference Unwin, Blatchley and Coker1999) found that the odds ratio for being a case on the GHQ changed only from 2.0 to 2.1 after adjustment, indicating that there was little evidence of confounding by the variables identified in that study. Results similar to these were obtained using PTSD as the outcome.
Meta-analysis
Post-traumatic stress disorder
It was possible to conduct a meta-analysis of 9 of the 11 studies that reported dichotomous outcomes for PTSD. We were unable to use the data from Goss Gilroy Inc. (1998) and Bartone (Reference Bartone and Paton2000). The results are summarised in Fig. 2. The overall summary estimate using a random-effects model was an odds ratio of 3.17 (95% CI 2.16-4.65), indicating an increased risk in Gulf War veterans. There was significant heterogeneity (χ2=29.4, d.f.=8, P<0.0001). In particular, the two large studies by Unwin et al and Gray et al differed: the former found an OR of 3.5 and the latter on OR of 1.8. The summary estimate for the risk ratios was 2.9 (95% CI 2.0-4.2).
Common mental disorder
We were able to perform a meta-analysis on 11 of the studies that reported on the prevalence of common mental disorder (Fig. 3). Two studies used the same sample, but one (Reference Wolfe, Proctor and EricksonWolfe et al, 1999) reported results from the Structured Clinical Interview for DSM—III—R (Reference Spitzer, Williams and GibbonSpitzer et al, 1990) and the other (Reference Proctor, White and WolfeProctor et al, 1998) presented results from self-reported symptoms of depression. The summary estimate was an odds ratio of 2.04 (95% CI 1.94-2.15), irrespective of whether the data from either of these studies were excluded, indicating an increased risk of common mental disorder in the deployed service personnel. Despite the variation between studies in the outcome used, there was no statistical evidence to support heterogeneity in this sample using odds ratios (heterogeneity test χ2=9.39, d.f.=10, P=0.5). The summary estimate for risk ratio was 1.8 (95% CI 1.6-2.0). It should be noted that the studies by Kang et al (Reference Kang, Mahan and Lee2000) and Unwin et al (Reference Unwin, Blatchley and Coker1999) accounted for 90% of the variance weights in the meta-analysis. The other studies therefore had little influence on the summary estimate.
A funnel plot of the standard error of the estimate against the size of effect suggested that there were fewer small non-significant findings than would be expected. This would not have had much influence on the findings, given the presence of a number of large studies.
Alcohol misuse
There was little evidence concerning alcohol misuse or dependence. Goss Gilroy Inc. (1998) stated that there was no statistically significant association between alcohol misuse and deployment. The Iowa study (Iowa Persian Gulf Study Group, 1997) reported an increased prevalence of alcohol misuse measured by the CAGE questionnaire (Reference EwingEwing, 1984).
DISCUSSION
The results of our systematic review and meta-analysis indicate an increased prevalence of PTSD and common mental disorder in service personnel who had been deployed to the Persian Gulf War. The size of this effect was somewhat larger for PTSD, with an OR of 3.2 (95% CI 2.2-4.7) compared with 2.0 (95% CI 1.9-2.1) for common mental disorder.
Publication bias
We adopted a thorough search strategy but — as in all systematic reviews — may have failed to identify some studies. We are also aware that other studies on this topic are in progress and have yet to report their findings. It is difficult to assess the effect of any publication or citation bias in our data, given the small number of studies that reported data in a form permitting meta-analysis. A funnel plot for the ‘common mental disorder’ outcome suggested that there was an underrepresentation of small studies finding no association between deployment to the Gulf and disorder. However, these small studies would not have had a major impact on the summary odds ratio, despite suggesting that it might be a slight overestimate. The summary estimate was dominated by the two large studies.
Sample selection
A critical part of these designs is the comparability of the deployed and non-deployed troops. Some of the studies used military databases and took care to ensure that their sample was representative of both Gulf War veterans and the comparison group. It is likely that the characteristics of troops selected for deployment systematically differed from those of other active service personnel who were not deployed. This could be less marked for the UK military service, in which almost everyone is likely to be deployed on active duty. Potential confounding factors include gender, fitness level and marital status, along with other aspects (such as propensity for risk-taking) that are more difficult to measure. It is also likely that within individual units, the reasons for choosing people for deployment would lead to a greater selection bias than in studies sampling from national databases, in which whole units would have been selected.
It is difficult to be sure about the effect of selection on the results reported here. Some authors have suggested a ‘healthy warrior’ effect, that the deployed have better underlying health. On the other hand, single people, who are more likely to have been deployed (at least in some studies; Reference Proctor, White and WolfeProctor et al, 1998), tend to have poorer mental health (Reference Kessler, McGonagle and ZhaoKessler et al, 1994; Reference Jenkins, Lewis and BebbingtonJenkins et al, 1997). None of the studies had any independent information about the mental health of participants before the Gulf War and so were not able to take any account of this factor.
Non-response bias
The studies that reported response rates according to deployment status all found that the Gulf War veterans had a higher response rate. It is likely that the publicity surrounding illnesses in Gulf War veterans increased the relevance of a questionnaire about health effects to respondents who had been deployed to the Gulf. This differential response rate could introduce a systematic bias.
Some studies have reported that non-responders tended to have poorer mental health than those who responded (Reference Williams and MacdonaldWilliams & Macdonald, 1986) although Unwin et al (Reference Unwin, Blatchley and Coker1999) in a more intensive follow-up of non-respondents did not find a statistically significant increased risk of common mental disorder. Kang et al (Reference Kang, Mahan and Lee2000) also compared those who responded to the later mailings with those who returned the first mailshot. They did not find that the later respondents had poorer self-rated general health. On balance, it is unlikely that the differential response rate seen in these studies could have explained such a large association as that reported.
Outcome measurements
The majority of studies relied on self-reported symptoms to assess the prevalence of psychiatric disorder. Some of the studies used well-recognised and validated measures of psychiatric disorder, but others (including some of the larger studies) reported results from a single question asking about depressive symptoms. Despite this variation in measurement methods, there was little evidence of heterogeneity in the estimates for common mental disorder. Studies that used the longer semi-structured interviews might have introduced observer bias, given the difficulty in ‘blinding’ the interviewers. In contrast, there was evidence of heterogeneity for the PTSD estimates. In particular, Unwin et al (Reference Unwin, Blatchley and Coker1999) reported a larger effect than did Gray et al (Reference Gray, Kaiser and Hawksworth1999), although both reported a significant increase in prevalence in the Gulf War veterans. Gray et al restricted their sample to naval construction workers, so the different result might merely have reflected the different experiences of this group of service personnel. It should also be noted that the Unwin et al study used a UK military cohort in which almost all the non-Gulf War veterans comparison group would have been deployed on active service at one time or another.
Five of the 20 studies were carried out within 12 months of the end of the war and at a time when publicity concerning illness in Gulf War veterans was minimal. All these studies reported a statistically significant increase in psychopathological disorders in Gulf War veterans. These early studies tended to be less robust from a methodological point of view than the later ones: the samples were less representative, response rates were lower and the studies smaller in size. In contrast, the later and often more robust studies could have been subject to a reporting bias following publicity about illnesses in Gulf War veterans. In conclusion, it appears unlikely that a reporting bias could have led to the findings reported in the constituent studies.
Illnesses in Gulf War veterans
We found that veterans deployed to the Persian Gulf War reported more PTSD and more symptoms of common mental disorder than did service personnel who had not been deployed to this war. Increased rates of PTSD have often been reported after conflicts and can be attributed to the increased likelihood of psychologically traumatising events during wartime. The increased rates of other psychiatric symptoms might just be a consequence of the same process. There is evidence that psychologically traumatic events also lead to an increase in other psychological symptoms, particularly anxiety, in addition to the symptoms more specifically associated with the syndrome of PTSD. An increased rate of psychiatric disorders would therefore be expected in Gulf War veterans, although this does not diminish the importance of this morbidity in affecting veterans many years after returning from the conflict.
What is less clear is how these findings relate to the issue of Gulf War illnesses. Gulf War veterans have reported a wide variety of symptoms, aside from psychiatric symptoms. Unwin et al (Reference Unwin, Blatchley and Coker1999), in the UK study, reported an increased prevalence of a whole range of symptoms after having adjusted for the increased prevalence of common mental disorder in the Gulf War veterans. This supports the view that some other factors must be contributing to illnesses in these veterans, in addition to any increase in psychiatric disorder.
Psychiatric disorder is common, disabling and burdensome. It is an important source of disability after war, yet this is often inadequately recognised and acknowledged. Developing more-effective means of preventing and treating psychiatric disorder in service personnel is an important priority for future research.
Clinical Implications and Limitations
CLINICAL IMPLICATIONS
-
▪ Veterans of the Gulf War report more post-traumatic stress disorder and more depression and anxiety than do war veterans not deployed to that conflict.
-
▪ Service in a war zone leads to an increase in symptoms many years afterwards.
-
▪ The presence of psychiatric symptoms probably does not explain the increased prevalence of other somatic symptoms reported by Gulf War veterans.
LIMITATIONS
-
▪ Most of the large studies were conducted after public concern about illnesses in Gulf War veterans had been voiced.
-
▪ The studies relied upon self-reported information about psychiatric symptoms.
-
▪ Some of the larger studies used non-standard methods of assessing psychiatric symptoms.
Acknowledgements
We thank Simon Wessely and Matthew Hotopf for comments on an earlier draft of the manuscript. We thank the Department of Information Services at University of Wales College of Medicine for assistance in obtaining references for this review.
eLetters
No eLetters have been published for this article.