Introduction
Individuals with schizophrenia generally exhibit neurocognitive deficits in multiple cognitive domains, including executive function, memory, attention, and problem-solving (Harvey & Rosenthal, Reference Harvey and Rosenthal2018; Mesholam-Gately, Giuliano, Goff, Faraone, & Seidman, Reference Mesholam-Gately, Giuliano, Goff, Faraone and Seidman2009; Sheffield, Karcher, & Barch, Reference Sheffield, Karcher and Barch2018). In addition to neurocognitive impairments, deficits in social cognition – the ability to learn social norms and perceive emotions and other social cues in interpersonal interactions – are commonly seen in individuals with schizophrenia (Green, Horan, & Lee, Reference Green, Horan and Lee2019). The social cognition domain is divided into four sub-domains: emotion processing, social perception, attributional style, and theory of mind (i.e. mentalizing) (Green et al., Reference Green, Horan and Lee2019).
The Measurement and Treatment Research to Improve Cognition in Schizophrenia (MATRICS) Consensus Cognitive Battery (MCCB) (Nuechterlein et al., Reference Nuechterlein, Green, Kern, Baade, Barch, Cohen and Marder2008) is the most widely used battery to comprehensively assess cognition in schizophrenia. However, some authors (Hellemann, Green, Kern, Sitarenios, & Nuechterlein, Reference Hellemann, Green, Kern, Sitarenios and Nuechterlein2017) have expressed concerns about the cross-cultural validity of the test used to assess social cognition in this battery, the Mayer-Salovery-Caruso Emotional Intelligence Test (MSCEIT) (Mayer, Salovey, & Caruso, Reference Mayer, Salovey and Caruso2002). The MSCEIT expects respondents to interpret stories or vignettes about social situations that are unfamiliar to many respondents from non-Western cultures, particularly rural respondents, so it is frequently omitted in studies of cognition in schizophrenia (Deng et al., Reference Deng, Phillips, Cai, Yu, Qian, Margaux and Yang2022; Stone et al., Reference Stone, Cai, Liu, Grivel, Yu, Xu and Phillips2020).
The reading the mind in the eyes test (RMET) (Baron-Cohen, Wheelwright, Hill, Raste, & Plumb, Reference Baron-Cohen, Wheelwright, Hill, Raste and Plumb2001) is another measure of social cognition used to assess social cognition in schizophrenia. The RMET assesses ‘theory of mind’, a different component of social cognition than the MSCEIT (which assesses the ‘emotion processing’ component of social cognition). It shows respondents the eye region of 36 Caucasian faces and asks them to select one of four accompanying labels that best describes the mental state of the individual pictured. The RMET – which has been translated into more than 20 languages – may be less culture-dependent than MSCEIT; however, there has been no systematic review integrating the results of studies about the use of RMET in schizophrenia, so it is uncertain whether it could be used as an alternative to the MSCEIT in comprehensive measures of cognitive functioning in schizophrenia. Moreover, very few of the available studies that assess the social cognition of individuals with schizophrenia or healthy controls report multivariate analyses that explore the association between RMET results and important covariates, such as age, years of schooling, IQ, race and language of administration – factors that could potentially explain the considerable heterogeneity of RMET performance among participants.
This systematic review identified all studies that use RMET to assess social cognition in separate samples of individuals with schizophrenia or healthy control subjects, not limited to studies that include both these groups. We also conducted a formal assessment of the quality of the reports of these studies. We then compared the RMET results of all identified samples of individuals with schizophrenia with those of all samples of healthy controls and conducted a meta-analysis of data from the subgroup of studies that directly compare RMET results in individuals with schizophrenia and healthy controls. Other study-level meta-regression analyses assessed the relationship of age, level of education, IQ, race, and language of administration (English v. non-English) to RMET scores in healthy controls and individuals with schizophrenia.
Method
Search
The search algorithm identified some studies that include both patients with schizophrenia and healthy controls, other studies that include patients with schizophrenia with no controls (or with different types of controls), and studies that include healthy controls compared to other types of patients (e.g. patients with autism, bipolar disorder, etc.).
We searched for relevant articles published before 15 July 2020 in three English-language databases (PubMed, Web of Science, and PsycINFO/EBSCO) and two Chinese-language databases (China National Knowledge Infrastructure [CNKI] and Wanfang). The search strategy of the title and abstract of documents included the following terms: (‘RMET’ or ‘Reading the Mind in the Eyes’ or ‘Reading the Mind in the Eye’) OR (‘schizophrenia’ AND ‘eye test’). The detailed search strategy for each database is shown in the online Supplementary materials. Reference lists of the papers meeting eligibility criteria were individually searched to identify additional studies.
Eligibility criteria
Original research studies using the 36-item version of RMET that report the crude RMET score (i.e. the number of correctly classified pictures) of patients with schizophrenia or healthy controls were included. Studies were excluded if the individuals with schizophrenia or healthy controls were under 18 or had a history of mental retardation, autism spectrum disorder, epilepsy, brain injury, brain disease, substance use disorder, or other mental disorders. To reduce the heterogeneity between the samples of individuals with schizophrenia included in the analysis, studies with samples that combined different psychotic disorders (for example, schizophrenia and schizoaffective disorder, delusional disorder or affective disorders with psychotic symptoms) were only included if they provided separate results for the subsample of individuals with schizophrenia (results for non-schizophrenia subsamples in these studies were not included in this review).
Selection of studies
Several reviewers (MAB, YRC, JT, XB, YC, JL, ZL, and QY) screened the titles and abstracts of studies identified in the electronic searches of the databases to decide whether they potentially met the eligibility criteria. Two independent reviewers had to agree on the classification of each article; disagreement was resolved by the senior author (FD). Full-text versions of the potentially eligible articles were then retrieved and independently reassessed by two reviewers (MAB, YRC, JT, XB, YC, JL, and ZL) to ensure that they met the inclusion criteria; disagreements about the final selection were resolved through discussion with the senior author (FD).
Data extraction
The following information about each selected article was entered in a pre-designed table:
• study characteristics (first author, title, journal, year of publication, and language of publication);
• type of study population(s) (patients with schizophrenia only, healthy subjects only, both patients with schizophrenia and healthy controls, or healthy controls compared to patients with other diagnoses);
• characteristics of the study population (country of test administration, source of participants, sampling method, inclusion or exclusion criteria of the study, diagnostic criteria employed to screen subjects, sample size);
• characteristics of included participants (gender, age, years of schooling, urban or rural residence, ethnicity, treatment status [of individuals with schizophrenia]);
• language of RMET test;
• method of administering RMET (interviewer-completed, paper and pencil self-completion, computer-based self-completion, or online self-completion);
• RMET test results (mean and s.d. of RMET scores and results of multivariate analyses if available) and
• (only from papers that include patients with schizophrenia and healthy controls) crude and adjusted results of comparing RMET scores between patients with schizophrenia and healthy controls.
Two independent reviewers (MAB, YRC, JT, XB, YC, and QY) extracted data for each included study; the senior author (FD) made a final determination in cases where the two reviewers disagreed.
Quality assessment
The quality assessment scale developed for this study included the 11 items listed in Table 1. The list combined adapted versions of items used in the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) statement (von Elm, Altman, Egger, Pocock, & Gøtzsche, Reference von Elm, Altman, Egger, Pocock and Gøtzsche2008) with items based on the authors' experience administrating the RMET test. Each item was coded as ‘1’ or ‘0’ based on whether the paper fulfilled the criteria specified in the item. Thus, the theoretical range of the total quality score was 0 to 11. We categorized the overall quality score based on these scores: 0–4 = ‘poor’, 5–7 = ‘fair’, 8–11 = ‘good’. Two reviewers independently assessed the quality of each paper (MAB, YRC, JT, XB, YC, JL, and ZL); disagreements in any of the 11 item scores for each paper were resolved by the senior author (FD).
a number (percent) of the 198 studies included in the review that provide this information.
Analysis
The T test was used to compare the study quality score between study samples of patients with schizophrenia and healthy controls and between samples using different language versions (English v. non-English). The mean RMET score(s), the mean of the number of correctly classified pictures in each group of respondents, was used as the outcome variable for each study. Both regular random-effect models and DerSimonian-Laird random-effect models were used to estimate the pooled score of RMET separately in patients with schizophrenia and healthy controls. The DerSimonian-Laird random-effect model is particularly useful when pooling samples that have heterogeneous results (DerSimonian & Laird, Reference DerSimonian and Laird1986). The Z test was used to compare pooled estimates of RMET scores in patient samples and healthy control samples.
A random-effects model was used to compare the standard mean difference of RMET scores between individuals with schizophrenia and healthy controls in the studies that included both types of respondents because the effect size estimates were heterogeneous. In this analysis, effect sizes for each group were weighted using the inverse variance method. Q statistics, which follow a chi-square distribution, were used to assess standardized within-study differences. The heterogeneity of estimates across studies was assessed using I 2, which represents the proportion of the variance in the estimates due to heterogeneity (Higgins, Thompson, Deeks, & Altman, Reference Higgins, Thompson, Deeks and Altman2003). A funnel plot was used to evaluate potential publication bias, and Egger's test assessed the small-size effect (Egger, Davey, Schneider, & Minder, Reference Egger, Davey, Schneider and Minder1997). We also used other methods to determine publication bias recommended by Carter, Schonbrodt, Gervais, and Hilgard (Reference Carter, Schonbrodt, Gervais and Hilgard2019): trim-and-fill imputation, precision-effect test (PET), and precision-effect estimate with standard error (PEESE). Subgroup analysis evaluated the possible influence of the language of the administered RMET on the outcome.
Both univariate and multivariate meta-regression assessed the association of age, years of schooling, IQ, race, and language of administration with the RMET score in individuals with schizophrenia and healthy controls. The meta-regression equations were estimated using two different methods: restricted maximum likelihood (Viechtbauer, Reference Viechtbauer2005) and bootstrap (Davison & Hinkley, Reference Davison and Hinkley1997).
The mean age in the 180 samples of healthy controls that provided age data covered a wide range (from 18.7 to 71.7 years old), making it feasible to conduct a meta-regression with spline construction of age to identify a potential none-monotonic relationship between age and the RMET score in both univariate and multivariate analyses. All ages from 25 to 45 were fitted as the knot value, and the model with the lowest AIC was considered the best-fitted model.
Data were analyzed using the STATA 17.0 version.
Registration
The protocol of this systematic review was registered on PROSPERO on 30 November 2020 before starting the title and abstract screening of the electronically identified studies (registration ID: CRD 42020216401).
Result
Selection of studies
As shown in Fig. 1, the titles and abstracts of 1886 articles identified in English-language databases and 157 articles identified in Chinese-language bases were screened to identify potentially eligible papers. Based on this preliminary screening by two independent reviewers, the kappa values for potential inclusion were 0.72 for English articles and 0.77 for Chinese articles. The full text of potentially eligible articles (556 in English and 61 in Chinese) was then reviewed by two independent reviewers; the kappa value for inclusion based on this final screening was 0.61 for English articles and 0.52 for Chinese articles. After screening the electronically identified articles and identifying additional articles from the reference lists of selected articles, 198 studies were included in the analysis, 5 in Chinese and 193 in English. These 198 studies included 41 separate samples of patients with schizophrenia (with a total of 1836 patients) and 197 separate samples of healthy controls (with a total of 23 976 individuals). Only 26 (13.1%) of the studies (with 1455 patients with schizophrenia and 1087 healthy controls) directly compared RMET results in individuals with schizophrenia and healthy controls. Among the 41 samples of patients with schizophrenia, 8 (19.5%) used the English-language version of RMET, 3 (7.3%) used the Chinese-language version, 29 (70.7%) used other language versions, and 1 (2.4%) used two language versions (English and Korean). Among the 197 samples of healthy controls, 75 (38.1%) used the English-language version of RMET, 7 (4.1%) used the Chinese-language version, 110 (55.8%) used other language versions of RMET, 1 used two language versions (English and Korean), and the language version used in 4 (2.0%) study samples was unknown. The detailed characteristics of these studies are shown in Table 2.
HC, healthy controls; SCH, patients with schizophrenia; NA, not available.
a Quality score assessed by study authors based on 11 items listed in Table 1 (total score ranges from 0 to 11).
Quality evaluation
Among the 41 samples of patients with schizophrenia included in the 198 papers, one reported a mean RMET score without an accompanying standard deviation (or standard error), and five did not include data on the mean educational level of participants. Among the 197 samples of healthy controls included in the 198 papers, four reported mean RMET scores without an accompanying standard deviation, 17 did not include data on the mean age of participants, and 99 did not include data on the mean educational level of participants.
The items used to assess study quality are shown in Table 1, and the results of the quality assessment of the 198 included studies are shown in the last column of Table 2. The total quality score (theoretical range 0–11) varied from 2 to 10. The mean (s.d.) quality score of all papers was 5.9 (1.4); 28 (14.1%) papers were classified as ‘poor quality’ (score 0–4), 148 (74.7%) as ‘fair quality’ (score = 5–7), and 19 (9.6%) as ‘good quality’ (score = 8–11). Among the 11 separate items, only five items were present in more than 75% of studies (items 1, 2, 6, 9, and 11 shown in Table 1). Four items were absent in more than 75% of the studies: description of study setting (item 3), rationale for sample size (item 5), number of study drop-outs (item 8), and adjustment of RMET results (item 10).
When assigning the quality assessed for the paper as a whole to each of the included samples in each paper, the overall mean quality score for the 238 samples was 6.0 (1.5); 22 (13.5%) poor quality, 173 (72.7%) fair quality, and 33 (13.9%) good quality. The mean quality score of the 41 samples of patients with schizophrenia was significantly higher than that of the 197 samples of healthy controls [6.7 (1.8) v. 5.9 (1.4); t = 3.41, p < 0.001]. The mean quality score in the 149 samples administered non-English versions of RMET was significantly higher than that of the 83 samples administered the English version of RMET [6.3 (1.4) v. 5.6 (1.5); t = 3.13, p = 0.002].
Pooled RMET scores of patients with schizophrenia and healthy controls
The pooled RMET scores in patients with schizophrenia and healthy controls are shown in Figs 2 and 3. Based on the results of 1823 patients reported in 40 separate study samples that provided both the mean and standard deviation of RMET scores, the pooled estimate for the RMET score in patients was 19.76 (95% CI 18.91–20.60). Based on the results of 23 619 healthy controls reported in 193 separate study samples that provided both the mean and standard deviation of RMET scores, the pooled RMET score in healthy controls was 25.53 (95% CI 25.19–25.86) – significantly higher than that in the patient samples (z = 12.41, p < 0.001).
Direct comparison of RMET results between patients with schizophrenia and healthy controls
Among the 26 studies that directly compared mean RMET scores of patients with schizophrenia and healthy controls, only one study (Scherzer, Achim, Leveille, Boisseau, & Stip, Reference Scherzer, Achim, Leveille, Boisseau and Stip2015) did not find a statistically significant difference between the two groups; all other studies reported significantly lower mean RMET scores in the patient group. As shown in Fig. 4A, the pooled standard mean difference for the 26 studies estimated by a random-effect meta-analysis model indicated that the RMET scores in patients with schizophrenia were 1.10 standard deviations lower than the RMET scores in healthy controls (z = −12.32, p < 0.001).
There was substantial heterogeneity in the estimated effect sizes of the 26 studies: the I 2 value was 73.0%, and the corresponding Q statistic value was 92.5 (p < 0.001). The funnel plot for the 26 studies (Fig. 5A) identifies the main reason for this heterogeneity; the plot is imbalanced because the six smallest studies (total sample sizes ranging from 37 to 60) have the six largest effect sizes. Thus, the potential for publication bias is high, a finding supported by the results of Egger's test (z = −4.53, p < 0.001). None of the statistical methods recommended to reduce the effect of publication bias due to the six outlier studies (trim-and-fill imputation, PET, and PEESE) effectively reduced the bias, so we conducted a sensitivity analysis by re-assessing the results after removing the data from the six studies. After removing these six outliners, the funnel plot for the remaining 20 studies is balanced (Fig. 5B); the pooled standardized mean difference is reduced but still statistically significant (SMD = 0.89; z = −13.81, p < 0.001); and the I 2 value is reduced to 42.1% and the corresponding Q-test value was 32.8 (p = 0.03) (Fig. 4B).
Among the 26 studies that directly compared patients with schizophrenia to healthy controls, five studies used the original English version of RMET (Baron-Cohen et al., Reference Baron-Cohen, Wheelwright, Hill, Raste and Plumb2001), one study used the English version in half of the participants and a Korean version in the other half, and 20 studies used translated versions of RMET (Turkish, Hungarian, Italian, and Spanish were each used in three papers; Thai was used in two papers; and Chinese, French, German, Japanese, Lebanese, and Polish were each used in a single paper). Based on the stratified analyses (Fig. 4C), the pooled SMD was greater in the 20 studies using non-English versions (SMD = −1.16, z = 11.22, p < 0.001) than in the five studies using the English version (SMD = −0.84, z = 3.28, p = 0.001) and heterogeneity was greater in studies using the English version (I 2 = 74.6%, p < 0.001) than in studies using non-English versions (I 2 = 67.9%, p < 0.001). The SMD was not significantly different between these language-based subgroups when all 25 study samples were included in the analysis (Chi[Q] = 2.48, p = 0.12). However, after excluding the six small-sample outlier studies (Fig. 4D), the SMD in the remaining 15 non-English RMET studies was significantly greater than the SMD in the remaining four English RMET studies (−0.95 v. −0.64, Chi[Q] = 8.54, p < 0.001), but the four remaining studies using the English version were less heterogeneous than the 15 remaining studies that used non-English versions (I 2 = 0.0% in the four English RMET studies, and I 2 = 37.9% in the 15 non-English RMET studies).
Meta-regression on the covariates
There were 36 studies with 40 distinct samples of individuals with schizophrenia (combined sample size = 1823) that provided both the mean age of the sample and the mean and standard deviation of the RMET scores; 29 of these studies included 35 distinct samples with schizophrenia (combined sample size = 1620) that also provided the mean years of schooling of the sample. These data made it possible to conduct three separate regression analyses that included age, schooling, and both age and schooling as independent variables. Each regression equation was estimated using two methods: restricted maximum likelihood and the bootstrap method. As shown in Table 3, when the regression only had age as an independent variable (Model 1, Fig. 6A), the RMET score decreased with increasing age, but this decreasing trend was not statistically significant (β = −0.045, p = 0.516). When the regression only included years of schooling as an independent variable (Model 2, Fig. 6B), the RMET score increased with increasing years of schooling, but this increasing trend was not statistically significant (β = 0.399, p = 0.149). Multivariate meta-regression using both mean age and mean years of schooling as independent variables (Model 3) also showed the negative relationship between RMET score and age (β = −0.032, p = 0.635) and the positive relationship between RMET score and years of schooling (β = 0.418, p = 0.140) in patients with schizophrenia, but neither of these associations was statistically significant. The results using the two estimation methods were quite similar, but the p values for the coefficients related to years of schooling are substantially smaller when using the bootstrap method.
* p-values printed in bold indicated that the result is statistically significant.
A parallel meta-regression analysis of healthy control subjects used the results from 180 distinct samples (combined sample size = 21 494) that included data on the mean age of respondents; 98 of these samples (combined sample size = 7946) also included data on the mean years of schooling of respondents. In these analyses, the regression that only included age as an independent variable (Model 1, Fig. 6C) identified a statistically significant decrease in RMET scores with increasing age (β = −0.031, p = 0.020); the regression that only included years of schooling as an independent variable (Model 2, Fig. 6D) found a statistically significant increase in RMET scores with increasing years of schooling (β = 0.477, p < 0.001); and the multivariate meta-regression that included both age and years of schooling as independent variables (Model 3) found that increasing years of schooling remained significantly associated with increasing RMET scores (β = 0.423, p < 0.001), but the relationship of increasing age with decreasing RMET scores was no longer statistically significant (β = −0.026, p = 0.126). In this case, the only difference in the two estimation methods was a smaller p value for age in Model 3.
The differences in the association of age and education with RMET scores between the patient samples and healthy control samples may be related to the number of distinct samples available for the different analyses. For example, in the regressions using age as an independent variable, the coefficient for the 40 patient samples was substantially greater than that for the 180 healthy control samples (β = −0.045 v. β = −0.031), but the relationship of decreasing RMET scores with increasing age in the healthy control samples was statistically significant, whereas that in the patient samples was not. Similarly, in the multivariate meta-regression analysis, the coefficient for the adjusted relationship of years of schooling in the 35 patient samples (β = 0.418) is essentially identical to that for the 99 healthy control samples (β = 0.423). However, the relationship of increasing RMET scores with increasing years of schooling is not statistically significant for the patient groups (p = 0.140), while it is statistically significant for the healthy control groups (p < 0.001).
In the multivariate meta-regression, the larger negative coefficient for age in the patient samples compared to that in the healthy control samples (β = −0.032 v. β = −0.026) suggests that after adjusting for years of schooling, the annual rate of decline in social cognitive functioning (as assessed by RMET) in patients with schizophrenia is 23% ([0.032–0.026]/0.026) faster than that in healthy controls.
We also considered IQ and race (Caucasian v. other) potential covariates. However, only 6 of the 41 studies with patient samples provided IQ, and only 7 of the studies provided data on race, so it was not feasible to conduct a meta-regression in the patient samples. There were, however, 26 studies with samples of healthy controls that provided IQ (19 of which also provided data on years of schooling) and 21 studies with samples of healthy controls that provided data on race (9 of which also provided data on years of schooling). In the univariate meta-regression of the RMET score and IQ, IQ had a non-significant positive association with the RMET score (β = 0.046, p = 0.413); in the multivariate meta-regression (RMET scores v. IQ and years of schooling), the positive association of RMET with years of schooling was statistically significant (β = 0.492, p = 0.045) while that with IQ remained non-significant (β = −0.066, p = 0.319). In the univariate meta-regression of RMET score and race, the proportion of Caucasians in the sample had a non-significant negative association with the RMET score (β = −0.152, p = 0.907) while in the multivariate meta-regression (RMET scores v. race and years of schooling) the positive association of RMET scores with years of schooling was no longer statistically significant (β = 1.199, p = 0.064) and the proportion of Caucasian subjects in the sample had a non-significant positive association with RMET scores (β = 3.296, p = 0.151).
Assessment of non-monotonic relationship between age and RMET score in healthy controls
The mean age of individuals in the 180 samples of healthy controls that included data on age ranged from 18.7 to 71.7, making it possible to assess a potential non-linear relationship of age with RMET scores using linear regression with spline construction. Assessing potential knots from 25 to 45 years of age, we identified 31 years of age as the point of inflection (i.e. the knot with the lowest AIC) for both univariate regression (only including age, AIC = 792.2) and multivariate analysis (including age and years of schooling, AIC = 422.7). As shown in Table 4 and Fig. 7, in the univariate analysis, the RMET score increased with age before age 31 (β = 0.123, p = 0.008) and declined with age after age 31 (β = −0.074, p < 0.001). In the multivariate model (Table 4), after adjusting for years of schooling (which was significantly associated with RMET score), the RMET showed a significant increase with age before age 31 (β = 0.179, p = 0.048) and a statistically significant decline with age after age 31 (β = −0.048, p = 0.011).
* p-values printed in bold indicated that the result is statistically significant.
Discussion
This review identified 198 studies that used RMET to assess social cognition in 41 separate samples of patients with schizophrenia and 197 separate samples of healthy controls. The pooled mean RMET score of the 1823 patients and 23 619 healthy controls included in these studies was much lower in patients than in healthy controls (19.8 [18.9–20.6] v. 25.5 [25.2–25.9], z = 12.41, p < 0.001). Meta-analysis of the results of 26 studies that directly compared RMET scores in patients with schizophrenia and healthy controls found that the pooled mean of patients' scores was more than one SMD lower than the pooled mean score of healthy controls. Significant publication bias was identified among these studies (studies with smaller sample sizes were more likely to report larger SMD between the two groups), but the differences between groups remained significant after removing the six outlier studies with potential publication bias. These results confirm previous findings that patients with schizophrenia are suffering from substantial deficits in theory of mind.
Subgroup analyses indicated that after excluding the outlier studies the difference in RMET performance between patients with schizophrenia and healthy controls was greater in studies using non-English versions of RMET than in those using the original English version (Chi [Q] = 8.54, p < 0.001). The reasons for this difference are unclear. All of the studies used the same sets of pictures (with Caucasian subjects), so it is likely (though not certain) that some of the respondents administered non-English versions of RMET were less racially and ethnically similar to the individuals in the stimulus pictures than respondents administered the English version of RMET. The difficulty patients have in identifying emotions in the RMET may be magnified when presented with pictures of persons with an ethnicity different from their own, resulting in a greater assessed deficit compared to healthy controls in studies that use non-English versions of RMET. One previous study reporting that children perform better when recognizing the emotions of their own-race faces than other-race faces (Segal, Reyes, Moulson, & Gobin, Reference Segal, Reyes, Moulson and Gobin2019) supports this hypothesis. Further research with RMET using non-Caucasian pictures is needed to clarify this issue.
The results for patients and healthy controls were quite heterogeneous, so we used meta-regression methods to explore the relationship between mean RMET performance, mean age, and mean level of education in patient samples and, separately, in healthy control samples. In the univariate analyses, age was negatively related to the RMET score and educational level was positively related to the RMET score in both the patient samples and the healthy control samples, but the results were only statistically significant for the healthy control samples, possibly because of the much smaller number of patient samples available for analysis. A separate meta-regression with spline construction in the healthy control samples found that RMET scores increased with age before age 31 and decreased with age after age 31. (The much smaller number of samples of patients with schizophrenia and the smaller range in the mean age of these samples made it infeasible to conduct a spline construction meta-regression using the patient samples.) These relationships persisted in the multivariate analysis (including age and years of schooling as covariates), though the effect of age was attenuated after adjustment for years of schooling.
Previous findings about the relationship between age and RMET scores have been inconsistent. Dodell-Feder, Ressler, and Germine (Reference Dodell-Feder, Ressler and Germine2020) used online interviews to assess RMET in 40 248 participants 10–70 years of age and found that RMET scores increased with age up until 65. Cabinio et al. (Reference Cabinio, Rossetto, Blasi, Savazzi, Castelli, Massaro and Baglio2015) reported unchanging RMET scores in healthy respondents 20–70. Two cross-sectional studies (Javkowiak-Siuda et al., Reference Javkowiak-Siuda, Baron-Cohen, Bialaszek, Dopierala, Kozlowska and Rymarczyk2016; Slessor, Phillips, & Bull, Reference Slessor, Phillips and Bull2007) comparing RMET performance in persons over 65 to that of persons under 35 found that the older participants had significantly lower RMET scores. Finally, Pardini and Nichelli (Reference Pardini and Nichelli2009), Deng et al. (Reference Deng, Phillips, Cai, Yu, Qian, Margaux and Yang2022), and Lee, Nam, and Hur (Reference Lee, Nam and Hur2020) reported that RMET performance started to decline in the fifth decade of life, at age 60 and age 66, respectively. Several hypotheses have been proposed to explain increasing deficits in theory of mind with aging. Slessor et al. (Reference Slessor, Phillips and Bull2007) suggested that deficits in theory of mind are manifestations of general impairment in the ability to decode cues. Some researchers suggest that the decline of theory of mind is mediated by impairment in other cognitive domains, such as executive function, information processing speed (Charlton, Barrick, Markus, & Morris, Reference Charlton, Barrick, Markus and Morris2009), destination memory (El Haj, Raffard, & Gély-Nargeot, Reference El Haj, Raffard and Gély-Nargeot2016), and verbal intelligence (Slessor et al., Reference Slessor, Phillips and Bull2007). Furthermore, neuroimaging studies report that declines in RMET score with aging are correlated with decreasing volume in the bilateral precentral gyrus, bilateral posterior insula, left superior temporal gyrus, and left inferior frontal gyrus (Cabinio et al., Reference Cabinio, Rossetto, Blasi, Savazzi, Castelli, Massaro and Baglio2015). Our systematic review of 198 studies that administered RMET to 180 separate samples of healthy subjects is the first study to identify a non-monotonic relationship between RMET score and age, suggesting that individuals accumulate knowledge and skills of theory of mind until they reach early middle age (32 years of age), and then their theory of mind performance gradually declines with normal aging. This raises the possibility that the neurodevelopmental trajectory of social cognition is more prolonged than that of other types of cognition (i.e. continuing to develop as the individual's social world expands during adolescence and young adulthood) and, thus, can be disrupted at later ages by serious mental illnesses like schizophrenia.
In this review we found that the association of years of schooling with RMET scores was more robust than the association of age with RMET scores, but there has been much less research about the role of education in the development of theory of mind. Khorashad et al. (Reference Khorashad, Khazai, Roshan, Hiradfar, Afkhamizadeh and van de Grift2018) found no significant relationship between RMET score and educational attainment, while other studies (Deng et al., Reference Deng, Phillips, Cai, Yu, Qian, Margaux and Yang2022; Dodell-Feder et al., Reference Dodell-Feder, Ressler and Germine2020; Schimit & Zachariae, Reference Schimit and Zachariae2009) found that years of schooling can explain some variance in the RMET score.
Familiarity with the four terms provided as potential response choices for each presented picture in the RMET is, presumably, a prerequisite for making the correct selection. It is reasonable to expect that persons with lower levels of education will have lower verbal intelligence and, thus, have greater difficulty achieving a high RMET score because they are less familiar with the presented terms. Moreover, the relative difficulty of the terms associated with each picture and the distinctiveness of the meanings of the four presented terms will vary across languages, so it is likely that the association of education level with the total RMET score (and with the pattern of incorrect RMET items) will vary for different language versions of the RMET. Assessment of item difficulty in each language (e.g. their frequency of use in daily speech) and comparison of RMET scores with measures of verbal intelligence will be needed to (1) decide on the minimum education level appropriate for administering the RMET; (2) develop a method of adjusting RMET scores based on education level or vocabulary skill, and (3) develop alternative versions of RMET suitable for persons with little formal education.
Limitations
There are several potential limitations. (1) We only searched for studies published in English or Chinese, so the analyses did not include studies published in other languages or unpublished studies. (2) Some samples in the papers did not include data about key variables needed in the analysis (i.e. the standard deviation of mean RMET score, age of the sample, educational level of the sample) and some other studies were of low methodological quality. (3) Only 26 of the 198 studies directly compared RMET results of patients with schizophrenia and healthy controls, limiting our ability to conduct meta-analyses of results. (4) Most samples of patients with schizophrenia were chronic patients regularly using antipsychotic medications, so their deficits in theory of mind may not be representative of that in all individuals with schizophrenia. (5) The range in the mean age and mean years of education of the 40 samples of patients was relatively narrow, making it difficult to accurately assess the potential relation of age and education with RMET scores in the patients. (6) The distribution of the mean age of the 180 separate samples of healthy controls was imbalanced (the mean age of 88% of the samples was below 50), which potentially biased the assessment of the inflection point (at 32 years of age) in the meta-regression spline construction analysis. (7) Few studies reported other covariates of interest, including race, vocabulary level, and IQ participants; this made it difficult to explore the potential relationship of these variables with RMET performance in persons with schizophrenia.
Conclusion
This is the first systematic review and meta-analysis of studies using the RMET to assess social cognitive functioning among individuals with schizophrenia. Meta-analyses of data from 198 identified studies confirm previous single-study findings that patients with schizophrenia experience severe impairments in theory of mind and, thus, support the construct validity of RMET. The consistency of these findings in multiple languages and several countries suggests that RMET may be a more cross-culturally valid measure of social cognition than other measures of social cognition like the MSCEIT that depend on respondents' interpretation of social scenarios or vignettes. RMET scores decrease with age and increase with years of schooling in both patients and healthy controls, though these relationships were only statistically significant in the healthy control samples, possibly due to the much smaller number of patient samples available for analysis. The unexpectedly more significant differences between patients and controls when using non-English versions of the RMET than when using the original English version suggests that linguistic, racial, ethnic, and cultural differences also need to be considered when interpreting the results of the RMET. The assessed quality of most of the reports (based on a revised version of the STROBE reporting guidelines) was ‘fair’, and, interestingly, the quality of reports of studies using non-English versions of RMET was greater than that of studies using the original English version. In the multivariate meta-analysis of healthy control samples that included both age and years of schooling as covariates, years of schooling remained significantly associated with RMET scores, but age was no longer significantly associated with RMET scores. We also found a previously unreported non-monotonic relationship between age and RMET performance in healthy controls: the RMET score increased with age before age 31 and decreased with age after age 31. These findings highlight the need to clarify the relationships between age, education, verbal intelligence, and social cognition; they also suggest the need for a more nuanced assessment of the neurodevelopment of theory of mind – which may differ from the neurodevelopment of other cognitive abilities.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291723003501
Author contributions
FD designed the study, coordinated the collection and analysis of the data, and wrote the initial draft of the manuscript. MAB, YRC, JT, XB, YC, JL, ZL, and QY screened articles and extracted data from selected articles. MQ provided advice about data analysis. MRP, LHY, and WSS provided technical support throughout all the steps of the study. MRP made detailed revisions to the manuscript.
Funding statement
This study was supported by a National Institute of Mental Health grant for the project ‘Characterizing Cognition Across the Lifespan in Untreated Psychosis in China’ (PI's: Yang, Phillips, Keshavan; number, MH108385 R01), the Shanghai Mental Health Qi Hang Project (PI, Deng; number, 2018-QH-06), and the Shanghai Mental Health Project (PI, Deng; number, 2021-YJG05). None of these funding agencies had any role in the design or conduct of the study; in the collection, management, analysis, or interpretation of the data; in the preparation, review, or approval of the manuscript; or in the decision to submit the manuscript for publication.
Competing interest
None.