Introduction
This article examines some key issues around data collection and data analysis for EDI (Equality, Diversity and Inclusion) in the context of mathematics higher education and discusses implications for evidence-based EDI policy. EDI questions have taken on increasing importance within the mathematics higher education community recently and generated much debate in the UK. For example, in March 2023, the Quality Assurance Agency for Higher Education in the UK released a benchmark statement (QAA 2023) for mathematics, statistics and operations research which stated that ‘values of EDI should permeate the curriculum and every aspect of the learning experience’. This statement generated an open letter signed by over 50 leading mathematicians asserting that the ‘values of EDI are a topic of fierce academic and political debate’ and criticizing the QAA for politicizing the mathematical curriculum (Clarence-Smith Reference Clarence-Smith2023). A national Academy for the Mathematical Sciences is currently being established in the UK to act as the coordinating focal point for the community of mathematics educators, researchers and scientists, following a key recommendation of the 2018 Bond Review, the Era of Mathematics (Bond Reference Bond2018). Initial consultations (Academy for the Mathematical Sciences Consultation Document 2023) during the proto-academy phase of the new Academy had identified that EDI should be ‘consciously placed at the heart of the new Academy’. Following an extended consultation, the executive committee of the Academy acknowledged in December 2023 the concerns about politicizing EDI and declared that it will follow a data and evidence-based approach to EDI in mathematics (Academy for the Mathematical Sciences 2023).
There has been a growing consensus within the higher education sector in the UK that good EDI policies should involve data, as can be seen from the Engineering and Physical Sciences Research Council EDI action plan (EPSRC 2022), the Advance HE ‘Equality Audits’ (Advance HE Bespoke Equality Research 2023) as well as the above-mentioned plans of the Academy for the Mathematical Sciences. However, as can be seen from a recent UKRI review (Guyan and Oloyede Reference Guyan and Oloyede2019), existing data collection for EDI in higher education is largely centred on gathering data on the diversity characteristics for various outcomes and recording various EDI interventions. Data showing discrepancies in outcomes are often treated as a proxy for evidence of unequal opportunity. While discrepancies suggest that there may be a problem, they don’t tell us what the problem is. If policymakers fail to look at the evidence and see where the problem lies, they could potentially harm the groups that they are seeking to help, by misdiagnosing the problem and thus failing to adequately tackle it. So far, there has been little work on how one may gather reliable data that directly capture experiences of disadvantage encountered by academic mathematicians, or how such data could inform effective EDI policy.
By delving into the above questions and thus contributing to the ongoing debate about EDI within the UK mathematics higher education community, this article provides a timely analysis. While this article focuses on mathematics and is largely written in the context of the UK, much of the analysis in this paper should be of interest for researchers and educators from other countries as well as those working in other disciplines.
Gathering Data on the Mathematics Pipeline
It would be useful to gather more quantitative data on the diversity characteristics of mathematicians and mathematical scientists at various points along the pipeline from school to university to employment. The quantitative data obtained above should be viewed as the beginning of a story and not a call for action. However, they can serve as starting points for further investigation in several ways.
First, the data can be compared with other fields beyond mathematics, with underlying census and education participation data, and with existing international data in order to understand whether (and to what extent) the diversity statistics of the UK mathematics pipeline deviates from these. This would be helpful in pinning down which areas need further investigation. However, on their own, such comparisons cannot tell us if any observed under or over-representation of some group – relative to some baseline, itself possibly arbitrary – whether in the entire mathematical community or at a particular career stage, is the result of a problem that needs to be addressed.
Second, by combining these data with targeted surveys to figure out why people leave mathematics, we can begin to measure any systemic problems and quantify experiences of disadvantage or discrimination. These latitudinal surveys would ideally be combined with longitudinal studies on focus groups. This part of the data-gathering process should combine qualitative and quantitative elements: the qualitative answers obtained should be grouped into clear categories (incorporating sets of common reasons why people leave mathematics at various stages) which would ultimately lead to a database of rich quantitative data.
An example of how the longer-term focus group studies can work can be seen from the famous longitudinal studies (Lubinski and Benbow Reference Lubinski and Benbow2006; Lubinski et al. Reference Lubinski, Benbow and Kell2014) performed on the SMPY (Study of Mathematically Precocious Youth) cohorts. These studies offer some of the best understanding we currently have on mathematically highly-able people and the reasons for their career choices and outcomes. Indeed, SMPY’s Cohort 3 is the largest database of highly gifted persons ever assembled for systematic longitudinal study. Each cohort was studied longitudinally over several decades. A list of standard questions was used to measure their life and work priorities, their attractions to STEM and non-STEM subjects and careers, career outcomes and reasons for various choices. Moreover, these cohorts were given some standard instruments for measuring subjective well-being. The results, documented in the above-cited papers, shed important light on the lifestyle preferences and priorities, time allocation, and career trajectories of mathematically gifted boys and girls over a period exceeding four decades.
While the SMPY cohorts comprised young people with very high mathematical abilities, similar studies could be done with representative cohorts of older people who had studied mathematics at university but ultimately left mathematics at some stage. Initial qualitative surveys on smaller groups may help in formulating the standard questions to use in the bigger longitudinal studies.
In addition to longitudinal studies on focus groups, researchers could perform periodic latitudinal (cross-sectional) studies to gather one-time data on much larger samples of mathematicians at various career stages. The questions here could measure among other things well-being and job satisfaction, experiences of disadvantages or discrimination, as well as the reasons for career choices made. Only some of the questions and answers will have relevance to EDI concerns but it is critical to provide a full range of questions and possible answers to all participants to avoid skewing the data. If the survey design is too restrictive or the answer options too limited, it may be difficult for respondents to give answers which accurately reflect their experience or opinions, leading to unreliable data.
The surveys proposed above can only measure perceived discrimination or disadvantage, and perceptions might not always be accurate. However, by gathering data on such perceptions in relation to various factors and across various groups of people, researchers can gain a better understanding of what may be holding certain groups back, where or how one should intervene, and what further studies may be necessary.
In order to gather reliable and representative data on experiences of disadvantage or discrimination, the answers from the above surveys should be grouped into meaningful categories. The data in relevant categories should be disaggregated by the different protected characteristics and compared with meaningful baseline data such as census data and education participation data to gather evidence for the hypothesis that members of a particular group are more likely to perceive themselves as disadvantaged in a specific way. As a simple example to illustrate this, suppose that in a sample of 100 surveyed working mathematicians, 30 feel disadvantaged due to caring responsibilities. By disaggregating the data by sex, one could find out whether this is experienced disproportionately by women. On the other hand, by comparing these figures with national data on perceptions of carers in employment (Carers UK 2023), one may gain insight on how experiences of mathematicians compare with those of the general population.
As a final point, many mathematicians are experts in statistical analysis but not in social statistics or behavioural science. Mathematics organizations that aim to effectively perform the above studies would need to work with behavioural scientists (psychologists, cognitive scientists, behavioural neuroscientists) as well as quantitative social scientists. Moreover, it is crucial that the raw data obtained through the above studies be published and shared openly with other scientists to carry out their own analysis. I elaborate on some of this in the next two sections.
Harmonization of Questions and Open Publication of Data are Key
The questions used in the surveys outlined in the previous section should be standardized as far as possible and harmonized with standard questions used elsewhere and across surveys conducted in different years. Without such harmonization, the data have the risk of being misleading. In addition, harmonization and standardization of questions will help researchers compare the data with fields beyond mathematics and to meaningful baseline data. If the task of gathering data is delegated to individual institutions, it is crucial that they are required to ensure survey questions are harmonized across institutions. While all this may be obvious, it has not always been the practice: Athena Swan has recently produced standard staff survey questions, which is a considerable improvement over the previous free for all. An example of the kind of problem that can result from non-standard questions is seen from the Athena Swan submission of the Bartlett at UCL. As noted by Armstrong and Sullivan (Reference Abbot, Bikfalvi, Bleske-Rechek, Bodmer, Boghossian, Carvalho, Ciccolini, Coyne, Gauss, Gill, Jitomirskaya, Jussim, Krylov, Loury, Maroja, McWhorter, Moosavi, Schwerdtle, Pearl, Schreiner, Schwerdtfeger, Shechtman, Shifman, Tanzman, Trout, Warshel and West2023), there was evidence in their survey that staff did not feel confident reporting sexual harassment, but this was not flagged because the question was non-standard and the analysis could be buried in an Athena Swan report rather than quickly viewed on a website. Additional examples where questions were changed from year to year can be seen from cases where institutions followed an Advance HE recommendation (active from 2016 to 2021) to conflate gender identity and sex in their surveys, leading to misleading data and preventing institutions from fulfilling their public sector equality duty to monitor and publish data on the protected characteristic of sex (Sullivan and Armstrong Reference Sullivan and Armstrong2021; Armstrong and Sullivan Reference Armstrong and Sullivan2023).
It is also crucial that the data gathered are published openly. This will allow other researchers to use the data to perform their own analysis, which can only lead to better scientific understanding. Restricting use of the data to internal studies runs several risks, including poor or one-sided analysis, policy capture, and groupthink. Public data sharing is likely to lead to a greater diversity of approaches to solving EDI problems and to more innovation. As mathematicians and mathematically highly-able people form a small fraction of the population, the data obtained by latitudinal and longitudinal surveys of mathematicians will be particularly valuable, making it all the more important to share it widely. It may also contribute one day to a widely accepted and comprehensive multi-disciplinary theory of differences across demographic groups that go beyond EDI concerns and help us better understand human behaviour and human diversity. It is unfortunate that, at present, even basic data are hard to come by. As an anecdotal item of evidence, I was informed by Dr John Armstrong (Reader in Financial Mathematics at Kings College London) that when he tried to find out whether LGBT staff are under- or over-represented in science, he first made a Freedom of Information request to every Russell Group university in the UK, then learned that the Higher Education Statistics Agency has data on this they do not advertise but which can be accessed on request, which ultimately cost him £600. It ought to be possible to just look this kind of data up.
A Warning on Interpreting Data (and the Importance of Academic Freedom in Data Analysis)
It is important to be cautious in the interpretation of data. It should not be automatically assumed that every observed disparity in outcomes is the result of discrimination or structural disadvantage. For example, consider the well-known fact that women are under-represented in mathematically intensive fields. Popular explanations for this disparity often focus on sexism and other social factors, such as lack of female role models, the burden of child-rearing, stereotypes, barriers in working environment and a masculine research culture (Moss-Racusin et al. Reference Moss-Racusin, Dovidio, Brescoll, Graham and Handelsman2012; Pártay et al. Reference Pártay, Teich and Cersonsky2023). However, evidence in the behavioural sciences suggests that sexism and socialization are only part of the story (Stewart-Williams and Halsey Reference Stewart-Williams and Halsey2021). As the social factors mentioned above are commonly studied and discussed in EDI circles, I would like to highlight some of the other possible factors which have been proposed.
There are well-documented personality differences, on average, between males and females – especially along the ‘interest in people vs interest in things’ dimension; indeed, these differences are among the most robust and replicable effects in all of psychology (Su et al. Reference Su, Rounds and Armstrong2009; Del Giudice et al. Reference Del Giudice, Booth and Irwing2012; Archer Reference Archer2019). Many of these personality sex-differences appear to persist from infancy (Connellan et al. Reference Connellan, Baron-Cohen, Wheelwright, Batki and Ahluwalia2000; McClure Reference McClure2000). In addition, there are well-documented differences in breadth of interests between males and females (for example, females with strong mathematical interests tend on average to be also highly interested in verbal domains, while males typically have a more asymmetric profile) and psychologists Ceci, Valla, and their colleagues have put these ideas together to develop a theoretical explanation of sex differences in attraction to STEM subjects, known as the breadth-based model (Valla and Ceci Reference Valla and Ceci2014). Studies have also shown sex-differences on average in neurocognitive profiles (Halpern Reference Halpern2013) and ability-tilts (Wai et al. Reference Wai, Hodges and Makel2018). It should be emphasized that all the differences noted above relate to statistical features of the distributions and do not imply binary divisions – in fact, members of both sexes vary greatly in interests and cognitive profiles and the distributions for the sexes overlap almost completely (Hyde Reference Hyde2005). The question of how various differences in personality, interests and neurocognitive profiles are correlated to differences in educational and vocational choices and outcomes is an active area of study (Ceci and Williams Reference Ceci and Williams2009; Pinker Reference Pinker2009; Valla and Ceci Reference Valla and Ceci2014; Stewart-Williams and Halsey Reference Stewart-Williams and Halsey2021). Furthermore, on certain traits, there is no mean difference between males and females, but a modest difference in variability, with males overrepresented at both the low and high end of these distributions (Halpern et al. Reference Halpern, Benbow, Geary, Gur, Hyde and Gernsbacher2007; Stevens and Haidt Reference Stevens and Haidt2017).
The underlying reasons behind the sex-differences highlighted in the previous paragraph continue to form an area of intense study and debate (Fine Reference Fine2017; Stewart-Williams Reference Stewart-Williams2018; Rippon Reference Rippon2019; Murray Reference Murray2020; Stewart-Williams and Halsey Reference Stewart-Williams and Halsey2021). Some purely environmental explanations (e.g., stereotype threat) for these differences have been cast into doubt (Flore and Wicherts Reference Flore and Wicherts2015). There have been critiques of biological explanations of sex-differences as well as progress in understanding the organizational role of pre-natal sex-hormones in shaping the human brain and mind. To get an idea of some of the arguments made by each side, it is worth reading side-by-side two articles: one is a critique by Fine et al. (Reference Fine, Joel and Rippon2019) and the other is a response to it by Del Giudice et al. (Reference Del Giudice, Puts, Geary and Schmitt2019).
I have focused on sex above as an illustrative example, partly because of the volume of studies that exist. However, I hope it is clear from the above discussion that there is no real consensus among scientists about the relative importance of various factors or the overarching reasons behind observed sex-differences in STEM fields. This is an area that is still very much at the frontiers of scholarship. Moreover, our knowledge is even more incomplete for differences observed for other protected characteristics, where the data are often harder to parse. We should indeed collect more data on the experiences, life-choices and career outcomes of people who study or work in mathematical fields: such data are valuable and if shared openly with other scientists may contribute one day to a widely accepted and comprehensive multi-disciplinary theory of differences across demographic groups.
However, policymakers, university departments or EDI committees in the higher education sector rarely have any expertise in the behavioural sciences and, given our current state of understanding, they should be careful not to embrace one particular grand theory or another to explain under-representation of some group.
Moreover, as the example above illustrates, topics related to equality and diversity are politically controversial. The tension between the Stonewall Workplace Equality index and academic freedom is well documented (Armstrong and Sullivan Reference Abbot, Bikfalvi, Bleske-Rechek, Bodmer, Boghossian, Carvalho, Ciccolini, Coyne, Gauss, Gill, Jitomirskaya, Jussim, Krylov, Loury, Maroja, McWhorter, Moosavi, Schwerdtle, Pearl, Schreiner, Schwerdtfeger, Shechtman, Shifman, Tanzman, Trout, Warshel and West2023) and the Race Equality Charter has led institutions to focus on ‘decolonizing’ the mathematics curriculum (Ogundimu and de Korte Reference Ogundimu and de Korte2023), which can prevent academics from teaching their subjects according to their best professional judgement.
In view of this, it is of paramount importance to affirm the importance of academic freedom in the analysis of the data gathered through the surveys proposed in the first section and to guard against ideological bias. Academic freedom has been the foundation of progress in human knowledge for centuries. If academics and students cannot challenge the status quo, if we cannot express controversial opinions or question the dominant orthodoxy, then our understanding of the world will stall and intellectual life on campus will be diminished.
Researchers should be free to analyse the EDI data generated by latitudinal and longitudinal studies without being constrained by institutional pressures or policies. Viewpoint diversity is important here to avoid groupthink. Much EDI thinking in the recent past has been influenced by postmodernist frameworks based on critical social justice which reject merit and objective reality in favour of lived narratives and view science as a tool for power (Abbot et al. Reference Abbot, Bikfalvi, Bleske-Rechek, Bodmer, Boghossian, Carvalho, Ciccolini, Coyne, Gauss, Gill, Jitomirskaya, Jussim, Krylov, Loury, Maroja, McWhorter, Moosavi, Schwerdtle, Pearl, Schreiner, Schwerdtfeger, Shechtman, Shifman, Tanzman, Trout, Warshel and West2023); there is a real need now to involve behavioural scientists and quantitative social scientists. The pursuit of truth and knowledge also demands that findings that are unwelcome for theoretical or ideological reasons should not be censored. There is no magic bullet to ensure this, but open data sharing, engagement with a variety of researchers, encouraging viewpoint diversity, and having robust academic freedom protections should help.
Unfortunately, it is all too common that studies that go against the EDI orthodoxy are suppressed (e.g., not sent for peer review) or downplayed by researchers. For two compelling examples of this, see Clark and Hatfield (Reference Clark and Hatfield2003) and Hill (Reference Hill2018). Krylov and Tanzman (Reference Krylov and Tanzman2023) have described various ways in which research studies running counter to the narrative of critical social justice are censored and suppressed by publishers. Del Giudice et al. (Reference Del Giudice, Puts, Geary and Schmitt2019) call this kind of suppression of studies for ideological or political reasons the ‘reverse bias’ and caution that if a ‘certain effect is never published or discussed in the literature, it may go completely unrecognized for some time’. As they note correctly, when academics themselves argue that certain scientific claims can be dangerous and socially harmful, it creates obvious incentives for ideological suppression of research findings. Scientists’ political views are largely unrepresentative of those of the general population (Honeycutt and Freberg Reference Honeycutt and Freberg2017). Williams and Ceci (Reference Williams, Ceci, Frisby, Redding, O’Donohue and Lilienfeld2023) have identified lack of political diversity and politically motivated social media as key factors leading to ideological bias in recent social science research. A recent analysis (Clark et al. Reference Clark, Jussim, Frey, Stevens, Al-Gharbi, Aquino, Bailey, Barbaro, Baumeister and Bleske-Rechek2023) by 39 prominent scientists shows that soft censorship is often driven by other scientists who are motivated by prosocial concerns. Academic freedom (including fostering an atmosphere within the academic community where diversity of expressed opinion on all matters, within the law, is tolerated and encouraged) can act as a bulwark against one-sided conclusions from EDI data.
Implications for EDI Policy
The previous sections focused on data collection and data analysis for EDI. In this section, I would like to discuss evidence-based EDI policy. Ultimately, EDI policy is a political question and therefore proposed EDI actions should be discussed and debated in a setting which includes multiple sides of the argument.
As noted in the previous section, we don’t currently know enough about the relative importance of various factors that may explain the under-representation of certain groups. This has implications for EDI policy. In particular, EDI committees should not automatically assume that any observed under-representation is the result of conscious or unconscious bias. It is also important, especially in this context, to bear in mind that non-mathematical (or non-STEM) careers are not objectively less valuable. As Lubinski and Benbow (Reference Lubinski and Benbow2006) eloquently put it: ‘Given the ever-increasing importance of quantitative and scientific reasoning skills in modern cultures, when mathematically gifted individuals choose to pursue careers outside engineering and the physical sciences, it should be seen as a contribution to society, not a loss of talent.’
Given what we know so far, my preferred framework for evidence-based EDI policy is that EDI action should be based on evidence of disadvantage or discrimination rather than on the existence of quantitative disparities in outcomes. As noted by Pinker (Reference Pinker2003), ‘equality is not the empirical claim that all groups of humans are interchangeable; it is the moral principle that individuals should not be judged or constrained by the average properties of their group’. By chasing real evidenced problems that people face rather than artificial targets on career outcomes for each group, EDI policy can support fairness, foster equality of opportunity, and place science and rationality at its heart. However, an observed disparity of outcomes can motivate further investigation, as noted at the beginning of the second section of this article. It can also provide food for thought about the way a subject is presented, rather than its actual content.
In addition to enforcing anti-discrimination law, it would be appropriate for universities to take targeted action when there is evidence of unequal opportunity or disadvantage. For example, in response to data showing that women are leaving mathematics due to caring responsibilities at higher rates (Cech and Blair-Loy Reference Cech and Blair-Loy2019), it would be sensible to create or improve policies such as flexible working, better support for those returning from maternity leave, childcare funds for research travel and other actions which may address the underlying problem (but see the next section on monitoring effectiveness of EDI policies). Similarly, if there is evidence from latitudinal surveys (such as those proposed in the second section) that a particular ethnic or demographic group feels less included or supported by the community and is disadvantaged as a result, it may be appropriate to consider mechanisms to address this. If the data show high levels of bullying, unfairness, or lack of inclusivity experienced by a cross-section of people (not associated with a particular protected characteristic) then action should be taken to counter this: it may well be that the most valuable EDI actions are those that make a better environment for everyone. It is also worth noting that many academic appointment and promotion procedures, and much of academic culture, depend on a long continuous record, and those who have career gaps, for whatever reason, are likely to be disadvantaged.
It is important to consider the possible costs of all EDI policies alongside the benefits. Unfortunately, many EDI actions that are pursued at present have limited value and may even be counterproductive. For example, there is much emphasis on unconscious bias training, despite the fact that such training appears to have limited efficacy for behaviour change (Dobbin and Kalev Reference Dobbin and Kalev2022). Devine and Ash (Reference Devine and Ash2022) suggest that the monetary investment in diversity training programmes ‘has clearly outpaced the available evidence that such programs are effective in achieving their goals’. More generally, many EDI interventions focus on reducing prejudice, but a large-scale meta-analysis (Paluck et al. Reference Paluck, Porat, Clark and Green2021) found that much existing research effort is ill-suited to providing evidence-based recommendations for such interventions. Another direction pursued enthusiastically by EDI committees is the motivational theory of role models; however, a meta-analysis of the theory of role models (Gladstone and Cimpian Reference Gladstone and Cimpian2021) observes that ‘the enthusiasm for role model interventions among educators and the general public continues to run ahead of the research’. For yet another example, consider the recent emphasis on ‘decolonizing’ and ‘diversifying’ the mathematics curriculum despite lack of support from the academic community amid concerns about academic freedom (Clarence-Smith Reference Clarence-Smith2022; Clarence-Smith Reference Clarence-Smith2023) and evidence that the decolonial perspective disputes the epistemic privilege of mathematical reasoning (Armstrong Reference Armstrong2023).
Conclusions: Using Data to Improve Decision Making on EDI
Mathematics has made progress towards becoming a more welcoming and inclusive discipline but challenges remain. This article contributes to the ongoing debate about EDI within the UK mathematical community by discussing key issues around data collection and data analysis and implications for evidence-based EDI policy in higher education. It cautions against arriving at hasty conclusions from diversity data and argues that EDI action should be based on evidence of disadvantage or discrimination rather than on the existence of quantitative disparities in outcomes.
In addition to data on mathematicians and the mathematics pipeline, it is important that data are gathered on EDI policies themselves. Gathering data on both the effectiveness of EDI policies as well as any negative consequences of such policies can only lead to better quality decision making on EDI.
If people introduce an EDI intervention, they should first evidence and pilot it, and following its implementation they should monitor its effects. Regular and meaningful surveys and data collection on the effects of EDI policies are therefore vital. The questions in surveys used to monitor the effect of EDI policies should be standardized and harmonized. For multiple choice questions intended to gather outcomes of a particular EDI action, it is important that the choices capture the full set of possible reactions from all the people affected, including negative ones. It is only after gathering reliable and representative data on the variety of experiences of those affected by EDI policies that one can balance the costs and benefits and therefore measure the effectiveness of the policies themselves.
Acknowledgements
I would like to thank Dr John Armstrong, Professor Saul Jacka and Dr Prakash Shah for providing helpful feedback on the first version of this manuscript. I am grateful to the anonymous reviewers for their suggestions and commentaries that have significantly improved the quality of this article. This article grew out of my efforts to answer some questions related to data-gathering and benchmarking for EDI in mathematics which were asked of me by Professor Tom Coates.
Competing Interests
The author has no relevant financial or non-financial interests to declare.
Funding
The author did not receive any funding from any organization for this work.
About the Author
Abhishek Saha is Professor of Mathematics at Queen Mary University of London. He is a founder member of the London Universities’ Council for Academic Freedom.