Almaatouq et al. propose a break from tradition to accelerate scientific progress, and we applaud them for it. However, we urge an even further shift to incorporate theory and methods from causal discovery, a subfield of machine learning with decades of research on artificial intelligence (AI)-guided causal learning and experiment design. Causal discovery has not been well leveraged in the experimental sciences perhaps because it also breaks from tradition – statistical tradition.
Causal discovery contains a growing collection of methods for learning multivariate structural causal models (Pearl, Reference Pearl2000; Spirtes et al., Reference Spirtes, Glymour, Scheines, Heckerman, Meek, Cooper and Richardson2000). Design spaces can be represented as a substructure of a larger structural causal model (illustrated in Fig. 1), making causal discovery closely aligned with research cartography. It is not surprising then that some of the challenges faced by integrative experiment design might be overcome with causal discovery. We focus on three such challenges: Practical application and scalability, confined inferential scope, and unknown causal factors.
Figure 1. (a) Hypothetical design space with three binary dimensions: Veteran status, rural status, and sex. Different experiment outcomes are colored red, green, blue, and yellow. Note that in this hypothetical example, rural status makes no difference to the outcome of the experiment, while each of the four combinations of veteran status and sex produce different outcomes. (b) A causal model that would correspond to the example design space. The structure of the causal model is shown on the left, and the two causal dependency tables are shown on the right: One for veteran status, which depends on sex and rural, and the other for outcome. The table for outcome is shown with rural included, to make the comparison with the design space clear, but in a normal causal model rural would not be included in this table as no arrow points directly from rural to outcome in the model structure.
Regarding the practical application of design spaces, causal discovery can learn entire causal models from nonexperimental data alone, but the direction of causal relationships can be difficult to identify (Hoyer, Janzing, Mooij, Peters, & Schölkopf, Reference Hoyer, Janzing, Mooij, Peters and Schölkopf2008; Peters, Janzing, & Schölkopf, Reference Peters, Janzing and Schölkopf2011; Peters et al., Reference Peters, Mooij, Janzing and Schölkopf2014; Shimizu, Hoyer, Hyvärinen, & Kerminen, Reference Shimizu, Hoyer, Hyvärinen and Kerminen2006; Shimizu et al., Reference Shimizu, Inazumi, Sogawa, Hyvarinen, Kawahara, Washio and Bollen2011; Spirtes et al., Reference Spirtes, Glymour, Scheines, Heckerman, Meek, Cooper and Richardson2000). Causal discovery can be applied to experimental data to resolve this limitation. Multiple methods are capable of combining datasets with: Both experimental and observational samples, samples with nonidentical variables, and samples from different contexts and populations (Bareinboim & Pearl, Reference Bareinboim and Pearl2016; Huang et al., Reference Huang, Zhang, Zhang, Ramsey, Sanchez-Romero, Glymour and Schölkopf2020; Mooij, Magliacane, & Claassen, Reference Mooij, Magliacane and Claassen2020; Peters, Bühlmann, & Meinshausen, Reference Peters, Bühlmann and Meinshausen2016). Incorporating these methods could enable increased flexibility when dealing with practical study design challenges.
Scalability is another practical issue: The size of these spaces makes complete search infeasible. Causal discovery methods can scale to large numbers of variables, however. Even a million variables is possible (Ramsey, Glymour, Sanchez-Romero, & Glymour, Reference Ramsey, Glymour, Sanchez-Romero and Glymour2017), but this applies to sparse models. In sparse models, each variable is directly related to only a small number of other variables. When variables have large numbers of interacting causes, causal discovery also suffers scalability problems (Spirtes et al., Reference Spirtes, Glymour, Scheines, Heckerman, Meek, Cooper and Richardson2000). However, such situations may not be common in reality. Like how linear and Gaussian modeling are surprisingly effective, sparse models often capture the important elements of a causal system. As alternatives, the active learning methods Almaatouq et al. point to could be used, and active learning causal discovery methods also exist (Ghassami, Salehkaleybar, Kiyavash, & Bareinboim, Reference Ghassami, Salehkaleybar, Kiyavash, Bareinboim, Dy and Krause2018; Hyttinen, Eberhardt, & Hoyer, Reference Hyttinen, Eberhardt and Hoyer2013a; Lindgren, Kocaoglu, Dimakis, & Vishwanath, Reference Lindgren, Kocaoglu, Dimakis and Vishwanath2018).
Confined inferential scope limits the kinds of information that can be learned. For example, let X, Y, and Z be variables. Some study designs allow researchers to learn that X causes Z and Y causes Z, but prevent researchers from learning whether X mediates the effect of Y on Z. In a pair of papers, Mayo-Wilson (Reference Mayo-Wilson2011, Reference Mayo-Wilson2014) proved: (1) certain causal facts cannot be learned from a system of experiments that each only investigate a single exposure–outcome pair, (2) the proportion of unlearnable facts approaches 100% as the complexity of the system increases, and (3) overcoming this requires that each experiment measures more variables than an exposure–outcome pair. By focusing on a single experiment under different conditions, Almaatouq et al. are at risk of being confined to a space of causal facts not much greater than the ad hoc experimentation they are trying to break away from.
Researchers ought to simultaneously measure as many relevant variables as possible. This happens naturally when planning to use causal discovery methods. Most causal discovery methods treat all variables equally, with no labeled outcome variable. It is normal in causal discovery to cast a wide net and use measurements from a larger number of variables, and then simultaneously model them with an algorithm. There is a growing body of papers applying this approach, including some in the social and behavioral sciences (Bronstein, Everaert, Kummerfeld, Haynos, & Vinogradov, Reference Bronstein, Everaert, Kummerfeld, Haynos and Vinogradov2022a; Bronstein, Kummerfeld, MacDonald, & Vinogradov, Reference Bronstein, Kummerfeld, MacDonald and Vinogradov2022b; Shen, Ma, Vemuri, & Simon, Reference Shen, Ma, Vemuri and Simon2020; Stevenson et al., Reference Stevenson, Kummerfeld, Merrill, Blevins, Abrantes, Kushner and Lim2022).
Unknown causal factors are ubiquitous in science and, unbeknownst to the researcher, can modify the context under which the data were collected. This commonly manifests as latent confounding. In the integrative experimental design paradigm it would occur as a failure to fully specify the design space. Research cartography could possibly solve this, but it is unclear how.
In contrast, causal discovery offers multiple solutions to unknown causal factors. Many causal discovery algorithms are only correct assuming “causal sufficiency”: That there are no unknown causal factors causing two or more measured variables. However there are also many papers developing theory and methods without assuming causal sufficiency (Chen et al., Reference Chen, Zhang, Cai, Huang, Ramsey, Hao and Glymour2021; Hyttinen, Hoyer, Eberhardt, & Jarvisalo, Reference Hyttinen, Hoyer, Eberhardt and Jarvisalo2013b; Ogarrio, Spirtes, & Ramsey, Reference Ogarrio, Spirtes and Ramsey2016; Spirtes et al., Reference Spirtes, Glymour, Scheines, Heckerman, Meek, Cooper and Richardson2000; Zhang, Reference Zhang2008). In many cases the presence or absence of unknown causal factors can be identified from measured data, and there are even causal discovery methods designed to learn the causal relationships among them (Huang, Low, Xie, Glymour, & Zhang, Reference Huang, Low, Xie, Glymour and Zhang2022; Kummerfeld & Ramsey, Reference Kummerfeld and Ramsey2016; Xie et al., Reference Xie, Huang, Chen, He, Geng, Zhang, Chaudhuri, Jegelka, Song, Szepesvari, Niu and Sabato2022).
Unfortunately, causal discovery has had limited application in the experimental sciences. We hope this commentary helps to raise awareness of these resources. Almaatouq et al. make it clear that there is a demand for these research products in the social and behavioral sciences. There is a serious barrier to the adoption and use of causal discovery: Much of it is buried and scattered among journals covering relatively unapplied topics such as theoretical machine learning and philosophy of science. We expect that in the future causal discovery will gain presence in journals on experimental methods and design or topics such as behavioral and brain sciences.
Almaatouq et al. propose a break from tradition to accelerate scientific progress, and we applaud them for it. However, we urge an even further shift to incorporate theory and methods from causal discovery, a subfield of machine learning with decades of research on artificial intelligence (AI)-guided causal learning and experiment design. Causal discovery has not been well leveraged in the experimental sciences perhaps because it also breaks from tradition – statistical tradition.
Causal discovery contains a growing collection of methods for learning multivariate structural causal models (Pearl, Reference Pearl2000; Spirtes et al., Reference Spirtes, Glymour, Scheines, Heckerman, Meek, Cooper and Richardson2000). Design spaces can be represented as a substructure of a larger structural causal model (illustrated in Fig. 1), making causal discovery closely aligned with research cartography. It is not surprising then that some of the challenges faced by integrative experiment design might be overcome with causal discovery. We focus on three such challenges: Practical application and scalability, confined inferential scope, and unknown causal factors.
Figure 1. (a) Hypothetical design space with three binary dimensions: Veteran status, rural status, and sex. Different experiment outcomes are colored red, green, blue, and yellow. Note that in this hypothetical example, rural status makes no difference to the outcome of the experiment, while each of the four combinations of veteran status and sex produce different outcomes. (b) A causal model that would correspond to the example design space. The structure of the causal model is shown on the left, and the two causal dependency tables are shown on the right: One for veteran status, which depends on sex and rural, and the other for outcome. The table for outcome is shown with rural included, to make the comparison with the design space clear, but in a normal causal model rural would not be included in this table as no arrow points directly from rural to outcome in the model structure.
Regarding the practical application of design spaces, causal discovery can learn entire causal models from nonexperimental data alone, but the direction of causal relationships can be difficult to identify (Hoyer, Janzing, Mooij, Peters, & Schölkopf, Reference Hoyer, Janzing, Mooij, Peters and Schölkopf2008; Peters, Janzing, & Schölkopf, Reference Peters, Janzing and Schölkopf2011; Peters et al., Reference Peters, Mooij, Janzing and Schölkopf2014; Shimizu, Hoyer, Hyvärinen, & Kerminen, Reference Shimizu, Hoyer, Hyvärinen and Kerminen2006; Shimizu et al., Reference Shimizu, Inazumi, Sogawa, Hyvarinen, Kawahara, Washio and Bollen2011; Spirtes et al., Reference Spirtes, Glymour, Scheines, Heckerman, Meek, Cooper and Richardson2000). Causal discovery can be applied to experimental data to resolve this limitation. Multiple methods are capable of combining datasets with: Both experimental and observational samples, samples with nonidentical variables, and samples from different contexts and populations (Bareinboim & Pearl, Reference Bareinboim and Pearl2016; Huang et al., Reference Huang, Zhang, Zhang, Ramsey, Sanchez-Romero, Glymour and Schölkopf2020; Mooij, Magliacane, & Claassen, Reference Mooij, Magliacane and Claassen2020; Peters, Bühlmann, & Meinshausen, Reference Peters, Bühlmann and Meinshausen2016). Incorporating these methods could enable increased flexibility when dealing with practical study design challenges.
Scalability is another practical issue: The size of these spaces makes complete search infeasible. Causal discovery methods can scale to large numbers of variables, however. Even a million variables is possible (Ramsey, Glymour, Sanchez-Romero, & Glymour, Reference Ramsey, Glymour, Sanchez-Romero and Glymour2017), but this applies to sparse models. In sparse models, each variable is directly related to only a small number of other variables. When variables have large numbers of interacting causes, causal discovery also suffers scalability problems (Spirtes et al., Reference Spirtes, Glymour, Scheines, Heckerman, Meek, Cooper and Richardson2000). However, such situations may not be common in reality. Like how linear and Gaussian modeling are surprisingly effective, sparse models often capture the important elements of a causal system. As alternatives, the active learning methods Almaatouq et al. point to could be used, and active learning causal discovery methods also exist (Ghassami, Salehkaleybar, Kiyavash, & Bareinboim, Reference Ghassami, Salehkaleybar, Kiyavash, Bareinboim, Dy and Krause2018; Hyttinen, Eberhardt, & Hoyer, Reference Hyttinen, Eberhardt and Hoyer2013a; Lindgren, Kocaoglu, Dimakis, & Vishwanath, Reference Lindgren, Kocaoglu, Dimakis and Vishwanath2018).
Confined inferential scope limits the kinds of information that can be learned. For example, let X, Y, and Z be variables. Some study designs allow researchers to learn that X causes Z and Y causes Z, but prevent researchers from learning whether X mediates the effect of Y on Z. In a pair of papers, Mayo-Wilson (Reference Mayo-Wilson2011, Reference Mayo-Wilson2014) proved: (1) certain causal facts cannot be learned from a system of experiments that each only investigate a single exposure–outcome pair, (2) the proportion of unlearnable facts approaches 100% as the complexity of the system increases, and (3) overcoming this requires that each experiment measures more variables than an exposure–outcome pair. By focusing on a single experiment under different conditions, Almaatouq et al. are at risk of being confined to a space of causal facts not much greater than the ad hoc experimentation they are trying to break away from.
Researchers ought to simultaneously measure as many relevant variables as possible. This happens naturally when planning to use causal discovery methods. Most causal discovery methods treat all variables equally, with no labeled outcome variable. It is normal in causal discovery to cast a wide net and use measurements from a larger number of variables, and then simultaneously model them with an algorithm. There is a growing body of papers applying this approach, including some in the social and behavioral sciences (Bronstein, Everaert, Kummerfeld, Haynos, & Vinogradov, Reference Bronstein, Everaert, Kummerfeld, Haynos and Vinogradov2022a; Bronstein, Kummerfeld, MacDonald, & Vinogradov, Reference Bronstein, Kummerfeld, MacDonald and Vinogradov2022b; Shen, Ma, Vemuri, & Simon, Reference Shen, Ma, Vemuri and Simon2020; Stevenson et al., Reference Stevenson, Kummerfeld, Merrill, Blevins, Abrantes, Kushner and Lim2022).
Unknown causal factors are ubiquitous in science and, unbeknownst to the researcher, can modify the context under which the data were collected. This commonly manifests as latent confounding. In the integrative experimental design paradigm it would occur as a failure to fully specify the design space. Research cartography could possibly solve this, but it is unclear how.
In contrast, causal discovery offers multiple solutions to unknown causal factors. Many causal discovery algorithms are only correct assuming “causal sufficiency”: That there are no unknown causal factors causing two or more measured variables. However there are also many papers developing theory and methods without assuming causal sufficiency (Chen et al., Reference Chen, Zhang, Cai, Huang, Ramsey, Hao and Glymour2021; Hyttinen, Hoyer, Eberhardt, & Jarvisalo, Reference Hyttinen, Hoyer, Eberhardt and Jarvisalo2013b; Ogarrio, Spirtes, & Ramsey, Reference Ogarrio, Spirtes and Ramsey2016; Spirtes et al., Reference Spirtes, Glymour, Scheines, Heckerman, Meek, Cooper and Richardson2000; Zhang, Reference Zhang2008). In many cases the presence or absence of unknown causal factors can be identified from measured data, and there are even causal discovery methods designed to learn the causal relationships among them (Huang, Low, Xie, Glymour, & Zhang, Reference Huang, Low, Xie, Glymour and Zhang2022; Kummerfeld & Ramsey, Reference Kummerfeld and Ramsey2016; Xie et al., Reference Xie, Huang, Chen, He, Geng, Zhang, Chaudhuri, Jegelka, Song, Szepesvari, Niu and Sabato2022).
Unfortunately, causal discovery has had limited application in the experimental sciences. We hope this commentary helps to raise awareness of these resources. Almaatouq et al. make it clear that there is a demand for these research products in the social and behavioral sciences. There is a serious barrier to the adoption and use of causal discovery: Much of it is buried and scattered among journals covering relatively unapplied topics such as theoretical machine learning and philosophy of science. We expect that in the future causal discovery will gain presence in journals on experimental methods and design or topics such as behavioral and brain sciences.
Financial support
E. K. was supported by funding through Grant No. NCRR 1UL1TR002494-01 and B. A. was supported by T32 DA037183. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interest
None.