Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-01-07T19:04:12.802Z Has data issue: false hasContentIssue false

Bayesian Dynamic Borrowing of Historical Information with Applications to the Analysis of Large-Scale Assessments

Published online by Cambridge University Press:  01 January 2025

David Kaplan*
Affiliation:
University of Wisconsin – Madison
Jianshen Chen
Affiliation:
The College Board
Sinan Yavuz
Affiliation:
University of Wisconsin – Madison
Weicong Lyu
Affiliation:
University of Wisconsin – Madison
*
Correspondence should be made to David Kaplan, Department of Educational Psychology, University of Wisconsin – Madison, 1025 W. Johnson Street, Madison, WI, 53706, USA. Email: [email protected]

Abstract

The purpose of this paper is to demonstrate and evaluate the use of Bayesian dynamic borrowing (Viele et al, in Pharm Stat 13:41-54, 2014) as a means of systematically utilizing historical information with specific applications to large-scale educational assessments. Dynamic borrowing via Bayesian hierarchical models is a special case of a general framework of historical borrowing where the degree of borrowing depends on the heterogeneity among historical data and current data. A joint prior distribution over the historical and current data sets is specified with the degree of heterogeneity across the data sets controlled by the variance of the joint distribution. We apply Bayesian dynamic borrowing to both single-level and multilevel models and compare this approach to other historical borrowing methods such as complete pooling, Bayesian synthesis, and power priors. Two case studies using data from the Program for International Student Assessment reveal the utility of Bayesian dynamic borrowing in terms of predictive accuracy. This is followed by two simulation studies that reveal the utility of Bayesian dynamic borrowing over simple pooling and power priors in cases where the historical data is heterogeneous compared to the current data based on bias, mean squared error, and predictive accuracy. In cases of homogeneous historical data, Bayesian dynamic borrowing performs similarly to data pooling, Bayesian synthesis, and power priors. In contrast, for heterogeneous historical data, Bayesian dynamic borrowing performed at least as well, if not better, than other methods of borrowing with respect to mean squared error, percent bias, and leave-one-out cross-validation.

Type
Application Reviews and Case Studies
Copyright
Copyright © 2022 The Author(s) under exclusive licence to The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s11336-022-09869-3.

References

Bainter, S. A., Curran, P. J., Advantages of Integrative Data Analysis for Developmental Research Journal of Cognition and Development (2015). 16(1) 110 25642149 10.1080/15248372.2013.871721CrossRefGoogle ScholarPubMed
Chen, M. H., Ibrahim, J. G., Shao, Q-M Power prior distributions for generalized linear models Journal of Statistical Planning and Inference (2000). 84 121137 10.1016/S0378-3758(99)00140-8CrossRefGoogle Scholar
Curran, P. J., Hussong, A. M., Integrative data analysis: The simultaneous analysis of multiple data sets Psychological Methods (2009). 14 81100 19485623 2777640 10.1037/a0015914CrossRefGoogle ScholarPubMed
Dawid, A. P., The well-calibrated Bayesian Journal of the American Statistical Association (1982). 77 605610 10.1080/01621459.1982.10477856CrossRefGoogle Scholar
Du, H., Bradbury, T. N., Lavner, J. A., Meltzer, A. L., McNulty, J. K., Neff, L. A., Karney, B. R., A comparison of Bayesian synthesis approaches for studies comparing two means: A tutorial Research Synthesis Methods (2020). 11 3665 10.1002/jrsm.1365 31782621CrossRefGoogle Scholar
Enders, C. K., Keller, B. T., Levy, R., A fully conditional specification approach to multilevel imputation of categorical and continuous variables Psychological Methods (2018). 23(2) 298317 28557466 10.1037/met0000148CrossRefGoogle ScholarPubMed
Gelman, A., Prior distributions for variance parameters in hierarchical models Bayesian Analysis (2006). 1 515533 10.1214/06-BA117ACrossRefGoogle Scholar
Gelman, A., Struggles with Survey Weighting and Regression Modeling Statistical Science (2007). 22(2) 153164Google Scholar
Gelman, A., Carlin, J. B., Stern, D. B., Dunson, H. S., Vehtari, A., Rubin, D. B., Bayesian Data Analysis (2014). 3 London, UK Chapman & HallGoogle Scholar
Gelman, A., Hill, J., Data analysis using regression and multilevel/hierarchical models (2007). Cambridge Cambridge University PressGoogle Scholar
Gelman, A., Thomas, L., Poststratification into many categories using hierarchical logistic regression Survey Methodology (1997). 23 127135Google Scholar
Hobbs, B. P., Carlin, B. P., Mandrekar, S. J., Sargent, D. J., Hierarchical commensurate and power prior models for adaptive incorporation of historical information in clinical trials Biometrics (2011). 67 10471056 21361892 3134568 10.1111/j.1541-0420.2011.01564.xCrossRefGoogle ScholarPubMed
Hobbs, B. P., Carlin, B. P., Sargent, D. J., Commensurate priors for incorporating historical information in clinical trials using general and generalized linear models Bayesian Analysis (2012). 7(2) 136CrossRefGoogle ScholarPubMed
Ibrahim, J. G., Chen, M. H., Power prior distributions for regression models Statistical Science (2000). 15 4660Google Scholar
Ibrahim, J. G., Chen, M. H., Gwon, Y., Chen, F., The power prior: theory and applications Statistics in Medicine (2015). 34 37243749 26346180 4626399 10.1002/sim.6728CrossRefGoogle ScholarPubMed
Jackman, S., Bayesian analysis for the social sciences (2009). New York, USA John Wiley & Sons 10.1002/9780470686621CrossRefGoogle Scholar
Kaplan, D., Bayesian statistics for the social sciences (2014). New York, USA Guilford PressGoogle Scholar
Kaplan, D. (2016). Causal inference with large-scale assessments in education from a Bayesian perspective: A review and synthesis. Large-Scale Assessments in Education, 4, https://doi.org/10.1186/s40536-016-0022-6CrossRefGoogle Scholar
Kaplan, D., Kuger, S., Kuger, S., Klieme, E., Jude, N., Kaplan, D., The methodology of PISA: Past, present, and future Assessing contexts of learning world-wide - Extended context assessment frameworks (2016). Dordrecht SpringerGoogle Scholar
Kaplan, D., Park, S., Rutkowski, L., Von Davier, M., Rutkowski, D., Analyzing international large-scale assessment data within a Bayesian framework A handbook of international large-scale assessment: Background, technical issues, and methods of data analysis (2013). London Chapman Hall/CRC PressGoogle Scholar
Keller, B. T., & Enders, C. K. (2019). Blimp user’s guide (version 2.1).Google Scholar
Lewandowski, D., Kurowicka, D., Joe, H., Generating random correlation matrices based on vines and extended onion method Journal of Multivariate Analysis (2009). 100 19892001 10.1016/j.jmva.2009.04.008CrossRefGoogle Scholar
Little, R. J., Calibrated Bayes: A Bayes/frequentist roadmap The American Statistician (2006). 60 213223 10.1198/000313006X117837CrossRefGoogle Scholar
Little, R. J., Calibrated Bayes, for statistics in general, and missing data in particular Statistical Science (2011). 26 162174 10.1214/10-STS318CrossRefGoogle Scholar
Liu, G. F., A dynamic power prior for borrowing historical data in noninferiority trials with binary endpoint Pharmaceutical Statistics (2018). 17 6173 29125220 10.1002/pst.1836CrossRefGoogle ScholarPubMed
Marcoulides, K. M. (2017). A Bayesian synthesis approach to data fusion using augmented data-dependent priors (Unpublished doctoral dissertation). Arizona State University.Google Scholar
Martin, M. O., Mullis, I., Hooper, M., Methods and procedures in TMISS 2015 (2016). Chestnut Hill, MA TIMSS and PIRLS International Study Center, Boston CollegeGoogle Scholar
Mislevy, R. J., Randomization-based inference about latent variables from complex samples Psychometrika (1991). 56 177196 10.1007/BF02294457CrossRefGoogle Scholar
Mislevy, R. J., Beaton, A. E., Kaplan, B., Sheehan, K. M., Estimating population characteristics from sparse matrix samples of item responses Journal of Educational Measurement (1992). 29 133161 10.1111/j.1745-3984.1992.tb00371.xCrossRefGoogle Scholar
Morita, S., Thall, P. F., Müller, P., Determining the effective sample size of a parametric prior Biometrics (2008). 64 595602 17764481 10.1111/j.1541-0420.2007.00888.xCrossRefGoogle ScholarPubMed
NCES. (2018). Early Childhood Longitudinal Program (ECLS) - Overview. National Center for Education Statistics, Institute of Education Sciences, U.S. Dept. of Education, Washington, DC. https://nces.ed.gov/ecls/.Google Scholar
Neuenschwander, B., Capkun-Niggli, G., Branson, M., Spiegelhalter, D. J., Summarizing historical information on controls in clinical trials Clinical Trials (2010). 7(1) 518 20156954 10.1177/1740774509356002CrossRefGoogle ScholarPubMed
OECD. (2002). PISA 2000 technical report. Paris: Organization for Economic Cooperation and Development.Google Scholar
OECD. (2019). PISA 2018 Results: (Volumes I-IV): What students know and can do. https://doi.org/10.1787/5f07c754-en.CrossRefGoogle Scholar
O’Hagan, A., Buck, C. E., Daneshkhah, A., Eiser, J. R., Garthwaite, P. H., Jenkinson, D. J., Rakow, T., Uncertain judgements: Eliciting experts’ probabilities (2006). West Sussex, England Wiley 10.1002/0470033312CrossRefGoogle Scholar
O’Malley, J., Normand, S., Kuntz, R., Sample size calculation for a historically controlled clinical trial with adjustment for covariates Journal of Biopharmaceutical (2002). 12(2) 227247 10.1081/BIP-120015745CrossRefGoogle ScholarPubMed
Pocock, S. J., The combination of randomized and historical controls in clincial trials Journal of Chronic Diseases (1976). 29 175188 770493 10.1016/0021-9681(76)90044-8CrossRefGoogle Scholar
R Core Team. (2019). R: A language and environment for statistical computing [Computer software manual. Vienna, Austria. https://www.R-project.org/.Google Scholar
Rässler, S., Statistical matching: A frequentist theory, practical applications, and alternative Bayesian approaches (2002). New York, USA Springer 10.1007/978-1-4613-0053-3CrossRefGoogle Scholar
Raudenbush, S. W., Bryk, A. S., Hierarchical linear models: Applications and data analysis methods (2002). 2 Thousands Oaks, CA Sage PublicationsGoogle Scholar
Rubin, D. B., Statistical matching using file concatenation with adjusted weights and multiple imputation Journal of Business and Economic Statistics (1986). 4 8795CrossRefGoogle Scholar
Rutkowski, L., Von Davier, M., Rutkowski, D., Handbook of international large-scale assessment: Background, technical issues, and methods of data analysis (2013). Boca Raton Chapman Hall/CRC 10.1201/b16061CrossRefGoogle Scholar
Schmidli, H., Gsteiger, S., Roychoudhury, S., O’Hagan, A., Spiegelhalter, D., Neuenschwander, B., Robust meta-analytic-predictive priors in clinical trials with historical control information Biometrics (2014). 70(4) 10231032 25355546 10.1111/biom.12242CrossRefGoogle ScholarPubMed
Stan Development Team. (2020). RStan: the R interface to Stan. http://mc-stan.org/. R package version 2.21.2.Google Scholar
Sung, Y. J., Schwander, K., Arnett, D. K., Kardia, S. L. R., Rankinen, T., Bouchard, C., Rao, D., An empirical comparison of meta-analysis and mega-analysis of individual participant data for identifying gene-environment interactions Genetic Epidemiology (2014). 38 369378 24719363 4332385 10.1002/gepi.21800CrossRefGoogle ScholarPubMed
Thompson, L., Chu, J., Xu, J., Li, X., Nair, R., Tiwari, R., Dynamic borrowing from a single prior data source using the conditional power prior Journal of Biopharmaceutical Statistics (2021). 31(4) 403424 34520325 10.1080/10543406.2021.1895190CrossRefGoogle ScholarPubMed
Tierney, J., Vale, C., Riley, R., Smith, C. T., Stewart, L., Clarke, M., & Rovers, M. (2015). Individual participant data (ipd) meta-analyses of randomised controlled trials: Guidance on their use. PLoS Medicine 12(7), https://doi.org/10.1371/journal.pmed.1001855.CrossRefGoogle Scholar
US Department of Education. (2019). NAEP: Nations Report Card. https://nces.ed.gov/nationsreportcard/. Accessed Nov. 16, 2019.Google Scholar
Vehtari, A., Gabry, J., Yao, Y., & Gelman, A. (2019). loo: Efficient leave-one-out cross-validation and WAIC for Bayesian models. https://CRAN.R-project.org/package=loo R package version 2.1.0.Google Scholar
Vehtari, A., Gelman, A., Gabry, J., Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC Statistics and Computing (2017). 27 14131432 10.1007/s11222-016-9696-4CrossRefGoogle Scholar
Viele, K., Berry, S., Neuenschwander, B., Amzal, B., Chen, F., Enas, N., Thompson, L., Use of historical control data for assessing treatment effects in clinical trials Pharmaceutical Statistics (2014). 13 4154 23913901 10.1002/pst.1589CrossRefGoogle ScholarPubMed
von Davier, M., Rutkowski, L., von Davier, M., Rutkowski, D., Imputing proficiency data under planned missingness in population models Handbook of international large-scale assessment: Background, technical issues, and methods of data analysis (2013). Boca Raton Chapman Hall/CRCGoogle Scholar
Zhou, X., Reiter, J. P., A note on Bayesian inference after multiple imputation The American Statistician (2010). 64 159163 10.1198/tast.2010.09109CrossRefGoogle Scholar
Supplementary material: File

Kaplan et al. supplementary material

Kaplan et al. supplementary material 1
Download Kaplan et al. supplementary material(File)
File 1.1 MB
Supplementary material: File

Kaplan et al. supplementary material

Kaplan et al. supplementary material 2
Download Kaplan et al. supplementary material(File)
File 907.6 KB