The Reliability Factor: Modeling Individual Reliability with Multiple Items from a Single Assessment

Stephen R. Martin; Philippe Rast

doi:10.1007/s11336-022-09847-9

The Reliability Factor: Modeling Individual Reliability with Multiple Items from a Single Assessment

Published online by Cambridge University Press: 01 January 2025

Stephen R. Martin

and

Philippe Rast

Show author details

Stephen R. Martin*: Affiliation:
University Of California, Davis
Philippe Rast: Affiliation:
University Of California, Davis
*: Correspondence should be made to Stephen R. Martin, Department of Psychology, University of California, Davis, 135 Young Hall, 1 Shields Avenue, Davis, CA 95616, USA. Email: stephenSRMMartin@gmail.com

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Reliability is a crucial concept in psychometrics. Although it is typically estimated as a single fixed quantity, previous work suggests that reliability can vary across persons, groups, and covariates. We propose a novel method for estimating and modeling case-specific reliability without repeated measurements or parallel tests. The proposed method employs a “Reliability Factor” that models the error variance of each case across multiple indicators, thereby producing case-specific reliability estimates. Additionally, we use Gaussian process modeling to estimate a nonlinear, non-monotonic function between the latent factor itself and the reliability of the measure, providing an analogue to test information functions in item response theory. The reliability factor model is a new tool for examining latent regions with poor conditional reliability, and correlates thereof, in a classical test theory framework.

Keywords

Omega Bayesian reliability

Type: Theory and Methods
Information: Psychometrika , Volume 87 , Issue 4 , December 2022 , pp. 1318 - 1342

DOI: https://doi.org/10.1007/s11336-022-09847-9 [Opens in a new window]
Copyright: Copyright © 2022 The Author(s) under exclusive licence to The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Research reported in this publication was supported by the National Institute On Aging of the National Institutes of Health under Award Number R01AG050720 to PR. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health

References

Asparouhov, T., Hamaker, E. L., Muthén, B., (2018). Dynamic structural equation models Structural Equation Modeling: A Multidisciplinary Journal 25 (3) 359–388 10.1080/10705511.2017.1406803CrossRef Google Scholar

Bacon, D. R., Sauer, P. L., Young, M., (1995). Composite reliability in structural equations modeling Educational and Psychological Measurement 55 (3) 394–406 10.1177/0013164495055003003CrossRef Google Scholar

Barnard, J., McCulloch, R., Meng, X-L (2000). Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage Statistica Sinica 10 (4) 1281–1311Google Scholar

Bauer, D. J., (2017). A more general model for testing measurement invariance and differential item functioning Psychological Methods 22 (3) 507–526 10.1037/met0000077 27266798CrossRef Google Scholar PubMed

Bentler, P. M., (2009). Alpha, dimension-free, and model-based internal consistency reliability Psychometrika 74 (1) 137–143 10.1007/s11336-008-9100-1 20161430 2786226CrossRef Google Scholar PubMed

Betancourt, M. (2017). A conceptual introduction to Hamiltonian Monte Carlo. Retrieved from arxiv.org/abs/1701.02434 Google Scholar

Brennan, R. L., (2005). Generalizability theory Educational Measurement: Issues and Practice 11 (4) 27–34 10.1111/j.1745-3992.1992.tb00260.xCrossRef Google Scholar

Brunton-Smith, I., Sturgis, P., Leckie, G., (2017). Detecting and understanding interviewer effects on survey data by using a cross-classified mixed effects location-scale model Journal of the Royal Statistical Society: Series A (Statistics in Society) 180 (2) 551–568 10.1111/rssa.12205CrossRef Google Scholar

Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software. https://doi.org/10.18637/jss.v076.i01 CrossRef Google Scholar

de Ayala, R. J., The theory and practice of item response theory New York, USA The Guilford PressGoogle Scholar

Dunn, T. J., Baguley, T., Brunsden, V., (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation British Journal of Psychology 105 (3) 399–412 10.1111/bjop.12046 24844115CrossRef Google Scholar

Ellis, J. L., & van den Wollenberg, A. L. (1993). Local homogeneity in latent trait models. A characterization of the homogeneous monotone IRT model. Psychometrika, 58(3), 417-429. https://doi.org/10.1007/BF02294649 CrossRef Google Scholar

Feldt, L. S., Quails, A. L., (1996). Estimation of measurement error variance at specific score levels Journal of Educational Measurement 33 (2) 141–156 10.1111/j.1745-3984.1996.tb00486.xCrossRef Google Scholar

Feldt, L. S., Steffen, M., Gupta, N. C., (1985). A comparison of five methods for estimating the standard error of measurement at specific score levels Applied Psychological Measurement 9 (4) 351–361 10.1177/014662168500900402CrossRef Google Scholar

Geldhof, G. J., Preacher, K. J., Zyphur, M. J., (2014). Reliability estimation in a multilevel confirmatory factor analysis framework Psychological Methods 19 (1) 72–91 10.1037/a0032138 23646988CrossRef Google Scholar

Gelman, A., Hill, J., Yajima, M., (2012). Why we (usually) don’t have to worry about multiple comparisons Journal of Research on Educational Effectiveness 5 (2) 189–211 10.1080/19345747.2011.618213CrossRef Google Scholar

Gelman, A., Rubin, D. B., (1992). Inference from iterative simulation using multiple sequences Statistical Science 7 (4) 457–472 10.1214/ss/1177011136CrossRef Google Scholar

Gelman, A., Vehtari, A., Simpson, D., Margossian, C. C., Carpenter, B., Yao, Y., Kennedy, L., Gabry, J., Burkner, P.-C., & Modrák, M. (2020). Bayesian workflow. Retrieved from arXiv:2011.01808 Google Scholar

Harvill, L. M. (1991). An NCME Instructional Module on. standard error of measurement. Educational Measurement: Issues and Practice, 10(2), 33–41. https://doi.org/10.1111/j.1745-3992.1991.tb00195.x CrossRef Google Scholar

Hedeker, D., Mermelstein, R. J., Berbaum, M. L., Campbell, R. T., (2009). Modeling mood variation associated with smoking: An application of a heterogeneous mixed-effects model for analysis of ecological momentary assessment (EMA) data Addiction 104 (2) 297–307 10.1111/j.1360-0443.2008.02435.x 19149827 2629640CrossRef Google Scholar PubMed

Hedeker, D., Mermelstein, R. J., Demirtas, H., (2008). An application of a mixed-effects location scale model for analysis of ecological momentary assessment (EMA) data Biometrics 64 (2) 627–634 10.1111/j.1541-0420.2007.00924.x 17970819CrossRef Google Scholar PubMed

Hedeker, D., Mermelstein, R. J., Demirtas, H., (2012). Modeling between-subject and within-subject variances in ecological momentary assessment data using mixed-effects location scale models Statistics in medicine 31 (27) 3328–3336 10.1002/sim.5338 22419604CrossRef Google Scholar PubMed

Holzinger, K. J., & Swineford, F. A. (1939). A study in factor analysis: The stability of a bi-factor solution. Supplementary Education Monographs, 48.Google Scholar

Hu, Y., Nesselroade, J. R., Erbacher, M. K., Boker, S. M., Burt, S. A., Keel, P. K., Klump, K., (2016). Test reliability at the individual level Structural Equation Modeling: A Multidisciplinary Journal 23 (4) 532–543 10.1080/10705511.2016.1148605CrossRef Google Scholar PubMed

Jöreskog, K. G., (1971). Statistical analysis of sets of congeneric tests Psychometrika 36 (2) 109–133 10.1007/BF02291393CrossRef Google Scholar

Kapur, K., Li, X., Blood, E. A., Hedeker, D., (2015). Bayesian mixed-effects location and scale models for multivariate longitudinal outcomes: An application to ecological momentary assessment data Statistics in Medicine 34 (4) 630–651 10.1002/sim.6345 25409923CrossRef Google Scholar PubMed

Leckie, G., French, R., Charlton, C., Browne, W., (2014). Modeling heterogeneous variance-covariance components in two-level models Journal of Educational and Behavioral Statistics 39 (5) 307–332 10.3102/1076998614546494CrossRef Google Scholar

Lee, Y., Nelder, J. A., (2006). Double hierarchical generalized linear models (with discussion) Journal of the Royal Statistical Society: Series C (Applied Statistics) 55 (2) 139–185 10.1111/j.1467-9876.2006.00538.xGoogle Scholar

Lek, K. M., & Van De Schoot, R. (2018). A comparison of the single, conditional and person-specific standard error of measurement: What do they measure and when to use them? Frontiers in Applied Mathematics and Statistics. https://doi.org/10.3389/fams.2018.00040 CrossRef Google Scholar

Lewandowski, D., Kurowicka, D., & Joe, H. (2009). Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis. https://doi.org/10.1016/j.jmva.2009.04.008 CrossRef Google Scholar

Li, X., Hedeker, D., (2012). A three-level mixed-effects location scale model with an application to ecological momentary assessment data Statistics in Medicine 31 (26) 3192–3210 10.1002/sim.5393 22865663 3665350CrossRef Google Scholar PubMed

Liu, H., Zhang, Z., Grimm, K. J., (2016). Comparison of inverse Wishart and separation-strategy priors for Bayesian estimation of covariance parameter matrix in growth curve analysis Structural Equation Modeling: A Multidisciplinary Journal 23 (3) 354–367 10.1080/10705511.2015.1057285CrossRef Google Scholar

Lord, F. M., & Novick, M. R. (2008). Statistical theories of mental test scores. Information Age Publishing.Google Scholar

Martin, S. R., Williams, D. R., & Rast, P. (2019). Measurement invariance assessment with Bayesian hierarchical inclusion modeling. PsyArXiv. https://doi.org/10.31234/osf.io/qbdjt CrossRef Google Scholar

Martin, S. R., Williams, D. R., & Rast, P. (2020). Omegad. Retrieved from http://github.com/stephensrmmartin/omegad Google Scholar

McNeish, D., (2018). Thanks coeffcient alpha, we’ll take it from here Psychological Methods 23 (3) 412–433 10.1037/met0000144 28557467CrossRef Google Scholar

Mehta, P. D., Neale, M. C., (2005). People are variables too: Multilevel structural equations modeling Psychological Methods 10 (3) 259–284 10.1037/1082-989X.10.3.259 16221028CrossRef Google Scholar PubMed

Meredith, W., (1993). Measurement invariance, factor analysis and factorial invariance Psychometrika 58 (4) 525–543 10.1007/BF02294825CrossRef Google Scholar

Merkle, E. C., Wang, T., (2018). Bayesian latent variable models for the analysis of experimental psychology data Psychonomic Bulletin & Review 25 (1) 256–270 10.3758/s13423-016-1016-7CrossRef Google Scholar PubMed

Muthén, B. O., (1994). Multilevel covariance structure analysis Sociological Methods & Research 22 (3) 376–398 10.1177/0049124194022003006CrossRef Google Scholar

Nestler, S., (2020). Modelling inter-individual differences in latent within-person variation: The confirmatory factor level variability model British Journal of Mathematical and Statistical Psychology 73 (3) 452–473 10.1111/bmsp.12196 31912895CrossRef Google Scholar PubMed

Raju, N. S., Price, L. R., Oshima, T., Nering, M. L., (2007). Standardized conditional SEM: A case for conditional reliability Applied Psychological Measurement 31 (3) 169–180 10.1177/0146621606291569CrossRef Google Scholar

Rast, P., Ferrer, E., (2018). A mixed-effects location scale model for dyadic interactions Multivariate Behavioral Research 53 (5) 756–775 10.1080/00273171.2018.1477577 30395725 8572132CrossRef Google Scholar PubMed

Rast, P., Hofer, S. M., Sparks, C., (2012). Modeling individual differences in within-person variation of negative and positive affect in a mixed effects location scale model using BUGS/JAGS Multivariate Behavioral Research 47 (2) 177–200 10.1080/00273171.2012.658328 26734847CrossRef Google Scholar

Rast, P., Martin, S. R., Liu, S., & Williams, D. R. (2020). A new frontier for studying within-person variability: Bayesian multivariate generalized autoregressive conditional heteroskedasticity models. Psychological Methods. https://doi.org/10.1037/met0000357 CrossRef Google Scholar

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: applications and data analysis methods (2nd edn). Thousand Oaks.Google Scholar

Raykov, T., (1997). Estimation of composite reliability for congeneric measures Applied Psychological Measurement 21 (2) 173–184 10.1177/01466216970212006CrossRef Google Scholar

Raykov, T., (2001). Estimation of congeneric scale reliability using covariance structure analysis with nonlinear constraints British Journal of Mathematical and Statistical Psychology 54 (2) 315–323 10.1348/000711001159582 11817096CrossRef Google Scholar PubMed

Raykov, T., du Toit, S. H. C., (2005). Estimation of reliability for multiple-component measuring instruments in hierarchical designs Structural Equation Modeling: A Multidisciplinary Journal 12 (4) 536–550 10.1207/s15328007sem1204_2CrossRef Google Scholar

Rothenberg, T. J., (1971). Identification in parametric models Econometrica 39 (3) 577 10.2307/1913267CrossRef Google Scholar

Schad, D. J., Betancourt, M., & Vasishth, S. (2019). Toward a principled Bayesian workflow in cognitive science.Google Scholar

Solin, A., & Särkkä, S. (2019). Hilbert space methods for reduced-rank Gaussian process regression. Statistics and Computing. https://doi.org/10.1007/s11222-019-09886-w CrossRef Google Scholar

Vehtari, A., Gelman, A., Gabry, J., (2017). Practical Bayesian model evaluation using leave-one-out crossvalidation andWAIC Statistics and Computing 27 (5) 1413–1432 10.1007/s11222-016-9696-4CrossRef Google Scholar

Viallefont, A., Lebreton, J.-D., Reboulet, A.-M., & Gory, G. (1998). Parameter identifiability and model selection in capture-recapture models: A numerical approach. Biometrical Journal, 40(3), 313–325. https://doi.org/10.1002/(SICI)1521-4036(199807)40:3

<

\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$<$$\end{document}

313:AID-BIMJ313

>

3.0.CO2-23.0.CO;2-2>CrossRef Google Scholar

Williams, D. R., Liu, S., Martin, S. R., & Rast, P. (2019). Bayesian multivariate mixed-effects location scale modeling of longitudinal relations among affective traits, states, and physical activity. PsyArXiv. https://doi.org/10.31234/osf.io/4kfjp CrossRef Google Scholar

Williams, D. R., Martin, S. R., & Rast, P. (2019). Putting the individual into reliability: Bayesian testing of homogeneous within-person variance in hierarchical models. PsyArXiv. https://doi.org/10.31234/OSF.IO/HPQ7W CrossRef Google Scholar

Yang, Y., Bhattacharya, A., & Pati, D. (2017). Frequentist coverage and sup-norm convergence rate in Gaussian process regression. Retrieved from arxiv.org/abs/1708.04753 Google Scholar

Zhang, X., & Savalei, V. (2019). Examining the effect of missing data on RMSEA and CFI under normal theory full-information maximum likelihood. Structural Equation Modeling: A Multidisciplinary Journal . https://doi.org/10.1080/10705511.2019.1642111 CrossRef Google Scholar

Article contents

The Reliability Factor: Modeling Individual Reliability with Multiple Items from a Single Assessment

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests