Hostname: page-component-745bb68f8f-hvd4g Total loading time: 0 Render date: 2025-01-07T18:51:36.082Z Has data issue: false hasContentIssue false

The Reliability Factor: Modeling Individual Reliability with Multiple Items from a Single Assessment

Published online by Cambridge University Press:  01 January 2025

Stephen R. Martin*
Affiliation:
University Of California, Davis
Philippe Rast
Affiliation:
University Of California, Davis
*
Correspondence should be made to Stephen R. Martin, Department of Psychology, University of California, Davis, 135 Young Hall, 1 Shields Avenue, Davis, CA 95616, USA. Email: [email protected]

Abstract

Reliability is a crucial concept in psychometrics. Although it is typically estimated as a single fixed quantity, previous work suggests that reliability can vary across persons, groups, and covariates. We propose a novel method for estimating and modeling case-specific reliability without repeated measurements or parallel tests. The proposed method employs a “Reliability Factor” that models the error variance of each case across multiple indicators, thereby producing case-specific reliability estimates. Additionally, we use Gaussian process modeling to estimate a nonlinear, non-monotonic function between the latent factor itself and the reliability of the measure, providing an analogue to test information functions in item response theory. The reliability factor model is a new tool for examining latent regions with poor conditional reliability, and correlates thereof, in a classical test theory framework.

Type
Theory and Methods
Copyright
Copyright © 2022 The Author(s) under exclusive licence to The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Research reported in this publication was supported by the National Institute On Aging of the National Institutes of Health under Award Number R01AG050720 to PR. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health

References

Asparouhov, T., Hamaker, E. L., Muthén, B., (2018). Dynamic structural equation models Structural Equation Modeling: A Multidisciplinary Journal 25 (3) 359388 10.1080/10705511.2017.1406803CrossRefGoogle Scholar
Bacon, D. R., Sauer, P. L., Young, M., (1995). Composite reliability in structural equations modeling Educational and Psychological Measurement 55 (3) 394406 10.1177/0013164495055003003CrossRefGoogle Scholar
Barnard, J., McCulloch, R., Meng, X-L (2000). Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage Statistica Sinica 10 (4) 12811311Google Scholar
Bauer, D. J., (2017). A more general model for testing measurement invariance and differential item functioning Psychological Methods 22 (3) 507526 10.1037/met0000077 27266798CrossRefGoogle ScholarPubMed
Bentler, P. M., (2009). Alpha, dimension-free, and model-based internal consistency reliability Psychometrika 74 (1) 137143 10.1007/s11336-008-9100-1 20161430 2786226CrossRefGoogle ScholarPubMed
Betancourt, M. (2017). A conceptual introduction to Hamiltonian Monte Carlo. Retrieved from arxiv.org/abs/1701.02434Google Scholar
Brennan, R. L., (2005). Generalizability theory Educational Measurement: Issues and Practice 11 (4) 2734 10.1111/j.1745-3992.1992.tb00260.xCrossRefGoogle Scholar
Brunton-Smith, I., Sturgis, P., Leckie, G., (2017). Detecting and understanding interviewer effects on survey data by using a cross-classified mixed effects location-scale model Journal of the Royal Statistical Society: Series A (Statistics in Society) 180 (2) 551568 10.1111/rssa.12205CrossRefGoogle Scholar
Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software. https://doi.org/10.18637/jss.v076.i01CrossRefGoogle Scholar
de Ayala, R. J., The theory and practice of item response theory New York, USA The Guilford PressGoogle Scholar
Dunn, T. J., Baguley, T., Brunsden, V., (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation British Journal of Psychology 105 (3) 399412 10.1111/bjop.12046 24844115CrossRefGoogle Scholar
Ellis, J. L., & van den Wollenberg, A. L. (1993). Local homogeneity in latent trait models. A characterization of the homogeneous monotone IRT model. Psychometrika, 58(3), 417-429. https://doi.org/10.1007/BF02294649CrossRefGoogle Scholar
Feldt, L. S., Quails, A. L., (1996). Estimation of measurement error variance at specific score levels Journal of Educational Measurement 33 (2) 141156 10.1111/j.1745-3984.1996.tb00486.xCrossRefGoogle Scholar
Feldt, L. S., Steffen, M., Gupta, N. C., (1985). A comparison of five methods for estimating the standard error of measurement at specific score levels Applied Psychological Measurement 9 (4) 351361 10.1177/014662168500900402CrossRefGoogle Scholar
Geldhof, G. J., Preacher, K. J., Zyphur, M. J., (2014). Reliability estimation in a multilevel confirmatory factor analysis framework Psychological Methods 19 (1) 7291 10.1037/a0032138 23646988CrossRefGoogle Scholar
Gelman, A., Hill, J., Yajima, M., (2012). Why we (usually) don’t have to worry about multiple comparisons Journal of Research on Educational Effectiveness 5 (2) 189211 10.1080/19345747.2011.618213CrossRefGoogle Scholar
Gelman, A., Rubin, D. B., (1992). Inference from iterative simulation using multiple sequences Statistical Science 7 (4) 457472 10.1214/ss/1177011136CrossRefGoogle Scholar
Gelman, A., Vehtari, A., Simpson, D., Margossian, C. C., Carpenter, B., Yao, Y., Kennedy, L., Gabry, J., Burkner, P.-C., & Modrák, M. (2020). Bayesian workflow. Retrieved from arXiv:2011.01808Google Scholar
Harvill, L. M. (1991). An NCME Instructional Module on. standard error of measurement. Educational Measurement: Issues and Practice, 10(2), 33–41. https://doi.org/10.1111/j.1745-3992.1991.tb00195.xCrossRefGoogle Scholar
Hedeker, D., Mermelstein, R. J., Berbaum, M. L., Campbell, R. T., (2009). Modeling mood variation associated with smoking: An application of a heterogeneous mixed-effects model for analysis of ecological momentary assessment (EMA) data Addiction 104 (2) 297307 10.1111/j.1360-0443.2008.02435.x 19149827 2629640CrossRefGoogle ScholarPubMed
Hedeker, D., Mermelstein, R. J., Demirtas, H., (2008). An application of a mixed-effects location scale model for analysis of ecological momentary assessment (EMA) data Biometrics 64 (2) 627634 10.1111/j.1541-0420.2007.00924.x 17970819CrossRefGoogle ScholarPubMed
Hedeker, D., Mermelstein, R. J., Demirtas, H., (2012). Modeling between-subject and within-subject variances in ecological momentary assessment data using mixed-effects location scale models Statistics in medicine 31 (27) 33283336 10.1002/sim.5338 22419604CrossRefGoogle ScholarPubMed
Holzinger, K. J., & Swineford, F. A. (1939). A study in factor analysis: The stability of a bi-factor solution. Supplementary Education Monographs, 48.Google Scholar
Hu, Y., Nesselroade, J. R., Erbacher, M. K., Boker, S. M., Burt, S. A., Keel, P. K., Klump, K., (2016). Test reliability at the individual level Structural Equation Modeling: A Multidisciplinary Journal 23 (4) 532543 10.1080/10705511.2016.1148605CrossRefGoogle ScholarPubMed
Jöreskog, K. G., (1971). Statistical analysis of sets of congeneric tests Psychometrika 36 (2) 109133 10.1007/BF02291393CrossRefGoogle Scholar
Kapur, K., Li, X., Blood, E. A., Hedeker, D., (2015). Bayesian mixed-effects location and scale models for multivariate longitudinal outcomes: An application to ecological momentary assessment data Statistics in Medicine 34 (4) 630651 10.1002/sim.6345 25409923CrossRefGoogle ScholarPubMed
Leckie, G., French, R., Charlton, C., Browne, W., (2014). Modeling heterogeneous variance-covariance components in two-level models Journal of Educational and Behavioral Statistics 39 (5) 307332 10.3102/1076998614546494CrossRefGoogle Scholar
Lee, Y., Nelder, J. A., (2006). Double hierarchical generalized linear models (with discussion) Journal of the Royal Statistical Society: Series C (Applied Statistics) 55 (2) 139185 10.1111/j.1467-9876.2006.00538.xGoogle Scholar
Lek, K. M., & Van De Schoot, R. (2018). A comparison of the single, conditional and person-specific standard error of measurement: What do they measure and when to use them? Frontiers in Applied Mathematics and Statistics. https://doi.org/10.3389/fams.2018.00040CrossRefGoogle Scholar
Lewandowski, D., Kurowicka, D., & Joe, H. (2009). Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis. https://doi.org/10.1016/j.jmva.2009.04.008CrossRefGoogle Scholar
Li, X., Hedeker, D., (2012). A three-level mixed-effects location scale model with an application to ecological momentary assessment data Statistics in Medicine 31 (26) 31923210 10.1002/sim.5393 22865663 3665350CrossRefGoogle ScholarPubMed
Liu, H., Zhang, Z., Grimm, K. J., (2016). Comparison of inverse Wishart and separation-strategy priors for Bayesian estimation of covariance parameter matrix in growth curve analysis Structural Equation Modeling: A Multidisciplinary Journal 23 (3) 354367 10.1080/10705511.2015.1057285CrossRefGoogle Scholar
Lord, F. M., & Novick, M. R. (2008). Statistical theories of mental test scores. Information Age Publishing.Google Scholar
Martin, S. R., Williams, D. R., & Rast, P. (2019). Measurement invariance assessment with Bayesian hierarchical inclusion modeling. PsyArXiv. https://doi.org/10.31234/osf.io/qbdjtCrossRefGoogle Scholar
Martin, S. R., Williams, D. R., & Rast, P. (2020). Omegad. Retrieved from http://github.com/stephensrmmartin/omegadGoogle Scholar
McNeish, D., (2018). Thanks coeffcient alpha, we’ll take it from here Psychological Methods 23 (3) 412433 10.1037/met0000144 28557467CrossRefGoogle Scholar
Mehta, P. D., Neale, M. C., (2005). People are variables too: Multilevel structural equations modeling Psychological Methods 10 (3) 259284 10.1037/1082-989X.10.3.259 16221028CrossRefGoogle ScholarPubMed
Meredith, W., (1993). Measurement invariance, factor analysis and factorial invariance Psychometrika 58 (4) 525543 10.1007/BF02294825CrossRefGoogle Scholar
Merkle, E. C., Wang, T., (2018). Bayesian latent variable models for the analysis of experimental psychology data Psychonomic Bulletin & Review 25 (1) 256270 10.3758/s13423-016-1016-7CrossRefGoogle ScholarPubMed
Muthén, B. O., (1994). Multilevel covariance structure analysis Sociological Methods & Research 22 (3) 376398 10.1177/0049124194022003006CrossRefGoogle Scholar
Nestler, S., (2020). Modelling inter-individual differences in latent within-person variation: The confirmatory factor level variability model British Journal of Mathematical and Statistical Psychology 73 (3) 452473 10.1111/bmsp.12196 31912895CrossRefGoogle ScholarPubMed
Raju, N. S., Price, L. R., Oshima, T., Nering, M. L., (2007). Standardized conditional SEM: A case for conditional reliability Applied Psychological Measurement 31 (3) 169180 10.1177/0146621606291569CrossRefGoogle Scholar
Rast, P., Ferrer, E., (2018). A mixed-effects location scale model for dyadic interactions Multivariate Behavioral Research 53 (5) 756775 10.1080/00273171.2018.1477577 30395725 8572132CrossRefGoogle ScholarPubMed
Rast, P., Hofer, S. M., Sparks, C., (2012). Modeling individual differences in within-person variation of negative and positive affect in a mixed effects location scale model using BUGS/JAGS Multivariate Behavioral Research 47 (2) 177200 10.1080/00273171.2012.658328 26734847CrossRefGoogle Scholar
Rast, P., Martin, S. R., Liu, S., & Williams, D. R. (2020). A new frontier for studying within-person variability: Bayesian multivariate generalized autoregressive conditional heteroskedasticity models. Psychological Methods. https://doi.org/10.1037/met0000357CrossRefGoogle Scholar
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: applications and data analysis methods (2nd edn). Thousand Oaks.Google Scholar
Raykov, T., (1997). Estimation of composite reliability for congeneric measures Applied Psychological Measurement 21 (2) 173184 10.1177/01466216970212006CrossRefGoogle Scholar
Raykov, T., (2001). Estimation of congeneric scale reliability using covariance structure analysis with nonlinear constraints British Journal of Mathematical and Statistical Psychology 54 (2) 315323 10.1348/000711001159582 11817096CrossRefGoogle ScholarPubMed
Raykov, T., du Toit, S. H. C., (2005). Estimation of reliability for multiple-component measuring instruments in hierarchical designs Structural Equation Modeling: A Multidisciplinary Journal 12 (4) 536550 10.1207/s15328007sem1204_2CrossRefGoogle Scholar
Rothenberg, T. J., (1971). Identification in parametric models Econometrica 39 (3) 577 10.2307/1913267CrossRefGoogle Scholar
Schad, D. J., Betancourt, M., & Vasishth, S. (2019). Toward a principled Bayesian workflow in cognitive science.Google Scholar
Solin, A., & Särkkä, S. (2019). Hilbert space methods for reduced-rank Gaussian process regression. Statistics and Computing. https://doi.org/10.1007/s11222-019-09886-wCrossRefGoogle Scholar
Vehtari, A., Gelman, A., Gabry, J., (2017). Practical Bayesian model evaluation using leave-one-out crossvalidation andWAIC Statistics and Computing 27 (5) 14131432 10.1007/s11222-016-9696-4CrossRefGoogle Scholar
Viallefont, A., Lebreton, J.-D., Reboulet, A.-M., & Gory, G. (1998). Parameter identifiability and model selection in capture-recapture models: A numerical approach. Biometrical Journal, 40(3), 313–325. https://doi.org/10.1002/(SICI)1521-4036(199807)40:3 <\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$<$$\end{document}313:AID-BIMJ313>\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$>$$\end{document}3.0.CO2-23.0.CO;2-2>CrossRefGoogle Scholar
Williams, D. R., Liu, S., Martin, S. R., & Rast, P. (2019). Bayesian multivariate mixed-effects location scale modeling of longitudinal relations among affective traits, states, and physical activity. PsyArXiv. https://doi.org/10.31234/osf.io/4kfjpCrossRefGoogle Scholar
Williams, D. R., Martin, S. R., & Rast, P. (2019). Putting the individual into reliability: Bayesian testing of homogeneous within-person variance in hierarchical models. PsyArXiv. https://doi.org/10.31234/OSF.IO/HPQ7WCrossRefGoogle Scholar
Yang, Y., Bhattacharya, A., & Pati, D. (2017). Frequentist coverage and sup-norm convergence rate in Gaussian process regression. Retrieved from arxiv.org/abs/1708.04753Google Scholar
Zhang, X., & Savalei, V. (2019). Examining the effect of missing data on RMSEA and CFI under normal theory full-information maximum likelihood. Structural Equation Modeling: A Multidisciplinary Journal . https://doi.org/10.1080/10705511.2019.1642111CrossRefGoogle Scholar