Hostname: page-component-cd9895bd7-fscjk Total loading time: 0 Render date: 2024-12-25T15:57:51.030Z Has data issue: false hasContentIssue false

Assessing Features of Psychometric Assessment Instruments: A Comparison of the COSMIN Checklist with Other Critical Appraisal Tools

Published online by Cambridge University Press:  07 December 2017

Ulrike Rosenkoetter
Affiliation:
John Walsh Centre for Rehabilitation Research, Kolling Institute of Medical Research, University of Sydney, New SouthWales, Australia
Robyn L. Tate*
Affiliation:
John Walsh Centre for Rehabilitation Research, Kolling Institute of Medical Research, University of Sydney, New SouthWales, Australia
*
Address for correspondence: Professor Robyn L. Tate, John Walsh Centre for Rehabilitation Research, University of Sydney, Level 9, Kolling Institute of Medical Research, Royal North Shore Hospital, St Leonards, New South Wales 2065, Australia. E-mail: [email protected]
Get access

Abstract

The past 20 years have seen the development of instruments designed to specify standards and evaluate the adequacy of published studies with respect to the quality of study design, the quality of findings, as well as the quality of their reporting. In the field of psychometrics, the first minimum set of standards for the review of psychometric instruments was published in 1996 by the Scientific Advisory Committee of the Medical Outcomes Trust. Since then, a number of tools have been developed with similar aims. The present paper reviews basic psychometric properties (reliability, validity and responsiveness), compares six tools developed for the critical appraisal of psychometric studies and provides a worked example of using the COSMIN checklist, Terwee-m statistical quality criteria, and the levels of evidence synthesis using the method of Schellingerhout and colleagues (2012). This paper will aid users and reviewers of questionnaires in the quality appraisal and selection of appropriate instruments by presenting available assessment tools, their characteristics and utility.

Type
Articles
Copyright
Copyright © Australasian Society for the Study of Brain Impairment 2017 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

American Educational Research Association. (1999). American Psychological Association, & National Council on Measurement in Education. Standards for educational and psychological testing. American Educational Research Association.Google Scholar
Anastasi, A., & Urbina, S. (1997). Psychology testing. New Jersey: Prentice Hall.Google Scholar
Andresen, E.M. (2000). Criteria for assessing the tools of disability outcomes research. Archives of Physical Medicine and Rehabilitation, 81 (Suppl. 2), S15–S20.CrossRefGoogle ScholarPubMed
Bayley, M.T., Tate, R., Douglas, J.M., Turkstra, L.S., Ponsford, J., Stergiou-Kita, M., . . . Bragge, P. (2014). INCOG guidelines for cognitive rehabilitation following traumatic brain injury: Methods and overview. The Journal of Head Trauma Rehabilitation, 29 (4), 290306.CrossRefGoogle ScholarPubMed
Bondy, M. (1974). Psychiatric antecedents of psychological testing (before Binet). Journal of the History of the Behavioral Sciences, 10 (2), 180194.3.0.CO;2-X>CrossRefGoogle ScholarPubMed
Bossuyt, P.M., Reitsma, J.B., Bruns, D.E., Gatsonis, C.A., Glasziou, P.P., Irwig, L., . . . De Vet, H.C. (2015). STARD 2015: An updated list of essential items for reporting diagnostic accuracy studies. Radiology, 277 (3), 826832.CrossRefGoogle ScholarPubMed
Cardol, M., Beelen, A., van den Bos, G.A., de Jong, B.A., de Groot, I.J., & de Haan, R.J. (2002). Responsiveness of the Impact on Participation and Autonomy questionnaire. Archives of Physical Medicine and Rehabilitation, 83 (11), 15241529.Google Scholar
Cardol, M., de Haan, R.J., de Jong, B.A., van den Bos, G.A., & de Groot, I.J. (2001). Psychometric properties of the Impact on Participation and Autonomy questionnaire. Archives of Physical Medicine and Rehabilitation, 82 (2), 210216.Google Scholar
Cardol, M., de Haan, R.J., van den Bos, G.A., de Jong, B.A., & de Groot, I.J. (1999). The development of a handicap assessment questionnaire: The Impact on Participation and Autonomy (IPA). Clinical Rehabilitation, 13 (5), 411419.CrossRefGoogle ScholarPubMed
Charters, E., Gillett, L., & Simpson, G.K. (2015). Efficacy of electronic portable assistive devices for people with acquired brain injury: A systematic review. Neuropsychological Rehabilitation, 25 (1), 82121.Google Scholar
Costa, D.S. (2015). Reflective, causal, and composite indicators of quality of life: A conceptual or an empirical distinction? Quality of Life Research, 24 (9), 20572065.Google Scholar
de Vet, H., Terwee, C., & Bouter, L. (2003). Clinimetrics and psychometrics: Two sides of the same coin. Journal of Clinical Epidemiology, 56 (12), 11461147.CrossRefGoogle Scholar
de Vet, H.C., Terwee, C.B., Mokkink, L.B., & Knol, D.L. (2011a). Measurement in medicine: A practical guide. Cambridge: Cambridge University Press.Google Scholar
de Vet, H.C., Terwee, C.B., Mokkink, L.B., & Knol, D.L. (2011b). Systematic reviews of measurement properties. Measurement in medicine: A practical guide (pp. 275314). Cambridge: Cambridge University Press.Google Scholar
de Vet, H.C., Terwee, C.B., Ostelo, R.W., Beckerman, H., Knol, D.L., & Bouter, L.M. (2006). Minimal changes in health status questionnaires: Distinction between minimally detectable change and minimally important change. Health and Quality of Life Outcomes, 4 (1), 54.Google Scholar
DeVellis, R.F. (2003). Scale development: Theory and applications. Thousand Oaks, CA: Sage Publications.Google Scholar
Downs, S.H., & Black, N. (1998). The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions. Journal of Epidemiology & Community Health, 52 (6), 377384.Google Scholar
Dunn, T.J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105 (3), 399412.CrossRefGoogle Scholar
Francis, D.O., McPheeters, M.L., Noud, M., Penson, D.F., & Feurer, I.D. (2016). Checklist to operationalize measurement characteristics of patient-reported outcome measures. Systematic Reviews, 5 (1), 129.Google Scholar
Frost, M.H., Reeve, B.B., Liepa, A.M., Stauffer, J.W., & Hays, R.D. (2007). What is sufficient evidence for the reliability and validity of patient-reported outcome measures? Value in Health, 10, S94–S105.CrossRefGoogle ScholarPubMed
Gibby, R.E., & Zickar, M.J. (2008). A history of the early days of personality testing in American industry: An obsession with adjustment. History of Psychology, 11 (3), 164184.CrossRefGoogle ScholarPubMed
Guyatt, G.H., Kirshner, B., & Jaeschke, R. (1992). Measuring health status: What are the necessary measurement properties? Journal of Clinical Epidemiology, 45 (12), 13411345.CrossRefGoogle ScholarPubMed
Hayton, J.C., Allen, D.G., & Scarpello, V. (2004). Factor retention decisions in exploratory factor analysis: A tutorial on parallel analysis. Organizational Research Methods, 7 (2), 191205.Google Scholar
Howell, R.D., Breivik, E., & Wilcox, J.B. (2007). Reconsidering formative measurement. Psychological Methods, 12 (2), 205218.Google Scholar
Jarvis, C.B., MacKenzie, S.B., & Podsakoff, P.M. (2003). A critical review of construct indicators and measurement model misspecification in marketing and consumer research. Journal of Consumer Research, 30 (2), 199218.Google Scholar
Kirshner, B., & Guyatt, G. (1985). A methodological framework for assessing health indices. Journal of Chronic Diseases, 38 (1), 2736.Google Scholar
Kottner, J., Audigé, L., Brorson, S., Donner, A., Gajewski, B.J., Hróbjartsson, A., . . . Streiner, D.L. (2011). Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. International Journal of Nursing Studies, 48 (6), 661671.Google Scholar
Kratochwill, T.R., Hitchcock, J., Horner, R., Levin, J.R., Odom, S., Rindskopf, D., & Shadish, W. (2010). Single-case designs technical documentation what works clearinghouse. Retrieved from What Works Clearinghouse website: http://ies.ed.gov/ncee/wwc/pdf/wwc_scd.pdf.Google Scholar
Lohr, K.N., Aaronson, N.K., Alonso, J., Burnam, M.A., Patrick, D.L., Perrin, E.B., & Roberts, J.S. (1996). Evaluating quality-of-life and health status instruments: Development of scientific review criteria. Clinical Therapeutics, 18 (5), 979992.Google Scholar
MacCallum, R.C., Widaman, K.F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, 4 (1), 8499.CrossRefGoogle Scholar
MacKenzie, S.B., Podsakoff, P.M., & Podsakoff, N.P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS Quarterly, 35 (2), 293334.Google Scholar
Maher, C.G., Sherrington, C., Herbert, R.D., Moseley, A.M., & Elkins, M. (2003). Reliability of the PEDro scale for rating quality of randomized controlled trials. Physical Therapy, 83 (8), 713721.Google Scholar
Mokkink, L.B., Terwee, C.B., Patrick, D.L., Alonso, J., Stratford, P.W., Knol, D.L., . . . De Vet, H.C. (2010a). The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: An international Delphi study. Quality of Life Research, 19 (4), 539549.Google Scholar
Mokkink, L.B., Terwee, C.B., Patrick, D.L., Alonso, J., Stratford, P.W., Knol, D.L., . . . de Vet, H.C. (2010b). The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. Journal of Clinical Epidemiology, 63 (7), 737745.Google Scholar
Mokkink, L.B., Terwee, C.B., Patrick, D.L., Alonso, J., Stratford, P.W., Knol, D.L., . . . de Vet, H.C. (2012). COSMIN checklist manual. Amsterdam, The Netherlands: University Medical Center.Google Scholar
Mokkink, L.B., Terwee, C.B., Stratford, P.W., Alonso, J., Patrick, D.L., Riphagen, I., . . . De Vet, H.C. (2009). Evaluation of the methodological quality of systematic reviews of health status measurement instruments. Quality of Life Research, 18 (3), 313333.Google Scholar
Noel-Storr, A.H., McCleery, J.M., Richard, E., Ritchie, C.W., Flicker, L., Cullum, S.J., . . . Rutjes, A.W. (2014). Reporting standards for studies of diagnostic test accuracy in dementia: The STARDdem Initiative. Neurology, 83 (4), 364373.CrossRefGoogle ScholarPubMed
Revelle, W., & Zinbarg, R.E. (2009). Coefficients alpha, beta, omega, and the glb: Comments on Sijtsma. Psychometrika, 74 (1), 145154.Google Scholar
Rust, J., & Golombok, S. (2009). Modern psychometrics. The science of psychological assessment (3rd ed.). London: Routledge.Google Scholar
Schellingerhout, J.M., Verhagen, A.P., Heymans, M.W., Koes, B.W., Henrica, C., & Terwee, C.B. (2012). Measurement properties of disease-specific questionnaires in patients with neck pain: A systematic review. Quality of Life Research, 21 (4), 659670.Google Scholar
Schmidt, S., Garin, O., Pardo, Y., Valderas, J. M., Alonso, J., Rebollo, P., . . . Grp, E. (2014). Assessing quality of life in patients with prostate cancer: A systematic and standardized comparison of available instruments. Quality of Life Research, 23 (8), 21692181.Google Scholar
Schreiber, J.B., Nora, A., Stage, F.K., Barlow, E.A., & King, J. (2006). Reporting structural equation modeling and confirmatory factor analysis results: A review. The Journal of Educational Research, 99 (6), 323338.Google Scholar
Schulz, K.F., Altman, D.G., & Moher, D. (2010). CONSORT 2010 statement: Updated guidelines for reporting parallel group randomised trials. BMC Medicine, 8 (1), 18.Google Scholar
Scientific Advisory Committee of the Medical Outcomes Trust, Aaronson, N., Alonso, J., Burnam, A., Lohr, K. N., Patrick, D. L., . . . Stein, R. E. (2002). Assessing health status and quality-of-life instruments: Attributes and review criteria. Quality of Life Research, 11 (3), 193205.Google Scholar
Streiner, D. (2003a). Clinimetrics vs. psychometrics: An unnecessary distinction. Journal of Clinical Epidemiology, 56 (12), 11421145. doi: 10.1016/j. jclinepi.2003.08.011.Google Scholar
Streiner, D.L. (2003b). Test development: Two-sided coin or one-sided Möbius strip? Journal of Clinical Epidemiology, 56 (12), 11481149.Google Scholar
Streiner, D.L., & Kottner, J. (2014). Recommendations for reporting the results of studies of instrument and scale development and testing. Journal of Advanced Nursing, 70 (9), 19701979.Google Scholar
Streiner, D.L., Norman, G.R., & Cairney, J. (2015a). Health measurement scales (5th ed.). Oxford, UK: Oxford University Press.Google Scholar
Streiner, D.L., Norman, G.R., & Cairney, J. (2015b). Reporting test results. In Streiner, D.L., Norman, G.R., & Cairney, J. (Eds.), Health measurement scales (5th ed., pp. 349356). Oxford, UK: Oxford University Press.Google Scholar
Tang, W., Cui, Y., & Babenko, O. (2014). Internal consistency: Do we really know what it is and how to assess it. Journal of Psychology and Behavioral Science, 2 (2), 205220.Google Scholar
Tate, R.L., Perdices, M., Rosenkoetter, U., Shadish, W., Vohra, S., Barlow, D.H., . . . Wilson, B. (2016). The single-case reporting guideline in behavioural interventions (SCRIBE) 2016 statement. Archives of Scientific Psychology, 4 (1), 19.Google Scholar
Tate, R.L., Rosenkoetter, U., Wakim, D., Sigmundsdottir, L., Doubleday, J., Togher, L., . . . Perdices, M. (2015). The Risk-of-bias in N-of-1 Trials (RoBiNT) scale: An expanded manual for the critical appraisal of single-case reports. Sydney, Australia: The Author(s).Google Scholar
Terwee, C.B., Bot, S.D., de Boer, M.R., van der Windt, D.A., Knol, D.L., Dekker, J., . . . de Vet, H.C. (2007). Quality criteria were proposed for measurement properties of health status questionnaires. Journal of Clinical Epidemiology, 60 (1), 3442.CrossRefGoogle ScholarPubMed
Terwee, C.B., Mokkink, L.B., Knol, D.L., Ostelo, R.W., Bouter, L.M., & de Vet, H.C. (2012). Rating the methodological quality in systematic reviews of studies on measurement properties: A scoring system for the COSMIN checklist. Quality of Life Research, 21 (4), 651657.Google Scholar
The AGREE Collaboration. (2003). Development and validation of an international appraisal instrument for assessing the quality of clinical practice guidelines: The AGREE project. Quality and Safety in Health Care, 12, 1823.Google Scholar
Trizano-Hermosilla, I., & Alvarado, J.M. (2016). Best alternatives to Cronbach's alpha reliability in realistic conditions: Congeneric and asymmetrical measurements. Frontiers in Psychology, 7, 769.Google Scholar
Turner‐Stokes, L., Pick, A., Nair, A., Disler, P.B., & Wade, D.T. (2015). Multi‐disciplinary rehabilitation for acquired brain injury in adults of working age. The Cochrane Library, Issue 12. Art. No.: CD004170.Google Scholar
Valderas, J.M., Ferrer, M., Mendívil, J., Garin, O., Rajmil, L., Herdman, M., & Alonso, J. (2008). Development of EMPRO: A tool for the standardized assessment of patient-reported outcome measures. Value in Health, 11 (4), 700708.Google Scholar