Assessing Features of Psychometric Assessment Instruments: A Comparison of the COSMIN Checklist with Other Critical Appraisal Tools

Ulrike Rosenkoetter; Robyn L. Tate

doi:10.1017/BrImp.2017.29

Assessing Features of Psychometric Assessment Instruments: A Comparison of the COSMIN Checklist with Other Critical Appraisal Tools

Published online by Cambridge University Press: 07 December 2017

Ulrike Rosenkoetter and

Robyn L. Tate

Show author details

Ulrike Rosenkoetter: Affiliation:
John Walsh Centre for Rehabilitation Research, Kolling Institute of Medical Research, University of Sydney, New SouthWales, Australia
Robyn L. Tate*: Affiliation:
John Walsh Centre for Rehabilitation Research, Kolling Institute of Medical Research, University of Sydney, New SouthWales, Australia
*: Address for correspondence: Professor Robyn L. Tate, John Walsh Centre for Rehabilitation Research, University of Sydney, Level 9, Kolling Institute of Medical Research, Royal North Shore Hospital, St Leonards, New South Wales 2065, Australia. E-mail: [email protected]

Article contents

Abstract
References

Get access

Abstract

The past 20 years have seen the development of instruments designed to specify standards and evaluate the adequacy of published studies with respect to the quality of study design, the quality of findings, as well as the quality of their reporting. In the field of psychometrics, the first minimum set of standards for the review of psychometric instruments was published in 1996 by the Scientific Advisory Committee of the Medical Outcomes Trust. Since then, a number of tools have been developed with similar aims. The present paper reviews basic psychometric properties (reliability, validity and responsiveness), compares six tools developed for the critical appraisal of psychometric studies and provides a worked example of using the COSMIN checklist, Terwee-m statistical quality criteria, and the levels of evidence synthesis using the method of Schellingerhout and colleagues (2012). This paper will aid users and reviewers of questionnaires in the quality appraisal and selection of appropriate instruments by presenting available assessment tools, their characteristics and utility.

Keywords

psychometrics COSMIN reporting guideline evidence standards instrument development

Type: Articles
Information: Brain Impairment , Volume 19 , Special Issue 1: Quantitative Data Analysis; by Robyn Tate and Michael Perdices , March 2018 , pp. 103 - 118

DOI: https://doi.org/10.1017/BrImp.2017.29 [Opens in a new window]
Copyright: Copyright © Australasian Society for the Study of Brain Impairment 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

American Educational Research Association. (1999). American Psychological Association, & National Council on Measurement in Education. Standards for educational and psychological testing. American Educational Research Association.Google Scholar

Anastasi, A., & Urbina, S. (1997). Psychology testing. New Jersey: Prentice Hall.Google Scholar

Andresen, E.M. (2000). Criteria for assessing the tools of disability outcomes research. Archives of Physical Medicine and Rehabilitation, 81 (Suppl. 2), S15–S20.CrossRef Google Scholar PubMed

Bayley, M.T., Tate, R., Douglas, J.M., Turkstra, L.S., Ponsford, J., Stergiou-Kita, M., . . . Bragge, P. (2014). INCOG guidelines for cognitive rehabilitation following traumatic brain injury: Methods and overview. The Journal of Head Trauma Rehabilitation, 29 (4), 290–306.CrossRef Google Scholar PubMed

Bondy, M. (1974). Psychiatric antecedents of psychological testing (before Binet). Journal of the History of the Behavioral Sciences, 10 (2), 180–194.3.0.CO;2-X>CrossRef Google Scholar PubMed

Bossuyt, P.M., Reitsma, J.B., Bruns, D.E., Gatsonis, C.A., Glasziou, P.P., Irwig, L., . . . De Vet, H.C. (2015). STARD 2015: An updated list of essential items for reporting diagnostic accuracy studies. Radiology, 277 (3), 826–832.CrossRef Google Scholar PubMed

Cardol, M., Beelen, A., van den Bos, G.A., de Jong, B.A., de Groot, I.J., & de Haan, R.J. (2002). Responsiveness of the Impact on Participation and Autonomy questionnaire. Archives of Physical Medicine and Rehabilitation, 83 (11), 1524–1529.Google Scholar

Cardol, M., de Haan, R.J., de Jong, B.A., van den Bos, G.A., & de Groot, I.J. (2001). Psychometric properties of the Impact on Participation and Autonomy questionnaire. Archives of Physical Medicine and Rehabilitation, 82 (2), 210–216.Google Scholar

Cardol, M., de Haan, R.J., van den Bos, G.A., de Jong, B.A., & de Groot, I.J. (1999). The development of a handicap assessment questionnaire: The Impact on Participation and Autonomy (IPA). Clinical Rehabilitation, 13 (5), 411–419.CrossRef Google Scholar PubMed

Charters, E., Gillett, L., & Simpson, G.K. (2015). Efficacy of electronic portable assistive devices for people with acquired brain injury: A systematic review. Neuropsychological Rehabilitation, 25 (1), 82–121.Google Scholar

Costa, D.S. (2015). Reflective, causal, and composite indicators of quality of life: A conceptual or an empirical distinction? Quality of Life Research, 24 (9), 2057–2065.Google Scholar

de Vet, H., Terwee, C., & Bouter, L. (2003). Clinimetrics and psychometrics: Two sides of the same coin. Journal of Clinical Epidemiology, 56 (12), 1146–1147.CrossRef Google Scholar

de Vet, H.C., Terwee, C.B., Mokkink, L.B., & Knol, D.L. (2011a). Measurement in medicine: A practical guide. Cambridge: Cambridge University Press.Google Scholar

de Vet, H.C., Terwee, C.B., Mokkink, L.B., & Knol, D.L. (2011b). Systematic reviews of measurement properties. Measurement in medicine: A practical guide (pp. 275–314). Cambridge: Cambridge University Press.Google Scholar

de Vet, H.C., Terwee, C.B., Ostelo, R.W., Beckerman, H., Knol, D.L., & Bouter, L.M. (2006). Minimal changes in health status questionnaires: Distinction between minimally detectable change and minimally important change. Health and Quality of Life Outcomes, 4 (1), 54.Google Scholar

DeVellis, R.F. (2003). Scale development: Theory and applications. Thousand Oaks, CA: Sage Publications.Google Scholar

Downs, S.H., & Black, N. (1998). The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions. Journal of Epidemiology & Community Health, 52 (6), 377–384.Google Scholar

Dunn, T.J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105 (3), 399–412.CrossRef Google Scholar

Francis, D.O., McPheeters, M.L., Noud, M., Penson, D.F., & Feurer, I.D. (2016). Checklist to operationalize measurement characteristics of patient-reported outcome measures. Systematic Reviews, 5 (1), 129.Google Scholar

Frost, M.H., Reeve, B.B., Liepa, A.M., Stauffer, J.W., & Hays, R.D. (2007). What is sufficient evidence for the reliability and validity of patient-reported outcome measures? Value in Health, 10, S94–S105.CrossRef Google Scholar PubMed

Gibby, R.E., & Zickar, M.J. (2008). A history of the early days of personality testing in American industry: An obsession with adjustment. History of Psychology, 11 (3), 164–184.CrossRef Google Scholar PubMed

Guyatt, G.H., Kirshner, B., & Jaeschke, R. (1992). Measuring health status: What are the necessary measurement properties? Journal of Clinical Epidemiology, 45 (12), 1341–1345.CrossRef Google Scholar PubMed

Hayton, J.C., Allen, D.G., & Scarpello, V. (2004). Factor retention decisions in exploratory factor analysis: A tutorial on parallel analysis. Organizational Research Methods, 7 (2), 191–205.Google Scholar

Howell, R.D., Breivik, E., & Wilcox, J.B. (2007). Reconsidering formative measurement. Psychological Methods, 12 (2), 205–218.Google Scholar

Jarvis, C.B., MacKenzie, S.B., & Podsakoff, P.M. (2003). A critical review of construct indicators and measurement model misspecification in marketing and consumer research. Journal of Consumer Research, 30 (2), 199–218.Google Scholar

Kirshner, B., & Guyatt, G. (1985). A methodological framework for assessing health indices. Journal of Chronic Diseases, 38 (1), 27–36.Google Scholar

Kottner, J., Audigé, L., Brorson, S., Donner, A., Gajewski, B.J., Hróbjartsson, A., . . . Streiner, D.L. (2011). Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. International Journal of Nursing Studies, 48 (6), 661–671.Google Scholar

Kratochwill, T.R., Hitchcock, J., Horner, R., Levin, J.R., Odom, S., Rindskopf, D., & Shadish, W. (2010). Single-case designs technical documentation what works clearinghouse. Retrieved from What Works Clearinghouse website: http://ies.ed.gov/ncee/wwc/pdf/wwc_scd.pdf.Google Scholar

Lohr, K.N., Aaronson, N.K., Alonso, J., Burnam, M.A., Patrick, D.L., Perrin, E.B., & Roberts, J.S. (1996). Evaluating quality-of-life and health status instruments: Development of scientific review criteria. Clinical Therapeutics, 18 (5), 979–992.Google Scholar

MacCallum, R.C., Widaman, K.F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, 4 (1), 84–99.CrossRef Google Scholar

MacKenzie, S.B., Podsakoff, P.M., & Podsakoff, N.P. (2011). Construct measurement and validation procedures in MIS and behavioral research: Integrating new and existing techniques. MIS Quarterly, 35 (2), 293–334.Google Scholar

Maher, C.G., Sherrington, C., Herbert, R.D., Moseley, A.M., & Elkins, M. (2003). Reliability of the PEDro scale for rating quality of randomized controlled trials. Physical Therapy, 83 (8), 713–721.Google Scholar

Mokkink, L.B., Terwee, C.B., Patrick, D.L., Alonso, J., Stratford, P.W., Knol, D.L., . . . De Vet, H.C. (2010a). The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: An international Delphi study. Quality of Life Research, 19 (4), 539–549.Google Scholar

Mokkink, L.B., Terwee, C.B., Patrick, D.L., Alonso, J., Stratford, P.W., Knol, D.L., . . . de Vet, H.C. (2010b). The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. Journal of Clinical Epidemiology, 63 (7), 737–745.Google Scholar

Mokkink, L.B., Terwee, C.B., Patrick, D.L., Alonso, J., Stratford, P.W., Knol, D.L., . . . de Vet, H.C. (2012). COSMIN checklist manual. Amsterdam, The Netherlands: University Medical Center.Google Scholar

Mokkink, L.B., Terwee, C.B., Stratford, P.W., Alonso, J., Patrick, D.L., Riphagen, I., . . . De Vet, H.C. (2009). Evaluation of the methodological quality of systematic reviews of health status measurement instruments. Quality of Life Research, 18 (3), 313–333.Google Scholar

Noel-Storr, A.H., McCleery, J.M., Richard, E., Ritchie, C.W., Flicker, L., Cullum, S.J., . . . Rutjes, A.W. (2014). Reporting standards for studies of diagnostic test accuracy in dementia: The STARDdem Initiative. Neurology, 83 (4), 364–373.CrossRef Google Scholar PubMed

Revelle, W., & Zinbarg, R.E. (2009). Coefficients alpha, beta, omega, and the glb: Comments on Sijtsma. Psychometrika, 74 (1), 145–154.Google Scholar

Rust, J., & Golombok, S. (2009). Modern psychometrics. The science of psychological assessment (3rd ed.). London: Routledge.Google Scholar

Schellingerhout, J.M., Verhagen, A.P., Heymans, M.W., Koes, B.W., Henrica, C., & Terwee, C.B. (2012). Measurement properties of disease-specific questionnaires in patients with neck pain: A systematic review. Quality of Life Research, 21 (4), 659–670.Google Scholar

Schmidt, S., Garin, O., Pardo, Y., Valderas, J. M., Alonso, J., Rebollo, P., . . . Grp, E. (2014). Assessing quality of life in patients with prostate cancer: A systematic and standardized comparison of available instruments. Quality of Life Research, 23 (8), 2169–2181.Google Scholar

Schreiber, J.B., Nora, A., Stage, F.K., Barlow, E.A., & King, J. (2006). Reporting structural equation modeling and confirmatory factor analysis results: A review. The Journal of Educational Research, 99 (6), 323–338.Google Scholar

Schulz, K.F., Altman, D.G., & Moher, D. (2010). CONSORT 2010 statement: Updated guidelines for reporting parallel group randomised trials. BMC Medicine, 8 (1), 18.Google Scholar

Scientific Advisory Committee of the Medical Outcomes Trust, Aaronson, N., Alonso, J., Burnam, A., Lohr, K. N., Patrick, D. L., . . . Stein, R. E. (2002). Assessing health status and quality-of-life instruments: Attributes and review criteria. Quality of Life Research, 11 (3), 193–205.Google Scholar

Streiner, D. (2003a). Clinimetrics vs. psychometrics: An unnecessary distinction. Journal of Clinical Epidemiology, 56 (12), 1142–1145. doi: 10.1016/j. jclinepi.2003.08.011.Google Scholar

Streiner, D.L. (2003b). Test development: Two-sided coin or one-sided Möbius strip? Journal of Clinical Epidemiology, 56 (12), 1148–1149.Google Scholar

Streiner, D.L., & Kottner, J. (2014). Recommendations for reporting the results of studies of instrument and scale development and testing. Journal of Advanced Nursing, 70 (9), 1970–1979.Google Scholar

Streiner, D.L., Norman, G.R., & Cairney, J. (2015a). Health measurement scales (5th ed.). Oxford, UK: Oxford University Press.Google Scholar

Streiner, D.L., Norman, G.R., & Cairney, J. (2015b). Reporting test results. In Streiner, D.L., Norman, G.R., & Cairney, J. (Eds.), Health measurement scales (5th ed., pp. 349–356). Oxford, UK: Oxford University Press.Google Scholar

Tang, W., Cui, Y., & Babenko, O. (2014). Internal consistency: Do we really know what it is and how to assess it. Journal of Psychology and Behavioral Science, 2 (2), 205–220.Google Scholar

Tate, R.L., Perdices, M., Rosenkoetter, U., Shadish, W., Vohra, S., Barlow, D.H., . . . Wilson, B. (2016). The single-case reporting guideline in behavioural interventions (SCRIBE) 2016 statement. Archives of Scientific Psychology, 4 (1), 1–9.Google Scholar

Tate, R.L., Rosenkoetter, U., Wakim, D., Sigmundsdottir, L., Doubleday, J., Togher, L., . . . Perdices, M. (2015). The Risk-of-bias in N-of-1 Trials (RoBiNT) scale: An expanded manual for the critical appraisal of single-case reports. Sydney, Australia: The Author(s).Google Scholar

Terwee, C.B., Bot, S.D., de Boer, M.R., van der Windt, D.A., Knol, D.L., Dekker, J., . . . de Vet, H.C. (2007). Quality criteria were proposed for measurement properties of health status questionnaires. Journal of Clinical Epidemiology, 60 (1), 34–42.CrossRef Google Scholar PubMed

Terwee, C.B., Mokkink, L.B., Knol, D.L., Ostelo, R.W., Bouter, L.M., & de Vet, H.C. (2012). Rating the methodological quality in systematic reviews of studies on measurement properties: A scoring system for the COSMIN checklist. Quality of Life Research, 21 (4), 651–657.Google Scholar

The AGREE Collaboration. (2003). Development and validation of an international appraisal instrument for assessing the quality of clinical practice guidelines: The AGREE project. Quality and Safety in Health Care, 12, 18–23.Google Scholar

Trizano-Hermosilla, I., & Alvarado, J.M. (2016). Best alternatives to Cronbach's alpha reliability in realistic conditions: Congeneric and asymmetrical measurements. Frontiers in Psychology, 7, 769.Google Scholar

Turner‐Stokes, L., Pick, A., Nair, A., Disler, P.B., & Wade, D.T. (2015). Multi‐disciplinary rehabilitation for acquired brain injury in adults of working age. The Cochrane Library, Issue 12. Art. No.: CD004170.Google Scholar

Valderas, J.M., Ferrer, M., Mendívil, J., Garin, O., Rajmil, L., Herdman, M., & Alonso, J. (2008). Development of EMPRO: A tool for the standardized assessment of patient-reported outcome measures. Value in Health, 11 (4), 700–708.Google Scholar

Article contents

Assessing Features of Psychometric Assessment Instruments: A Comparison of the COSMIN Checklist with Other Critical Appraisal Tools

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests