Explaining the Weak Relationship Between Job Performance and Ratings of Job Performance

Kevin R. Murphy

doi:10.1111/j.1754-9434.2008.00030.x

Explaining the Weak Relationship Between Job Performance and Ratings of Job Performance

Published online by Cambridge University Press: 07 January 2015

Kevin R. Murphy

Show author details

Kevin R. Murphy*: Affiliation:
The Pennsylvania State University
*: E-mail: [email protected], Address: Department of Psychology, The Pennsylvania State University, Moore Building, University Park, PA, 16802

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Ratings of job performance are widely viewed as poor measures of job performance. Three models of the performance–performance rating relationship offer very different explanations and solutions for this seemingly weak relationship. One-factor models suggest that measurement error is the main difference between performance and performance ratings and they offer a simple solution—that is, the correction for attenuation. Multifactor models suggest that the effects of job performance on performance ratings are often masked by a range of systematic nonperformance factors that also influence these ratings. These models suggest isolating and dampening the effects of these nonperformance factors. Mediated models suggest that intentional distortions are a key reason that ratings often fail to reflect ratee performance. These models suggest that raters must be given both the tools and the incentive to perform well as measurement instruments and that systematic efforts to remove the negative consequences of giving honest performance ratings are needed if we hope to use performance ratings as serious measures of job performance.

Type: Focal Article
Information: Industrial and Organizational Psychology , Volume 1 , Issue 2 , June 2008 , pp. 148 - 160

DOI: https://doi.org/10.1111/j.1754-9434.2008.00030.x [Opens in a new window]
Copyright: Copyright © Society for Industrial and Organizational Psychology 2008

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Department of Psychology, The Pennsylvania State University

References

Austin, J. T., & Villanova, P. (1992). The criterion problem 1917–1992. Journal of Applied Psychology, 77, 836–874.Google Scholar

Balzer, W. K., & Sulsky, L. M. (1992). Halo and performance appraisal research: A critical examination. Journal of Applied Psychology, 77, 975–985.Google Scholar

Banks, C. G., & Murphy, K. R. (1985). Toward narrowing the research practice gap in performance appraisal. Personnel Psychology, 38, 335–345.Google Scholar

Bernardin, H. J., & Beatty, R. W. (1984). Performance appraisal: Assessing human behavior at work. Boston: Kent.Google Scholar

Bernardin, H. J., & Buckley, M. R. (1981). Strategies in rater training. Academy of Management Review, 6, 205–212.Google Scholar

Bernardin, H. J., & Walter, C. S. (1977). Effects of rater training and diary-keeping on psychometric error in ratings. Journal of Applied Psychology, 62, 64–69.Google Scholar

Bjerke, D. G., Cleveland, J. N., Morrison, R. F., & Wilson, W. C. (1987). Officer fitness report evaluation study (Navy Personnel Research and Development Center Report, TR 88-4). San Diego, CA: NPRDC.Google Scholar

Bracken, D., Timmreck, C., & Church, A. (2001). Handbook of multisource feedback. San Francisco: Jossey-Bass.Google Scholar

Chadwick-Jones, J. K., Brown, C. A., Nicholson, N., & Sheppard, C. (1971). Absence measures: Their reliability and stability in an industrial setting. Personnel Psychology, 24, 463–470.Google Scholar

Cleveland, J. N., & Murphy, K. R. (1992). Analyzing performance appraisal as goal-directed behavior. In Ferris, G. & Rowland, K. (Eds.), Research in personnel and human resources management (Vol. 10, pp. 121–185). Greenwich, CT: JAI Press.Google Scholar

Cleveland, J. N., Murphy, K. R., & Williams, R. (1989). Multiple uses of performance appraisal: Prevalence and correlates. Journal of Applied Psychology, 74, 130–135.Google Scholar

Coen, T., & Jenkins, M. (2000). Abolishing performance appraisals: Why they backfire and what to do instead. New York: Berrett-Koehler.Google Scholar

Cooper, W. (1981). Ubiquitous halo. Psychological Bulletin, 90, 218–244.Google Scholar

Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley.Google Scholar

DeCotiis, T., & Petit, A. (1978). The performance appraisal process: A model and some testable propositions. Academy of Management Review, 3, 635–646.Google Scholar

DeNisi, A. S., Cafferty, T. P., & Meglino, B. M. (1984). A cognitive view of the performance appraisal process: A model and research propositions. Organizational Behavior and Human Performance, 33, 360–396.Google Scholar

Gaudet, F. J. (1963). Solving the problems of employee absence. New York: American Management Association.Google Scholar

Guion, R. M. (1998). Assessment, measurement and prediction for personnel decisions. Mahwah, NJ: Erlbaum.Google Scholar

Harris, M. H., & Schaubroeck, J. (1988). A meta-analysis of self-supervisory, self-peer, and peer-supervisory ratings. Personnel Psychology, 41, 43–62.Google Scholar

Heneman, R. L., Wexley, K. N., & Moore, M. L. (1987). Performance-rating accuracy: A critical review. Journal of Business Research, 15, 431–448.Google Scholar

Hunter, J. E. (1983). The economic benefits of personnel selection using ability tests: A state of the art review including a detailed analysis of the dollar benefit of U.S. Employment Service placements and a critique of the low-cutoff method of test use (USES Test Research Report No. 47). Washington, DC: U.S. Employment Service, USDOL.Google Scholar

Hunter, J. E., & Schmidt, F. L. (1982). Fitting people to jobs: Implications of personnel selection for national productivity. In Fleishman, E. A. & Dunnette, M. D. (Eds.), Human performance and productivity. Volume I: Human capability assessment (pp. 233–284). Hillsdale, NJ: Erlbaum.Google Scholar

Jacobs, R., Kafry, D., & Zedeck, S. (1980). Expectations of behaviorally anchored rating scales. Personnel Psychology, 33, 595–640.Google Scholar

Jawahar, I. M., & Williams, C. R. (1997). Where all the children are above average: The performance appraisal purpose effect. Personnel Psychology, 50, 905–926.Google Scholar

Landy, F. J., & Farr, J. L. (1980). Performance rating. Psychological Bulletin, 87, 72–107.Google Scholar

Landy, F. J., & Farr, J. L. (1983). The measurement of work performance: Methods, theory, and applications. New York: Academic Press.Google Scholar

Landy, F. J., Vance, R. J., Barnes-Farrell, J. L., & Steele, J. W. (1980). Statistical control of halo error in performance ratings. Journal of Applied Psychology, 65, 501–506.Google Scholar

Latham, G., & Wexley, K. (1977). Behavioral observation scales. Journal of Applied Psychology, 30, 255–268.Google Scholar

Le, H., Oh, I., Shaffer, J., & Schmidt, F. (2007). Implications of methodological advances for the practice of personnel selection: How practitioners benefit from meta-analysis. Academy of Management Perspectives, 3, 6–15.Google Scholar

Longenecker, C. O., Sims, H. P., & Gioia, D. A. (1987). Behind the mask: The politics of employee appraisal. Academy of Management Executive, 1, 183–193.Google Scholar

Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar

Lumsden, J. (1976). Test theory. Annual Review of Psychology, 27, 251–280.Google Scholar

McBriarty, M. A. (1988). Performance appraisal: Some unintended consequences. Public Personnel Management, 17, 421–434.Google Scholar

McIntyre, R. M., Smith, D., & Hassett, C. E. (1984). Accuracy of performance ratings as affected by rater training and perceived purpose of rating. Journal of Applied Psychology, 69, 147–156.Google Scholar

Meyer, H. H., Kay, E., & French, R. P. (1965). Split roles in performance appraisal. Harvard Business Review, 43, 123–129.Google Scholar

Morin, D., & Murphy, K. R. (1999). Analyse empirique de la relation enre le contexte de l’évaluation de rendment et l’indulgence de l’évaluateur [The relationship between performance appraisal context and rating inflation]. Relations Industrielles [Industrial Relations], 54, 694–726.Google Scholar

Murphy, K. R. (1982). Difficulties in the statistical control of halo. Journal of Applied Psychology, 67, 161–164.Google Scholar

Murphy, K. R., & Balzer, W. K. (1989). Rater errors and rating accuracy. Journal of Applied Psychology, 74, 619–624.Google Scholar

Murphy, K. R., Balzer, W., Kellam, K., & Armstrong, J. (1984). Effect of purpose of rating on accuracy in observing teacher behavior and evaluating teaching performance. Journal of Educational Psychology, 76, 45–54.Google Scholar

Murphy, K. R., & Cleveland, J. N. (1991). Performance appraisal: An organizational perspective. Needham Heights, MA: Allyn and Bacon.Google Scholar

Murphy, K. R., & Cleveland, J. N. (1995). Understanding performance appraisal: Social, organizational and goal-based perspectives. Thousand Oaks, CA: Sage.Google Scholar

Murphy, K. R., Cleveland, J. N., Kinney, T. B., Skattebo, A. L., Newman, D. A., & Sin, H. P. (2003). Unit climate, rater goals, and performance ratings in an instructional setting. Irish Journal of Management, 24, 48–65.Google Scholar

Murphy, K. R., Cleveland, J. N., & Mohler, C. (2001). Reliability, validity and meaningfulness of multisource ratings. In Bracken, D., Timmreck, C., and Church, A. (Eds.), Handbook of multisource feedback (pp. 130–148). San Francisco: Jossey-Bass.Google Scholar

Murphy, K. R., Cleveland, J. N., Skattebo, A. L., & Kinney, T. B. (2004). Raters who pursue different goals give different ratings. Journal of Applied Psychology, 89, 158–164.Google Scholar

Murphy, K. R., & DeShon, R. (2000a). Inter-rater correlations do not estimate the reliability of job performance ratings. Personnel Psychology, 53, 873–900.Google Scholar

Murphy, K. R., & DeShon, R. (2000b). Progress in psychometrics: Can industrial and organizational psychology catch up? Personnel Psychology, 53, 913–924.Google Scholar

Murphy, K. R., Jako, R. A., & Anhalt, R. L. (1993). The nature and consequences of halo error: A critical analysis. Journal of Applied Psychology, 78, 218–225.Google Scholar

Murphy, K. R., & Reynolds, D. (1988). Does true halo affect observed halo? Journal of Applied Psychology, 73, 235–238.Google Scholar

Noonan, L. E., & Sulsky, L. M. (2001). Impact of Frame-of-Reference and Behavioral Observation Training on alternative training effectiveness criteria in a Canadian military sample. Human Performance, 14, 3–26.Google Scholar

Osterman, P. (2007). Comment on Le, Oh, Shaffer and Schmidt. Academy of Management Perspectives, 3, 16–18.Google Scholar

Roch, S. G., Sturnburgh, A. M., & Caputo, P. M. (2007). Absolute vs. relative rating formats: Implications for fairness and organizational justice. International Journal of Selection and Assessment, 15, 302–316.Google Scholar

Schmidt, F. L. (2002). The role of general cognitive ability and job performance: Why there cannot be a debate. Human Performance, 15, 187–202.Google Scholar

Schmidt, F. L., & Hunter, J. E. (1996). Measurement error in psychological research: Lessons from 26 research scenarios. Psychological Methods, 1, 199–223.Google Scholar

Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, 262–274.Google Scholar

Schmidt, F. E., Viswesvaran, C., & Ones, D. (2000). Reliability is not validity and validity is not reliability. Personnel Psychology, 53, 901–912.Google Scholar

Scullen, S. E., Bergey, P. K., & Aiman-Smith, L. (2005). Forced distribution rating systems and improvement of workforce potential: A baseline simulation. Personnel Psychology, 58, 1–32.Google Scholar

Scullen, S. E., Mount, M. K., & Goff, M. (2000). Evidence of the construct validity of developmental ratings of managerial performance. Journal of Applied Psychology, 88, 50–66.Google Scholar

Scullen, S. E., Mount, M. K., & Judge, T. A. (2003). Understanding the latent structure of job performance ratings. Journal of Applied Psychology, 85, 956–970.Google Scholar

Sulsky, L. M., Skarlicki, D. P., & Keown, J. L. (2002). Frame-of-reference training: Overcoming the effects of organizational citizenship behavior on performance rating accuracy. Journal of Applied Social Psychology, 32, 1224–1240.Google Scholar

Tziner, A., & Murphy, K. R. (1999). Additional evidence of attitudinal influences in performance appraisal. Journal of Business and Psychology, 13, 407–419.Google Scholar

Tziner, A., Murphy, K. R., & Cleveland, J. N. (2001). Relationships between attitudes toward organizations and performance appraisal systems and rating behavior. International Journal of Selection and Assessment, 9, 226–239.Google Scholar

Tziner, A., Murphy, K. R., & Cleveland, J. N. (2002). Does conscientiousness moderate the relationship between attitudes and beliefs regarding performance appraisal and rating behavior? International Journal of Selection and Assessment, 10, 218–224.Google Scholar

Tziner, A., Murphy, K. R., & Cleveland, J. N. (2005). Contextual and rater factors affecting rating behavior. Group and Organizational Management, 30, 89–98.Google Scholar

Tziner, A., Murphy, K. R., Cleveland, J. N., Beaudin, G., & Marchand, S. (1998). Impact of rater beliefs regarding performance appraisal and its organizational contexts on appraisal quality. Journal of Business and Psychology, 12, 457–467.Google Scholar

Tziner, A., Murphy, K. R., Cleveland, J. N., Yavo, A., & Hayoon, E. (in press). A new old question: Do contextual factors relate to rating behavior?—An investigation with peer evaluations. International Journal of Selection and Assessment.Google Scholar

Villanova, P., & Bernardin, H. J. (1989). Impression management in the context of performance appraisal. In Giacalone, R. A. & Rosenfeld, P. (Eds.), Impression management in the organization (pp. 299–314). Hillsdale, NJ: Erlbaum.Google Scholar

Viswesvaran, C., Ones, D. S., & Schmidt, F. L. (1996). Comparative analysis of the reliability of job performance ratings. Journal of Applied Psychology, 81, 557–574.Google Scholar

Viswesvaran, C., Schmidt, F. L., & Ones, D. S. (2005). Is there a general factor in ratings of job performance? A meta-analytic framework for disentangling substantive and error influences. Journal of Applied Psychology, 90, 108–131.Google Scholar

Wallace, S. R. (1974). How high the validity? Personnel Psychology, 27, 397–407.Google Scholar

Wegner, D. M., Schneider, D. J., Carter, S. R. III, & White, T. L. (1987). Paradoxical effects of thought suppression. Journal of Personality and Social Psychology, 53, 636–647.Google Scholar

Welch, J. F. (2001). Jack: Straight from the gut. New York: Warner Books.Google Scholar

Wherry, R. J., & Bartlett, C. J. (1982). The control of bias in ratings: A theory of rating. Personnel Psychology, 35, 521–555.Google Scholar

Williams, J. R., & Levy, P. E. (1992). The effects of perceived system knowledge on the agreement between self-ratings and supervisor ratings. Personnel Psychology, 45, 835–847.Google Scholar

Article contents

Explaining the Weak Relationship Between Job Performance and Ratings of Job Performance

Abstract

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests