Does effectiveness in performance appraisal improve with rater training?

Autoři: Christian Rosales Sánchez aff001;  Dolores Díaz-Cabrera aff001;  Estefanía Hernández-Fernaud aff001
Působiště autorů: Universidad de La Laguna, Tenerife, Islas Canarias, España aff001
Vyšlo v časopise: PLoS ONE 14(9)
Kategorie: Research Article
doi: 10.1371/journal.pone.0222694


Performance appraisal is a complex process by which an organization can determine the extent to which employees are performing their work effectively. However, this appraisal may not be accurate if there is no reduction in the impact of problems caused by possibly subjective rater judgements. The main objective of this work is to check the effectiveness—separately and jointly—of the following four training programmes in the extant literature aimed at improving the accuracy of performance assessment: 1) Performance Dimension Training, 2) Frame-of-Reference, 3) Rater Error Training, and 4) Behavioural Observation Training. Based on these training strategies, three programmes were designed and applied separately. A fourth programme was a combination of the other three. We analyzed two studies using different samples (85 students and 42 employees) for the existence of differences in the levels of knowledge of performance and its dimensions, rater errors, observational accuracy, and accuracy of task and citizenship performance appraisal, according to the type of training raters receive. First, the main results show that training based on performance dimensions and the creation of a common framework, in addition to the training that includes the four programmes (Training_4_programmes), increases the level of knowledge of performance and its dimensions. Second, groups that receive training in rater error score higher in knowledge of biases than the other groups, whether or not they have received training. Third, participants’ observational accuracy improves with each new moment measure (post-training and follow-up), though not because of the type of training received. Fourth, participants who receive training through the programme that combine the other four gave a task performance appraisal that was closer to the one undertaken by the judges-experts than the other groups. And finally, students’ citizenship performance appraisal does not vary according to type of training or to different moment measures, whereas the group of employees who received all four types of training gave a more accurate citizenship performance assessment.

Klíčová slova:

Social sciences – Economics – Labor economics – Employment – Jobs – Research and analysis methods – Research design – Survey research – Questionnaires – Research assessment – Mathematical and statistical techniques – Statistical methods – Metaanalysis – Biology and life sciences – Neuroscience – Cognitive science – Cognition – Memory – Memory recall – Learning and memory – Psychology – Computer and information sciences – Computer applications – Physical sciences – Mathematics – Statistics


1. Schraeder M., Becton J.B., & Portis R. (2007). A Critical Examination of Performance Appraisal: An Organization’s Friend or Foe? The Journal for Quality and Participation, 30, 20–25.

2. Borman W. C., & Motowidlo S. J. (1997). Task performance and contextual performance: The meaning for personnel selection research. Human Performance, 10(2), 99–109.

3. Motowidlo S. J., & Schmit M. J. (1999). Performance assessment in unique jobs. In Ilgen D. R., & Pulakos E. D. (Eds.), The changing nature of performance: Implications for staffing, motivation, and development (pp 56–87). San Francisco: Jossey-Bass.

4. Viswesvaran C., & Ones D. S. (2000). Perspectives on models of job performance. International Journal of Selection and Assessment, 8(4), 216–226.

5. Witman D. S., Van Rooy D. L. & Viswesvaran C. (2010). Satisfaction, citizenship behaviors, and performance in work units: A meta-analysis of collective construct relations. Personnel Psychology, 63, 41–81.

6. Díaz-Vilela L., Díaz- Cabrera D., Isla-Díaz R., Hernández-Fernaud E., & Rosales-Sánchez C. (2012). Spanish adaptation of the citizenship performance questionnaire by Coleman y Borman (2000) and an analysis of the empiric structure of the construct. Revista de Psicología del Trabajo y las Organizaciones, 28(3), 135–149.

7. Woehr D.J., & Huffcutt A.I. (1994). Rater training for performance appraisal: A quantitative review. Journal of Occupational and Organizational Psychology, 67, 189–205.

8. Landy F. J., & Farr J. L. (1983). The measurement of work performance: Methods, theory, and applications. New York: Academic Press.

9. Aguinis H. (2013). Performance management. Upper Saddle River, New Jersey: Pearson.

10. Sulsky L.M., & Balzer W.K. (1988). Meaning and measurement of performance rating accuracy. Some methodological and theoretical concerns. Journal of Applied Psychology, 73, 497–506.

11. Cronbach L. J. (1955). Processes affecting scores on "understanding of others" and "assumed similarity." Psychological Bulletin, 52, 177–193. 14371889

12. Borman W.C. (1977). Consistency of rating accuracy and rating errors in the judgment of human performance. Organizational Behavior and Human Performance, 20, 238–252. 10305661

13. Roch S. G., Woehr D. J., Mishra V., & Kieszczynska U. (2012). Rater training revisited: An updated meta-analytic review of frame-of-reference training. Journal of Occupational and Organizational Psychology, 85, 370–394. doi: 10.1111/j.2044-8325.2011.02045.x

14. Gorman C. A., & Rentsch J. R. (2009). Evaluating frame-of-reference rater training effectiveness using performance schema accuracy. Journal of Applied Psychology, 94, 1336–1344. doi: 10.1037/a0016476 19702375

15. Gorman C. A., & Rentsch J. R. (2016). Retention of Assessment Center Rater Training. Journal of Personnel Psychology 16, 1–11. doi: 10.1027/1866-5888/a000167 Hogrefe Publishing.

16. Sulsky L. M., & Day D. V. (1994). Effects of frame-of-reference training on rater accuracy under alternative time delays. Journal of Applied Psychology, 79, 535–543.

17. Sulsky L. M., & Kline T. J. B. (2007). Understanding frame-of-reference training success: A social learning theory perspective. International Journal of Training and Development, 11, 121–131.

18. Raczynski K.R., Cohen A.S., Engelhard G. & Lu Z. (2015). Comparing the Effectiveness of Self-Paced and Collaborative Frame-of-Reference Training on Rater Accuracy in a Large-Scale Writing Assessment. Journal of Educational Measurement, 52(3), 301–318. doi: 10.1111/jedm.12079

19. Woehr D. J. (1994). Understanding frame-of-reference training: The impact of training on the recall of performance information. Journal of Applied Psychology, 79, 525–534.

20. Bernardin H.J. (1978). Effects of rater training on leniency and halo errors in student ratings of instructors. Journal of Applied Psychology, 63, 301–308.

21. Latham G. P., Wexley K. N., & Pursell E. D. (1975). Training managers to minimize rating errors in the observation of behavior. Journal of Applied Psychology, 60,550–555.

22. Bernardin H.J.; & Walter C.S. (1977) Effects of rater training and diary-keeping on psychometric error in ratings. Journal of Applied Psychology, 62, 64–69.

23. Bernardin H.J.; & Pence E.C. (1980). The effects of rater training: Creating new response sets and decreasing accuracy. Journal of Applied Psychology, 65, 60–66.

24. Smith D.E. (1986). Programs for performance appraisal: A Review. The Academy of Management Review, Vol 11, No. 1, 22–40.

25. Pulakos E. D. (1984). A comparison of training programs: Error training and accuracy training. Journal of Applied Psychology, 69, 581–588.

26. Thornton G.C. & Zorich S. (1980). Training to improve observer accuracy. Journal of Applied Psychology, Vol. 65, No. 3, 351–354.

27. Noonan L.E., & Sulsky L.M. (2001). Impact of Frame-of-Reference and Behavioral Observation Training on Alternative Training Effectiveness Criteria in a Canadian Military Sample. Human Performance, 14(1), 3–26.

28. Sulsky L. M., & Day D. V. (1992). Frame-of-reference training and cognitive categorization: An empirical investigation of rater memory issues. Journal of Applied Psychology, 77, 501–510. 1512184

29. Hedge J. W., & Kavanagh M. J. (1988). Improving the accuracy of performance evaluations: Comparison of three methods of performance appraiser training. Journal of Applied Psychology, 73, 68–73

30. Bernardin H. J., & Buckley M. R. (1981). Strategies in rater training. Academy of Management Review, 6, 205–212.

31. McIntyre R., Smith D., & Hassett C. (1984). Accuracy of performance ratings as affected by rater training and perceived purpose of rating. Journal of Applied Psychology, 69,147–156.

32. Roch S.G., & O’Sullivan B.J. (2003). Frame of reference rater training issues: recall, time and behavior observation training. International Journal of Training and Development, 7:2.

33. Cardy R., & Keefe T. J. (1994). Observational purpose and evaluative articulation in frame-of-refer- ence training: The effects of alternative processing modes on rater accuracy. Organizational Behavior and Human Decision Processes, 57, 338–357.

34. Chiciro K. E., Buckley M. R., Wheeler A. R., Facteau J. D., Bernardin H. J., & Beu D. S. (2004). A note on the need for true scores in frame-of-reference (FOR) training research. Journal of Managerial Issues, 16, 382–395.

35. Keown-Gerrard J.L., & Sulsky L.M. (2001). The Effects of Task Information Training and Frame-of-Reference Training With Situational Constraints on Rating Accuracy. Human Performance, 14(4), 305–320.

36. Lievens F., & Sánchez J. I. (2007). Can training improve the quality of inferences made by raters in competency modeling? A quasi-experiment. Journal of Applied Psychology, 92, 812–819. doi: 10.1037/0021-9010.92.3.812 17484560

37. Loignon A.C., Woehr D. J., Thomas J.S, Loughry M.L., Ohland M. W., & Ferguson D. (2016). Facilitating Peer Evaluation in Team Contexts: The Impact of Frame-Of-Reference Rater Training. Academy of Management Learning & Education.

38. Schleicher D. J., & Day D. V. (1998). A cognitive evaluation of frame-of-reference rater training: Content and process issues. Organizational Behavior and Human Decision Processes, 73, 76–101. 9705795

39. Day D. V. & Sulsky L. M. (1995). Effects of frame-of-reference training and ratee information configuration on memory organization and rater accuracy. Journal of Applied Psychology, 80, 158–67.

40. Hoffman B. J., Gorman C. A., Blair C. A., Meriac J. P., Overstreet B. L., & Atchley E. K. (2012). Evidence for the effectiveness of an alternative multisource performance rating methodology. Personnel Psychology, 65, 531–563. doi: 10.1111/j.1744-6570.2012.01252.x

41. Eppich W., Nannicelli A., Seivert N., Sohn M-W., Rozenfeld R., Woods D., et al. (2015). A Rater Training Protocol to Assess Team Performance. Journal Of Continuing Education in the Health Professions, 35(2), 83–90. doi: 10.1002/chp.21270 26115107

42. Rosales, C., Díaz-Cabrera, M.D., & Hernández-Fernaud, E. (under review). Influence of the type of measurement and the effect of primacy and recency on task and citizenship performance appraisal.

43. Giráldez M., & Provencio M. (2012). Life Vest Under Your Seat (Volamos hacia Miami) (Cortometraje). España. Disponible en:

44. Díaz-Vilela L., Delgado N., Isla-Díaz R., Díaz-Cabrera D., Hernández-Fernaud E. & Rosales-Sánchez C. (2015). Relationships between contextual and task performance and interrater agreement: Are there any? Plos One, 10(10):e0139898, 2015. doi: 10.1371/journal.pone.0139898 26473956

45. Díaz-Cabrera D., Hernández-Fernaud E., Isla-Díaz R., Delgado N., Díaz-Vilela L. & Rosales-Sánchez C. (2014). Factores relevantes para aumentar la precisión, la viabilidad y el éxito de los sistemas de evaluación del desempeño laboral. Papeles del Psicólogo, 35(2), 3–13.

46. Aguinis H., Mazurkiewicz M. D., & Heggestad E. D. (2009). Using web-based frame-of reference training to decrease biases in personality-based job analysis: An experimental field study. Personnel Psychology, 62, 405–438.

47. Ivancevich J. M. (1979.) Longitudinal study of the effects of rater training on psychometric error in ratings. Journal of Applied Psychology, 64, 502–508.

48. Lee J. A. (1994). The effects of cognitive style and training on performance ratings’ validity. Journal of Business and Psychology, 8, 297–308.

49. Melchers K. G., Lienhardt N., von Aarburg M., & Kleinmann M. (2011). Is more structure always better? An evaluation of the effects of rater training and descriptively anchored rating scales on rating accuracy in a structured interview. Personnel Psychology, 64, 53–87.

50. Pulakos E. D. (1986). The development of training programs to increase accuracy with different training tools. Organizational Behavior and Human Decision Processes, 38(1), 76–91.

51. Schleicher D. J., Day D. V., Mayes B. T., & Riggio R. E. (2002). A new frame of reference training: Enhancing the construct validity of assessment centers. Journal of Applied Psychology, 87, 735–746. 12184577

52. Stamoulis D. T., & Hauenstein N. M. A. (1993). Rater training and rating accuracy: Training for dimensional accuracy versus training for ratee differentiation. Journal of Applied Psychology, 78, 994–1003

53. Sulsky L., Skarlicki D.P., & Keown J. (2002). Frame-of-reference training: Overcoming the effects of organizational citizenship behavior on performance appraisal accuracy. Journal of Applied Social Psychology, 6, 1224–1241.

54. Uggerslev K.L., & Sulsky L.M. (2008). Using frame-of-reference training to understand the implications of rater idiosyncrasy for rating accuracy. Journal of Applied Psychology, 93, 711–719. doi: 10.1037/0021-9010.93.3.711 18457499

55. DeNisi A.S., & Murphy K. (2017). Performance Appraisal and Performance Management: 100 Years of Progress? Journal of Applied Psychology, 102(3), 421–433. doi: 10.1037/apl0000085 28125265

56. Athey T. R., & McIntyre R. M. (1987). Effect of rater training on rater accuracy: Level-of-processing theory and social facilitation theory perspectives. Journal of Applied Psychology, 72, 239–244.

57. Hauenstein N. M. A. (1998). Training raters to increase the accuracy of appraisals and the usefulness of feedback. Smither En J. (Ed.), Performance appraisal (pp. 404–444). San Francisco: Jossey-Bass.

58. Bernardin H. J., Tyler C. L., & Villanova P. (2009). Rating level and accuracy as a function of rater personality. International Journal of Selection and Assessment, 17, 300–310.

Článek vyšel v časopise


2019 Číslo 9

Nejčtenější v tomto čísle

Tomuto tématu se dále věnují…


Zvyšte si kvalifikaci online z pohodlí domova

Ulcerative colitis_muž_břicho_střeva
Ulcerózní kolitida
nový kurz

Blokátory angiotenzinových receptorů (sartany)
Autoři: MUDr. Jiří Krupička, Ph.D.

Antiseptika a prevence ve stomatologii
Autoři: MUDr. Ladislav Korábek, CSc., MBA

Citikolin v neuroprotekci a neuroregeneraci: od výzkumu do klinické praxe nejen očních lékařů
Autoři: MUDr. Petr Výborný, CSc., FEBO

Zánětlivá bolest zad a axiální spondylartritida – Diagnostika a referenční strategie
Autoři: MUDr. Monika Gregová, Ph.D., MUDr. Kristýna Bubová

Všechny kurzy