Evaluating and enhancing supervisee competence is a key function of supervision and can be aided by the use of direct assessments of clinical competence, e.g. the Cognitive Therapy Scale – Revised (CTS-R). We aimed to review the literature regarding inter-rater reliability and training on the CTS and CTS-R to present exploratory data on training raters to use this measure. We employed a systematic review. An exploratory study evaluated the outcomes of a CTS-R supervisor training workshop (n = 34), including self-reported familiarity with and confidence in using the tool, and inter-rater consistency on three CTS-R subscales, pre- and post-training. CTS and CTS-R inter-rater reliability was variable, with evidence of rater training enhancing reliability, although the form, duration and frequency of such training is unclear. The exploratory study found that supervisors rated themselves as more familiar with and confident in using the CTS-R at the end of training compared to at the beginning. However, inter-rater reliability was poor at the beginning and end of the training. Rating competence requires supervisors to make qualitative judgements, which is inherently variable. Training raters has been shown to improve rater reliability, although this was not demonstrated in the exploratory study. Practice implications and future research priorities are identified.