Many examples of calibration in climate science raise no alarms regarding model reliability. We examine one example and show that, in employing classical hypothesis testing, it involves calibrating a base model against data that are also used to confirm the model. This is counter to the ‘intuitive position’ (in favor of use novelty and against double counting). We argue, however, that aspects of the intuitive position are upheld by some methods, in particular, the general cross-validation method. How cross-validation relates to other prominent classical methods such as the Akaike information criterion and Bayesian information criterion is also discussed.