1. Eight experienced judges used a 20-point scoring scale in different ways when assessing the ‘proportion of lean to fat’ from the same photographs of cut bacon sides presented in the same arrangements. The average score for the 220 judgements made during the experiment by a single judge, which represents his over-all level of scoring or ‘standard of judging’, ranged from 13·7 to 12·0. The standard deviation, a measure of the degree of discrimination attempted by the judge, ranged from 3·26 to 4·74.
2. The judges varied in their consistency of assessment or ‘repeatability’. The standard error for a single judge, measuring the extent of the variation of his scores within photographs, ranged from 0·89 to 1·99. The judges who attempted to discriminate most between photographs tended to be the least consistent in their judgements, although one judge was a notable exception to this trend.
3. The consistency of judgement tended to decrease from the good (high-scoring) rashers to the poor (low-scoring) rashers, but this effect was more marked for some judges than for others.
4. Some of the variation in the scores awarded to each photograph was due to alterations in the standard of judging from batch to batch. When this was allowed for, the standard errors were all reduced and ranged from 0·61 to 1·31. The judges tended to adjust their standards according to the average quality of the batch being assessed. This led to the variation among the average scores for the batches being less than it would have been had their standards remained constant.
5. Alterations in the standard of judging during the experiment affected the scores awarded to good (high-scoring) rashers rather less than those awarded to poor (low-scoring) ones.
6. The correlations between the individual judge's mean scores for the forty-four photographs and the over-all mean scores were very high. For seven judges, they ranged from 0·962 to 0·984, whereas for the eighth judge the correlation was 0·918. The lower correlation for this judge was due to two rashers with very thin fat being heavily marked down.
7. The correlations between the over-all mean scores and two different combinations of three objective measurements were both about 0·92; these measurements were therefore slightly less closely related to the over-all scores than were an individual judge's mean scores.
8. The possibility of making the experimental technique more realistic and of improving the precision of such visual judgements by providing photographic scales of reference are discussed.