EXPLORING THE EXISTENCE OF GRADER BIAS IN BEEF GRADING

JU WON JANG; ARIUN ISHDORJ; DAVID P. ANDERSON; TSENGEG PUREVJAV; GARLAND DAHLKE

doi:10.1017/aae.2017.9

EXPLORING THE EXISTENCE OF GRADER BIAS IN BEEF GRADING

Published online by Cambridge University Press: 02 May 2017

TSENGEG PUREVJAV and

JU WON JANG*: Affiliation:
Department of Agricultural Economics, Texas A&M University, College Station, Texas
ARIUN ISHDORJ: Affiliation:
Department of Agricultural Economics, Texas A&M University, College Station, Texas
DAVID P. ANDERSON: Affiliation:
Department of Agricultural Economics, Texas A&M University, College Station, Texas
TSENGEG PUREVJAV: Affiliation:
INTI Service Corp., College Station, Texas
GARLAND DAHLKE: Affiliation:
Department of Animal Science, Iowa State University, Ames, Iowa
*: *Corresponding author's e-mail: [email protected]

Article contents

Abstract
Introduction
Model
Data
Results
Conclusion
Footnotes
References

Rights & Permissions

Abstract

The U.S. Department of Agriculture (USDA) beef grading system plays an important role in marketing and promoting beef. USDA graders inspect beef carcasses and determine a quality grade within a few seconds. Although the graders are well trained, the nature of this grading process may lead to grading errors. Significant differences in the USDA graders’ “called” and “camera-graded” quality grades were observed, as well as variations in quality grades across seasons and years. Under grid pricing, producers gained financially from grades called by USDA graders rather than grades measured by cameras.

Keywords

Beef grading system grader bias marbling score quality grade Q13 Q18

Type: Research Article
Information: Journal of Agricultural and Applied Economics , Volume 49 , Issue 3 , August 2017 , pp. 467 - 489

DOI: https://doi.org/10.1017/aae.2017.9 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: Copyright © The Author(s) 2017

1. Introduction

The U.S. Department of Agriculture (USDA) beef carcass grading standards are composed of USDA quality and yield grades, which are designed to assess the eating quality and the amount of lean edible meat from a carcass, respectively. Producers use these grades to roughly predict the market value of cattle they sell to packers and have a financial incentive to produce the best-tasting and high-quality beef under the USDA grading system. Consumers make informed purchasing decisions using USDA quality grades and labels. In short, the system simplifies the marketing process and makes communication among producers, packers, and consumers easier (Field, Reference Field2007).

The integrity of the beef grading system is assured by accurate and precise grading. In reality, however, graders employed by the USDA determine carcass grades by a brief visual inspection that takes only a few seconds. Although USDA graders are well trained and independent of both producers and packers, the nature of the grading process could lead to grading errors. These errors could diminish the incentive to produce a higher-quality product (Chalfant and Sexton, Reference Chalfant and Sexton2003). When quality grades called by USDA graders are lower than actual quality grades, cattle producers take a loss on transactions with packers. In the case of beef consumers, they have to pay more (or less) than the actual value of beef because of the grading errors. The grading errors in quality grades impede the communication among beef consumers, producers, and packers. The influence of grading errors on the efficiency of the market and promotion of beef can be minimized if the errors are not systematically biased across time and location (Hueth, Marcoul, and Lawrence, Reference Hueth, Marcoul and Lawrence2007). Grading accuracy and consistency, thus, are crucial for improving producers’, packers’, and consumers’ confidence in the efficiency of the beef marketing system.

In 2006, two camera-based grading systems were approved by the USDA in order to improve beef carcass grading accuracy and uniformity within the industry.Footnote ¹ In August 2014, the USDA Agricultural Marketing Service (AMS) sought public input for possible revisions to the U.S. Standards for Grades of Carcass Beef (USDA-AMS, 2014) to help adjust for recent improvements and trends in the raising and feeding of cattle. Although the USDA-AMS has been working on improving the accuracy of beef grading, there are relatively few studies that looked at the presence and sources of grader bias. Mafi, Harsh, and Scanga (Reference Mafi, Harsh and Scanga2014) documented that cameras/instruments were more accurate and consistent than the USDA graders in assessing marbling score to determine quality grade. They also found that cameras/instruments reduced grader-to-grader and plant-to-plant variations. Hueth, Marcoul, and Lawrence (Reference Hueth, Marcoul and Lawrence2007) used a behavioral model and showed the existence of grader bias in assigning yield grade. They defined grading as biased when the distribution of the “true” (the grade that should be assigned according to the USDA standards) yield grade systemically differs from that of the “called” (the grade actually assigned by a USDA grader) yield grade. To measure the divergence of two distributions, they estimated a mean and variance of the true yield indexFootnote ² and compared them with the sample mean and standard deviation of the index. They also estimated cutoff values for each yield grade to capture the USDA graders’ behavior.

The current study builds on the previous literature by looking for evidence of the existence and possible sources of grading errors using data from two large-scale Midwest packing plants from 2005 through 2008. The data on quality grade called by USDA graders (“called” quality grade) and “camera-graded” quality grade of each carcass were provided along with year, month, and day of the week when cattle were processed.

The specific objectives of this study are threefold. First, we analyzed the difference between called and camera-graded quality grades. Then, using these given quality grades, we estimated the cutoff points for each quality grade (e.g., Choice or Select) and compared them with the USDA standards cutoff points for each grade. From the analysis, we expected to find possible sources of grading errors. One of the significant errors in ratings is known as “central tendency bias.” The existence of central tendency bias may be shown in beef grading if USDA graders do not follow the USDA standards and have a tendency to call grades close to the mean and avoid calling extreme grades. Second, we further investigated the patterns of grading errors by conducting seasonal and annual analyses to extend the existing literature by estimating seasonal and annual cutoff points. Existing research documented seasonal changes in beef carcass characteristics (Gray et al., Reference Gray, Moore, Hale, Kerth, Griffin, Savell and Raines2012), the number of cattle marketed, and consumer demand (McCully, Reference McCully2015). The patterns of estimated intervals for quality grades across seasons and years were compared with Choice-Select spread, physical characteristics of beef carcasses, and the number of slaughter cattle processed in order to help identify possible sources of grading errors. Finally, because the USDA intends to more widely utilize the camera grading system in the future, it is worthwhile to analyze and discuss the impact of potential changes on producers and packers. For this analysis, weekly weighted averages of premiums and discounts for each quality grade were collected from USDA-AMS (2005–2008) 5-Area Weekly Direct Slaughter Cattle Reports. The premium and discount data, along with called and camera-graded quality grades, allow the measurement of the financial impact of fully utilizing the camera grading system on cattle producers and packers.

To our knowledge, this is the first study that quantifies the variations in beef quality grading by USDA graders and camera systems across seasons and years. In addition, we address the impact of increased utilization of cameras in grading on cattle producers and packers. These analyses were possible because the data used contain a much larger number of observations over the years than those in earlier studies (Hueth, Marcoul, and Lawrence, Reference Hueth, Marcoul and Lawrence2007; Mafi, Harsh, and Scanga, Reference Mafi, Harsh and Scanga2014).

A few points must be made about the terminology and assumptions used in this research. The term grader bias in this article does not imply deception or dishonesty, but simply that the called quality grade is different from the USDA standards. In this article, we assume that the camera-graded quality grade is not identical with the true USDA quality grade. Grades determined by cameras can be biased because of the initial settings, sensitivity, accuracy, and errors related to calibration of cameras (Mafi, Harsh, and Scanga, Reference Mafi, Harsh and Scanga2014; Moore et al., Reference Moore, Bass, Green, Chapman, O'Connor, Yates, Scanga, Tatum, Smith and Belk2010). Furthermore, part of our data was collected before the camera grading system was officially approved by the USDA in 2006. Given that there are some errors that can be associated with camera grading, the quality grade measured by the camera is unlikely to be identical with the true quality grade. These factors led us to develop a different behavioral model from the model developed in Hueth, Marcoul, and Lawrence (Reference Hueth, Marcoul and Lawrence2007).

2. Model

There are eight USDA quality grades: Prime, Choice, Select, Standard, Commercial, Utility, Cutter, and Canner. The factors that are used to determine the quality grade are the degree of marbling and the maturity class, which are classified into nine and five different levels,Footnote ³ respectively. The degree of marbling and the maturity class are combined to determine the final quality grade (Hale, Goodson, and Savell, Reference Hale, Goodson and Savell2013). When slaughter cattle are processed before 42 months of age, their carcasses are categorized as Prime, Choice, Select, or Standard according to marbling score. If slaughter cattle are processed after 42 months of age, the carcasses are graded as Commercial, Utility, Cutter, or Canner. The USDA graders subjectively determine both maturity and marbling class based on the descriptions and illustrations provided in the standards and their own practical work experiences.

Results of the 2005 National Beef Quality Audit determined that more than 97% of carcasses in U.S. fed beef plants were classified as A-level maturity (9–30 months) (Garcia et al., Reference Garcia, Nicholson, Hoffman, Lawrence, Hale, Griffin and Savell2008). Hence, in this study we assume that maturity class was A (9–30 months) or B (30–42 months). Given the maturity class, the primary determinant of quality grade will be the marbling score. The analysis of this study includes beef carcasses, which are graded as Prime, Choice, Select, and Standard. Given this exclusion, a model that uses marbling score as the determinant of the quality grade is specified as follows.

Let MSI_k be the marbling score interval for quality grade k. These intervals allow us to express quality grade in a functional form:

(1)

$$\begin{eqnarray} \hspace*{-16pt}\text{Quality\,grade} &=& \{ {\it k}\rm{|marbling\,score} \in {\rm MSI}_{\it k},\nonumber\\ \hspace*{-16pt} {\it k}&=&\rm Prime,\,Choice,\,Select,\,Standard|maturity \, {\le}{42}\,months\} . \end{eqnarray}$$

Let c_i be a called quality grade, m_i be a camera-graded quality grade, and t_i be a true quality grade for a carcass i. True quality grade is unobserved. Using these definitions, the called and true quality grades can be expressed as follows:

(2)

$$\begin{equation} {c_i} = {m_i} + {u_i},{u_i}\ \sim N(0,\sigma _u^2),{t_i} = {m_i} + {v_i},{v_i}\ \sim N(0,\sigma _v^2), \end{equation}$$

where u_i and v_i are error terms for called and true quality grades, respectively. We assume that error terms are distributed normally with mean zero and standard deviations, σ_u and σ_v. This assumption allows the use of a likelihood function to estimate cutoff points and standard errors.

The USDA standard marbling score intervals ( ${\widehat {{\rm{MSI}}}_k}$ ) for each quality grade are ${\rm{\ }}{\widehat {{\rm{MSI}}}_{{\rm{Prime}}}} = [ {8.0,{\rm{\ }} + \infty } )$ , ${\rm{\ }}{\widehat {{\rm{MSI}}}_{{\rm{Choice}}}} = [ {5.0,{\rm{\ }}8.0} )$ , ${\widehat {{\rm{MSI}}}_{{\rm{Select}}}} = [ {4.0,{\rm{\ }}5.0} )$ , and ${\widehat {{\rm{MSI}}}_{{\rm{Standard}}}} = ( { - \infty ,{\rm{\ }}4.0} )$ . The ${\widehat {{\rm{MSI}}}_{{\rm{Prime}}}}$ means that USDA graders should call Prime when an observed marbling score is greater than or equal to 8.0. Other quality grades should be called in a similar way, in that a grade is called when the marbling score falls within the indicated interval.

Because our data indicate that the called quality grade is not identical with the camera-graded quality grade, we presume that the USDA graders have their own marbling score intervals, which could be different from those of the USDA standards. Using this premise, the USDA graders’ marbling score intervals $( {{{\widetilde {{\rm{MSI}}}}_k}} )\ $ are defined by the following implicit cutoff points (C_k , k = Prime, Choice, Select, and Standard): ${\widetilde {{\rm{MSI}}}_{{\rm{Prime}}}} = [ {{C_{{\rm{Prime}}}},{\rm{\ }} + \infty } )$ , ${\widetilde {{\rm{MSI}}}_{{\rm{Choice}}}} = [ {{C_{{\rm{Choice}}}},{\rm{\ }}{C_{{\rm{Prime}}}}} )$ , ${\widetilde {{\rm{MSI}}}_{{\rm{Select}}}} = [ {{C_{{\rm{Select}}}},{\rm{\ }}{C_{{\rm{Choice}}}}} )$ , and ${\widetilde {{\rm{MSI}}}_{{\rm Standard}}} = ( { - \infty ,{\rm{\ }}{C_{{\rm{Select}}}}} )$ . If these implicit cutoff points are different from those of the USDA standards across time, then we can conclude that grader bias exists.

We assume that the called quality grade and the probability of the called quality grade being the true quality grade are independent. Then the likelihood function can be defined as follows:

(3)

$$\begin{equation} \begin{array}{@{}*{1}{l}@{}} {{L^i}{\rm{(}}{c_i},{\rm{\ }}{m_i}{\rm{\ |\ }}{\sigma _u},{\sigma _v},{\rm{\ }}{C_{{\rm{Prime}}}},{C_{{\rm{Choice}}}},{C_{{\rm{Select}}}})}\\[6pt] {{\rm{\ }} = \bm{I}\left( {{c_i} = {\rm{Standard}}} \right)\left\{ {{\rm{\Phi }}\left( {\frac{{{C_{{\rm{Select}}}} - {m_i}}}{{{\sigma _u}}}} \right) \times {\rm{\Phi }}\left( {\frac{{4 - {m_i}}}{{{\sigma _v}}}} \right)} \right\}}\\[6pt] { \times \bm{I}\left( {{c_i} = {\rm{Select}}} \right)\left\{ {\left[ {{\rm{\Phi }}\left( {\frac{{{C_{{\rm{Choice}}}} - {\rm{\ }}{m_i}}}{{{\sigma _u}}}} \right) - {\rm{\Phi }}\left( {\frac{{{C_{{\rm{Select}}}} - {\rm{\ }}{m_i}}}{{{\sigma _u}}}} \right)} \right] \times \left[ {{\rm{\Phi }}\left( {\frac{{5{\rm{\ }} - {\rm{\ }}{m_i}}}{{{\sigma _v}}}} \right) - {\rm{\Phi }}\left( {\frac{{4{\rm{\ }} - {\rm{\ }}{m_i}}}{{{\sigma _v}}}} \right)} \right]} \right\}}\\[6pt] { \times \bm{I}\left( {{c_i} = {\rm{Choice}}} \right)\left\{ {\left[ {{\rm{\Phi }}\left( {\frac{{{C_{{\rm{Prime}}}} - {m_i}}}{{{\sigma _u}}}} \right) - {\rm{\Phi }}\left( {\frac{{{C_{{\rm{Choice}}}} - {m_i}}}{{{\sigma _u}}}} \right)} \right] \times \left[ {{\rm{\Phi }}\left( {\frac{{8 - {m_i}}}{{{\sigma _v}}}} \right) - {\rm{\Phi }}\left( {\frac{{5 - {m_i}}}{{{\sigma _v}}}} \right)} \right]} \right\}}\\[6pt] { \times \bm{I}\left( {{c_i} = {\rm{Prime}}} \right)\left\{ {\left[ {1 - {\rm{\Phi }}\left( {\frac{{{C_{{\rm{Prime}}}} - {\rm{\ }}{m_i}}}{{{\sigma _u}}}} \right)} \right] \times \left[ {1 - {\rm{\Phi }}\left( {\frac{{8{\rm{\ }} - {\rm{\ }}{m_i}}}{{{\sigma _v}}}} \right)} \right]} \right\},} \end{array} \end{equation}$$

where I () is an indicator function, and Φ() is the cumulative density function of the standard normal distribution. The likelihood function is derived from the assumption that the USDA graders call quality grade to maximize the probability of calling the true quality grade by using their own implicit intervals. Because the true quality grade is unknown to USDA graders, they call quality grade using visual inspection and their own implicit cutoff points. A log transformation of the likelihood function was used in the maximum likelihood estimation process. The estimated cutoff points provide information about grading behavior of USDA graders in assigning quality grades.

3. Data

The data used in the analysis provide information on called and camera-graded quality grades of beef carcasses from May 2005 to October 2008. Figure 1 presents the distribution of called quality grade for the entire sample (n = 134,451Footnote ⁴ ) and shows that 94.4% of beef carcasses were graded Choice or Select. Although the called quality grade was available for the entire sample, the camera-graded quality grade was only available for the subsample of the data (n = 18,080). Because the values for both called and camera-graded quality grades are required to estimate the implicit cutoff points, the subsample (n = 18,080) of the entire data (n = 134,451) was used in estimating the cutoff points and conducting premium-discount analysis.

Figure 1. The Distribution of Quality Grade (n = 134,451, the number of head, percent of total graded in parentheses)

In our data, called marbling grades were reported as USDA quality grades (Prime, Choice, Select, or Standard), and camera-graded marbling scores were reported as numeric values (e.g., 5.0 for small) for some of the carcasses in our sample and as a degree of marbling (e.g., small 20) for the remaining carcasses. To make the marbling measurements consistent across carcasses and to estimate the cutoff points, we converted each degree of marbling into a numeric marbling score. Figure 2 shows the distribution of the numeric (camera-graded) marbling scores. Each number on the horizontal axis of Figure 2 corresponds to a degree of marbling score.

Figure 2. The Distribution of the Numeric (Camera-Graded) Marbling Score (n = 18,080)

As shown in Figures 1 and 3, the distributions of called quality grade from the entire sample (n = 134,451) and the subsample (n = 18,080) used in the analysis were similar. Both distributions show that most carcasses were graded as Choice or Select. The distribution from the entire sample (subsample) shows that the USDA graders graded 67.3% (70.2%) and 27.1% (27.2%) of carcasses as Choice and Select, respectively. The National Summary of Meats Graded Reports announced by the USDA-AMS (2015) at the beginning of each year showed that most carcasses were graded either Choice or Select (Table 1). Although the distributions for called and camera-graded quality grades differ a bit, in percentage terms, from the national averages reported in Table 1, the shape of both called and camera-graded quality grades are similar to the national summary indicating that the sample data used in the analysis closely represent the national level data.

Figure 3. The Distribution of Called and Camera-Graded Quality Grade (n = 18,080, the number of head, percent of total graded in parentheses)

Table 1. National Summary of Meat Graded (million pounds, percent of total graded in parentheses)

Source: USDA-AMS (http://www.ams.usda.gov/reports/meat-grading).

As shown in Figure 3, 70.2% (27.2%) of carcasses were graded as Choice (Select) by the USDA graders, whereas 51.1% (35.8%) of the carcasses were graded as Choice (Select) by cameras. This indicates that the USDA graders tend to call more Choice and less Select compared with cameras. Also, the two distributions show that the USDA graders were more generous in grading carcasses compared with cameras.

The distributions of camera-graded quality grade given called quality grade are shown in Figure 4. These conditional distributions allow us to analyze the differences between called and camera-graded quality grades. If there were no divergences between these two grades, then all the carcasses called as Prime by the USDA graders should be graded as Prime by the cameras. However, as shown in Figure 4, out of 395 beef carcasses that were graded as Prime by the USDA graders, only 144 carcasses (36.5%) were graded as Prime by the cameras, and the remaining 251 carcasses (63.5%) were graded as Choice. Furthermore, from all the carcasses graded as Choice by the USDA graders, 33.4% were graded as Select by the cameras. In the case of Select, 42.6% were graded as Standard by the cameras. These conditional distributions suggest that noticeable differences exist between called and camera-graded quality grades, except for Standard, and that the cameras generally assigned lower quality grades than the USDA graders.

Figure 4. The Distribution of Camera-Graded Quality Grade Given Called Quality Grade (n = 18,080, the number of head, percent of total graded in parentheses)

Figure 5 illustrates the distribution of called quality grade given camera-graded quality grade. Almost all beef carcasses graded as Choice by the cameras were graded as Choice by the USDA graders. A similar pattern was observed for Prime, where a majority of carcasses graded as Prime by the cameras were also graded as Prime by the USDA graders. However, in the case of all carcasses graded as Select by the cameras, 65.6% were graded as Choice by the USDA graders and 96.1% of all carcasses graded as Standard by the cameras were graded as Select by the USDA graders. The comparison of conditional distributions in Figure 5 indicates that the difference between called and camera-graded quality grades was smaller when the USDA graders assessed Choice grade carcasses, but this was not the case for the other quality grade carcasses. This smaller divergence for Choice quality grade could be explained by Piazza and Izard's (Reference Piazza and Izard2009) findings: the more humans are exposed to the number of objects or sequence, the more likely they accurately repeat the sequence. As shown in Figure 3, 70.2% of carcasses in our observations were graded as Choice by the USDA graders. This could indicate that the USDA graders were more accurate in assessing Choice grade carcasses because of repeated exposures to Choice grade carcasses.

Figure 5. The Distribution of Called Quality Grade Given Camera-Graded Quality Grade (n = 18,080, the number of head, percent of total graded in parentheses)

The distributional analyses in this section were not enough to confirm the existence of grader bias caused by the USDA graders because camera-graded quality grade can also be different from the true USDA standard quality grade because of calibration errors or initial camera settings. The implicit cutoff points for each quality grade, thus, were estimated to further analyze the existence of grader bias and explore possible sources of the bias.

4. Results

4.1. Subsample Analysis

Cutoff points for each quality grade were estimated using equation (3) to identify the implicit USDA graders’ interval. The existence of grader bias can be checked by comparing the estimated and USDA Standard cutoff points. As shown in Table 2, the estimated cutoff point for Prime was 8.90, which was greater than the USDA standards cutoff point of 8.00. The estimated interval for Prime [8.90, +∞) indicates that the USDA graders called Choice when the marbling score was greater than 8.00 and that the USDA graders have higher standards for Prime.

Table 2. Estimates of Standard Errors (σ_u ,) and Cutoff Values (C_k )

Note: Standard errors in parentheses; all estimated parameters were significant at the 1% level.

Table 2 also shows that the estimated cutoff point for Choice was 4.50, which was lower than the cutoff point of 5.00 for Choice defined by the USDA standards. This difference between two cutoff points indicates that the USDA graders called Choice instead of Select when the marbling score was less than 5.00.Footnote ⁵ The estimated cutoff points also identify the estimated implicit interval for Choice as [4.50, 8.90). This interval is much wider than the one from the USDA standards for Choice [5.00, 8.00), indicating that the USDA graders had a tendency to call more Choice.

The estimated cutoff point for Select was 3.18. This value is smaller than 4.00, the value from the USDA standards for Select. Using the estimated cutoff points, the estimated intervals for Select and Standard quality grades were identified as [3.18, 4.50) and (−∞, 3.18), respectively. These intervals indicate that USDA graders called Select when the marbling score was less than the USDA standards cutoff point of 4.00 for Select, again indicating that the USDA graders were generous in grading beef carcasses with less marbling.

Potential sources of grader bias could be identified by comparing the estimated and USDA standards intervals across quality grades. Although the estimated intervals for Prime and Standard were narrower than the USDA standards intervals, the estimated intervals for Choice and Select were wider than the USDA standards intervals. This nonconformity can be explained by a central tendency bias. This bias was mostly researched by educational theorists. Saal, Downey, and Lahey (Reference Saal, Downey and Lahey1980) define this bias as a rater's (grader's) property or tendency to restrict a range of scores around a mean and to avoid awarding extreme scores. Existing studies in the field (Engelhard, Reference Engelhard1994; Leckie and Goldstein, Reference Leckie and Goldstein2011; Myford and Wolfe, Reference Myford and Wolfe2009) have found that there is a central tendency to a rater's scoring. Beef grading behavior is very similar to scoring behavior in schools. Both USDA graders and raters, although well trained, are human beings and evaluate subjects based on their subjective observations with given grading standards. These similarities have led us to consider the central tendency bias as the potential source of grader bias in beef carcass grading.

The narrow estimated intervals for Prime and Standard quality grades mean that the USDA graders tend to avoid calling extreme grades. The wider intervals for Choice and Select mean that graders preferred to call the quality grade around the mean marbling score of 5.10 for our sample (Table 3). These results indicate that USDA graders tend to call central grades and avoid calling extreme grades (i.e., Prime and Standard). These grading patterns are evidence of the central tendency bias in beef carcass grading.

Table 3. Summary Statistics of Marbling Score

A reason for the central tendency bias in beef carcass grading may be found in the economic impact of quality grade to producers and packers. Producers can receive a premium or discount based on the quality grade of a beef carcass, if slaughter cattle were sold or priced based on their eventual grade. As shown in Table 4, Choice grade carcasses do not receive any premium or discount when priced based on a grid pricing system.Footnote ⁶ Under grid pricing, calling Choice is a way to make a smaller impact on the financial rewards/losses of producers and packers. Moreover, calling central grades, especially Choice, may be a way to avoid complaints from producers and packers. If USDA graders call extreme grades (Prime and Standard) more frequently, the probability of receiving complaints and regrading requests could be higher. Because USDA graders are independent from producers and packers, they may have no intention of affecting the profit margin of both producers and packers through their grading. According to Hueth, Marcoul, and Lawrence (Reference Hueth, Marcoul and Lawrence2007), packing plants hire a “tagger” who identifies grader miscalls and requests regrading. With the presence of a tagger, the USDA graders could become more generous in grading and have a tendency to call the central grades (Choice and Select) more often to avoid regrading requests.

Table 4. U.S. Department of Agriculture Reported Average Premiums and Discounts (May 2005–October 2008, $/cwt.)

4.2. Seasonal and Annual Analyses

Dynamics in beef carcass grading were analyzed by estimating seasonal and annual cutoff points and comparing them with the USDA standards. The results reported in Table 2 show that the estimated cutoff points for Prime varied significantly by season. The estimated cutoff point for Prime in the summer was 7.97, which was close to the USDA standards cutoff point for Prime. With respect to other seasons, the estimated cutoff points for Prime were noticeably higher than the USDA standards, indicating that during those seasons the USDA graders were much stricter in grading high-quality beef carcasses compared with summer. The estimated interval for Choice in the summer [4.32, 7.97) was narrower than those for other seasons; however, the cutoff point of 4.32 was smaller compared with the USDA standards of 5.00, indicating that USDA graders were more generous and graded Select carcasses as Choice. The estimated interval for Select was the widest in fall and narrowest in winter. These seasonal differences in the estimated intervals and cutoff points can be caused by many factors such as seasonality in the Choice-Select spread, the volume of carcasses processed, the physical characteristics of beef carcasses, and many other factors.

The Choice-Select spread, which is defined as the difference between the Choice and Select wholesale boxed-beef values, is used as an indicator of demand for high-quality beef in the industry (McCully, Reference McCully2015). For example, when the Choice-Select spread reaches a high level (>$8/cwt.), the industry assumes strong demand for high marbled beef, such as Choice, and when the spread is low (<$3/cwt.), the industry assumes weak demand for Choice (McCully, Reference McCully2015).

In this article, the Choice-Select spread data were collected from the USDA-AMS 5-Area Weekly Direct Slaughter Cattle Reports for the period covered in our data and were summarized in Figure 6. As illustrated in Figure 6, the average Choice-Select spread peaked during the cookout month, May, and during the holiday months, November through January, indicating a high demand for Choice beef during these months. The spread decreased significantly after May and the holidays indicating the lower demand for Choice beef. In particular, the spread in summer (June–August) was relatively lower than those for other seasons. It is also true that beef supplies do tend to increase in summer. When comparing the patterns of the Choice-Select spread with those of the estimated intervals, we can argue that the low Choice-Select spread (low demand for Choice beef) influenced the narrow interval for Choice (calling less Choice) in summer compared with other seasons. The similarity in two patterns suggests that the demand for specific quality grade beef possibly influences the grading behavior.

Figure 6. Average Choice-Select Spread during Our Sample Period (May 2005–October 2008) (source: USDA-AMS, 2005–2008)

As reported in Table 3, the majority of cattle in our sample were processed in the spring and summer seasons, 43.1% and 45.1%, respectively, and in 2006 and 2007, indicating that for our sample the volume of slaughter cattle fluctuated greatly by season and year. The seasonality in number of slaughter cattle processed can be explained by the fact that the majority of calves are born in the spring months, weened in fall, and either backgrounded or placed on feed during October and November. The majority of these cattle are marketed and slaughtered during the summer months or later of the following year. These trends in seasons and years from our findings are consistent with the national averages reported in the monthly Cattle on Feed report provided by the USDA National Agricultural Statistics Service (USDA-NASS, 2015) and summarized in Figure 7 for the time period covered in our data. Marketing of cattle tends to be highest in May through August of every year, which covers the last month of spring and all the months of summer. During the spring and summer months, the busy time of the year, USDA graders were more generous and were more likely to call Choice when the actual quality grade was Select. The estimated cutoff points for Choice in fall and winter were 4.56 and 4.81, respectively, and were closer to 5.00, the USDA standards cutoff for Choice, compared with spring and summer. Seasonal variations in the number of slaughter cattle processed at the packing plants can influence the grades called by USDA graders. Graders had a tendency to call more central grades during the busy seasons of the year, which may be associated with taking shorter breaks, working longer hours, and/or using more temporary help.

Figure 7. Number of Fed Cattle Marketed on 1,000+ Capacity Feedlots, United States, May 2005–October 2008 (unit: 1,000 head) (source: USDA-NASS, 2015)

We observed seasonal and yearly variations in carcass characteristics such as marbling score (Table 3), rib eye area, fat thickness, and hot carcass weight (Figure 8) in our data. High grain and oilseed prices between 2006 and 2008 increased the cost of production for beef cattle producers. Beef cattle producers can respond to high feed ingredient costs by adjusting the types and amount of ingredients in feed rations, as well as the length of time spent in the feedlot, which in return can affect the quality grade of slaughter cattle. Other factors, such as age at slaughter and the type of breed, can explain variations in carcass characteristics by season and over time (Gray et al., Reference Gray, Moore, Hale, Kerth, Griffin, Savell and Raines2012). These seasonal variations in carcass characteristics can influence the graders’ judgement and serve as one of the potential sources of grader bias in beef carcass grading.

Figure 8. Average Rib Eye Area, Fat Thickness, and Hot Carcass Weight from the Whole Sample (seasonal, May 2005–Oct 2008)

Figure 6 shows that average Choice-Select spread in 2008 ($5.31/cwt.) was less than in 2005, 2006, and 2007 ($9.33/cwt., $13.81/cwt., and $9.73/cwt., respectively) indicating lower demand for higher-quality beef in 2008, the period that overlaps with the global financial crisis. During the economic recession, the demand for Choice beef declined as shown by the decrease in the Choice-Select spread (Figure 6). Changes in demand may influence USDA graders and lead to calling less Choice. The entire sample, thus, is separated into two subsamples (before and during the crisis) to analyze the potential impact of the economic recession on grading behavior.

As shown in Table 2, the interval for Choice during the crisis [5.04, 8.47) was significantly narrower than the one before the crisis [4.20, 8.28). Both the lower and upper cutoff points during the crisis were significantly higher than the cutoff points before the crisis. The estimation results also show that the estimated cutoff points for each quality grade, 3.93, 5.04, and 8.47, were close to the USDA standards, 4.00, 5.00, and 8.00, after the crisis broke out. These results indicate that the USDA graders were more precise and careful when grading. Their possible awareness of higher demand for cheaper beef cuts during the recession might have influenced their grading. It is possible that USDA graders were trying to avoid grading errors to prevent giving financial advantages/disadvantages to either producers or packers.

4.3. Premiums and Discounts AnalysisFootnote ⁷

The trend in premiums and discounts for each quality grade during our sample period is illustrated in Figure 9. The data were collected from the USDA-AMS (2005–2008) 5-Area Weekly Direct Slaughter Cattle Reports Cattle for the period covered in our data. The premiums and discounts for Choice are zero because it serves as a base quality grade from which premiums and discounts are added/subtracted for Prime, Select, and Standard.Footnote ⁸ In 2008, premiums for Prime decreased, and discounts for Select and Standard also decreased (Figure 9). This means that the premium-discount spread between Prime and Select, as well as Prime and Standard, became narrower. Because the change in premiums and discounts relates to consumer preferences and packers send signals to producers about the quality of beef demanded through premiums and discounts,Footnote ⁹ the narrow spread in 2008 implies that consumers preferred less expensive beef instead of high-quality beef as their income declined.

Figure 9. Premiums and Discounts, Weekly Average Direct Beef Carcasses ($/cwt.) (source: USDA-AMS, 2005–2008)

Our quality grade data include weights of each beef carcass. Using weekly weighted averages of premiums and discounts provided by the USDA and camera-graded and called quality grades along with the weight of each carcass from our data set, we were able to calculate the premiums and discounts of camera-graded and called quality grades for each carcass. The difference in camera-graded and called quality grade premiums and discounts indicates how much producers or packers would have financially gained or lost if USDA graders were to be replaced by cameras during our sample period. By measuring this difference, we can forecast how replacing human graders with cameras may influence the future earnings of producers and packers. For example, if camera-graded quality grade discounts were greater than called quality grade discounts, the difference of the discounts provides the amount of money that producers or packers may lose if USDA graders were replaced by cameras. Although the amount of money that producers or packers could have lost is not identical with what they will lose in the future, we could roughly estimate the financial impact of the replacement on producers and packers. However, in the analysis we do not account for the dynamics of the market. If the volume of beef is changed by the full adoption of cameras, then the premium or discount of beef carcasses may be altered. Because we do not account for this change in the analysis, the findings of this section need to be interpreted with caution.

Cattle are marketed mainly by three pricing methods: (1) live weight pricing, (2) dressed weight pricing, or (3) grid pricing (Schroeder and Davis, Reference Schroeder and Davis1998). When slaughter cattle are priced on a live or dressed weight basis, packers and producers negotiate prices based on the expected value of the cattle. The expected value is determined by expected quality and yield grade, weight premiums and discounts, by-products, slaughter costs (sellers generally pay transportation on dressed cattle sales), and the packer's profit. Because packers pay before cattle are graded by the USDA graders, packers can have financial gains if beef carcasses are graded at a higher quality grade than their expected value, and vice versa. Hence, under live and dressed weight pricing methods, only packers’ earnings are influenced by the called quality grade. When slaughter cattle are marketed based on yield and quality grade (i.e., grid pricing), price is based on the called grade of each animal. Under grid pricing, the quality grade and yield grade influence producers’ earnings, unlike live and dressed weight pricing. Therefore, under grid pricing, producers will lose financially when USDA graders call a lower quality grade than the true grade. In the case of the live and dressed weight pricing, packers will lose when USDA graders call a lower quality grade than the expected value for which they paid. Information on pricing method used for each carcass and the expected value of cattle was unavailable to us, so we were not able to calculate the amount of money that each producer and each packer would gain or lose under different pricing methods. We, however, were able to calculate the combined financial gains/losses of producers and packers after replacing human graders with cameras by calculating the difference between called and camera-graded quality grade premiums and discounts. The expected value of the cattle did not affect the calculated difference, because the expected values of camera-graded and called quality grades were identical for each cattle carcass and cancel out when the difference is calculated.

The differences reported in Table 5 were calculated by subtracting the sum of called quality grade premiums and discounts from the sum of camera-graded quality grade premiums and discounts. The average difference in value of −$3.00/cwt. is the amount of money producers and packers would have jointly lost on average per hundredweight of carcass if a camera grading system would have been used instead of USDA graders during our sample period.

Table 5. Premiums and Discounts of Camera-Graded and Called Quality Grade

Note: All values are reported in dollars per hundredweight ($/cwt.).

Traditionally, live weight pricing was very popular. However, over the past two decades dressed weight pricing and grid pricing methods became increasingly popular. According to the USDA report, more than 50% of cattle sold during the period covered in our data were sold on grid pricing. Specifically, 56.3% (in 2005), 53.3% (in 2006), 57.2% (in 2007), and 62.3% (in 2008) of cattle were sold based on grid pricing (USDA, 2014). To calculate and interpret the change in the earnings of producers and packers, respectively, we assume that the proportion of the grid pricing in our sample is similar to the national level. Hence, the combined difference of −$53,981, as reported in Table 5, can be separated into producers’ and packers’ differences, −$29,825 and −$24,156, respectively. The difference of −$29,825 for producers implies that producers will lose financially when the number of USDA graders is reduced through increased use of the camera grading system under grid pricing. The difference for packers (−$24,156) implies that under dressed weight pricing, packers gain from grades called by USDA graders instead of camera grades. Here we are only considering transactions between producers and packers. In reality, the process is more complex and depends on how packers profitably market high- and low-quality carcasses in the wholesale market. The discount for low-quality carcasses can be high if packers have difficulty profitably marketing low-quality beef. At the same time, packers may pay high premiums for high-quality carcasses if there is a demand for high-quality beef. Our results from Table 5 imply that on average packers were penalized more for low-quality carcasses. The discounts for camera-graded and called grades for Standard, on average, were $15.6/cwt. and $15.7/cwt., respectively, and the premiums for camera-graded and called grades for Prime were $4.2/cwt. and $11.7/cwt., respectively. Both called and camera-graded quality grade discounts for Standard were very similar to the national averages reported in Table 1, whereas called and camera-graded grade premiums for Prime were well below the national average.

Our results in this section are consistent with our findings in previous sections that the USDA graders were more generous in grading than the cameras. Because the USDA is working on reducing human graders, this might imply that producers and packers will lose financially if more cameras are used in grading.

Table 5 also shows that the difference in premiums and discounts has noticeably decreased after 2007. This result is consistent with our findings of annual data analysis. We found that after the financial crisis started, USDA graders became much more precise and stricter in grading, and, at the same time, as illustrated in Figure 9, both premiums and discounts decreased. These changes could be one of the reasons why the difference decreased after 2007.

5. Conclusion

The role of USDA graders is crucial in cattle and beef markets. Although USDA graders are well trained, a subjective determination of quality grades could cause grading errors. This study uses a unique data set and provides a comprehensive analysis of existence and possible sources of grader bias in assigning quality grades to beef carcasses and adds to the existing body of research that has addressed this issue.

The analyses in this article used data from two large-scale Midwest packing plants. The data included called and camera-graded quality grades for each beef carcass from May 2005 to October 2008. We also used the USDA reported weekly weighted averages of premiums and discounts for each quality grade along with called and camera-graded quality grades to estimate the financial impact of the reduced use of USDA graders and adoption of a camera grading system on beef cattle producers and packers.

The results of the interval estimation analysis indicate that USDA graders’ called grades were noticeably different from those measured by the camera grading system. The analyses suggest that seasonality in Choice-Select spread, consumer demand, number of carcasses processed, and carcass characteristics can influence grading behavior of human graders. We also observed a central tendency bias in the grading behavior of USDA graders.

Our results have important implications for the current debate surrounding the widespread adoption of camera grading systems at packing plants. After verifying the existence of systematic grader bias across time, we investigated the possible impact of using camera grading methods instead of USDA graders on the economic gains/losses of producers and packers. When grading errors are systematically biased, the reduction of USDA graders' utilization can influence the financial rewards of producers and packers. The results of the premiums and discounts analysis support the findings of the interval estimation analysis and show that combined earnings of producers and packers will decline when more camera grading is utilized in the beef grading system. Under grid pricing, producers will lose financially if camera grading is used instead of the USDA graders.

There are a number of limitations to the present work. First, in this article we used data from 2005 through 2008. Conducting the analysis using newer data that were collected using more recent computerized technology in grading beef carcasses would provide more up-to-date information on beef grading and the presence of grading errors. Second, we focused on investigating the financial impact of the replacement of USDA graders with cameras on packers and producers. The calculations were done without accounting for market response to changes in relative shares of different grades; hence the results provided in this article need to be interpreted with caution. Third, it is also important to examine the welfare impact of the policy change on consumers. According to our results, we expect that beef prices will change when USDA graders are replaced by cameras. This price change will influence consumers’ welfare in one way or another. Because of the lack of price information, it was not feasible to investigate this impact in this study. Hence, future research that focuses on using more recent data and more nationally representative samples in comparing called and camera-graded beef carcass grades is needed. Nonetheless, the findings of this study are relevant to a variety of policy questions.

Footnotes

The authors would like to thank the anonymous reviewers for their valuable comments that greatly contributed to improving the final version of the paper. The authors are also grateful to participants of the 2015 Southern Agricultural Economics Association meetings for their constructive comments and suggestions.

1 Nine packing plants use these instruments to assist in grading operations for approximately 40% of the beef carcasses graded each day by the USDA (2013).

2 To define its yield grade standard, the USDA uses the following equation: Yield index = 2.50 + (2.5 × fat thickness) + (0.20 × kph) + (0.0038 × weight) – (0.32 × rib eye area),where kph refers to kidney, pelvic, and heart fat.

3 Degree of marbling is segmented into abundant, moderately abundant, slightly abundant, moderate, modest, small, slight, traces, and practically devoid. Maturity classes are classified into A (9–30 months), B (30–42 months), C (42–72 months), D (72–96 months), and E (>96 months).

4 The total number of observations in our data does not necessarily reflect all the cattle processed at the packing plants.

5 If USDA graders follow the USDA standards, they should call Select when the marbling score is greater than or equal to 4 and less than 5.

6 There are three cattle pricing methods: live weight pricing, dressed weight pricing, and grid pricing. Although the price of carcasses is determined by called yield and quality grade under grid pricing, the price is determined based on the expected value under live and dressed weight pricing.

7 Financial terms (loss/gain) in this analysis are used to express the amount of money that producers or packers would have earned if USDA human graders had been replaced by a camera grading system during the research period and do not have any normative meanings.

8 Choice as the par value without premium or discount represents all Choices and does not account for high Choice, which may have a premium in some grid pricing scales.

9 If there is a market for high-quality beef, then packers penalize more heavily the low-quality beef carcasses, whereas the premiums for high-quality beef increase. However, when there is high demand for beef in general, then packers do not consider beef quality and decrease (increase) premiums (discounts).

References

Chalfant, J.A., and Sexton, R.I.. “Marketing Orders, Grading Errors, and Price Discrimination.” American Journal of Agricultural Economics 84(February 2003):53–66.CrossRef Google Scholar

Engelhard, G. “Examining Rater Errors in the Assessment of Written Composition with a Many-Faceted Rasch Model.” Journal of Educational Measurement 31(Summer 1994):93–112.CrossRef Google Scholar

Field, T.G. Beef Production Management Decisions. 5th ed. Upper Saddle River, NJ: Prentice Hall, 2007.Google Scholar

Garcia, L.G., Nicholson, K.L., Hoffman, T.W., Lawrence, T.E., Hale, D.S., Griffin, D.B., Savell, J.W., et al. “National Beef Quality Audit-2005: Survey of Targeted Cattle and Carcass Characteristics Related to Quality, Quantity, and Value of Fed Steers and Heifers.” Journal of Animal Science 86(December 2008):3533–43.CrossRef Google Scholar PubMed

Gray, G.D., Moore, M.C., Hale, D.S., Kerth, C.R., Griffin, D.B., Savell, J.W., Raines, C.R., et al. “National Beef Quality Audit-2011: Survey of Instrument Grading Assessments of Beef Carcass Characteristics.” Journal of Animal Science 90(December 2012):5152–58.CrossRef Google Scholar PubMed

Hale, D.S., Goodson, K., and Savell, J.W.. “USDA Beef Quality and Yield Grades.” 2013. Internet site: http://meat.tamu.edu/beefgrading/ (Accessed January 2, 2015).Google Scholar

Hueth, B., Marcoul, P., and Lawrence, J.. “Grader Bias in Cattle Markets? Evidence from Iowa.” American Journal of Agricultural Economics 89(November 2007):890–903.CrossRef Google Scholar

Leckie, G., and Goldstein, H.. “Understanding Uncertainty in School League Tables.” Fiscal Studies 32(June 2011):207–24.CrossRef Google Scholar

Mafi, G., Harsh, B., and Scanga, J.. “Review of Instrument Augmented Assessment of USDA Beef Carcass Quality Grades.” Champaign, IL: American Meat Science Association, 2014.Google Scholar

McCully, M.A. “Trends in the Choice-Select Spread and Implications to Cattle Producers.” Internet site: http://www.cabpartners.com/articles/news/217/ChoiceSelectWhitePaper.pdf (Accessed June 9, 2015).Google Scholar

Moore, C.B., Bass, P.D., Green, M.D., Chapman, P.L., O'Connor, M.E., Yates, L.D., Scanga, J.A., Tatum, J.D., Smith, G.C., and Belk, K.E.. “Establishing an Appropriate Mode of Comparison for Measuring the Performance of Marbling Score Output from Video Image Analysis Beef Carcass Grading Systems.” Journal of Animal Science 88(July 2010):2464–75.CrossRef Google Scholar PubMed

Myford, C.M., and Wolfe, E.W.. “Monitoring Rater Performance over Time: A Framework for Detecting Differential Accuracy and Differential Scale Category Use.” Journal of Educational Measurement 46(Winter 2009):371–89.CrossRef Google Scholar

Piazza, M., and Izard, V.. “How Humans Count: Numerosity and the Parietal Cortex.” Neuroscientist 15(June 2009):261–73.CrossRef Google Scholar PubMed

Saal, F.E., Downey, R.G., and Lahey, M.A.. “Rating the Ratings: Assessing the Psychometric Quality of Rating Data.” Psychological Bulletin 88(September 1980):413–28.CrossRef Google Scholar

Schroeder, T.C., and Davis, E.E.. Fed Cattle Grid Pricing. College Station: Texas Agricultural Extension Service, The Texas A&M University System, 1998.CrossRef Google Scholar

U.S. Department of Agriculture. FSIS’ and AMS’ Field-Level Workforce Challenges. Washington, DC: U.S. Department of Agriculture, Office of Inspector General, Audit Report 50601-0002-31, 2013.Google Scholar

U.S. Department of Agriculture. 2013 Packers and Stockyards Annual Report. Washington, DC: U.S. Department of Agriculture, Grain Inspection, Packers and Stockyards Administration, 2014.Google Scholar

U.S. Department of Agriculture, Agricultural Marketing Service (USDA-AMS). 5-Area Weekly Direct Slaughter Cattle Reports. St. Joseph, MO: USDA Market News Service, USDA Livestock, Poultry & Grain Market News Division, 2005–2008.Google Scholar

U.S. Department of Agriculture, Agricultural Marketing Service (USDA-AMS). “USDA Seeks Input on Revisions to Beef Grading Standards.” Washington, DC: USDA-AMS, 2014. Internet site: https://www.ams.usda.gov/press-release/usda-seeks-input-revisions-beef-grading-standards (Accessed September 2014).Google Scholar

U.S. Department of Agriculture, Agricultural Marketing Service (USDA-AMS). “National Summary of Meats Graded - Historical Grading Volumes BEEF.” Internet site: https://catalog.data.gov/dataset/national-summary-of-meats-graded-historical-grading-volumes-beef (Accessed October, 2015).Google Scholar

U.S. Department of Agriculture, National Agricultural Statistics Services (USDA-NASS). “Cattle on Feed.” Internet site: http://usda.mannlib.cornell.edu/MannUsda/viewDocumentInfo.do?documentID=1020 (Accessed January 2015).Google Scholar

Figure 1. The Distribution of Quality Grade (n = 134,451, the number of head, percent of total graded in parentheses)

Figure 2. The Distribution of the Numeric (Camera-Graded) Marbling Score (n = 18,080)

Figure 3. The Distribution of Called and Camera-Graded Quality Grade (n = 18,080, the number of head, percent of total graded in parentheses)

Table 1. National Summary of Meat Graded (million pounds, percent of total graded in parentheses)

Figure 4. The Distribution of Camera-Graded Quality Grade Given Called Quality Grade (n = 18,080, the number of head, percent of total graded in parentheses)

Figure 5. The Distribution of Called Quality Grade Given Camera-Graded Quality Grade (n = 18,080, the number of head, percent of total graded in parentheses)

Table 2. Estimates of Standard Errors (σu,) and Cutoff Values (Ck)

Table 3. Summary Statistics of Marbling Score

Table 4. U.S. Department of Agriculture Reported Average Premiums and Discounts (May 2005–October 2008, $/cwt.)

Figure 6. Average Choice-Select Spread during Our Sample Period (May 2005–October 2008) (source: USDA-AMS, 2005–2008)

Figure 7. Number of Fed Cattle Marketed on 1,000+ Capacity Feedlots, United States, May 2005–October 2008 (unit: 1,000 head) (source: USDA-NASS, 2015)

Figure 8. Average Rib Eye Area, Fat Thickness, and Hot Carcass Weight from the Whole Sample (seasonal, May 2005–Oct 2008)

Figure 9. Premiums and Discounts, Weekly Average Direct Beef Carcasses ($/cwt.) (source: USDA-AMS, 2005–2008)

Table 5. Premiums and Discounts of Camera-Graded and Called Quality Grade

Article contents

EXPLORING THE EXISTENCE OF GRADER BIAS IN BEEF GRADING

Abstract

Keywords

1. Introduction

2. Model

3. Data

4. Results

4.1. Subsample Analysis

4.2. Seasonal and Annual Analyses

4.3. Premiums and Discounts AnalysisFootnote 7

5. Conclusion

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests

4.3. Premiums and Discounts AnalysisFootnote ⁷