Predicting the body weight of crossbred Holstein × Zebu dairy cows using multivariate adaptive regression splines algorithm

Ignacio Vázquez-Martínez; Cem Tirink; Fernando Casanova-Lugo; Dixan Pozo-Leyva; Daniel Mota-Rojas; Murat Baitugelovich Kalmagambetov; Rashit Uskenov; Ömer Gülboy; Ricardo A. Garcia-Herrera; Alfonso J. Chay-Canul

doi:10.1017/S0022029924000578

Predicting the body weight of crossbred Holstein × Zebu dairy cows using multivariate adaptive regression splines algorithm

Published online by Cambridge University Press: 14 November 2024

Ignacio Vázquez-Martínez ,

Cem Tirink

Fernando Casanova-Lugo ,

Dixan Pozo-Leyva ,

Daniel Mota-Rojas ,

Murat Baitugelovich Kalmagambetov ,

Rashit Uskenov ,

Ömer Gülboy ,

Ricardo A. Garcia-Herrera and

Alfonso J. Chay-Canul

Show author details

Ignacio Vázquez-Martínez: Affiliation:
División Académica de Ciencias Agropecuarias, Universidad Juárez Autónoma de Tabasco, Villaher-mosa, Tabasco, México Benemérita Universidad Autónoma de Puebla, Complejo Regional Norte, Tetela de Ocampo, Puebla, México
Cem Tirink*: Affiliation:
Faculty of Agriculture, Department of Animal Science, Igdir University, TR76000, Igdir, Türkiye
Fernando Casanova-Lugo: Affiliation:
Tecnológico Nacional de México, Instituto Tecnológico de la Zona Maya, Othón P. Blanco, Quintana Roo, México
Dixan Pozo-Leyva: Affiliation:
Tecnológico Nacional de México, Instituto Tecnológico de la Zona Maya, Othón P. Blanco, Quintana Roo, México
Daniel Mota-Rojas: Affiliation:
Neurophysiology, Behavior and Animal Welfare Assessment, Department of Animal Production and Agriculture (DPAA), Universidad Autónoma Metropolitana Xochimilco Campus, Mexico City 04960, Mexico
Murat Baitugelovich Kalmagambetov: Affiliation:
Aktobe Agricultural Experimental Station, Aktobe, Republic of Kazakhstan
Rashit Uskenov: Affiliation:
Agronomic Faculty, S. Seifullin Kazakh Agrotechnical University, Z10P6B8, 62 Zhenis av., Astana, Kazakhstan
Ömer Gülboy: Affiliation:
Faculty of Agriculture, Department of Animals Science, Ondokuz Mayis University, TR55139, Samsun, Türkiye
Ricardo A. Garcia-Herrera: Affiliation:
División Académica de Ciencias Agropecuarias, Universidad Juárez Autónoma de Tabasco, Villaher-mosa, Tabasco, México
Alfonso J. Chay-Canul: Affiliation:
División Académica de Ciencias Agropecuarias, Universidad Juárez Autónoma de Tabasco, Villaher-mosa, Tabasco, México
*: Corresponding author: Cem Tirink; Email: [email protected]

Article contents

Abstract
Material and methods
Results
Discussion
References

Rights & Permissions

Abstract

This study aimed to estimate live body weight from body measurements for Holstein × Zebu dairy cows (n = 156) reared under conditions of humid tropics in Mexico using multivariate adaptive regression splines algorithm (MARS) with several train-test proportions. The body measurements included withers height, rump height, hip width, heart girth, body length and diagonal body length. The data were divided into 65:35, 70:30 and 80:20 split data for training and testing sets, respectively. The MARS algorithm was used to construct a prediction model, which predicted the body weight from the body measurements of the test dataset. The results emphasized that the MARS algorithm had an explanation rate for 80:20 train and test set of 0.836 and 0.711, respectively, with minimum Akaike information criterion values. This indicates that it is a reliable way of predicting body weight from body measurements. The results suggest that body weight prediction can be performed with the MARS algorithm in a reliable way, therefore, this algorithm may be a useful tool for animal breeders and researchers in the development of feeding and selection-aimed approaches.

Keywords

Body measurements body weight crossbred multivariate adaptive regression splines (MARS)tropical cows

Type: Research Article
Information: Journal of Dairy Research , Volume 91 , Issue 3 , August 2024 , pp. 267 - 272

DOI: https://doi.org/10.1017/S0022029924000578 [Opens in a new window]
Copyright: Copyright © The Author(s), 2024. Published by Cambridge University Press on behalf of Hannah Dairy Research Foundation

Providing regular and effective information on the body weight (BW) of animals is very important for sustainable animal husbandry and breeding. Accurate BW determination or estimation enables more precise calculation of the ideal feed allocation and will make it easier to decide drug doses, for example, and identify the most appropriate slaughter time and likely marketing price (Tırınk et al., Reference Tırınk, Önder, Francois, Marcon, Şen, Shaikenova, Omarova and Tyasi2023a). Unfortunately BW is rarely measured by small farmers due to the lack of weighscales (Lukuyu et al., Reference Lukuyu, Gibson, Savage, Duncan, Mujibi and Okeyo2016; Tebug et al., Reference Tebug, Missohou, Sourokou Sabi, Juga, Poole, Tapio and Marshall2018). Of the various methods for measuring or estimating BW, the weighscale, although it is the most accurate method, is less preferred by producers because it is cumbersome, slow, expensive to implement and stressful for the animals (Wangchuk et al., Reference Wangchuk, Wangdi and Mindu2018). On the other hand, visual measurement techniques such as image analysis require mathematical models to predict features such as BW and are currently only really applicable in research studies (Stajnko et al., Reference Stajnko, Brus and Hočevar2008; Altay and Delialioğlu, Reference Altay and Delialioğlu2022; Coşkun et al., Reference Coşkun, Şahin, Delialioğlu, Altay and Aytekin2023a). Therefore, there is a need for developing other practical methods that are low price and easy for small farmers to apply in practice (Dingwell et al., Reference Dingwell, Wallace, McLaren, Leslie and Leslie2006; Oliveira et al., Reference Oliveira, Abreu, Fonseca and Antoniassi2013). Alternative methods do exist based on biometric measures such as withers height (WH), heart girth (HG), hip width (HW), rump height (RH), and body length (BL: Heinrichs et al., Reference Heinrichs, Rogers and Cooper1992; Dingwell et al., Reference Dingwell, Wallace, McLaren, Leslie and Leslie2006; Lesosky et al., Reference Lesosky, Dumas, Conradie, Handel, Jennings, Thumbi, Toye and de Clare Bronsvoort2012; Bretschneider et al., Reference Bretschneider, Cuatrin, Arias and Vottero2014; Lukuyu et al., Reference Lukuyu, Gibson, Savage, Duncan, Mujibi and Okeyo2016; Herrera-López et al., Reference Herrera-López, García-Herrera, Chay-Canul, González-Ronquillo, Macías-Cruz, Díaz-Echeverría, Casanova-Lugo and Piñeiro-Vázquez2018; Putra et al., Reference Putra, Said and Arifin2020).

Milk production systems in the tropical regions of Mexico typically use crosses from Bos taurus and Bos indicus breeds and forage as the sole source for main feed, maintenance and milk production (Magaña et al., Reference Magaña, Ríos and Martínez2006; Rojo-Rubio et al., Reference Rojo-Rubio, Vázquez-Armijo, Pérez-Hernández, Mendoza-Martínez, Salem, Albarrán-Portillo, González-Reyna, Hernández-Martínez, Rebollar-Rebollar, Cardoso-Jiménez, Dotantes-Coronado and Gutierrez-Cedillo2009; Román-Ponce et al., Reference Román-Ponce, Ruiz-López, Montaldo, Rizzi and Román-Ponce2013), supplying around 20% of the milk consumed in the country (Magaña et al., Reference Magaña, Ríos and Martínez2006; Rojo-Rubio et al., Reference Rojo-Rubio, Vázquez-Armijo, Pérez-Hernández, Mendoza-Martínez, Salem, Albarrán-Portillo, González-Reyna, Hernández-Martínez, Rebollar-Rebollar, Cardoso-Jiménez, Dotantes-Coronado and Gutierrez-Cedillo2009; Román-Ponce et al., Reference Román-Ponce, Ruiz-López, Montaldo, Rizzi and Román-Ponce2013). Some studies have evaluated the relationship between biometric measures and BW in cross-bred cattle (Reis et al., Reference Reis, Albuquerque, Valente, Martins, Teodoro, Ferreira, Monteiro, de Almeida and Madalena2008; Mota et al., Reference Mota, Berchielli, Canesin, Rosa, Ribeiro and Brandt2013; Oliveira et al., Reference Oliveira, Abreu, Fonseca and Antoniassi2013; Franco et al., Reference Franco, Marcondes, Campos, Freitas, Detmann and Valadares2017) as well as buffalo (Ramos-Zapata et al., Reference Ramos-Zapata, Dominguez-Madrigal, García-Herrera, Camacho-Perez, Lugo-Quintal, Tyasi, Gurgel, Ítavo and Chay-Canul2023; Cruz-Tamayo et al., Reference Cruz-Tamayo, Ramírez-Bautista, Mota-Rojas, Escobar-España, García-Herrera, Gurgel, Dias-Silva, de Araújo, Santana, Aguiar, Ítavo and Chay-Canul2024), but models have not yet been developed for animals of this type under conditions of the humid tropics of Mexico, nor have models that are available been evaluated for local applicability.

Studies in the last two decades were developed for predicting BW using multiple linear regression analysis, however, these regression analyses are often inadequate for prediction because of non-linearity (Ruchay et al., Reference Ruchay, Kolpakov, Kalschikov, Dzhulamanov and Dorofeev2021). Various machine-learning approaches have been performed to calculate BW of cattle, sheep, camels and goats. Common features of these studies report the potential of various machine learning algorithms to predict linear or nonlinear relationships between BW and biometric traits accurately and reliably (Ruchay et al., Reference Ruchay, Kolpakov, Kalschikov, Dzhulamanov and Dorofeev2021). However, studies reporting the prediction of BW of tropical dairy cows through machine-learning methods are limited. Therefore, this research targeted calculation of the relationship between BW and biometric measures in Holstein × Zebu crossbred cows through MARS algorithm.

Material and methods

Data recording, study site, animals and handling

The data of BW and biometric measurements were recorded in 157 crossbreed dairy cows (Holstein × Zebu). The age of the cows ranged between 3 and 6 years and the cows grazed paddocks of star grass (Cynodon nlemfuensis) and humidicola grass (Brachiaria humidicola), without supplementation. The data were collected in the commercial farm ‘Rancho la Esperanza’, located at 17°36′27″N, 93°11′35″W; 120 masl and 10 km of the road Juárez-Reforma, in the municipality of Juarez, Chiapas, in southern Mexico.

Biometric measurements were expressed in cm and recorded as described by Oliveira et al. (Reference Oliveira, Abreu, Fonseca and Antoniassi2013) and Bretschneider et al. (Reference Bretschneider, Cuatrin, Arias and Vottero2014). These were: heart girth (HG), withers height (WH), rump height (RH), hip width (HW), body length (BL) and diagonal body length (DBL). We used a flexible fibre tape glass (Truper^®) and a big caliper of 65 cm (Haglof^®, Sweden). The animals were weighed on a scale fixed platform with a capacity of 2000 kg and accuracy of 1 kg.

Statistical analysis

The multivariate adaptive regression splines (MARS) algorithm is a non-parametric regression procedure that assists in a more applicable explanation of linear, nonlinear and interaction results among all variables examined within a cause-and-effect relationship. The most important advantage of this algorithm is that it does not necessarily need to meet the assumptions that the classical regression approach requires (Eyduran et al., Reference Eyduran, Akin and Eyduran2019; Akin et al., Reference Akin, Eyduran, Eyduran and Reed2020; Coşkun et al., Reference Coşkun, Şahin, Altay and Aytekin2023b; Tırınk et al., Reference Tırınk, Piwczyński, Kolenda and Önder2023b). This procedure generates the basic functions according to the stepwise procedures, considering all possible interaction effects among candidate knots and explanatory variables (Arthur et al., Reference Arthur, Temeng and Ziggah2020). The initial procedure is called the forward pass stage, and the next procedure is named the backward pass stage. In the forward pass stage, the algorithm starts with an intersection for the initial model and iteratively incorporates the initial models combined with the least training error to develop the model. Generally, this process characteristically provides an over-fitted model that influences extreme entanglement (Friedman, Reference Friedman1991; Eyduran et al., Reference Eyduran, Akin and Eyduran2019). Besides being predominantly worthy, the model constructed from the forward transition process may be weak for the dataset prior to the unstable constructed model, requiring overfitting difficulty with regard to generalization capability. The primary model that will identify the minimum quantity of the estimate model is eliminated in the last process, which is carried out to resolve the overfitting difficulty (Zaborski et al., Reference Zaborski, Ali, Eyduran, Grzesiak, Tariq, Abbas, Waheed and Tirink2019; Arthur et al., Reference Arthur, Temeng and Ziggah2020; Faraz et al., Reference Faraz, Tirink, Eyduran, Waheed, Taukir, Nabeel and Tariq2021). The equation of the MARS procedure carried out to estimate BW from explanatory variables can be given as:

(1)

$$\hat{y}{\rm \;} = \beta _0 + \mathop \sum \limits_{m = 1}^M {\rm \beta }_{\rm m}\mathop \prod \limits_{k = 1}^{K_{\rm m}} h_{{\rm km}}( {X_{{\rm v}( {{\rm k}, {\rm m}} ) }} ) $$

Where: $\hat{y}$ expresses the expected BW value, β₀ expresses the intercept of the model, β_m is the basis functions coefficient, K _m expresses the interaction order limit parameter, h _km (X _v(k,m)) term is expresses the basis function of the prediction model and v(k,m) is an indicator of the explanatory variables in the mth component of the kth product. Basic functions which can decrease the model performances achieved after aforementioned two procedure are eradicated by means of the generalized cross-validation error (GCV), whose equation is given below (Eyduran et al., Reference Eyduran, Akin and Eyduran2019; Zaborski et al., Reference Zaborski, Ali, Eyduran, Grzesiak, Tariq, Abbas, Waheed and Tirink2019; Çanga and Boğa, Reference Canga and Boğa2022):

(2)

$${\rm GCV}( {\rm \lambda } ) = \displaystyle{{\mathop \sum \nolimits_{{\rm i} = 1}^{\rm n} {( {{\rm y}_{\rm i}-\widehat{{{\rm y}_i}}} ) }^2} \over {{\left[{1-\displaystyle{{{\rm M}( {\rm \lambda } ) } \over {\rm n}}} \right]}^2}}$$

where, n expresses the training set's sample size, y_i expresses the observed value of BW, $\widehat{{{\rm y}_i}}$ expresses the predicted value of BW and M(λ) is the penalty term that will resolve the complexity of the model containing the λ terms.

At the beginning of the MARS procedure, the multicollinearity relationship between the explanatory variables must be tested to ensure lack of conflict. For this, the data were divided into proportions (80:20, 70:30 and 65:35) for training and test sets, respectively. In the training process, the 10-fold cross-validation procedure was used to choose the best MARS model among tested 180 MARS model (degree = 1:6 and nprune = 2:38). The criteria of the goodness of fit, whose equations are given below, was used to compare the performances of the models obtained from the MARS algorithm for the train and test sets at the different proportions (Grzesiak and Zaborski, Reference Grzesiak, Zaborski and Karahoca2012; Eyduran et al., Reference Eyduran, Akin and Eyduran2019; Olfaz et al., Reference Olfaz, Tirink and Onder2019; Zaborski et al., Reference Zaborski, Ali, Eyduran, Grzesiak, Tariq, Abbas, Waheed and Tirink2019; Tırınk et al., Reference Tırınk, Önder, Francois, Marcon, Şen, Shaikenova, Omarova and Tyasi2023a, Reference Tırınk, Piwczyński, Kolenda and Önder2023b).

1. Pearson correlation coefficient (r):
(3)$$r = \displaystyle{{\mathop \sum ( {x_i-\bar{x}} ) ( {y_i-\bar{y}} ) } \over {\sqrt {\mathop \sum {( {x_i-\bar{x}} ) }^2} \mathop \sum {( {y_i-\bar{y}} ) }^2}}$$
2. Root-mean-square error (RMSE):
(4)$${\rm RMSE} = {\rm \;}\sqrt {\displaystyle{1 \over {\rm n}}\mathop \sum \limits_{{\rm i} = 1}^{\rm n} {( {{\rm y}_{\rm i}-\widehat{{{\rm y}_i}}} ) }^2} $$
3. Standard deviation ratio (SDR):
(5)$${\rm S}{\rm D}_{{\rm ratio}} = \displaystyle{{{\rm S}_{\rm m}} \over {{\rm S}_{\rm d}}}$$
4. Performance Index (PI)
(6)$${\rm PI} = \displaystyle{{{\rm rRMSE}} \over {1 + {\rm r}}}$$
5. Global relative approximation error (RAE):
(7)$${\rm RAE} = {\rm \;}\sqrt {\displaystyle{{\mathop \sum \nolimits_{{\rm i} = 1}^{\rm n} {( {{\rm y}_{\rm i}-\widehat{{{\rm y}_i}}} ) }^2} \over {\mathop \sum \nolimits_{{\rm i} = 1}^{\rm n} {\rm y}_{\rm i}^2 }}} $$
6. Mean absolute percentage error (MAPE):
(8)$${\rm MAPE} = \displaystyle{1 \over {\rm n}}{\rm \;}\mathop \sum \limits_{{\rm i} = 1}^{\rm n} \left\vert {\displaystyle{{{\rm y}_{\rm i}-\widehat{{{\rm y}_i}}} \over {{\rm y}_{\rm i}}}} \right\vert \times 100$$
7. Akaike information criterion (AIC):
(9)$$\left\{{\matrix{ {{\rm AIC} = {\rm n}{\rm .ln}\left[{\displaystyle{ 1 \over {\rm n}}{\sum\nolimits_{{\rm i\ = \ 1}}^{\rm n} {( {{\rm y}_{\rm i} - {\rm y}_{{\rm ip}}} ) } }^ 2} \right]{\rm} + {\rm 2k, \;\;\;\;\;\;if\;n/k> 40\;}} \cr {{\rm AI}{\rm C}_{\rm c}{\rm} = {\rm AIC} + \displaystyle{{{\rm 2k}( {{\rm k} + 1} ) } \over {{\rm n} - {\rm k} - 1}}{\rm \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;otherwise}} \cr } } \right.$$

where, n represents that training set's sample size, k is the number of explanatory variables in the model, y _i is the observed value of BW, $\widehat{{{\rm y}_i}}$ is the expected value of BW, s _d is the standard deviation of BW, and s _m is the standard deviation representing the errors of the optimal model. Goodness-of-fit criteria were carried out to define the best model for smallest AIC, RMSE, SD_ratio, MAPE, RAE, PI, and CV values for both sets, and the utmost R ² and r values for whole models (Tatliyer, Reference Tatliyer2020).

Descriptive statistical evaluation was carried out with the R software (R Core Team, 2018). To tabulate descriptive statistics for whole variables we used the ‘psych’ package (Revelle, Reference Revelle2017). To show the relationship between explanatory and response variables, Pearson's correlation coefficient was determined by using the ‘performance analytics’ package (Peterson and Carl, Reference Peterson and Carl2020). To test the multicollinearity problem, the function of the variance inflation factor was used with the ‘car’ package in R software (Fox and Weisberg, Reference Fox and Weisberg2019). The MARS algorithm was carried out with the ‘caret’ package for different proportions (Kuhn, Reference Kuhn2022). To show the performances of the ‘made for all’ models, the ‘ehaGoF’ package was used (Eyduran, Reference Eyduran2020).

Results

Table 1 shows the calculated descriptive statistics for the data. The coefficient of variation (CV %) was calculated to be lower than 30% for all traits, meaning that the measured data were reliable for the data analysis.

Table 1. Descriptive statistics for all variables recorded in 157 crossbred dairy cows (Holstein × Zebu)

BW, Body weight; HG, Heart girth; WH, Withers height; RH, Rump height; HW, Hip width; BL, Body length; DBL, diagonal body length.

Pearson's correlation analysis was performed to present the relationship between explanatory and response variables. The estimated Pearson correlation coefficients and their significance levels are given in Figure 1. All correlation coefficients in Figure 1 were determined to be statistically significant. The greatest correlation coefficient was determined between BW and HG. In addition, HW and WH had higher coefficients with BW than with each other. The lowest correlation coefficients were determined between BW and BL (0.51), DBL (0.47) and RH (0.37), respectively. Although there is a relatively high correlation coefficient among the other variables, it is still possible to discuss a positive relationship.

Figure 1. Pearson correlation coefficients between various biometric measures and body weight in cross-bred Holstein × Zebu cattle. All correlations were significant at P < 0.001. BW, body weight; HG, heart girth; WH, withers height; RH, rump height; HW, hip width; BL, body length; DBL, diagonal body length.

Before implementing the MARS algorithm, the multicollinearity problem was assessed. For this aim, the variance inflation factors existing between explanatory variables were determined. Values were 2.18, 2.16, 1.46, 1.80, 1.42 and 1.40 for HG, WH, RH, HW, BL and DBL, respectively. Since all were below 10 there was no multicollinearity problem that would cause overfitting and hence the MARS algorithm could be applied.

To compare the models obtained through the MARS algorithm, model comparison criteria were applied for different proportions of the training and test sets as shown in Table 2. The outcomes of the model evaluation criteria showed that the greatest analytical model power was obtained for the 80:20 training/test proportion, which had the lowest AIC values for both sets. Also, R ² and r values were determined for this model to be 0.836 and 0.711 for the training and test sets, respectively. In addition, relative importance values were calculated for all proportions, as shown in Table 3. HG had the biggest effective variable for determining the BW for all proportions.

Table 2. Goodness-of-fit criteria results of the different proportions of MARS models

Table 3. Relative importance values for predicting body weight according to the MARS prediction model

HG, heart girth; WH, withers height; HW, hip width; DBL, diagonal body length.

The best predictive model was provided by the 80:20 proportion. Equation (10) shows that the BW can be described with the five basis functions in the MARS prediction model.

(10)

$$\eqalign{&BW = 476.671 + \left\{{\matrix{ {if\;HG < 187\colon ( {HG-187} ) \times 6.183} \cr {if\;HG > 187\colon ( {HG-187} ) \times 4.249} &\cr + ( {HW-50} ) \times 5.929 + ( {BDL-116} ) \times 5.472 \right.$$

According to this, the first term of the selected best MARS prediction model was an intercept that had a coefficient of 476.671. In the second term, HG was determined with the cutpoint of 187 cm and negative coefficient of −6.183. The third term (HG-187) had a cutpoint of 187 cm with a coefficient of 4.249. The fourth term and the third basis function were for HW, with a cutpoint of 50 cm with a coefficient of 5.929. For changes of HW of 50 cm, the effective fourth term on body weight was affected by 5.929. The fifth term was for DBL, again, with cutpoint 116 cm with a coefficient of 5.472.

Discussion

The dearth of studies for predicting BW from biometric measurements in Holstein × Zebu crossbred cattle is a challenge for researchers and producers alike. Only a few studies have focused on this issue, with most of the research conducted on Holstein or Zebu breeds. Nevertheless, some studies do exist, using some of the same biometric measurements as us (HG, WH, HW, BL: Reis et al., Reference Reis, Albuquerque, Valente, Martins, Teodoro, Ferreira, Monteiro, de Almeida and Madalena2008; Mota et al., Reference Mota, Berchielli, Canesin, Rosa, Ribeiro and Brandt2013; Oliveira et al., Reference Oliveira, Abreu, Fonseca and Antoniassi2013; Franco et al., Reference Franco, Marcondes, Campos, Freitas, Detmann and Valadares2017). Our correlation coefficients among biometric measurements were estimated lower than the study of Putra et al., (Reference Putra, Said and Arifin2020) on Pasundan cows, whilst in our hands the BW predictive power of WH was higher but BL and HG were lower than in Ongole cows (Putra, Reference Putra2020). Bene et al. (Reference Bene, Nagy, Kiss, Polgar and Szabo2007) compared different beef breed cattle. Our correlation between BW and WH was higher than Angus, Hereford and Hungarian Simmental whereas in our hands RH was less useful as a predictor. We obtained similar correlations to those reported for BW and BL (Bene et al., Reference Bene, Nagy, Kiss, Polgar and Szabo2007). Our best correlation (HG) was lower than the results of Kashoma et al. (Reference Kashoma, Luziga, Werema, Shirima and Ndossi2011) for Tanzanian shorthorn Zebu cattle, and in another study using the same crossbred type of cattle as ourselves, Mota et al. (Reference Mota, Berchielli, Canesin, Rosa, Ribeiro and Brandt2013) found high correlation coefficients between BW and HG, hip height and rump height. On the other hand, our data may be more reliable since we used far more cattle (156 compared to 24).

Reis et al. (Reference Reis, Albuquerque, Valente, Martins, Teodoro, Ferreira, Monteiro, de Almeida and Madalena2008) report that the accuracy of estimating BW can be affected by breed, age, body size, body condition and physiological state. Franco et al. (Reference Franco, Marcondes, Campos, Freitas, Detmann and Valadares2017) reported an R ² of 0.83 between BW and HW in Holstein crossbred heifers. These authors conclude that although HW was highly correlated with BW, it showed a low R ² with a high coefficient of variation when compared with other variables such as body length, hip height and rump height. Using HG and WH to estimate the BW of dairy cows in low-input systems in Senegal, Tebug et al. (Reference Tebug, Missohou, Sourokou Sabi, Juga, Poole, Tapio and Marshall2018) reported that R ² varied from 0.77 to 0.94; they also reported that the RMSE of the developed models corresponded to 9.4 to 12.33% (29.27 to 39.24 kg) of the average BW of animals. Also, Bretschneider et al. (Reference Bretschneider, Cuatrin, Arias and Vottero2014) determined that the RMSE of their model was 5.8% of the average BW (15.95 kg). Mota et al. (Reference Mota, Berchielli, Canesin, Rosa, Ribeiro and Brandt2013) concluded that the correlations between measurements and body development of heifers with different parentages are distinct, and that specific equations are necessary for predicting body weight. Additionally, Tedde et al. (Reference Tedde, Grelet, Ho, Pryce, Hailemariam, Wang, Plastow, Gengler, Brostaux, Froidmont, Dehareng, Bertozzi, Crowe, Dufrasne and Soyeurt2021) indicated that estimating BW through biometric measurements can be approached as a regression problem, where the input features are the body measurements, and the target value is the BW that the regression model predicts.

Ruchay et al. (Reference Ruchay, Kolpakov, Kalschikov, Dzhulamanov and Dorofeev2021) stated that the random forest regression algorithm, one of the machine learning methods, is the most effective algorithm in predicting the BW of Hereford cows and may be more effective than traditional models. Similarly, Dang et al. (Reference Dang, Choi, Lee, Lee, Alam, Park, Han, Lee and Hoang2022) determined that the light generalized boosted regression tree-based model had the best performance, and will use these findings to develop a method for indirectly estimating the live weight of Hanwoo cows using machine vision technology that measures ten different body features. Celik (Reference Celik2019) compared MARS, Chi-square automatic interaction detection (CHAID), exhaustive-CHAID and classificastion and regression tree (CART) algorithms for predicting BW in Pakistani goats. Goodness of fit criteria were used to evaluate the model performances, and it was reported that the model obtained with the MARS procedure was most reliable among the models studied. Our evaluation criteria yielded similar results to theirs. Canga (Reference Canga2022) used MARS to predict hot carcass weight from several features, and once again the MARS algorithm gave similar results to the results of the current study within the scope of the model comparison criteria.

We can conclude that the use of statistical procedures for predicting BW from biometric measurements is an important and useful tool that can be applied to crossbred cattle. They are relatively easy to use and require minimal effort. The MARS algorithm is particularly useful. It is a non-parametric approach, which allows for the incorporation of non-linear relationships between the independent and dependent variables. This makes it especially useful for predicting body weight from body measurements, as these can often be subject to non-linear relationships. The MARS algorithm also provides an efficient and accurate way to construct prediction models, without the need to perform many variable transformations. Furthermore, MARS allows for the incorporation of both continuous and categorical variables, thus making it an ideal method for predicting BW from biometric measurements.

References

Akin, M, Eyduran, SP, Eyduran, E and Reed, BM (2020) Analysis of macro nutrient related growth responses using multivariate adaptive regression splines. Plant Cell, Tissue and Organ Culture 140, 661–670.CrossRef Google Scholar

Altay, Y and Delialioğlu, RA (2022) Diagnosing lameness with the random forest classification algorithm using thermal cameras and digital colour parameters. Mediterranean Agricultural Sciences 35, 47–54.CrossRef Google Scholar

Arthur, CK, Temeng, VA and Ziggah, YY (2020) Multivariate adaptive regression splines (MARS) approach to blast-induced ground vibration prediction. International Journal of Mining, Reclamation and Environment 34, 198–222.CrossRef Google Scholar

Bene, S, Nagy, B, Kiss, B, Polgar, JP and Szabo, F (2007) Comparison of body measurements of beef cows of different breeds. Archiv fur Tierzucht 50, 363–373.Google Scholar

Bretschneider, G, Cuatrin, A, Arias, D and Vottero, D (2014) Estimation of body weight by an indirect measurement method in developing replacement Holstein heifers raised on pasture. Archivos de Medicina Veterinaria 46, 439–443.CrossRef Google Scholar

Canga, D (2022) Use of MARS data mining algorithm based on training and test sets in determining carcass weight of cattle in different breeds. Journal of Agricultural Sciences 28, 259–268.Google Scholar

Canga, D and Boğa, M (2022) Detection of correct pregnancy status in lactating dairy cattle using MARS data mining algorithm. Turkish Journal of Veterinary & Animal Sciences 46, 809–819.CrossRef Google Scholar

Celik, S (2019) Comparing predictive performances of tree-based data mining algorithms and MARS algorithm in the prediction of live body weight from body traits in Pakistani goats. Pakistan Journal of Zoology 51, 1447–1456.CrossRef Google Scholar

Coşkun, G, Şahin, Ö, Delialioğlu, RA, Altay, Y and Aytekin, I (2023a) Diagnosis of lameness via data mining algorithm by using thermal camera and image processing method in Brown Swiss cows. Tropical Animal Health and Production 55, 50.CrossRef Google Scholar PubMed

Coşkun, G, Şahin, Ö, Altay, Y and Aytekin, I (2023b) Final fattening live weight prediction in Anatolian merinos lambs from somebody characteristics at the initial of fattening by using some data mining algorithms. Black Sea Journal of Agriculture 6, 47–53.CrossRef Google Scholar

Cruz-Tamayo, AA, Ramírez-Bautista, MA, Mota-Rojas, D, Escobar-España, JC, García-Herrera, R, Gurgel, ALC, Dias-Silva, TP, de Araújo, MJ, Santana, JCS, Aguiar, IOM, Ítavo, LCV and Chay-Canul, AJ (2024) Relationship between body weight and hip width in dairy buffaloes (Bubalus bubalis). Journal of Dairy Research 91, 180–183.CrossRef Google Scholar

Dang, CG, Choi, TJ, Lee, SS, Lee, SH, Alam, M, Park, MN, Han, S, Lee, JG and Hoang, DT (2022) Machine learning-based live weight estimation for Hanwoo cow. Sustainability 14, 12661.CrossRef Google Scholar

Dingwell, RT, Wallace, MM, McLaren, CJ, Leslie, CF and Leslie, KE (2006) An evaluation of two indirect methods of estimating body weight in Holstein calves and heifers. Journal of Dairy Science 89, 3992–3998.CrossRef Google Scholar PubMed

Eyduran, E (2020) ehaGoF: calculates goodness of fit statistics. R package version 0.1.1.CrossRef Google Scholar

Eyduran, E, Akin, M and Eyduran, SP (2019) Application of Multivariate Adaptive Regression Splines Through R Software. Ankara. Türkiye: Nobel Academic Publishing, p. 104.Google Scholar

Faraz, A, Tirink, C, Eyduran, E, Waheed, A, Taukir, NA, Nabeel, MS and Tariq, MM (2021) Prediction of live body weight based on body measurements in Thalli sheep under tropical conditions of Pakistan using CART and MARS. Tropical Animal Health and Production 53, 301.CrossRef Google Scholar PubMed

Fox, J and Weisberg, S (2019) An {R} Companion to Applied Regression, 3rd Edn. Thousand Oaks, CA: Sage.Google Scholar

Franco, MDO, Marcondes, MI, Campos, JMDS, Freitas, DRD, Detmann, E and Valadares, SDC (2017) Evaluation of body weight prediction equations in growing heifers. Acta Scientiarum. Animal Sciences 39, 201–206.CrossRef Google Scholar

Friedman, J (1991) Multivariate adaptive regression splines. Annals of Statistics 19, 1–67.Google Scholar

Grzesiak, W and Zaborski, D (2012) Examples of the use of data mining methods in animal breeding. In Karahoca, A (ed.), Data Mining Applications in Engineering and Medicine. London: InTech, pp. 303–324.Google Scholar

Heinrichs, AJ, Rogers, GW and Cooper, JB (1992) Predicting body weight and wither height in Holstein heifers using body measurements. Journal of Dairy Science 75, 3576–3581.CrossRef Google Scholar PubMed

Herrera-López, S, García-Herrera, R, Chay-Canul, AJ, González-Ronquillo, M, Macías-Cruz, U, Díaz-Echeverría, VF, Casanova-Lugo, F and Piñeiro-Vázquez, A (2018) Desarrollo y evaluación de una ecuación para predecir el peso vivo en novillas cruzadas usando el ancho de cadera. ITEA-Información Técnica Económica Agraria 114, 368–377.Google Scholar

Kashoma, IPB, Luziga, C, Werema, CW, Shirima, GA and Ndossi, D (2011) Predicting body weight of Tanzania shorthorn zebu cattle using heart girth measurements. Livestock Research for Rural Development 23, 94.Google Scholar

Kuhn, M (2022) caret: Classification and regression training. R package version 6.0–93.Google Scholar

Lesosky, M, Dumas, S, Conradie, I, Handel, IG, Jennings, A, Thumbi, S, Toye, F and de Clare Bronsvoort, BM (2012) A live weight–heart girth relationship for accurate dosing of East African shorthorn zebu cattle. Tropical Animal Health and Production 45, 311–316.CrossRef Google Scholar PubMed

Lukuyu, MN, Gibson, JP, Savage, DB, Duncan, AJ, Mujibi, FDN and Okeyo, AM (2016) Use of body linear measurements to estimate live weight of crossbred dairy cattle in smallholder farms in Kenya. SpringerPlus 5, 1–14.CrossRef Google Scholar

Magaña, MJG, Ríos, AG and Martínez, GJC (2006) Los sistemas de doble propósito y los desafíos en los climas tropicales de México. Archivos Latinoamericanos de Producción Animal 14, 105–114.Google Scholar

Mota, DA, Berchielli, TT, Canesin, RC, Rosa, BL, Ribeiro, AF and Brandt, HV (2013) Nutrient intake, productive performance and body measurements of dairy heifers fed with different sources of protein. Acta Scientiarum. Animal Sciences 35, 273–279.CrossRef Google Scholar

Olfaz, M, Tirink, C and Onder, H (2019) Use of CART and CHAID algorithms in Karayaka sheep breeding. Kafkas Üniversitesi Veteriner Fakültesi Dergisi 25, 105–110.Google Scholar

Oliveira, AS, Abreu, DC, Fonseca, MA and Antoniassi, PMB (2013) Development and evaluation of predictive models of body weight for crossbred Holstein-Zebu dairy heifers. Journal of Dairy Science 96, 6697–6702.CrossRef Google Scholar PubMed

Peterson, BG and Carl, P (2020) Performance analytics: econometric tools for performance and risk analysis. R package version 2.0.4.Google Scholar

Putra, WPB (2020) The assessment of body weight of Sumba Ongole cattle (Bos indicus) by body measurements. Manas Journal of Agriculture Veterinary and Life Sciences 10, 52–57.Google Scholar

Putra, WPB, Said, S and Arifin, J (2020) Principal component analysis (PCA) of body measurements and body indices in the Pasundan cows. Black Sea Journal of Agriculture 3, 49–55.Google Scholar

R Core Team (2018) R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.Google Scholar

Ramos-Zapata, R, Dominguez-Madrigal, C, García-Herrera, R-A, Camacho-Perez, E, Lugo-Quintal, JM, Tyasi, TL, Gurgel, ALC, Ítavo, LCV and Chay-Canul, AJ (2023) Predicting live weight using body volume formula in lactating water buffalo. Journal of Dairy Research 90, 138–141.CrossRef Google Scholar PubMed

Reis, GL, Albuquerque, FHMAR, Valente, BD, Martins, GA, Teodoro, RL, Ferreira, MBD, Monteiro, JBN, de Almeida, M and Madalena, FE (2008) Predição do peso vivo a partir de medidas corporais em animais mestiços Holandês/Gir. Ciência Rural 38, 778–783.CrossRef Google Scholar

Revelle, WR (2017) Psych: procedures for personality and psychological research.Google Scholar

Rojo-Rubio, R, Vázquez-Armijo, JF, Pérez-Hernández, P, Mendoza-Martínez, GD, Salem, AZM, Albarrán-Portillo, B, González-Reyna, A, Hernández-Martínez, J, Rebollar-Rebollar, S, Cardoso-Jiménez, D, Dotantes-Coronado, EJ and Gutierrez-Cedillo, JG (2009) Dual purpose cattle production in Mexico. Tropical Animal Health and Production 41, 715–721.CrossRef Google Scholar

Román-Ponce, SI, Ruiz-López, FDJ, Montaldo, HH, Rizzi, R and Román-Ponce, H (2013) Efectos de cruzamiento para producción de leche y características de crecimiento en bovinos de doble propósito en el trópico húmedo. Revista Mexicana de Ciencias Pecuarias 4, 405–416.Google Scholar

Ruchay, AN, Kolpakov, VI, Kalschikov, VV, Dzhulamanov, KM and Dorofeev, KA (2021) Predicting the body weight of Hereford cows using machine learning. IOP Conference Series: Earth and Environmental Science 624, 1–5.Google Scholar

Stajnko, D, Brus, M and Hočevar, M (2008) Estimation of bull live weight through thermographically measured body dimensions. Computers and Electronics in Agriculture 61, 233–240.CrossRef Google Scholar

Tatliyer, A (2020) The effects of raising type on performances of some data mining algorithms in lambs. Journal of Agriculture and Nature 23, 772–780.Google Scholar

Tebug, SF, Missohou, A, Sourokou Sabi, S, Juga, J, Poole, EJ, Tapio, M and Marshall, K (2018) Using body measurements to estimate live weight of dairy cattle in low-input systems in Senegal. Journal of Applied Animal Research 46, 87–93.CrossRef Google Scholar

Tedde, A, Grelet, C, Ho, PN, Pryce, JE, Hailemariam, D, Wang, Z, Plastow, G, Gengler, N, Brostaux, Y, Froidmont, E, Dehareng, F, Bertozzi, C, Crowe, MA, Dufrasne, I, GplusE Consortium Group and Soyeurt, H (2021) Validation of dairy cow bodyweight prediction using traits easily recorded by dairy herd improvement organizations and its potential improvement using feature selection algorithms. Animals 11, 1288.CrossRef Google Scholar PubMed

Tırınk, C, Önder, H, Francois, D, Marcon, D, Şen, U, Shaikenova, K, Omarova, K and Tyasi, TL (2023a) Comparison of the data mining and machine learning algorithms for predicting the final body weight for Romane sheep breed. PLoS ONE 18, e0289348.CrossRef Google Scholar PubMed

Tırınk, C, Piwczyński, D, Kolenda, M and Önder, H (2023b) Estimation of body weight based on biometric measurements by using random forest regression, support vector regression and CART algorithms. Animals 13, 798.CrossRef Google Scholar PubMed

Wangchuk, K, Wangdi, J and Mindu, M (2018) Comparison and reliability of techniques to estimate live cattle body weight. Journal of Applied Animal Research 46, 349–352.CrossRef Google Scholar

Zaborski, D, Ali, M, Eyduran, E, Grzesiak, W, Tariq, MM, Abbas, F, Waheed, A and Tirink, C (2019) Prediction of selected re-productive traits of indigenous Harnai sheep under the farm management system via various data mining algorithms. Pakistan Journal of Zoology 51, 421–431.CrossRef Google Scholar

Table 1. Descriptive statistics for all variables recorded in 157 crossbred dairy cows (Holstein × Zebu)

Table 2. Goodness-of-fit criteria results of the different proportions of MARS models

Table 3. Relative importance values for predicting body weight according to the MARS prediction model

Article contents

Predicting the body weight of crossbred Holstein × Zebu dairy cows using multivariate adaptive regression splines algorithm

Abstract

Keywords

Material and methods

Data recording, study site, animals and handling

Statistical analysis

Results

Discussion

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests