Kafkas Üniversitesi Veteriner Fakültesi Dergisi Early View
Hybrid Ensemble Model for Lactation Milk Yield Prediction of Holstein Cows
Derviş TOPUZ1, Selçuk TEKGÖZ2
1Niğde Ömer Halisdemir University, Niğde Zübeyde Hanım Vocational School of Health Services, Department of Health Services Science, TR-51240 Niğde-TÜRKİYE
2Niğde Ömer Halisdemir University, Graduate School of Natural and Applied Sciences, Department of Interdisciplinary Disaster Management, TR-51240 Niğde-TÜRKİYE
DOI : 10.9775/kvfd.2025.34031 Machine learning (ML) algorithms are widely employed across various domains to identify patterns and relationships in large datasets, and to perform tasks such as prediction and classification. This study investigates the use of machine learning techniques to predict lactation milk yield in Holstein dairy cows within the field of veterinary sciences. The dataset comprises records from 128 cows, with lactation milk yield categorized into three classes low, medium, and high based on threshold values determined by expert opinion. The independent variables include Age (in days), Days in Milk (DIM), Service Period (in days), Calving Date, and Parity. To reduce the dimensionality of the dataset, Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were applied. The performance of nine classification algorithms was evaluated on both the original and reduced datasets using 10-fold cross-validation and bootstrap resampling methods. Due to class imbalance in the data, the weighted F1-score was used as the primary performance metric instead of accuracy. Among the original models, the highest weighted F1-scores were achieved by Decision Tree (DT), Gradient Boosting Machine (GBM), and Extreme Gradient Boosting (XGBoost), with scores of 0.47, 0.53, and 0.51, respectively. A hybrid ensemble model developed by combining these topperforming algorithms demonstrated superior performance, yielding a weighted F1- score of 1.00, an accuracy of 1.00, and an ROC-AUC of 1.00. These findings suggest that hybrid ensemble models can provide more effective and robust solutions in veterinary applications and similar research fields. Keywords : Machine Learning, Decision tree, Gradient boosting machine, Xgboost, Milk yield, Holstein, F1-score, AUC, Hybrid model