DOI QR코드

DOI QR Code

Wine Quality Prediction by Using Backward Elimination Based on XGBoosting Algorithm

  • Umer Zukaib (Department of Computer Science, Comsats University Islamabad, Abbottabad Campus) ;
  • Mir Hassan (Institute of Data Science and Digital Technologies, Vilnius University) ;
  • Tariq Khan (Department of Information Engineering, University of Politecnico Delle Marche) ;
  • Shoaib Ali (Department of Computer Science, Virtual University)
  • Received : 2024.02.05
  • Published : 2024.02.29

Abstract

Different industries mostly rely on quality certification for promoting their products or brands. Although getting quality certification, specifically by human experts is a tough job to do. But the field of machine learning play a vital role in every aspect of life, if we talk about quality certification, machine learning is having a lot of applications concerning, assigning and assessing quality certifications to different products on a macro level. Like other brands, wine is also having different brands. In order to ensure the quality of wine, machine learning plays an important role. In this research, we use two datasets that are publicly available on the "UC Irvine machine learning repository", for predicting the wine quality. Datasets that we have opted for our experimental research study were comprised of white wine and red wine datasets, there are 1599 records for red wine and 4898 records for white wine datasets. The research study was twofold. First, we have used a technique called backward elimination in order to find out the dependency of the dependent variable on the independent variable and predict the dependent variable, the technique is useful for predicting which independent variable has maximum probability for improving the wine quality. Second, we used a robust machine learning algorithm known as "XGBoost" for efficient prediction of wine quality. We evaluate our model on the basis of error measures, root mean square error, mean absolute error, R2 error and mean square error. We have compared the results generated by "XGBoost" with the other state-of-the-art machine learning techniques, experimental results have showed, "XGBoost" outperform as compared to other state of the art machine learning techniques.

Keywords

References

  1. Botonaki, A., et al., The role of food quality certification on consumers' food choices. British Food Journal, 2006.
  2. Corduas, M., L. Cinquanta, and C. Ievoli, The importance of wine attributes for purchase decisions: A study of Italian consumers' perception. Food Quality and Preference, 2013. 28(2): p. 407-418.
  3. Cortez, P., et al., Modeling wine preferences by data mining from physicochemical properties. Decision support systems, 2009. 47(4): p. 547-553.
  4. Veale, R. and P. Quester, Consumer sensory evaluations of wine quality: The respective influence of price and country of origin. Journal of wine economics, 2008. 3(1): p. 10-29.
  5. Kupis, J., et al., Assessing the usability of the automated self-administered dietary assessment tool (ASA24) among low-income adults. Nutrients, 2019. 11(1): p. 132.
  6. Gupta, Y., Selection of important features and predicting wine quality using machine learning techniques. Procedia Computer Science, 2018. 125: p. 305-312.
  7. Shaw, B., A.K. Suman, and B. Chakraborty, Wine Quality Analysis Using Machine Learning, in Emerging Technology in Modelling and Graphics. 2020, Springer. p. 239-247.
  8. Gupta, U., et al., Wine quality analysis using machine learning algorithms, in Micro-Electronics and Telecommunication Engineering. 2020, Springer. p. 11-18.
  9. Tingwei, Z. Red wine quality prediction through active learning. in Journal of Physics: Conference Series. 2021. IOP Publishing.
  10. Er, Y. and A. Atasoy, The classification of white wine and red wine according to their physicochemical qualities. International Journal of Intelligent Systems and Applications in Engineering, 2016: p. 23-26.
  11. Chen, B., et al. Wineinformatics: applying data mining on wine sensory reviews processed by the computational wine wheel. in 2014 IEEE International Conference on Data Mining Workshop. 2014. IEEE.
  12. Appalasamy, P., et al., Classification-based data mining approach for quality control in wine production. Journal of Applied Sciences, 2012. 12(6): p. 598-601.
  13. Beltran, N.H., et al., Chilean wine classification using volatile organic compounds data obtained with a fast GC analyzer. IEEE Transactions on Instrumentation and Measurement, 2008. 57(11): p. 2421-2436.
  14. Thakkar, K., et al., AHP and Machine Learning Techniques for Wine Recommendation. International Journal of Computer Science and Information Technologies, 2016. 7(5): p. 2349-2352.
  15. Reddy, Y.S. and P. Govindarajulu, An Efficient User Centric Clustering Approach for Product Recommendation Based on Majority Voting: A Case Study on Wine Data Set. IJCSNS, 2017. 17(10): p. 103.
  16. Kumar, S., K. Agrawal, and N. Mandan. Red Wine Quality Prediction Using Machine Learning Techniques. in 2020 International Conference on Computer Communication and Informatics (ICCCI). 2020. IEEE.
  17. Sun, L.-X., K. Danzer, and G. Thiel, Classification of wine samples by means of artificial neural networks and discrimination analytical methods. Fresenius' journal of analytical chemistry, 1997. 359(2): p. 143-149.
  18. Vlassides, S., J.G. Ferrier, and D.E. Block, Using historical data for bioprocess optimization: modeling wine characteristics using artificial neural networks and archived process information. Biotechnology and Bioengineering, 2001. 73(1): p. 55-68.
  19. Moreno, I.M., et al., Differentiation of two Canary DO red wines according to their metal content from inductively coupled plasma optical emission spectrometry and graphite furnace atomic absorption spectrometry by using Probabilistic Neural Networks. Talanta, 2007. 72(1): p. 263-268.
  20. Yu, H., et al., Prediction of enological parameters and discrimination of rice wine age using least-squares support vector machines and near infrared spectroscopy. Journal of agricultural and food chemistry, 2008. 56(2): p. 307-313.
  21. Radosavljevic, D., S. Ilic, and S. Pitulic, A DATA MINING APPROACH TO WINE QUALITY PREDICTION.
  22. Aich, S., et al. A classification approach with different feature sets to predict the quality of different types of wine using machine learning techniques. in 2018 20th International conference on advanced communication technology (ICACT). 2018. IEEE.
  23. Ashenfelter, O., Predicting the quality and prices of Bordeaux wine. Journal of Wine Economics, 2010. 5(1): p. 40-52.
  24. Ribeiro, J., et al. Wine vinification prediction using data mining tools. in ECC'09 Proceedings of the 3rd international conference on European computing conference. Computing and Computational Intelligence. WSEAS. 2009.
  25. Lee, S., J. Park, and K. Kang. Assessing wine quality using a decision tree. in 2015 IEEE International Symposium on Systems Engineering (ISSE). 2015. IEEE.
  26. Yeo, M., T. Fletcher, and J. Shawe-Taylor, Machine learning in fine wine price prediction. Journal of Wine Economics, 2015. 10(2): p. 151-172.
  27. Noble, W.S., What is a support vector machine? Nature biotechnology, 2006. 24(12): p. 1565-1567.
  28. Svetnik, V., et al., Random forest: a classification and regression tool for compound classification and QSAR modeling. Journal of chemical information and computer sciences, 2003. 43(6): p. 1947-1958.
  29. Kanungo, T., et al., An efficient k-means clustering algorithm: Analysis and implementation. IEEE transactions on pattern analysis and machine intelligence, 2002. 24(7): p. 881-892.
  30. Van Laarhoven, P.J. and E.H. Aarts, Simulated annealing, in Simulated annealing: Theory and applications. 1987, Springer. p. 7-15.
  31. Wang, S.-C., Genetic algorithm, in Interdisciplinary computing in java programming. 2003, Springer. p. 101-116.