DOI QR코드

DOI QR Code

Comparative Study of Attendance Prediction Using Machine Learning and Regression Analysis

머신러닝과 회귀분석을 활용한 관중수 예측방안 비교연구

  • Seung-Yong Lee (Big Data Contents Convergence Department, Namseoul University)
  • 이승용 (남서울대학교 빅데이터콘텐츠융합학과)
  • Received : 2025.05.22
  • Accepted : 2025.07.20
  • Published : 2025.07.30

Abstract

This study aims to identify an effective methodology for predicting spectator attendance in professional sports. Using home game data from a professional baseball team (2015-2017), multiple regression and machine learning models-Deep Learning, Random Forest, and XGBoost-were evaluated against 2024 attendance data. Multiple regression revealed that events, weather, and game days significantly influenced attendance with a MAPE of 38.58%. Among machine learning methods, XGBoost achieved the highest accuracy (MAPE 6.83%) by effectively mitigating overfitting, emphasizing the need for proper overfitting control when using small-scale data. These findings provide practical guidance for sports marketing firms in selecting appropriate prediction models. Future research will extend to small-scale data applications for SMEs.

본 연구의 목적은 프로스포츠 관중 예측에 적합한 방법론을 확인하는데 있다. 연구방법은 A 프로야구 구단이 2015년부터 2017년까지 진행한 홈경기 데이터를 기반으로 다중회귀분석 모형과 3가지 머신러닝 기법으로 도출한 예측모형의 정확도를 2024년 관중수와 비교하여 확인하였다. 연구결과 다중회귀분석에서는 이벤트, 날씨, 경기 요일이 관중수에 유의미한 영향을 미쳤고 예측 정확도는 MAPE 기준 38.58%였다. Deep Learning, Random Forest, XGBoost 기법을 활용한 예측에서는 과적합 문제를 적절히 통제한 XGBoost가 MAPE 기준 6.83%의 정확도를 보여주어서 관중수 예측과 같은 소규모 데이터를 활용한 예측에서는 과적합 문제가 중요한 쟁점이라는 것을 확인할 수 있었다. 본 연구는 프로스포츠 마케팅 기업 등이 관중 예측 방법론을 적절히 선택하는데 도움을 줄 수 있을 것이다. 향후 연구에서는 중소기업 등이 보유한 소규모 데이터를 활용한 머신러닝 기반의 예측 정확성 향상방안을 마련해 보겠다.

Keywords

Acknowledgement

이 논문은 2024년도 남서울대학교 학술연구비 지원에 의해 연구되었음

References

  1. Goo, K. B. (2012). Business Ethics and Management Ethics of Professional Sports Clubs. The Journal of the Korean Society for the Philosophy of Sport, Dance, and Martial Arts, 20(3), 145-163.
  2. Son, S. B. (2004). Changes of Krean Pro-sports in Accordance with Economic Growth. The Korean Jounal of Physical Education, 43(6), 3-14.
  3. Lee, K. J., & Lee, B. H. (2006). Golf Tournament Factors Influencing Spectators' Viewing Satisfaction and Recisit Intention. Korea Sports Research, 17(20), 67-74.
  4. Shamir, B., & Ruskin, H. (1984). Sport participation vs. sport spectatorship: Two modes of leisure behavior. Journal of Leisure Research, 16(1), 9-21. DOI : 10.1080/00222216.1984.11969569
  5. Park. H. K. (2017). A Study on KLPGA Tour Gallery Satisfaction and Economic Value Analysis. Doctoral dissertation. Dept. of Global Sports Industry, The Graduate School, Hanyang University, Seoul.
  6. Kim, U. S. (1997). The Development Process of Korean Professional Sports and It's Future-Directed Model. Journal of Sport and Leisure Studies, 7, 35-68.
  7. Korea Baseball Organization, (2025). Status of spectators by year. KBO (Online), https://www.koreabaseball.com/Record/Crowd/GraphYear.aspx
  8. Chang, M. (2009). Characteristics of Sports League and Anticompetitive Restraints. The Journal of Sports and Entertainment Law, 12(2), 141-179. DOI : 10.19051/kasel.2009.12.2.141
  9. Chung, Y. R., & Kang, H. M. (2012). The Influence of Customer Relationship Management in Professional Sport Teamson Non-financial Business Performance. Journal of Sport and Leisure Studies, 47(1), 413-426. https://doi.org/10.51979/KSSLS.2012.02.47.413
  10. Rigby, D. K., Reichheld, F. F., & Schefter, P. (2002). Avoid the four perils of CRM. Harvard business review, 80(2), 101-106. DOI : 10.1571/psgp10-24-02cc
  11. Boslet, M. (2001). CRM: The promise, the peril, the eye-popping price. The Industry Standard, 6(13), 61-65.
  12. Chea, J. S. (2012) Prediction Model for Korean Professional Baseball Spectators. Korean Journal of Sports Science, 23(4), 892-905. DOI : 10.24985/kjss.2012.23.4.892
  13. Sul, M. S., Park, D. Y., & Lee, M. J. (2011). Forecast Study of Korea Pro Baseball Spectators by Using Time Series Analysis (2011-2015). Journal of Sport and Leisure Studies, 45(1), 375-387. https://doi.org/10.51979/KSSLS.2011.08.45.375
  14. Kim, M. C. (2009). Model Study to Predict the Number of Pro-baseball Spectator by Time Series Analysis : about Busan Lotte Giants` Spectator. Korean Journal of Sport Management, 14(1), 17-33.
  15. Kim, H. D., & Chae, J. S. (2012). Prediction of the Number of Spectators for the Pro-baseball Club Using a Time Series Model. The Korean Journal of Measurement and Evaluation in Physical Education and Sport Science, 14(3), 57-68. DOI : 10.21797/ksme.2012.14.3.005
  16. Oh, S. W., & Han, J. W. (2023). A Prediction for Korean Baseball League Performance Using Intervention Time-series Analysis. Korean Journal of Sport Managemet, 28(5), 65-75. DOI : 10.31308/KSSM.28.5.65
  17. Kim, Y. M., Hur, j., & Lee, K. S. (2005). The Influence of Spectator Motives on Satisfaction, Trust, Commitment, and Repurchase Intention of K-League Spectator. The Korean Journal of Physical Education, 44(6), 725-737.
  18. Zhang, J. J., Smith, D. W., Pease, D. G., & Mahar, M. T. (1996). Spectator knowledge of hockey as a significant predictor of game. Sport Marketing Quarterly, 5(3), 31-48. https://doi.org/10.1177/106169349600500305
  19. Xu, C., Cho, K. M. & Byun, H. (2017). The Analysis of the Decision Factors in Watching Professional Basketball Game by Chinese Basketball Fans in Korea using IPA Method. Journal of Sport and Leisure Studies, 70, 117-130. DOI : 10.51979/KSSLS.2017.11.70.117
  20. hung, J. E. (2014). Exercise Motivations of the Participants in Figure Skating as LeisureSport - Involvement and Continuance Intention. Journal of Korea Society for Wellness, 9(2), 139-152.
  21. Lee, J. H., & Chon, T. J. (2017). The Effects of Professional Baseball Stadium Servicescape toSpectators` Emotional Responses, Perceived Value, SpectatorSatisfaction and Behavior Intentio. Journal of Wellness, 12(2), 165-178. DOI : 10.21097/ksw.2017.05.12.2.165
  22. Park, D, J., Kim, B. W., Jeong, Y. S., & Ahn, C. W. (2018). Deep Neural Network Based Prediction of Daily Spectators for Korean Baseball League : Focused on Gwangju-KIA Champions Field. Smart Media Journal, 7(1), 16-23. DOI : 10.30693/SMJ.2018.7.1.16
  23. Lee, C. S., Kim, D. H., & Hwang, S. H. (2019). Consumer Sentiment of Korean Professional Baseball Spectators in Terms of Big Data. The Korean Journal of Sports, 17(2), 881-889.
  24. Cho, J. H., Ma, Y. S., & Jung, J. M. (2023). Development of Sales Prediction Model for Sports Facilities in Seoul Using Machine Learning. Korean Journal of Sport Manageme, 28(4), 67-80.
  25. Ryu, Y. J. & Pyu, H. W. (2023). Beauty Premium of Professional Baseball Players: Focusing on the Salary and Popularity. Korean Journal of Sport Managemet, 28(5), 38-51. DOI : 10.31308/KSSM.28.5.38
  26. Sul, M. S., Park, D. Y., & Lee, M. J. (2011). Forecast Study of Korea Pro Baseball Spectators by Using Time Series Analysis (2011-2015). Journal of Sport and Leisure Studies, 45(1), 375-387. https://doi.org/10.51979/KSSLS.2011.08.45.375
  27. Kim, M. C. (2009). Model Study to Predict the Number of Pro-baseball Spectator by Time Series Analysis : about Busan Lotte Giants` Spectator. Korean Journal of Sport Management, 14(1), 17-33.
  28. Song, H. S. (2013). Research of Spectator's Demand Forecasting Franchise in Korea Professional Baseball Team. Mater thesis, Dept. of Physical Education, The Graduate School, Hanyang University, Seoul.
  29. Chea, J. S. (2012). Prediction Model for Korean Professional Baseball Spectators. Korean Journal of Sport Science, 23(4), 892-905. https://doi.org/10.24985/KJSS.2012.23.4.892
  30. Kim, H. D., & Chea, J. S. (2012). Prediction of the Number of Spectators for the Pro-baseball Club Using a Time Series Model. The Korean Journal of Measurement and Evaluation in Physical Education and Sport Science, 14(3), 57-68. https://doi.org/10.21797/KSME.2012.14.3.005
  31. Lee, J. T., & Bang, S. Y. (2010). Forcasting attendance in the Korean professional baseball league using GARCH models. Journal of the Korean Data & Information Science Society, 21(6), 1041-1049.
  32. Choi, S. S. (2018). Management of Weather-related Risks in Pro-sports Industry. Masterthesis, Graduate School of Sports and Leisure Studies, Korea National Sport University, Seoul.
  33. Kim, H. (2016). Prediction of the number of attendances in the home team according to the visiting team and the day in Korean Baseball League. Korean Journal of Sport Management, 21(6), 85-96. https://doi.org/10.46669/kss.2023.21.4.008
  34. Kim, D. K., & Han, J. W. (2017). Predicting Spectator Consumption Behavior in Professional Baseball through the Theory of Extended Planned Behavior. The Korean Journal of Physical Education, 56(2), 321-338. https://doi.org/10.23949/kjpe.2017.03.56.2.22
  35. Lee, J. H.. (2012). The Influence of Factors Affecting Decision to Spectate on Spectator Satisfaction and Revisiting Intention in Professional Baseball Games. Korean Journal of Sport Management, 17(3), 41-53.
  36. Sung, S. H. & Sul, S. Y. (2019). Analyses on Determinants of Attracting Spectators in Korean Professional Baseball League. Korean Journal of Sport Management, 24(4), 53-67. DOI : 10.31308/KSSM.24.4.4
  37. Davis, M. C. (2008). The interaction between baseball attendance and winning percentage: A VAR analysis. International Journal of Sport Finance, 3(1), 58-73. https://doi.org/10.1177/155862350800300105
  38. Cho, Y. C., & Shin, S. A. (2013). Analysis of the relation between the image of pro-basketball star player and satisfaction in viewing and club image. The Korea Jounal of Sports Science, 22(2), 617-627.
  39. Coates, D., & Humphreys, B. R. (2010). Week to week attendance and competitive balance in the National Football League. International Journal of Sport Finance, 5(4), 239-252. https://doi.org/10.1177/155862351000500401
  40. Nam, S. U., & Jeon, G. H. (2019). A Study on the Impact of Air Pollution on the Korean Baseball Attendance. Korean Journal of Business Administration, 32(1), 71-88. DOI : 10.18032/kaaba.2019.32.1.71
  41. Carmichael, F., Millington, J., & Simmons, R. (1999). Elasticity of demand for rugby league attendance and the impact of benefit. Applied Economics Letters, 6(12), 797-800. DOI : 10.1080/135048599352196
  42. Welki, A. M., & Zlatoper, T. J. (1999). U.S. professional football game-day attendance. Atlantic Economic Journal, 27(3), 285-298. DOI : 10.1007/BF02299579
  43. Butler, M. R. (2002). Interleague play and baseball attendance. Journal of Sports Economics, 3(4), 320-334. DOI : 10.1177/152700202237498
  44. Shriver, T. D. (2007). Much adieu about freddy: Freddy adu and attendance in major league soccer. Journal of Sport Management, 21(3), 438-451. DOI : 10.1123/jsm.21.3.438
  45. Ito, H., Ai, J., & Ozawa, A. (2016). Managing Weather Risks: The Case of J. League Soccer Teams in Japan. The Journal of Risk and Insurance, 83(4), 877-912. DOI : 10.1111/jori.12071
  46. Park, B. K. (2003). The Relationship between Spectating Inducing Factors and Spectating Satisfaction of Professional Soccer Spectator. Korean Society for the Sociology of Sport, 16(1), 43-55.
  47. Choi, S. S. (2018). Management of Weather-related Risks in Pro-sports Industry. master's thesis, Graduate School of Leisure Studies, Korean Nation University, Seoul.
  48. Lee, S. G. (2018). The effect of weather and air pollution on intention of spectating pro-sports.master's thesis, Department of Physical Education, The Graduate School Seoul Nation University, Seoul.
  49. Kim, S. W. (2012). Impact of Customer Events in the Professional BaseballStadium on Visitors' Viewing Satisfaction and Revisit Intentions. Journal of Korea Entertainment Industry Association, 6(2), 83-91. https://doi.org/10.21184/jkeia.2012.06.6.2.83
  50. Kim, G. B., Kim, M. O., & Kim, M. S. (2024). An analysis of effectiveness of discount coupons for professional sport tickets and an improvement plan. Journal of Korean Society of Sports Policy, 22(1), 19-30.
  51. Park, D. J., Kim, B. W., Seon, J. S., & Ahn, C. W. (2018). Deep Neural Network Based Prediction of Daily Spectators for Korean Baseball League : Focused on Gwangju-KIA Champions Field. Smart Media Journal, 7(1), 16-23. https://doi.org/10.30693/SMJ.2018.7.1.16
  52. Park, J. U., & Park, S. H. (2017). A Study on Prediction of Attendance in Korean Baseball League Using Artificial Neural Network. The Transactions of the Korea Information Processing Society, 6(12), 565-572.
  53. Seo, Y. J., Moon, H. W., & Woo, Y. T. (2019). A Win/Lose prediction model of Korean professional baseball usingmachine learning technique. Journal of The Korea Society of Computer and Information, 24(2), 17-24. DOI : 10.9708/jksci.2019.24.02.017
  54. Korea Baseball Organization, (2025). Game schedule and results. KBO(Online), https://www.koreabaseball.com/Schedule/Schedule.aspx
  55. Weather Nuri, (2025). Daily data, Korea Meteorological Administration (Online), https://www.weather.go.kr/w/observation/land/past-obs/obs-by-day.do
  56. Daum Portal Weather, (2025). Regional weather, Daum Korea Meteorological Administration (Online), https://weather.daum.net/?location-regionId=AO82332118&weather-cp=kweather
  57. Kwon, T. W., Park, S. H., & Kwon, B. S. (2006). Factors Attracting Attendance at Korean Professional Baseball Using Decision Tree Technique. The Korean Society of Sports Science, 15(1), 433-443.
  58. Paek, S. T., Lee, K. Y., & Cho, K. M. (2005). A Study of the Relationship between Factors Related to Spectator Attendance and Revisiting the Game. Korean Journal of Sport Management, 10(2), 127-138.
  59. Kwon, W., & Chon, T. J. (2015). Analysis on the Influences of Professional Baseball Spectator's Game Watching Determinant on the Sport Fan's Behavior and Loyalty, The Korean Society of Sports Science, 24(4), 959-970.
  60. Kwon, J. Y., & Kim, S. J. (2012). The Effect of Visit Motivation of Professional Baseball Games Sport Events on Spectators' Satisfaction and Behavior Intent. Journal of Korea Entertainment Industry Association, 6(2), 55-64. https://doi.org/10.21184/jkeia.2012.06.6.2.55
  61. Jung, D., & Choi, Y. (2021). Prediction of blast vibration in quarry using machine learning models. Tunnel and Underground Space, 31(6), 508-519. DOI : 10.7474/TUS.2021.31.6.508
  62. Kang, K. G., Park, C. Y., & Na, H. J. (2023). A Comparative Study of Machine Learning-Based Future Enterprise Value Prediction Models: Impact of ESG Evaluation Rating. Korean Journal of Business Administration, 36(9), 1515-1537. DOI : 10.18032/kaaba.2023.36.9.1515
  63. Sarker, I. H. (2021). Machine learning: Algorithms, real-world applications and research directions. SN computer science, 2(3), 1-21. DOI : 10.1007/s42979-021-00592-x
  64. Lee, S. B. (2022). An Introduction to Machine Learning Focusing onPredictive Models Using Supervised Learning. Journal of Educational Studies, 53(3), 1-43. DOI : 10.15854/jes.2022.09.53.3.1
  65. Jung, D. Y., Park, S. S, & Kim, J. C. (2019). A Study on the Technology Analysis of Neuromorphic Chip using Unsupervised Learning. Journal of Korean Institute of Intelligent Systems, 29(6), 470-475. DOI : 10.5391/JKIIS.2019.29.6.470
  66. Cho, H. M., & Shin, H. J.. (2021). Trading Strategies Using Reinforcement Learning. Journal of the Korea Academia-Industrial, 22(1), 123-130. DOI : 10.5762/KAIS.2021.22.1.123
  67. Kim, K. M., Jang, H. Y., & Zhang, B. T.. (2014). Oversampling-Based Ensemble Learning Methods for Imbalanced Data. KIISE Transactions on Computing Practice, 20(10), 549-554. DOI : 10.5626/KTCP.2014.20.1
  68. Ganaie, M. A., Hu, M., Malik, A. K., Tanveer, M., & Suganthan, P. N. (2022). Ensemble deep learning: A review. Engineering Applications of Artificial Intelligence, 115, 1-18. DOI : 10.1016/j.engappai.2022.105151
  69. Mahesh, B. (2020). Machine learning algorithms-a review. International Journal of Science and Research, 9(1), 381-386. DOI : 10.21275/ART20203995
  70. Alzubi, J., Nayyar, A., & Kumar, A. (2018). Machine learning from theory to algorithms: an overview. Journal of physics: Conference Series, 1142, 1-15. DOI : 10.1088/1742-6596/1142/1/012012
  71. Yoo, J. H. (2019). Study on Prediction of Attendance Using Machine Learnin. Jounal of Institute of Korean Electrical and Electronics Engeering, 23(4), 128-134. DOI : 10.7471/ikeee.2019.23.4.1243
  72. Lee, H. H., Sohn, S. Y., & Park, M. S. (2024). Deep Learning-Based Daily Baseball Attendance Predcition. The Journal of the Convergence on Culture Technology, 10(3), 131-135. https://doi.org/10.17703/JCCT.2024.10.3.131
  73. Kim, T. H., Lim, S. W., Koh, J. G. & Lee, J. H. (2020). A Study on the Win-Loss Prediction Analysis of Korean Professional Baseball by Artificial Intelligence Model. The Journal of Big Data, 5(2), 77-84. DOI : 10.36498/kbigdt.2020.5.2.77
  74. Cho, J. H. & Seok, B. G. (2023). The Development prediction model of Korea Professional Baseballleague spectator using machine learning. The Korea Journal of Sports Science, 32(5), 547-558. DOI : 10.35159/kjss.2023.10.32.5.547
  75. Lee, H. S., Lee, J. S., An, S. H., & Kim, Y. S. (2024). A Study on the Prediction of the Number of Spectators in the KBO League during COVID-19 Using Machine Learning Models. The Korean Journal of Physical Education, 63(3), 271-284. DOI : 10.23949/kjpe.2024.5.63.3.21
  76. Mun, S. E., Jang, S. B., Lee, J. H., & Lee, J. S. (2016). Machine learning and deep learning technology trends. Information and Communications Magazine, 33(10), 49-56.
  77. Byeon, J. H. & Kwon, Y. J. (2023). An Investigation of Generative AI in Educational Application: Focusing on the Usage of ChatGPT for Learning Biology. Brain, Digital, & Learning, 13(1), 1-17. DOI : 10.31216/BDL.20230001
  78. Yoo, J. E. (2015). Random forests, an alternative data mining technique to decision tree. Journal of Education Evaluation, 28(2), 427-448.
  79. Yoo, H. B., Tak, K. J. & Mun, J. S. (2021). A Study on the Factors and Overcoming Methods of Extinction of Provinces in Korea: The Exploration with Machine Learning methods. The Korean Journal of Local Government Studies, 24(4), 443-476. DOI : 10.20484/klog.24.4.18
  80. Kim, P. S., Jeon, S. S. & Lee, S. H.. (2023). A Study on the Application of Machine Learngin to Predict Keirin Competition Ranking and Sports Betting Methods. Journal of Korea Service Management Society, 24(2), 157-192. DOI : 10.15706/jksms.2023.24.2.007