DOI QR코드

DOI QR Code

Development of Examination Model of Weather Factors on Garlic Yield Using Big Data Analysis

빅데이터 분석을 활용한 마늘 생산에 미치는 날씨 요인에 관한 영향 조사 모형 개발

  • Received : 2018.04.04
  • Accepted : 2018.05.04
  • Published : 2018.05.31

Abstract

The development of information and communication technology has been carried out actively in the field of agriculture to generate valuable information from large amounts of data and apply big data technology to utilize it. Crops and their varieties are determined by the influence of the natural environment such as temperature, precipitation, and sunshine hours. This paper derives the climatic factors affecting the production of crops using the garlic growth process and daily meteorological variables. A prediction model was also developed for the production of garlic per unit area. A big data analysis technique considering the growth stage of garlic was used. In the exploratory data analysis process, various agricultural production data, such as the production volume, wholesale market load, and growth data were provided from the National Statistical Office, the Rural Development Administration, and Korea Rural Economic Institute. Various meteorological data, such as AWS, ASOS, and special status data, were collected and utilized from the Korea Meteorological Agency. The correlation analysis process was designed by comparing the prediction power of the models and fitness of models derived from the variable selection, candidate model derivation, model diagnosis, and scenario prediction. Numerous weather factor variables were selected as descriptive variables by factor analysis to reduce the dimensions. Using this method, it was possible to effectively control the multicollinearity and low degree of freedom that can occur in regression analysis and improve the fitness and predictive power of regression analysis.

정보통신 기술의 발전으로 농업분야에서도 다량의 데이터로부터 가치 있는 정보를 생성하고 그 활용을 위해 빅데이터 기술을 적용하는 연구가 활발히 진행되고 있다. 농업에서 재배 가능한 작물과 품종은 기온, 강수량, 일조시간 등의 자연환경의 영향에 따라 결정된다. 본 논문은 마늘의 생육과정과 일별로 측정되는 기상변수를 활용하여 농작물 생산에 영향을 미치는 기상기후 요인을 도출하고 마늘을 대상으로 단위면적당 생산량 예측(단수) 모형을 도출하였다. 기상변수는 마늘의 생육단계를 고려하여 빅데이터 분석 기법을 이용하였다. 탐색적 자료 분석과정에서는 통계청, 농촌진흥청, 농촌경제연구원으로부터 생산량, 도매시장 반입량, 생육 데이터 등 다양한 농산물 생산 데이터를 제공받아 활용하였다. 또한 기상청으로부터 AWS, ASOS, 특보현황 등 다양한 기상관측 데이터를 수집하여 활용하였다. 상관관계 분석 과정은 변수선택, 후보모형 도출, 모형진단, 시나리오 예측 등을 통해 도출한 모형의 모형 적합도와 생산량 예측력을 비교하여 마늘생산단수예측 모형을 설계하였다. 수많은 기상요인 변수는 요인분석을 이용하여 차원을 감소시키고 설명변수로 선정하였다. 이 방법을 이용함으로써 회귀분석에서 발생할 수 있는 다중공선성과 낮은 자유도의 문제를 효과적으로 통제할 수 있었으며 회귀분석의 적합도와 예측력을 높일 수 있었다.

Keywords

References

  1. Korea Institute for Industrial Economics & Trade, http://www.kiet.go.kr/servlet/isearch, 2014.
  2. J. K. Koh, J. H. Kim, "Have local officials recognized the importance of adaptive policy," Journal of the Korean Urban Management Association, vol. 24, no. 3, pp. 51-72, 2011.
  3. O. S. Kwon, H. K. Cho, E. B. Cho, K. S. Roh, "Climate Variables and Rice Productivity: A Semiparametric Analysis Using Panel Regional Date," Korean Journal of Agricultural Economics, vol. 54, no. 3, pp. 71-94, 2013.
  4. L. You, M. W. Rosegrant, S. Wood, D. Sun, "Impact of growing season temperature on wheat productivity in china," Agricultural for Meteorology, vol. 149, pp. 1009-1014, 2009. DOI: https://doi.org/10.1016/j.agrformet.2008.12.004
  5. S. H. Han, B. H. Lee, M. S. Park, J. h. Seoung, H. S. Yang, S. C. Shin, "A Study of building Crop Yield Forecasting Model Considering Meteorological elements," Korea Rural Economic Institute, p.152, 2011.
  6. T. B. John, C. H. Yu, "Exploratory Data Analysis," Wiley, 2003.
  7. Y. Kano, H. Akira, "Stepwise variable selection in factor analysis," Psychometrika, vol. 65, no. 1, pp. 7-22, 2000. DOI: https://doi.org/10.1007/BF02294182
  8. T. A. Brown, "Confirmatory Factor Analysis for Applied Research", 2nd Edi. (Methodology in the Social Sciences), Jan 8, 2015.
  9. R. L. Gorsuch, "Factor Analysis: Classic Edition", Psychology Press, Dec 24, 2014.
  10. G. Ciaburro, "Regression Analysis with R: Design and develop statistical nodes to identify unique relationships within data at scale", Jan 31, 2018.
  11. L. D. Schroeder and D. L. Sjoquist, "Understanding Regression Analysis: An Introductory Guide", Quantitative Applications in the Social Sciences, Nov 24, 2016.
  12. D. E. Farrar and R. R Glauber, "Multicollinearity in Regression Analysis; the Problem Revisited", Aug 24, 2017.
  13. Wikipedia contributors, "Focus On: Regression Analysis: Dependent and independent Variables, Multicollinearity, Simple linear Regression, Heteroscedasticity, Lasso (statistics), ... Estimation, Errors and Residuals, etc.", Feb 22, 2018.