DOI QR코드

DOI QR Code

MapReduce-Based Partitioner Big Data Analysis Scheme for Processing Rate of Log Analysis

로그 분석 처리율 향상을 위한 맵리듀스 기반 분할 빅데이터 분석 기법

  • Received : 2018.09.07
  • Accepted : 2018.10.01
  • Published : 2018.10.30

Abstract

Owing to the advancement of Internet and smart devices, access to various media such as social media became easy; thus, a large amount of big data is being produced. Particularly, the companies that provide various Internet services are analyzing the big data by using the MapReduce-based big data analysis techniques to investigate the customer preferences and patterns and strengthen the security. However, with MapReduce, when the big data is analyzed by defining the number of reducer objects generated in the reduce stage as one, the processing rate of big data analysis decreases. Therefore, in this paper, a MapReduce-based split big data analysis method is proposed to improve the log analysis processing rate. The proposed method separates the reducer partitioning stage and the analysis result combining stage and improves the big data processing rate by decreasing the bottleneck phenomenon by generating the number of reducer objects dynamically.

인터넷과 스마트기기의 발달로 인해 소셜미디어 등 다양한 미디어의 접근의 용이해짐에 따라 많은 양의 빅데이터들이 생성되고 있다. 특히 다양한 인터넷 서비스를 제공하는 기업들은 고객 성향 및 패턴, 보안성 강화를 위해 맵리듀스 기반 빅데이터 분석 기법들을 활용하여 빅데이터 분석하고 있다. 그러나 맵리듀스는 리듀스 단계에서 생성되는 리듀서 객체의 수를 한 개로 정의하고 있어, 빅데이터 분석할 때 처리될 많은 데이터들이 하나의 리듀서 객체에 집중된다. 이로 인해 리듀서 객체는 병목현상이 발생으로 빅데이터 분석 처리율이 감소한다. 이에 본 논문에서는 로그 분석처리율 향상을 위한 맵리듀스 기반 분할 빅데이터 분석 기법을 제안한다. 제안한 기법은 리듀서 분할 단계와 분석 결과병합 단계로 구분하며 리듀서 객체의 수를 유동적으로 생성하여 병목현상을 감소시켜 빅데이터 처리율을 향상시킨다.

Keywords

References

  1. H. G. Lee, Y. W. Kim & K. Y. Kim (2017), Implementation of an Efficient Big Data Collection Platform for Smart Manufacturing. Journal of Engineering and Applied Sciences, 12(2Si), 6304-6307.
  2. Y. W. Kim & H. G. Lee (2017). Implementation of Big Data Analysis System to Prevent Illegal Sales in the Cable TV Industry. Journal of Engineering and Applied Sciences, 12(3Si), 6542-6545.
  3. H. G. Lee, Y. W. Kim, K. Y. Kim & J. S. Choi. (2018). Design of GlusterFS Based Big Data Distributed Processing System in Smart Factory, Journal of Korea Institute of Information, Electronics, and Communication Technology, 11(1), 70-75. https://doi.org/10.17661/JKIIECT.2018.11.1.70
  4. E. H. Jeong & B. K. Lee. (2017). A Design of Hadoop Security Protocol using One Time Key based on Hash-chain, Journal of Korea Institute of Information, Electronics, and Communication Technology, 10(4), 340-349. https://doi.org/10.17661/jkiiect.2017.10.4.340
  5. Y. S. Lee (2015). Authentication Method for Safe Internet of Things Environments, Journal of Korea Institute of Information, Electronics, and Communication Technology, 8(1), 51-58. https://doi.org/10.17661/jkiiect.2015.8.1.051
  6. J. T. Seong (2017). Analysis of Signal Recovery for Compressed Sensing using Deep Learning Technique, Journal of Korea Institute of Information, Electronics, and Communication Technology, 10(4), 257-267. https://doi.org/10.17661/jkiiect.2017.10.4.257