DOI QR코드

DOI QR Code

The Blog Ranking Algorithm Reflecting Trend Index

트렌드 지수를 반영한 블로그 랭킹 알고리즘

  • Lee, Yong-Suk (Graduate School of Information Security, Korea University) ;
  • Kim, Hyoung Joong (Graduate School of Information Security, Korea University)
  • 이용석 (고려대학교 정보보호대학원 빅데이터 응용 및 보안학과) ;
  • 김형중 (고려대학교 정보보호대학원 빅데이터 응용 및 보안학과)
  • Received : 2017.06.07
  • Accepted : 2017.06.25
  • Published : 2017.06.30

Abstract

The growth of blogs has two aspect of providing various information and marketing. This study collected the rankings of blog posts of large portal using OpenAPI and investigated the features of blogs ranked through the exploratory data analysis technique. As a result of the analysis, it was found that the influence of the blogger and the recent creation date of the post were highly influential factors in the top rank. Due to the weakness of these evaluation algorithms, there was a problem of showing the search results which is concentrated to the power blogger's post. In this study, we propose an algorithm that improves the reliability of content by adding the reliability DB information which is verified by the experts and reflects the fairness of the application of the ranking score through the trend index indicating various public interests. Improved algorithms have made it possible to provide more reliable information in the search results of the relevant field and have an effect of making it difficult to manipulate ranking by illegal applications that increase the number of visitors.

블로그의 성장은 다양한 정보제공이라는 긍정적 측면과 마케팅적 활용이라는 부정적 수단으로 사용되고 있는 문제를 가지고 있다. 본 연구는 대형 포털의 블로그 포스트의 랭킹 결과를 OpenAPI를 이용하여 수집하였고, 탐색적 데이터 분석기법을 통해서 상위 랭크된 블로그의 특징들을 조사하였다. 분석 결과를 보면 상위 랭크에 영향을 주는 요소로는 블로거의 영향력과 포스트의 최근 생성일에 관련성이 높은 것을 알 수 있었다. 이런 평가 알고리즘의 약점으로 인해 파워 블로거의 포스트 중심으로 검색 결과를 편중되게 보여주는 문제가 있었다. 본 연구에서는 다양한 대중의 관심사를 나타내는 트렌드 지수를 통해 랭킹 점수 적용의 공정성을 확보하고, 전문가에 의해 검증된 신뢰 DB정보를 추가하여 컨텐츠 신뢰성을 높이는 알고리즘을 제안하였다. 개선된 알고리즘을 맛집 검색 결과가 실제 지역 학생들의 추천 맛집정보와의 유사도가 높은 것을 확인하였다. 개선된 알고리즘으로 좀 더 신뢰할 수 있는 정보제공이 가능해 졌으며, 방문자수 증가시키는 불법 앱에 의한 순위 조작이 어려워지는 부가적 개선 효과가 기대된다.

Keywords

References

  1. J.-E. Kim and Y.-Y. Kim, "How the characteristics of the food-blog marketing effect to purchasing intension with the mediation effect of trust," Korean Journal of Tourism Research, vol. 30, no. 5, pp. 85-105, 2015.
  2. S.-L. Lee, H.-H. Yoon, and N. Young, "Blogs in the restaurant industry: Consumer usage motivation and service quality perception," Korean Journal of Hotel Administration, vol. 19, no. 6, pp. 273-287, 2010.
  3. S.-T. Lim, W.-S. Cho, "The effects of business blog Information characteristics influencing on electronic word-of-mouth in the food service industry: Emphasis on trust transference," Korean Hospitality and Tourism Academe, vol. 20, no. 5, pp. 165-180, 2011.
  4. H.-G. Song, "A study of relationship of gourmet blog's reliability with the perceived benefits, perceived risk and online word of mouth of eating out consumer," Culinary Science & Hospitality Research, vol. 20, no. 6, pp. 275- 291, 2014. https://doi.org/10.20878/cshr.2014.20.6.024
  5. The Kyunghyang Shinmun, news about blog rank hacking [Internet] Available: http://news.khan.co.kr/kh_news/ khan_art_view.html?artid=201609121242001&code=940202#csidx9f55f61f28a12039a949b1c036c78db
  6. NAVER, FAQ about ranking algorithm, [Internet]. Available: https://help.naver.com/support/contents/contents.nhn?serviceNo=606&categoryNo=15024
  7. J.-W. Kim, U.-I. Yun, G.-B. Pyun, H.-M. Ryang, G.-I. Lee, E.-C. Yoon, and K.-H. Ryu, "A blog ranking algorithm using analysis of both blog influence and characteristics of blog posts," Cluster Computing, vol. 18, no. 1, pp.100-104, 2015.
  8. Jiaul H. Paik, "A novel TF-IDF weighting scheme for effective ranking," Proceedings of the International ACM Conference on Research and Development in Information Retrieval, pp. 343-52, 2013.
  9. J.-Y. Lee, "A study on the pivoted inverse document frequency weighting method," Journal of the Korean Society for Information Management, vol. 20, no. 4, pp. 233-248, 2003. https://doi.org/10.3743/KOSIM.2003.20.4.233
  10. P. Lawrence, S. Brin, R. Motwani, and T. Winograd, "The PageRank citation ranking: Bringing order to the web," Technical Report No. SIDL-WP-1999-0120, Stanford University, 1998.
  11. J. M. Kleinberg, "Authoritative sources in a hyperlinked environment," Journal of the ACM, vol. 46, no. 5, pp. 604-632, 1999. https://doi.org/10.1145/324133.324140
  12. M. Richardson and P. Domingos, "The intelligent surfer: Probabilistic combination of link and content information in pagerank," Advances in Neural Information Processing Systems, vol. 14, pp.1141-1448, 2002
  13. S.-C. Lee, D.-J. Kim, H.-Y. Lee, S.-W. Kim, J.-B. Lee, "C-rank: A contribution-based approach for web page ranking," ,Journal of KIISE: Computing Practices and Letters, vol. 16, no.1 , pp. 100-104, 2010.
  14. A. Yeung, G. Noll, N. Gibbins, C Meinel, and N, Shadbolt, "SPEAR: Spamming-resistant expertise analysis and ranking in collaborative tagging systems," Computational Intelligence, vol. 27, no. 3, pp. 458-488, 2011. https://doi.org/10.1111/j.1467-8640.2011.00384.x
  15. M.-k. Seo, R for Practical Data Analysis, 1st ed. Gilbut, 2014.
  16. Koreapas.com, Community Bulletin board about restaurant recommended by Korea University students, [Internet]. Available: http://www.koreapas.com
  17. J.-H. Lee, W.-S. Lee, J.-W. Park, and J.-H. Choi, "The blog polarity classification technique using opinion mining," Journal of Digital Contents Society, vol. 15, no. 4, pp. 458-488, 2014.