Target and Swear Word Detection Using Sentence Analysis in Real-Time Chatting

실시간 채팅 환경에서 문장 분석을 이용한 대상자 및 비속어 검출

  • 염충석 (상명대학교 소프트웨어학과) ;
  • 장준영 (상명대학교 소프트웨어학과) ;
  • 장유환 (상명대학교 소프트웨어학과) ;
  • 김현철 (상명대학교 소프트웨어학과) ;
  • 박희민 (상명대학교 소프트웨어학과)
  • Received : 2021.03.12
  • Accepted : 2021.03.17
  • Published : 2021.03.31

Abstract

By the increase of internet usage, communicating online became an everyday thing. Thereby various people have experienced profanity by anonymous users. Nowadays lots of studies tried to solve this problem using artificial intelligence, but most of the solutions were for non-real time situations. In this paper, we propose a Telegram plugin that detects swear words using word2vec, and an algorithm to find the target of the sentence. We vectorized the input sentence to find connections with other similar words, then inputted the value to the pre-trained CNN (Convolutional Neural Network) model to detect any swears. For target recognition we proposed a sequential algorithm based on KoNLPY.

References

  1. Gulli, A., & Pal, S., "Deep learning with Keras", Packt Publishing Ltd., 2017.
  2. Kalchbrenner, Nal, Edward Grefenstette, and Phil Blunsom, "A convolutional neural network for modelling sentences", arXiv preprint arXiv:1404.2188, 2014.
  3. Jang, Baekcheol, Inhwan Kim, and Jong Wook Kim. "Word2vec convolutional neural networks for classification of news articles and tweets", PloS one 14.8, 2019.
  4. Chung, HaeKyung and Na, Jungjo, "Development of Wine Recommendation App Using Artificial Intelligence-Based Chatbot Service", Journal of the Semiconductor & Display Technology, Vol. 18 Issue 3, pp. 93-99, 2019.
  5. Ahn, Hyochang and Lee, Yong-Hwan, "A Research of CNN-based Object Detection for Multiple Object Tracking in Image", Journal of the Semiconductor & Display Technology, Vol. 18 Issue 3, pp. 110-114, 2019.
  6. Soohyun Kim, et al., "An Offensive Words Detection Method Using the Levenshtein Distance Algorithm", KIISE Korea Software Congress 2018, pp. 2012-2014, 2018.
  7. Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov, "Enriching Word Vectors with Subword Information", 2016.
  8. Park, E. L., & Cho, S., "KoNLPy: Korean natural language processing in Python", Proceedings of the 26th Annual Conference on Human & Cognitive Language Technology, Vol. 6, pp. 133-136 2014.