DOI QR코드

DOI QR Code

Developing a Framework for Detecting Phishing URLs Using Machine Learning

  • Received : 2023.10.05
  • Published : 2023.10.30

Abstract

The attack technique targeting end-users through phishing URLs is very dangerous nowadays. With this technique, attackers could steal user data or take control of the system, etc. Therefore, early detecting phishing URLs is essential. In this paper, we propose a method to detect phishing URLs based on supervised learning algorithms and abnormal behaviors from URLs. Finally, based on the research results, we build a framework for detecting phishing URLs through end-users. The novelty and advantage of our proposed method are that abnormal behaviors are extracted based on URLs which are monitored and collected directly from attack campaigns instead of using inefficient old datasets.

Keywords

References

  1. D. Sahoo, C. Liu, S.C.H. Hoi: Malicious URL Detection using Machine Learning: A Survey. CoRR, abs/1701.07179, (2017).
  2. M. Khonji, Y. Iraqi, A. Jones: Phishing detection: a literature survey. IEEE Communications Surveys & Tutorials, vol. 15(4), pp. 2091-2121 (2013). https://doi.org/10.1109/SURV.2013.032213.00009
  3. Zhiqiang Wang, Xiaorui Ren, Shuhao Li, Bingyan Wang, Jianyi Zhang, Tao Yang: A Malicious URL Detection Model Based on Convolutional Neural Network. Security and Communication Networks, vol. 2021 (2021). https://doi.org/10.1155/2021/5518528.
  4. Kumi, S.; Lim, C.; Lee, S.-G: Malicious URL Detection Based on Associative Classification. Entropy 2021, 23, 182 (2021). https://doi.org/10.3390/e23020182.
  5. Y. Xin et al.: Machine Learning and Deep Learning Methods for Cybersecurity. In: IEEE Access, vol. 6, pp. 35365-35381, (2018). https://doi.org/10.1109/ACCESS.2018.2836950.
  6. W. Yang, W. Zuo, B. Cui: Detecting Malicious URLs via a Keyword-Based Convolutional Gated-Recurrent-Unit Neural Network. In: IEEE Access, vol. 7, pp. 29891-29900 (2019). https://doi.org/10.1109/ACCESS.2019.2895751.
  7. Yan Ding, Nurbol Luktarhan, Keqin Li, Wushour Slamu: A keyword-based combination approach for detecting phishing webpages. Computers & Security, vol.84, pp. 256-275 (2019). https://doi.org/10.1016/j.cose.2019.03.018
  8. Sheikh Shah Mohammad Motiur Rahmana, Takia Islam, Md. Ismail Jabiullah: PhishStack: Evaluation of Stacked Generalization in Phishing URLs Detection. Procedia Computer Science, vol. 167, pp. 2410-2418,
  9. Dipankar Kumar Mondal, Bikash Chandra Singh, Haibo Hu, Shivazi Biswas, Zulfikar Alom, Mohammad Abdul Azim: SeizeMaliciousURL: A novel learning approach to detect malicious URLs. Journal of Information Security and Applications, vol. 62 (2021).
  10. Cho Do Xuan, Hoa Dinh Nguyen, Tisenko Victor Nikolaevich: Malicious URL Detection based on Machine Learning. International Journal of Advanced Computer Science and Applications (IJACSA), vol. 11(1) (2020). http://dx.doi.org/10.14569/IJACSA.2020.0110119.
  11. Leo Breiman: Random Forests. Machine Learning, vol. 45(1), pp. 5-32 (2001). https://doi.org/10.1023/A:1010933404324
  12. John Shawe-Taylor, Shiliang Sun: Kernel Methods and Support Vector Machines. Academic Press Library in Signal Processing, vol. 1, pp. 857-881 (2014). https://doi.org/10.1016/B978-0-12-396502-8.00016-4
  13. https://chongluadao.vn/
  14. https://Openphish.com
  15. https:// Phishing.army
  16. https:// Online.gov.vn
  17. https:// Tinnhiemmang.vn:
  18. Developer Information. https://www.phishtank.com/developer_info.php. [Last accessed 9/2021].
  19. URLhaus Database Dump. https://urlhaus.abuse.ch/downloads/csv[Last accessed 9/2021].
  20. Dataset URL. http://downloads.majestic.com/majestic_million.csv. [Last accessed 9/2021].
  21. Malicious_n_Non-Malicious URL. https://www.kaggle.com/antonyj453/ urldataset#data.csv. [Last accessed 9/2021]
  22. JSON manifest file example. https://docs.microsoft.com/enus/microsoft-edge/extensions/api-support/supportedmanifest-keys/json-manifest-example. [Last accessed 9/2021]