DOI QR코드

DOI QR Code

Social Media Data Analysis Trends and Methods

  • Rokaya, Mahmoud (Department of Information Technology, College of Computers and Information Technology, Taif University) ;
  • Al Azwari, Sanaa (Department of Information Technology, College of Computers and Information Technology, Taif University)
  • Received : 2022.09.05
  • Published : 2022.09.30

Abstract

Social media is a window for everyone, individuals, communities, and companies to spread ideas and promote trends and products. With these opportunities, challenges and problems related to security, privacy and rights arose. Also, the data accumulated from social media has become a fertile source for many analytics, inference, and experimentation with new technologies in the field of data science. In this chapter, emphasis will be given to methods of trend analysis, especially ensemble learning methods. Ensemble learning methods embrace the concept of cooperation between different learning methods rather than competition between them. Therefore, in this chapter, we will discuss the most important trends in ensemble learning and their applications in analysing social media data and anticipating the most important future trends.

Keywords

References

  1. Caleb T. Carr & Rebecca A. Hayes (2015) Social Media: Defining, Developing, and Divining, Atlantic Journal of Communication, 23:1, 46-65, DOI: 10.1080/15456870.2015.972282
  2. Jose Luis Lalueza, Isabel Crespo and Marc Bria, Microcultures, Local Communities, and Virtual Networks, Chapter IX in Handbook of Research on Digital Information Technologies: Innovations, Methods, and Ethical Issues , Copyright: © 2008 |Pages: 14 DOI: 10.4018/978-1-59904-970-0.ch009
  3. Nyagadza, Brighton, and Brighton Nyagadza. "Search Engine Marketing and Social Media Marketing Predictive Trends." Journal of Digital Media & Policy, 2020. doi:10.1386/jdmp_00027_1.
  4. Ravneet Singh Bhandari1 , Ajay Bansal2, Sanjeela Mathur3 and Harikishni Nain, Privacy Concern Behaviour on Social Media Sites: A Comparative Analysis of Urban and Rural Users, FIIB Business Review, 1-13, 2022, https://doi.org/10.1177%2F23197145221078106 https://doi.org/10.1177%2F23197145221078106
  5. A. Badawy, E. Ferrara and K. Lerman, "Analyzing the Digital Traces of Political Manipulation: The 2016 Russian Interference Twitter Campaign," 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2018, pp. 258-265, doi: 10.1109/ASONAM.2018.8508646.
  6. Mingmin Zhang, Ping Xu, Yinjiao Ye, Trust in social media brands and perceived media values: A survey study in China, Computers in Human Behavior, Volume 127, 2022, 107024, ISSN 0747-5632, https://doi.org/10.1016/j.chb.2021.107024.
  7. Dwivedi, Yogesh K., Kawaljeet Kaur Kapoor, and Hsin Chen. "Social media marketing and advertising." The Marketing Review 15.3 (2015): 289-309. https://doi.org/10.1362/146934715X14441363377999
  8. Chris Norval, Heleen Janssen, Jennifer Cobbe and Jatinder Singh, Data protection and tech startups: The needfor attention, support, and scrutiny, Policy Internet. 2021;13:278-299, https://doi.org/10.1002/poi3.255
  9. Jeffrey A. Hall, Dong Liu, Social media use, social displacement, and well-being, Current Opinion in Psychology, Volume 46, 2022, https://doi.org/10.1016/j.copsyc.2022.101339.
  10. Chetioui K, Bah B, Alami AO, Bahnasse A. Overview of Social Engineering Attacks on Social Networks. Procedia Computer Science. 2022 Jan 1;198:656-61. https://doi.org/10.1016/j.procs.2021.12.302
  11. Irshad S, Soomro TR. Identity theft and social media. International Journal of Computer Science and Network Security. 2018 Jan 30;18(1):43-55
  12. Treyger E, Cheravitch J, Cohen R. Russian Disinformation Efforts on Social Media. RAND CORP SANTA MONICA CA; 2022 Jun 7.
  13. Barbier G, Liu H. Data mining in social media. InSocial network data analytics 2011 (pp. 327-352). Springer, Boston, MA.
  14. Tharani JS, Arachchilage NA. Understanding phishers' strategies of mimicking uniform resource locators to leverage phishing attacks: A machine learning approach. Security and Privacy. 2020 Sep;3(5):e120, DOI: 10.1002/spy2.120
  15. Le Page S, Jourdan GV, Bochmann GV, Flood J, Onut IV. Using url shorteners to compare phishing and malware attacks. In2018 APWG Symposium on Electronic Crime Research (eCrime) 2018 May 15 (pp. 1-13). IEEE, DOI: 10.1109/ECRIME.2018.8376215
  16. Kumar S, Carley KM. Understanding DDoS cyber-attacks using social media analytics. In2016 IEEE Conference on Intelligence and Security Informatics (ISI) 2016 Sep 28 (pp. 231-236). IEEE, DOI: 10.1109/ISI.2016.7745480
  17. Derhab A, Alawwad R, Dehwah K, Tariq N, Khan FA, AlMuhtadi J. Tweet-based bot detection using big data analytics. IEEE Access. 2021 Apr 22;9:65988-6005, DOI: 10.1109/ACCESS.2021.3074953
  18. Zhang C, Ma Y, editors. Ensemble machine learning: methods and applications. Springer Science & Business Media; 2012 Feb 17, https://link.springer.com/content/pdf/10.1007/978-1-4419-9326-7.pdf
  19. B. V. Dasarathy and B. V. Sheela, "Composite classifier system design: concepts and methodology," Proceedings of the IEEE, vol. 67, no. 5, pp. 708-713, 1979, DOI: 10.1109/PROC.1979.11321
  20. Y. Freund and R. E. Schapire, "Decision-theoretic generalization of on-line learning and an application to boosting," Journal of Computer and System Sciences, vol. 55, no. 1, pp. 119-139, 1997, https://doi.org/10.1006/jcss.1997.1504
  21. L. Breiman, "Bagging predictors," Machine Learning, vol. 24, no. 2, pp. 123-140, 1996, https://doi.org/10.1007/BF00058655
  22. Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE. Adaptive mixtures of local experts. Neural computation. 1991 Mar;3(1):79-87, DOI: 10.1162/neco.1991.3.1.79
  23. Jordan MI, Jacobs RA. Hierarchical mixtures of experts and the EM algorithm. Neural computation. 1994 Mar;6(2):181-214, DOI: 10.1162/neco.1994.6.2.181
  24. Benediktsson JA, Swain PH. Consensus theoretic classification methods. IEEE transactions on Systems, Man, and Cybernetics. 1992 Jul;22(4):688-704, DOI: 10.1109/21.156582
  25. Xu L, Krzyzak A, Suen CY. Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE transactions on systems, man, and cybernetics. 1992 May;22(3):418-35, DOI: 10.1109/21.155943
  26. Ho TK, Hull JJ, Srihari SN. Decision combination in multiple classifier systems. IEEE transactions on pattern analysis and machine intelligence. 1994 Jan;16(1):66-75, DOI: 10.1109/34.273716
  27. Rogova, G. (2008). Combining the Results of Several Neural Network Classifiers. In: Yager, R.R., Liu, L. (eds) Classic Works of the Dempster-Shafer Theory of Belief Functions. Studies in Fuzziness and Soft Computing, vol 219. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44792-4_27
  28. Lam L, Suen CY. Optimal combinations of pattern classifiers. Pattern Recognition Letters. 1995 Sep 1;16(9):945-54, https://doi.org/10.1016/0167-8655(95)00050-Q
  29. Woods K, Kegelmeyer WP, Bowyer K. Combination of multiple classifiers using local accuracy estimates. IEEE transactions on pattern analysis and machine intelligence. 1997 Apr;19(4):405-10, DOI: 10.1109/34.588027
  30. Wolpert DH. Stacked generalization. Neural networks. 1992 Jan 1;5(2):241-59. https://doi.org/10.1016/S0893-6080(05)80023-1
  31. Ho TK. The random subspace method for constructing decision forests. IEEE transactions on pattern analysis and machine intelligence. 1998 Aug;20(8):832-44, DOI: 10.1109/34.709601
  32. Kuncheva LI. Combining pattern classifiers: methods and algorithms. John Wiley & Sons; 2014 Sep 9, https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.365.2334&rep=rep1&type=pdf
  33. Banfield RE, Hall LO, Bowyer KW, Kegelmeyer WP. Ensemble diversity measures and their application to thinning. Information Fusion. 2005 Mar 1;6(1):49-62, https://doi.org/10.1016/j.inffus.2004.04.005
  34. Kuncheva, L.I., Whitaker, C.J. Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy. Machine Learning 51, 181-207 (2003). https://doi.org/10.1023/A:1022859003006
  35. Kuncheva, L.I. (2003). That Elusive Diversity in Classifier Ensembles. In: Perales, F.J., Campilho, A.J.C., de la Blanca, N.P., Sanfeliu, A. (eds) Pattern Recognition and Image Analysis. IbPRIA 2003. Lecture Notes in Computer Science, vol 2652. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44871-6_130
  36. Dietterich, T.G. (2000). Ensemble Methods in Machine Learning. In: Multiple Classifier Systems. MCS 2000. Lecture Notes in Computer Science, vol 1857. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45014-9_1
  37. E. Filippi, M. Costa and E. Pasero, "Multi-layer perceptron ensembles for increased performance and fault-tolerance in pattern recognition tasks," Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94), 1994, pp. 2901-2906 vol.5, doi: 10.1109/ICNN.1994.374692.
  38. Healey SP, Cohen WB, Yang Z, Brewer CK, Brooks EB, Gorelick N, Hernandez AJ, Huang C, Hughes MJ, Kennedy RE, Loveland TR. Mapping forest change using stacked generalization: An ensemble approach. Remote Sensing of Environment. 2018 Jan 1;204:717-28, https://doi.org/10.1016/j.rse.2017.09.029
  39. Zhang B, Luo L, Liu X, Li J, Chen Z, Zhang W, Wei X, Hao Y, Tsang M, Wang W, Liu Y. DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale Click-Through Rate Prediction. arXiv preprint arXiv:2203.11014. 2022 Mar 11, https://doi.org/10.48550/arXiv.2203.11014
  40. S. Haider et al., "A Deep CNN Ensemble Framework for Efficient DDoS Attack Detection in Software Defined Networks," in IEEE Access, vol. 8, pp. 53972-53983, 2020, doi: 10.1109/ACCESS.2020.2976908.
  41. Osanaiye, O., Cai, H., Choo, KK.R. et al. Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing. J Wireless Com Network 2016, 130 (2016). https://doi.org/10.1186/s13638-016-0623-3
  42. S. Das, D. Venugopal, S. Shiva and F. T. Sheldon, "Empirical Evaluation of the Ensemble Framework for Feature Selection in DDoS Attack," 2020 7th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2020 6th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom), 2020, pp. 56-61, doi: 10.1109/CSCloud-EdgeCom49738.2020.00019.
  43. Jia B, Huang X, Liu R, Ma Y. A DDoS attack detection method based on hybrid heterogeneous multiclassifier ensemble learning. Journal of Electrical and Computer Engineering. 2017 Mar 15;2017, https://doi.org/10.1155/2017/4975343
  44. Hansrajh A, Adeliyi TT, Wing J. Detection of online fake news using blending ensemble learning. Scientific Programming. 2021 Jul 29;2021, https://doi.org/10.1155/2021/3434458
  45. Ahmad I, Yousaf M, Yousaf S, Ahmad MO. Fake news detection using machine learning ensemble methods. Complexity. 2020 Oct 17;2020, https://doi.org/10.1155/2020/8885861
  46. Khan, M.Z., Alhazmi, O.H. Study and analysis of unreliable news based on content acquired using ensemble learning (prevalence of fake news on social media). Int J Syst Assur Eng Manag 11, 145-153 (2020). https://doi.org/10.1007/s13198-020-01016-4
  47. Fayaz M, Khan A, Rahman JU, Alharbi A, Uddin MI, Alouffi B. Ensemble machine learning model for classification of spam product reviews. Complexity. 2020 Dec 18;2020, https://doi.org/10.1155/2020/8857570
  48. Li G, Shen M, Li M, Cheng J. Personal Credit Default Discrimination Model Based on Super Learner Ensemble. Mathematical Problems in Engineering. 2021 Mar 31;2021, https://doi.org/10.1155/2021/5586120
  49. Xiaojun C, Zicheng W, Yiguo P, Jinqiao S. A continuous reauthentication approach using ensemble learning. Procedia Computer Science. 2013 Jan 1;17:870-8, https://doi.org/10.1016/j.procs.2013.05.111
  50. Choi, J. A., & Lim, K. (2020). Identifying machine learning techniques for classification of target advertising. ICT Express, 6(3), 175-180, https://doi.org/10.1016/j.icte.2020.04.012
  51. Garcia-Mendez S, Leal F, Malheiro B, Burguillo-Rial JC, Veloso B, Chis AE, Gonzalez-Velez H. Simulation, modelling and classification of wiki contributors: Spotting the good, the bad, and the ugly. Simulation Modelling Practice and Theory. 2022 Nov 1;120:102616, https://doi.org/10.1016/j.simpat.2022.102616
  52. Zhan, X., You, Z., Yu, C., Li, L., & Pan, J. (2020). Ensemble learning prediction of drug-target interactions using GIST descriptor extracted from PSSM-based evolutionary information. BioMed Research International, 2020, https://doi.org/10.1155/2020/4516250
  53. Sanober, S., Alam, I., Pande, S., Arslan, F., Rane, K. P., Singh, B. K., ... & Shabaz, M. (2021). An enhanced secure deep learning algorithm for fraud detection in wireless communication. Wireless Communications and Mobile Computing, 2021, https://doi.org/10.1155/2021/6079582
  54. Mao, Z., Fang, Z., Li, M., & Fan, Y. (2022). EvadeRL: Evading PDF Malware Classifiers with Deep Reinforcement Learning. Security and Communication Networks, 2022, https://doi.org/10.1155/2022/7218800
  55. Anand, P. M., Kumar, T. G., & Charan, P. S. (2020). An ensemble approach for algorithmically generated domain name detection using statistical and lexical analysis. Procedia Computer Science, 171, 1129-1136, https://doi.org/10.1016/j.procs.2020.04.121
  56. Chen, Y., Chen, H., Zhang, Y., Han, M., Siddula, M., & Cai, Z. (2022). A survey on blockchain systems: Attacks, defenses, and privacy preservation. High-Confidence Computing, 2(2), 100048, https://doi.org/10.1016/j.hcc.2021.100048
  57. Saheed, Y. K., Abiodun, A. I., Misra, S., Holone, M. K., & Colomo-Palacios, R. (2022). A machine learning-based intrusion detection for detecting internet of things network attacks. Alexandria Engineering Journal, 61(12), 9395-9409, https://doi.org/10.1016/j.aej.2022.02.063
  58. Bijalwan, A., Chand, N., Pilli, E. S., & Krishna, C. R. (2016). Botnet analysis using ensemble classifier. Perspectives in Science, 8, 502-504, https://doi.org/10.1016/j.pisc.2016.05.008
  59. Tebenkov, E., & Prokhorov, I. (2021). Machine learning algorithms for teaching AI chat bots. Procedia Computer Science, 190, 735-744, https://doi.org/10.1016/j.procs.2021.06.086
  60. Suchacka, G., Cabri, A., Rovetta, S., & Masulli, F. (2021). Efficient on-the-fly Web bot detection. Knowledge-Based Systems, 223, 107074, https://doi.org/10.1016/j.knosys.2021.107074
  61. Xie, Y., Li, A., Gao, L., & Liu, Z. (2021). A heterogeneous ensemble learning model based on data distribution for credit card fraud detection. Wireless Communications and Mobile Computing, 2021, https://doi.org/10.1155/2021/2531210
  62. Yan, J., Qi, Y., & Rao, Q. (2018). Detecting malware with an ensemble method based on deep neural network. Security and Communication Networks, 2018, https://doi.org/10.1155/2018/7247095
  63. Xu, H., Fan, G., & Song, Y. (2022). Application Analysis of the Machine Learning Fusion Model in Building a Financial Fraud Prediction Model. Security and Communication Networks, 2022, https://doi.org/10.1155/2022/8402329
  64. Shatnawi, A. S., Jaradat, A., Yaseen, T. B., Taqieddin, E., AlAyyoub, M., & Mustafa, D. (2022). An Android Malware Detection Leveraging Machine Learning. Wireless Communications and Mobile Computing, 2022, https://doi.org/10.1155/2022/1830201
  65. Martin, I., Hernandez, J. A., Munoz, A., & Guzman, A. (2018). Android malware characterization using metadata and machine learning techniques. Security and Communication Networks, 2018, https://doi.org/10.1155/2018/5749481
  66. Xiao, F., Lin, Z., Sun, Y., & Ma, Y. (2019). Malware detection based on deep learning of behavior graphs. Mathematical Problems in Engineering, 2019, https://doi.org/10.1155/2019/8195395
  67. Gera, T., Singh, J., Mehbodniya, A., Webber, J. L., Shabaz, M., & Thakur, D. (2021). Dominant feature selection and machine learning-based hybrid approach to analyze android ransomware. Security and Communication Networks, 2021, https://doi.org/10.1155/2021/7035233
  68. Park, S., & Choi, J. Y. (2020). Malware detection in selfdriving vehicles using machine learning algorithms. Journal of advanced transportation, 2020, https://doi.org/10.1155/2020/3035741
  69. Lu, T., Du, Y., Ouyang, L., Chen, Q., & Wang, X. (2020). Android malware detection based on a hybrid deep learning model. Security and Communication Networks, 2020, https://doi.org/10.1155/2020/8863617
  70. Subasi, A., Balfaqih, M., Balfagih, Z., & Alfawwaz, K. (2021). A Comparative Evaluation of Ensemble Classifiers for Malicious Webpage Detection. Procedia Computer Science, 194, 272-279, https://doi.org/10.1016/j.procs.2021.10.082
  71. Hota, H. S., Shrivas, A. K., & Hota, R. (2018). An ensemble model for detecting phishing attack with proposed removereplace feature selection technique. Procedia computer science, 132, 900-907, https://doi.org/10.1016/j.procs.2018.05.103
  72. Orunsolu AA, Sodiya AS, Akinwale AT. A predictive model for phishing detection. Journal of King Saud UniversityComputer and Information Sciences. 2019 Dec 24, https://doi.org/10.1016/j.jksuci.2019.12.005
  73. AbdulNabi, I., & Yaseen, Q. (2021). Spam email detection using deep learning techniques. Procedia Computer Science, 184, 853-858, https://doi.org/10.1016/j.procs.2021.03.107