DOI QR코드

DOI QR Code

Information Retrieval Systems: Between Morphological Analyzers and Systemming Algorithms

  • Mohamed, Afaf Abdel Rhman (Department of Computer Science and Information, College of Science at Zulf, Majmaah University) ;
  • Ouni, Chafika (Department of Computer Science and Information, College of Science at Zulf, Majmaah University) ;
  • Eljack, Sarah Mustafa (Department of Computer Science and Information, College of Science at Zulf, Majmaah University) ;
  • Alfayez, Fayez (Department of Computer Science and Information, College of Science at Zulf, Majmaah University)
  • Received : 2022.03.05
  • Published : 2022.03.30

Abstract

The main objective of an Information Retrieval System (IRS) is to obtain suitable information within a reasonable time to satisfy a user need. To achieve this purpose, an IRS should have a good indexing system that is based on natural language processing.In this context, we focus on the available Arabic language processing techniques for an IRS with the goal of contributing to an improvement in the performance. Our contribution consists of integrating morphological analysis into an IRS in order to compare the impact of morphological analysis with that of stemming algorithms.

Keywords

Acknowledgement

The authors would like to thank the Deanship of Scientifc Research at Majmaah University for supporting this work under Project Number R-2022-18.

References

  1. Zulaini Y, MuhamadTaufik A, Azreen A, Rabiah, A.Query translation using concepts similarity based on Quran ontology for cross-language information retrieval. Journal of Computer Science. 2013 June,9(7),pp 889-897. https://doi.org/10.3844/jcssp.2013.889.897
  2. Ali A, Mosa E , Abdullah B. An intelligent use of stemmer and morphology analysis for Arabic information retrieval. Egyptian Informatics Journal.2020 March, 209-217 https://doi.org/10.1016/j.eij.2012.10.002
  3. Essam H, HayelK. Arabic studies' progress in information retrieval. International Journal of Advanced Computer Science and Applications (IJACSA). 2016November,Vol 7, pp 234-238.
  4. Sangita K, Soumen S. New concept-based indexing technique for search engine. Indian Journal of Science and Technology.2017 May, 10(18),pp 1-10.
  5. Maher, A., Mohammed, A.L.: The effectiveness of classification on information retrieval system (case study). (2018). arXiv: 1804.00566 cs.IR.
  6. Ahmed K, Zakir K, Mirza A. Arabic stemmer for search engines information retrieval. InternationalJournal of Advanced Computer Science and Applications (IJACSA). 2016November, 7(1), pp 407-411.
  7. Ahmad A, WafaaA: Arabic stemming techniques: Comparisons and new vision. Proceedings of the 8th IEEE GCC Conference and Exhibition, Muscat Oman, 2015, pp 1-4.
  8. Kheireddine A, SihamHalim. S. A novel robust Arabic light stemmer. Journal of Experimental & Theoretical Artificial Intelligence.2016 July,29(3),pp 1-17
  9. Yaser A, Khawlah M, Mohammad H. Conditional Arabic light stemmer: CondLight. The International Arab Journal of Information Technology.2018 April, 15(3A),pp.559-564.
  10. Jasmeet, S., Vishal, G. Text stemming: Approaches, applications, and challenges. ACM Computing Surveys. 2016 September, 49(3), pp 1-46.
  11. Mohamad A, Riyad A, Ghassan K, AlaaA :Building an effective rule-based light stemmer for Arabic language to improve search effectiveness. The International Arab Journal of Information Technology2012July, 9(4), pp 368-372.
  12. Youness M, Mohammed E, Jamaa, B.Arabic stemmer based big data. Journal of Electronic Commerce in Organizations. 2018January,16(1), pp 17-28. https://doi.org/10.4018/jeco.2018010102
  13. JNouha K, Lamia B ,Abdelmajid B. The MORPH2 new version: A robust morphological analyzer for Arabic texts.Proceedings of 10th International Conference on Statistical Analysis of Textual Data,Sapienza University of Rome,2010,pp 1034-1044.
  14. Mohammed A, Saif K, BelalA.Novel root based Arabic stemmer. Journal of King Saud University - Computer and Information Sciences. 2015Marh, 27,pp 94-103 . https://doi.org/10.1016/j.jksuci.2014.04.001
  15. Mohammed A.Towards improving Khoja rule-based Arabic stemmer. Proceedings of IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), Amman-Jordan, 2013,pp 1-6.
  16. Osama M E. An improved Arabic light stemmer. Proceedings of the 3rd International Conference on Research and Innovation in Information Systems - ICRIIS'13,Kuala Lumpur-Malaysia,2013,pp33-38.
  17. BelguithHadrich L, Chaaben N. Analyse et desambiguisation morphologiques des textes arabes non voyelles. In Actes de la 13eme edition de la conference sur le Traitement Automatique des Langues Naturelles (TALN ), Belgique, 2006,pp 493-501.
  18. Mohamed B, Azzeddine M, Mohamed O, Abdelhak L, Abderrahim B: AlKhalilMorpho Sys 2: A robust Arabic morpho-syntactic analyzer. Journal of King SaudUniversity - Computer and Information Sciences. 2016 June, 29, pp 141-146.
  19. Mesfar, S. Analyse morpho-syntaxique automatique et reconnaissance des entites nommees en arabe standard. Doctoral dissertation, Besancon, University of FrancheComte, 2008.
  20. AttiaM.: An ambiguity-controlled morphological analyzer for modern standard Arabic modelling finite state networks. Proceedings of the Challenge of Arabic for NLP/MT Conference, The British Computer Society Conference, London, 2006, pp 4-67 .
  21. Ababou, N., Mazroui, A.: A hybrid Arabic POS tagging for simple andcompoundmorphosyntactic tags. International Journal of Speech Technology. 2016 June, 19(2), pp 289-302. https://doi.org/10.1007/s10772-015-9302-8
  22. Chennoufi, A., Mazroui, A.: Impact of morphological analysis and a large training corpus on the performances of Arabic diacritization. International Journal of Speech Technology. 2016 June,19(2), pp 269-280. https://doi.org/10.1007/s10772-015-9313-5
  23. Adnen M, Mounir Zrigui: Semantic Similarity Analysis for Corpus development and Paraphrase Detection in Arabic. The International Arab Journal of Information Technology. January 2021, Vol. 18, No. 1.
  24. Hassanin A., kamal J, Sherif A , Mohsen R. Arabic Documents Information Retrieval for Printed, Handwritten, and Calligraphy Image.IEEE access.2021 March, volume 9
  25. TREC, http ://trec.nist.gov/trec_eval, Date accessed : 19/01/2021.
  26. Internet world users by language, Top 10 Languages, https://www.internetworldstats.com/stats7.htm, Date accessed : 12/10/2021
  27. Morphological Analyzer & Part-Of-Speech tagger, http://qutuf.com/, Date accesed: 30 /06/2021