AJRSP
International peer-reviewed journal

ISSN: 2706-6495
Email: editor@ajrsp.com

Coming Issue: (67)
Submission Deadline:
26 October 2024
Date of Issue:
5 November 2024


Peer-Review statement

AJRSP follows academic conditions and rules for the arbitration and dissemination of scientific research. All published articles have undergone a rigorous peer-review process based on initial screening and final decision.


Helpful Links
  • Publication Fields
  • Publication Fees
  • AJRSP Template
  • Publication Ethics
  • Journal's Policies
  • Contact us

  • Archive
  • Current Issue
  • Archive
  • All copyrights reserved to the Academic Journal of Research and Scientific Publishing by Creative Commons License

    ajrsp

    A Template-Based Approach for Tagging Non-Vocalized Arabic Nouns

    Author: Mr. Hashem Saadaldin Alghalib Alsharif
    Managing Director Deanship of Admission & Registration, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia
    Email: halshareef@kau.edu.sa
    Doi: doi.org/10.52132/Ajrsp.e.2021.32.1


    Abstract:

    There exist no corpora of Arabic nouns. Furthermore, in any Arabic text, nouns can be found in different forms. In fact, by tagging nouns in an Arabic text, the beginning of each sentence can determine whether it starts with a noun or a verb. Part of Speech Tagging (POS) is the task of labeling each word in a sentence with its appropriate category, which is called a Tag (Noun, Verb and Article). In this thesis, we attempt to tag non-vocalized Arabic text. The proposed POS Tagger for Arabic Text is based on searching for each word of the text in our lists of Verbs and Articles. Nouns are found by eliminating Verbs and Articles. Our hypothesis states that, if the word in the text is not found in our lists, then it is a Noun. These comparisons will be made for each of the words in the text until all of them have been tagged. To apply our method, we have prepared a list of articles and verbs in the Arabic language with a total of 112 million verbs and articles combined, which are used in our comparisons to prove our hypothesis. To evaluate our proposed method, we used pre-tagged words from "The Quranic Arabic Corpus", making a total of 78,245 words, with our method, the Template-based tagging approach compared with (AraMorph) a rule-based tagging approach and the Stanford Log-linear Part-Of-Speech Tagger. Finally, AraMorph produced 40% correctly-tagged words and Stanford Log-linear Part-Of-Speech Tagger produced 68% correctly-tagged words, while our method produced 68,501 correctly-tagged words (88%).

    Keywords:

    Template, Approach, Tagging, Non-Vocalized, Arabic Nouns

    Download PDF