Sentiment classification on arabic corpora: A preliminary cross-study

TitreSentiment classification on arabic corpora: A preliminary cross-study
Publication TypeJournal Article
Year of Publication2013
AuthorsMountassir, A, Benbrahim, H, Berrada, I
JournalDocument Numerique
Volume16
Pagination73-96
Abstract

The rise of social media (such as online web forums and social networking sites) has attracted interests to mining and analyzing opinions available on the web. The online opinion has become the object of studies in many research areas; especially that called "Opinion Mining and Sentiment Analysis". Several interesting and advanced works were performed on few languages (in particular English). However, there were very few studies on Morphologically Rich Languages such as Arabic. This paper presents the study we have carried out to investigate supervised sentiment classification in an Arabic context. We use two Arabic Corpora which are different in many aspects. We use three common classifiers known by their effectiveness, namely Naïve Bayes, Support Vector Machines and k-Nearest Neighbor. We investigate some settings to identify those that allow achieving the best results. These settings are about stemming type, term frequency thresholding, term weighting and ngram words. We show that Naïve Bayes and Support Vector Machines are competitively effective; however k-Nearest Neighbor's effectiveness depends on the corpus. Through this study, we recommend to use light-stemming rather than stemming, to remove terms that occur once, to combine unigram and bigram words and to use presence-based weighting rather than frequency-based one. Our results show also that classification performance can be influenced by documents length, documents homogeneity and the nature of document authors. However, the size of data sets does not have an impact on classification results. © 2013 Lavoisier.

URLhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-84879022187&doi=10.3166%2fDN.16.1.73-96&partnerID=40&md5=203c5325261be58119451ab15e8141b5
DOI10.3166/DN.16.1.73-96
Revues: 

Partenaires

Localisation

Suivez-nous sur

         

    

Contactez-nous

ENSIAS

Avenue Mohammed Ben Abdallah Regragui, Madinat Al Irfane, BP 713, Agdal Rabat, Maroc

  Télécopie : (+212) 5 37 68 60 78

  Secrétariat de direction : 06 61 48 10 97

        Secrétariat général : 06 61 34 09 27

        Service des affaires financières : 06 61 44 76 79

        Service des affaires estudiantines : 06 62 77 10 17 / n.mhirich@um5s.net.ma

        Résidences : 06 61 82 89 77

Contacts

    

    Compteur de visiteurs:475,152
    Education - This is a contributing Drupal Theme
    Design by WeebPal.